Jupyter Notebook Dataframe

Open-source Jupyter Notebook projects categorized as Dataframe

Top 5 Jupyter Notebook Dataframe Projects

  1. hamilton

    Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

    Project mention: Show HN: I built an open-source data pipeline tool in Go | news.ycombinator.com | 2024-12-17

    I always thought Hamilton [1] does a good job of giving enough visual hooks that draw you in.

    I also noticed this pattern where library authors sometimes do a bit extra in terms of discussing and even promoting their competitors, and it makes me trust them more. A “heres why ours is better and everyone else sucks …” section always comes across as the infomercial character who is having quite a hard time peeling an apple to the point you wonder if this the first time they’ve used hands.

    One thing wish for is a tool that’s essentially just Celery that doesn’t require a message broker (and can just use a database), and which is supported on Windows. There’s always a handful of edge cases where we’re pulling data from an old 32-bit system on Windows. And basically every system has some not-quite-ergonomic workaround that’s as much work as if you’d just built it yourself.

    It seems like it’s just sending a JSON message over a queue or HTTP API and the worker receives it and runs the task. Maybe it’s way harder than I’m envisioning (but I don’t think so because I’ve already written most of it).

    I guess that’s one thing I’m not clear on with Bruin, can I run workers if different physical locations and have them carry out the tasks in the right order? Or is this more of a centralized thing (meaning even if its K8s or Dask or Ray, those are all run in a cluster which happens to be distributed, but they’re all machines sitting in the same subnet, which isn’t the definition of a “distributed task” I’m going for.

    [1] https://github.com/DAGWorks-Inc/hamilton

  2. Nutrient

    Nutrient - The #1 PDF SDK Library. Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.

    Nutrient logo
  3. kangas

    🦘 Explore multimedia datasets at scale

  4. pdpipe

    Easy pipelines for pandas DataFrames.

  5. ickle

    DataFrame, analysis & manipulation library for tiny labeled datasets

  6. Portland-Jail-Data-Crawler

    Scraper used for recording changes to Portland jail database

  7. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Jupyter Notebook Dataframe discussion

Log in or Post with

Jupyter Notebook Dataframe related posts

  • Show HN: Hamilton's UI – observability, lineage, and catalog for data pipelines

    1 project | news.ycombinator.com | 2 May 2024
  • Kangas: Pandas for Multimedia Datasets

    1 project | news.ycombinator.com | 3 May 2023

Index

What are some of the best open-source Dataframe projects in Jupyter Notebook? This list will help you:

# Project Stars
1 hamilton 2,018
2 kangas 1,053
3 pdpipe 718
4 ickle 14
5 Portland-Jail-Data-Crawler 2

Sponsored
Nutrient - The #1 PDF SDK Library
Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.
nutrient.io

Did you know that Jupyter Notebook is
the 13th most popular programming language
based on number of references?