nbdime
Pandas
nbdime | Pandas | |
---|---|---|
7 | 397 | |
2,596 | 42,039 | |
0.3% | 0.7% | |
8.4 | 10.0 | |
4 days ago | 2 days ago | |
TypeScript | Python | |
GNU General Public License v3.0 or later | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
nbdime
-
Stuff I Learned during Hanukkah of Data 2023
I remember hearing about nbdime and thinking it sounded useful, but I've never really needed it since I rarely use Jupyter in the first place. But then I made some changes to my Hanukkah of Data 2023 notebook to work with the follow-up "speed run" challenge (a new dataset and slightly tweaked clues), and the native Git diff was too noisy to be useful. nbdime came to the rescue! Here are the changes I had to make for days 2 and 3 during the speed run:
- The Jupyter+Git problem is now solved
-
Ask HN: Are there any good Diff tools for Jupyter Notebooks?
[5] ReviewNB for reviewing & diff'ing notebook PRs / Commits on GitHub
Disclaimer: While I’m the author of last two (GitPlus & ReviewNB), I’ve represented the overall landscape in an unbiased way. I've been working on this specific problem for 3+ years & regularly talk to teams who use GitHub with notebooks.
[1] https://nbdime.readthedocs.io
- Notebooks suck: change my mind
-
What if Git worked with Programming Languages?
Interesting they mentioned Jupyter Notebooks but not NBDime https://github.com/jupyter/nbdime which is a Jupyter plugin specifically to address this problem. Without it, diffing notebooks is not feasible.
-
Jupyter diff in Magit
A bit off-topic but someone might know; I'm working with jupyter notebook files (ipynb) which are basically json files. Git diff is very noisy so there's nbdime which works great in the CLI. Is there a way to have Magit aware of its integration with git diff?
-
The Notepad++
I use nbdime which allows you to ignore parts of a notebook (e.g. outputs) when diffing.
Pandas
- PDEP-13: The Pandas Logical Type System
- PHP Doesn't Suck Anymore
-
AWS Serverless Diversity: Multi-Language Strategies for Optimal Solutions
Python is a natural fit for serverless development. It boasts a vast array of libraries, including Powertools for AWS and robust libraries for data engineers. Its versatility and excellent developer experience make it a top choice for serverless projects, offering a seamless and enjoyable development experience.
-
Pandas reset_index(): How To Reset Indexes in Pandas
In data analysis, managing the structure and layout of data before analyzing them is crucial. Python offers versatile tools to manipulate data, including the often-used Pandas reset_index() method.
-
Deploying a Serverless Dash App with AWS SAM and Lambda
Dash is a Python framework that enables you to build interactive frontend applications without writing a single line of Javascript. Internally and in projects we like to use it in order to build a quick proof of concept for data driven applications because of the nice integration with Plotly and pandas. For this post, I'm going to assume that you're already familiar with Dash and won't explain that part in detail. Instead, we'll focus on what's necessary to make it run serverless.
-
Help Us Build Our Roadmap – Pydantic
there is pull request to integrate in both pydantic extra types and into pandas cose [1]
[1]: https://github.com/pandas-dev/pandas/issues/53999
-
Stuff I Learned during Hanukkah of Data 2023
Last year I worked through the challenges using VisiData, Datasette, and Pandas. I walked through my thought process and solutions in a series of posts.
-
Introducing Flama for Robust Machine Learning APIs
pandas: A library for data analysis in Python
-
Exploring Open-Source Alternatives to Landing AI for Robust MLOps
Data analysis involves scrutinizing datasets for class imbalances or protected features and understanding their correlations and representations. A classical tool like pandas would be my obvious choice for most of the analysis, and I would use OpenCV or Scikit-Image for image-related tasks.
-
Mastering Pandas read_csv() with Examples - A Tutorial by Codes With Pankaj
Pandas, a powerful data manipulation library in Python, has become an essential tool for data scientists and analysts. One of its key functions is read_csv(), which allows users to read data from CSV (Comma-Separated Values) files into a Pandas DataFrame. In this tutorial, brought to you by CodesWithPankaj.com, we will explore the intricacies of read_csv() with clear examples to help you harness its full potential.
What are some alternatives?
jupytext - Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
Cubes - [NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis
poetry-dynamic-versioning - Plugin for Poetry to enable dynamic versioning based on VCS tags
tensorflow - An Open Source Machine Learning Framework for Everyone
nvim-treesitter-context - Show code context
orange - 🍊 :bar_chart: :bulb: Orange: Interactive data analysis
webdiff - Two-column web-based git difftool
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
locust - "git diff" over abstract syntax trees
Keras - Deep Learning for humans
unison - A friendly programming language from the future
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration