jsonmerge_git_merge_driver
nbdime
Our great sponsors
jsonmerge_git_merge_driver | nbdime | |
---|---|---|
1 | 7 | |
0 | 2,595 | |
- | 1.0% | |
0.0 | 8.7 | |
over 2 years ago | about 1 month ago | |
Python | TypeScript | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
jsonmerge_git_merge_driver
-
What if Git worked with Programming Languages?
I investigated the option of using a custom git merge driver for a project where we were planning to version control a bunch of data files using git.
Here's a proof of concept python merge driver I bashed together at the time to auto-merge JSON objects: https://github.com/fcostin/jsonmerge_git_merge_driver
This never went anywhere near production, but it was very easy to put together something basic.
One complication with using a custom merge driver, as discussed by https://github.com/Praqma/git-merge-driver , is that they need to be configured inside the `.git/config` of the repo, which itself is not version controlled. So there's an additional config management overhead to rolling that out to everyone in a machine. Additionally, if outsourcing hosting for git repos, it may not be supported to install and configure a custom merge driver for merges conducted by the hosting platform (e.g. merges created by github.com pull request workflow).
One idea I had at the time was using external schema files (e.g. JSON schema for JSON files) to help guide/constrain the result of the merge. I never implemented it, but it should be possible. If the schemas were also version controlled in the same git repo that stores the data, you'd need to figure out which one (and which version) to load when resolving a merge conflict of a data file. There doesn't seem to be a well-supported robust way for a merge driver script to discover the source and destination branches, but there are some potentially fragile ways of doing it that work some of the time.
nbdime
-
Stuff I Learned during Hanukkah of Data 2023
I remember hearing about nbdime and thinking it sounded useful, but I've never really needed it since I rarely use Jupyter in the first place. But then I made some changes to my Hanukkah of Data 2023 notebook to work with the follow-up "speed run" challenge (a new dataset and slightly tweaked clues), and the native Git diff was too noisy to be useful. nbdime came to the rescue! Here are the changes I had to make for days 2 and 3 during the speed run:
- The Jupyter+Git problem is now solved
-
Ask HN: Are there any good Diff tools for Jupyter Notebooks?
[5] ReviewNB for reviewing & diff'ing notebook PRs / Commits on GitHub
Disclaimer: While I’m the author of last two (GitPlus & ReviewNB), I’ve represented the overall landscape in an unbiased way. I've been working on this specific problem for 3+ years & regularly talk to teams who use GitHub with notebooks.
[1] https://nbdime.readthedocs.io
- Notebooks suck: change my mind
-
What if Git worked with Programming Languages?
Interesting they mentioned Jupyter Notebooks but not NBDime https://github.com/jupyter/nbdime which is a Jupyter plugin specifically to address this problem. Without it, diffing notebooks is not feasible.
-
Jupyter diff in Magit
A bit off-topic but someone might know; I'm working with jupyter notebook files (ipynb) which are basically json files. Git diff is very noisy so there's nbdime which works great in the CLI. Is there a way to have Magit aware of its integration with git diff?
-
The Notepad++
I use nbdime which allows you to ignore parts of a notebook (e.g. outputs) when diffing.