diffsitter
locust
Our great sponsors
diffsitter | locust | |
---|---|---|
15 | 4 | |
1,517 | 47 | |
- | - | |
8.7 | 0.0 | |
5 days ago | 6 months ago | |
Rust | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
diffsitter
-
AST-grep(sg) is a CLI tool for code structural search, lint, and rewriting
Or https://github.com/afnanenayet/diffsitter. I've tried both and I like them. No preference or notable opinions on them yet!
-
Enable new diff option linematch (#14537) · neovim/neovim@04fbb1d
For git diff's I've been using https://github.com/afnanenayet/diffsitter
-
Difftastic, the Fantastic Diff: How it works
One more tree-sitter based diffing tool - diffsitter
https://github.com/afnanenayet/diffsitter
-
What Comes After Git
Several threads here point to difftastic: https://github.com/Wilfred/difftastic
I know a lot of people who have a lot of hope for diffsitter (or something like it): https://github.com/afnanenayet/diffsitter
Personally, I think the reason most "good" semantic diff tools are proprietary is that they are huge amounts of effort that are mostly "hacks" and "heuristics" bandaged together in ways that people don't want to let out how the sausage was made.
But I also "general, language agnostic AST-based semantic diff" is a mountain peak we cannot reach (probably ever), and I believe my experiments found an interesting local maxima that people are maybe sleeping on (lexer-based diffs rather than parser-based diffs): https://github.com/WorldMaker/tokdiff
-
Fast Kernel Headers: Tree -v1: Eliminate the Linux kernel's "Dependency Hell"
https://github.com/afnanenayet/diffsitter there are quiet a few projects such as this one, attempting to solve the issue. :)
-
Thinking about programming systems and not just languages and environments
There’s an interesting project in the semantic diff/merge space that I have been keeping an eye out for https://github.com/afnanenayet/diffsitter
-
What if Git worked with Programming Languages?
I have never used any of them, but it look like tree-sitter based diff tools are exactly what you are searching for (like difftastic, gumtree or diffsitter).
I believe Unison is the only attempt to do this at a programming language/environment level.
For Git diffs, there is Diffsitter, which uses Tree Sitter to generate semantic diffs of code files: https://github.com/afnanenayet/diffsitter
I have not used it, but it is high on my todo list.
-
Difftastic: A syntactic diff tool
Looks great, I'll try it! FYI, there is a very similar project called diffsitter https://github.com/afnanenayet/diffsitter
- diffsitter - a tree-sitter based AST difftool to get meaningful semantic diffs
locust
-
Effective Code Browsing
Nice!
Have been working on something similar, although my use case is more about learning how code has changed across git commits: https://github.com/bugout-dev/locust
For Javascript/Typescript/React support, like you, I hooked into the Babel toolchain. Can't recommend it highly enough.
There's also a newish project called quick-lint-js which seems to have written their own from-scratch AST parser for JS, but I haven't tried it yet: https://github.com/quick-lint/quick-lint-js
Finally, another project that I know in this space is comby (I believe it is owned/maintained by the folks at Sourcegraph): https://comby.dev/
Don't know why I dumped all those links there. Just figured there may be something useful in them for you. Am also just super passionate about building knowledge about code bases by analyzing their ASTs. Nice to meet a fellow enthusiast. :)
-
What if Git worked with Programming Languages?
I maintain a free/open source project that does exactly what the author asks for: https://github.com/bugout-dev/locust.
Our tool uses git as the foundation of its functionality. It superimposes git diffs on top of ASTs.
It is insanely powerful.
For example, we use it to power semantic code search and current support Python, Javascript, and Java. We generate a JSON object defining the AST differences between initial and terminal commits on GitHub PRs and doing text search on the JSON objects performs surprisingly well when we want to answer questions like, "When did we add dateutils as a dependency?" or "When did we last change the /journals handler on the API?"
The Python integration currently sees the most use but if you are interested in other languages, we would be happy to support it.
Do drop me a DM if you want help getting started with Locust.
-
Diffsitter: A tree-sitter based AST difftool to get meaningful semantic diffs
My team has a similar project (Locust: https://github.com/bugout-dev/locust) where the goal is to learn the semantic meanings of code changes in git commits, GitHub PRs, etc.
Since we took git diffs as a target for semantic analysis, we have a different approach to our diffs. We start with line-by-line diffs (specifically using "git diff") and then take a semantic diff by superimposing the git diff information on top of the initial and terminal ASTs.
This makes the diff calculation cheaper because we don't have to do full diff between trees.
Haven't updated the code in a few months, but my team is actively using Locust on public GitHub repos to learn the semantics of those code bases. We do plan to do some work on it soon to make it easier to make Locust easier to use (especially as a library).
Really need to sit down and take a proper look at tree-sitter. We currently support Locust diffs for Python, Javascript, and Java, but each one is custom written and implements the same basic algorithm. It looks like tree sitter might just crush this problem for us.
- Difftastic: Syntax-aware structured diff tool
What are some alternatives?
difftastic - a structural diff that understands syntax 🟥🟩
weggli - weggli is a fast and robust semantic search tool for C and C++ codebases. It is designed to help security researchers identify interesting functionality in large codebases.
semantic-source - Parsing, analyzing, and comparing source code across many languages
gumtree - An awesome code differencing tool
nvim-treesitter-context - Show code context
tree-sitter-json - JSON grammar for tree-sitter
TypeScript - TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
dark - Darklang main repo, including language, backend, and infra
nbdime - Tools for diffing and merging of Jupyter notebooks.
git-merge-driver - Example of how to configure a custom git merge driver
diffr - Yet another diff highlighting tool