Tree-sitter: an incremental parsing system for programming tools

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • tree-sitter

    An incremental parsing system for programming tools

  • > Does GitHub currently use tree-sitter for syntax highlighting?

    For some languages, yes. https://news.ycombinator.com/item?id=26227214

    > If yes, are the libraries open-source?

    They are! tree-sitter itself is open-source [1], as are all of the language parsers we've listed on the homepage [2]. The syntax highlighting support is documented here [3].

    [1] https://github.com/tree-sitter/tree-sitter

    [2] https://tree-sitter.github.io/tree-sitter/#available-parsers

    [3] https://tree-sitter.github.io/tree-sitter/syntax-highlightin...

  • tree-sitter-ruby

    Ruby grammar for tree-sitter

  • So what's cool is that while we don't handle that during parsing, you can use another set of tree-sitter features to do tree queries to achieve this. Here's the query for detecting Ruby locals: https://github.com/tree-sitter/tree-sitter-ruby/blob/32cd5a0... and here's some better documentation for how the query language works: https://tree-sitter.github.io/tree-sitter/syntax-highlightin....

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • elisp-tree-sitter

    Emacs Lisp bindings for tree-sitter

  • Emacs does have a package to use tree-sitter [0]. I think emacs-lsp is aware of this highlighting backend and performs pretty well.

    (semantic highlighting is pretty slow for C++ with font-lock, with tree-sitter it's a breeze :))

    [0]: https://github.com/ubolonton/emacs-tree-sitter

  • tree-sitter-go

    Go grammar for tree-sitter

  • Worth calling out that the syntax highlighting support is used to highlight several languages in github.com. (Linguist is still used for the long tail of languages, but we plan to migrate more and more over to tree-sitter-based highlighting over time.)

    The query language is also what's used to drive the fuzzy/ctags-like Code Navigation feature. Both of those are powered by tree-sitter query files defined in each language's repo, like these for Go: https://github.com/tree-sitter/tree-sitter-go/tree/master/qu...

  • parser

    A Ruby parser. (by whitequark)

  • This is more a function of Ruby than of tree-sitter. The tree-sitter grammars for other languages are hopefully less inscrutable. For Ruby, we basically just ported whitequark's parser [1] over to tree-sitter's grammar DSL and scanner API.

    [1] https://github.com/whitequark/parser

  • tree-sitter-c

    C grammar for tree-sitter

  • [1] https://github.com/tree-sitter/tree-sitter-c/issues/51

  • nvim-treesitter

    Nvim Treesitter configurations and abstraction layer

  • There's been some recent discussion as to whether tree-sitter grammars can be used to parse markdown with some hacks, with no consensus among plugin authors:

    https://github.com/nvim-treesitter/nvim-treesitter/issues/87...

    Could you possibly chime into that discussion and help them with any possible insights you might have on that? That would be really awesome! TIA <3

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • lsif-os

    A (mostly) language-agnostic indexer for generating LSIF data.

  • I'm curious to see if Tree-sitter can be used to provide fast and rich code navigation. I was able to implement simple goto definition/references [1], not sure if it can be used for more advanced navigation features in a language-agnostic way.

    If you're interested, GitHub is already using it [2] for that purpose and Sourcegraph is experimenting it [3]

    [1] https://github.com/alidn/lsif-os

  • sourcegraph

    Code AI platform with Code Search & Cody

  • [3] https://github.com/sourcegraph/sourcegraph/issues/17378

  • csharp-mode

    A major-mode for editing C# in emacs

  • Tooting my own horn, Emacs’ csharp-mode[1] is undergoing a rewrite to be 100% based on tree-sitter rather than regexps.

    The new code runs way faster and is so much nicer to work with.

    Once all the kinks are gone, I can’t imagine going back.

    [1] https://github.com/emacs-csharp/csharp-mode/blob/master/csha...

  • tree-sitter-kotlin

    Kotlin grammar for Tree-sitter

  • Since the feature launched there is now a Kotlin tree sitter implementation https://github.com/fwcd/tree-sitter-kotlin

  • Moose

    MOOSE - Platform for software and data analysis. (by moosetechnology)

  • Could you compare Sourcegraph to something like Moose, FAMIX, GToolkit?

    https://github.com/moosetechnology/Moose

  • PHP Parser

    A PHP parser written in PHP

  • I wish there was a more universal format for parsers, but I just don't think there enough people who know their stuff.

    Take PHP, a language that a lot of people use: the tree-sitter-php extension doesn't support features added in 2019, let alone features added towards the end of 2020.

    If you want an up-to-date PHP parser, there's really only one open-source parser[0] that's accurate enough to be used on PHP codebases old and new, and it's written in PHP. Then if you want to parse in a robust fashion you have to adopt a number of hacks to get everything working.

    I hadn't encountered LSIF before – can GitHub be configured to use those maps?

    [0] https://github.com/nikic/PHP-Parser

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts