Building a high performance JSON parser

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io
featured
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
  1. simdjson

    Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

    Everything you said is totally reasonable. I'm a big fan of napkin math and theoretical upper bounds on performance.

    simdjson (https://github.com/simdjson/simdjson) claims to fully parse JSON on the order of 3 GB/sec. Which is faster than OP's Go whitespace parsing! These tests are running on different hardware so it's not apples-to-apples.

    The phrase "cannot go faster than this" is just begging for a "well ackshully". Which I hate to do. But the fact that there is an existence proof of Problem A running faster in C++ SIMD than OP's Probably B scalar Go is quite interesting and worth calling out imho. But I admit it doesn't change the rest of the post.

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. sonic

    A blazingly fast JSON serializing & deserializing library (by bytedance)

    Also worth looking at https://github.com/bytedance/sonic

  4. ojg

    Optimized JSON for Go

    You might want to take a look at https://github.com/ohler55/ojg. It takes a different approach with a single pass parser. There are some performance benchmarks included on the README.md landing page.

  5. JSMN

    Jsmn is a world fastest JSON parser/tokenizer. This is the official repo replacing the old one at Bitbucket

    Like how https://github.com/zserge/jsmn works. I thought it would be neat to have such as parser for https://github.com/vshymanskyy/muon

  6. muon

    µON - a compact and simple binary object notation (by vshymanskyy)

    Like how https://github.com/zserge/jsmn works. I thought it would be neat to have such as parser for https://github.com/vshymanskyy/muon

  7. pulldown-cmark

    An efficient, reliable parser for CommonMark, a standard dialect of Markdown

    I also really like this paradigm. It’s just that in old crusty null-terminated C style this is really awkward because the input data must be copied or modified. But it’s not an issue when using slices (length and pointer). Unfortunately most of the C standard library and many operating system APIs expect that.

    I’ve seen this referred to as a pull parser in a Rust library? (https://github.com/raphlinus/pulldown-cmark)

  8. go-jsonschema

    A tool to generate Go data types from JSON Schema definitions.

    For json schema specifically there are some tools like go-jsonschema[1] but I've never used them personally. But you can use something like ffjson[2] in go to generate a static serialize/deserialize function based on a struct definition.

    [1] https://github.com/omissis/go-jsonschema

  9. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. ffjson

    faster JSON serialization for Go

  11. jsoncut

  12. benchmarks

    Some benchmarks of different languages

  13. jsonrepair

    Repair invalid JSON documents

    The jsonrepair tool https://github.com/josdejong/jsonrepair might interest you. It's tailored to fix JSON strings.

    I've been looking into something similar for handling partial JSONs, where you only have the first n chars of a JSON. This is common with LLM with streamed outputs aimed at reducing latency. If one knows the JSON schema ahead, then one can start processing these first fields before the remaining data has fully loaded. If you have to wait for the whole thing to load there is little point in streaming.

    Was looking for a library that could do this parsing.

  14. go

    The Go programming language

    Obviously you can manually inline functions. That's what happened in the article.

    The comment is about having a directive or annotation to make the compiler inline the function for you, which Go does not have. IMO, the pre-inline code was cleaner to me. It's a shame that the compiler could not optimize it.

    There was once a proposal for this, but it's really against Go's design as a language.

    https://github.com/golang/go/issues/21536

  15. graphql-go-tools

    GraphQL Router / API Gateway framework written in Golang, focussing on correctness, extensibility, and high-performance. Supports Federation v1 & v2, Subscriptions & more.

    I've taken a very similar approach and built a GraphQL tokenizer and parser (amongst many other things) that's also zero memory allocations and quite fast. In case you'd like to check out the code: https://github.com/wundergraph/graphql-go-tools

  16. jsb

    Fast json <=> binary serializer library for C

    Writing a json parser is definitely an educational experience. I wrote one this summer for my own purposes that is decently fast: https://github.com/nwpierce/jsb

  17. json5-spec

    The JSON5 Data Interchange Format

  18. Visual Studio Code

    Visual Studio Code

  19. gqlscan

    GraphQL lexical scanner for Go

    You might also want to check out this abomination of mine: https://github.com/graph-guard/gqlscan

    I've held a talk about this, unfortunately wasn't recorded. I've tried to squeeze as much out of Go as I could and I've went crazy doing that :D

  20. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • A Journey building a fast JSON parser and full JSONPath

    5 projects | news.ycombinator.com | 11 Oct 2023
  • Ask HN: What are some Golang tools you can't live without?

    2 projects | news.ycombinator.com | 23 May 2023
  • I wrote a JSON parsing library that makes it easy to query and even do arithmetic operations on JSON.

    3 projects | /r/golang | 3 Jul 2022
  • What is the best solution to unique data in golang

    7 projects | /r/golang | 5 Aug 2021
  • OjG now has a tokenizer that is almost 10 times faster than json.Decode

    7 projects | /r/golang | 18 Apr 2021

Did you know that Go is
the 4th most popular programming language
based on number of references?