simdjson-go
data-analysis
Our great sponsors
simdjson-go | data-analysis | |
---|---|---|
6 | 6 | |
1,757 | 44 | |
1.0% | - | |
4.0 | 7.3 | |
5 months ago | 10 months ago | |
Go | Jupyter Notebook | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
simdjson-go
-
Show HN: Up to 100x Faster FastAPI with simdjson and io_uring on Linux 5.19
Speaking of Go, there's a simdjson implementation for golang too:
> Performance wise, simdjson-go runs on average at about 40% to 60% of the speed of simdjson. Compared to Golang's standard package encoding/json, simdjson-go is about 10x faster.
I haven't tried it yet but I don't really need that speed.
https://github.com/minio/simdjson-go
-
How to Use AVX512 in Golang
I agree. For performance-sensitive situations, C/C++ or Rust is the only choice. However, many developers choose Go or other languages for engineering efficiency. A typical use case of SIMD in Go is simdjson-go. Besides, there are plenty of bindings and ports of simdjson. "Other languages" developers also need performance improvement from native instructions such as SIMD.
- Sonic: A fast JSON serializing and deserializing library
- Whats the fastest JSON unmarshaling package as of right now?
-
What is the best solution to unique data in golang
I suggest to use a streaming library to parse your file. Like jstream or simdjson-go
-
I wrote yet another json parser. It may be a contender for fastest.
You can also try comparing with https://github.com/minio/simdjson-go. It does use a different API, however, would be good to compare nevertheless.
data-analysis
- Why a public database of hospital prices doesn't exist yet
-
Open Database of Hospital Prices
https://github.com/dolthub/data-analysis/tree/main/transpare...
-
Show HN: Up to 100x Faster FastAPI with simdjson and io_uring on Linux 5.19
Absolutely interested, on my end at least. I wrote this to manage the transparency in coverage files: https://github.com/dolthub/data-analysis/tree/main/transpare... but I'm always looking for better techniques.
Oh wow, I see you used it on those exact files. How about that.
- Healthcare datasets with multiple continuous variables
-
Beyond the trillion prices: pricing C-sections in America
Details: data repository, code repository, and notebook. The linked GitHub repo gives you the tools you need to reproduce this analysis or create your own.
- I wrote some tools to find the prices of C-sections in America. Context in README
What are some alternatives?
easyjson - Fast JSON serializer for golang.
json_benchmark - Python JSON benchmarking and correectness.
jstream - Streaming JSON parser for Go
synthea - Synthetic Patient Population Simulator
jsonparser - One of the fastest alternative JSON parser for Go that does not require schema
jsplit - A Go program to split large JSON files into many jsonl files
sonic - A blazingly fast JSON serializing & deserializing library
japronto - Screaming-fast Python 3.5+ HTTP toolkit integrated with pipelining HTTP server based on uvloop and picohttpparser.
jsonlite - A simple, self-contained, serverless, zero-configuration, json document store.
msgspec - A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
rjson - A fast json parser for go
typedload - Python library to load dynamically typed data into statically typed data structures