data-analysis
jsplit
data-analysis | jsplit | |
---|---|---|
6 | 2 | |
44 | 59 | |
- | - | |
7.3 | 10.0 | |
10 months ago | over 1 year ago | |
Jupyter Notebook | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
data-analysis
- Why a public database of hospital prices doesn't exist yet
-
Open Database of Hospital Prices
https://github.com/dolthub/data-analysis/tree/main/transpare...
-
Show HN: Up to 100x Faster FastAPI with simdjson and io_uring on Linux 5.19
Absolutely interested, on my end at least. I wrote this to manage the transparency in coverage files: https://github.com/dolthub/data-analysis/tree/main/transpare... but I'm always looking for better techniques.
Oh wow, I see you used it on those exact files. How about that.
- Healthcare datasets with multiple continuous variables
-
Beyond the trillion prices: pricing C-sections in America
Details: data repository, code repository, and notebook. The linked GitHub repo gives you the tools you need to reproduce this analysis or create your own.
- I wrote some tools to find the prices of C-sections in America. Context in README
jsplit
-
Show HN: Up to 100x Faster FastAPI with simdjson and io_uring on Linux 5.19
Regarding the hard way, this little utility does a great job of splitting larger than memory JSON documents into collections of NDJSON files:
https://github.com/dolthub/jsplit
- [OC] The ridiculously absurd amount of pricing data that insurance companies just publicly dumped
What are some alternatives?
json_benchmark - Python JSON benchmarking and "correctness".
JsonReader - A JSON pull parser for PHP
synthea - Synthetic Patient Population Simulator
simdjson-go - Golang port of simdjson: parsing gigabytes of JSON per second
japronto - Screaming-fast Python 3.5+ HTTP toolkit integrated with pipelining HTTP server based on uvloop and picohttpparser.
json-buffet
msgspec - A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
price-transparency-guide - The technical implementation guide for the tri-departmental price transparency rule.
typedload - Python library to load dynamically typed data into statically typed data structures