csvz
parquet-wasm
csvz | parquet-wasm | |
---|---|---|
3 | 6 | |
30 | 464 | |
- | - | |
4.4 | 9.0 | |
over 3 years ago | 2 days ago | |
Rust | ||
Creative Commons Zero v1.0 Universal | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
csvz
- CSVZ: Zipped CSV files with optional metadata (2020)
-
Why isn’t there a decent file format for tabular data?
I put some work into creating a standard, csvz, for putting csv files and their metadata, into a zip file.
https://github.com/secretGeek/csvz
It’s a pretty powerful concept.
SimonW’s preferred technique of using sqlite as the means of exchange is also very powerful. Particularly when combined with all of the utils he maintains.
- It's Time to Retire the CSV
parquet-wasm
- FLaNK AI Weekly for 29 April 2024
- Parquet-WASM: Rust-based WebAssembly bindings to read and write Parquet data
-
Goodbye, Node.js Buffer
nodejs-polars is node-specific and uses native FFI. polars can be compiled to Wasm but doesn't yet have a js API out of the box.
As for the fastest way to serialize data to Pandas data to the browser, you should use Parquet; it's the fastest to write on the Python side and read on the JS side, while also being compressed. See https://github.com/kylebarron/parquet-wasm (full disclosure, I wrote this)
-
Rust 1.63.0
I'm building WebAssembly bindings to existing Rust libraries [0] and lower-dependency geospatial tools [1]. Rust makes it very easy to bind rust code to both WebAssembly and Python. And by avoiding some large C geospatial dependencies we can get reliable performance in both wasm and Python using the exact same codebase.
[0]: https://github.com/kylebarron/parquet-wasm
[1]: https://github.com/kylebarron/geopolars
- Why isn’t there a decent file format for tabular data?
-
Recommendations when publishing a WASM library
Looks to be a great resource. I've been working on a WASM implementation of reading and writing Apache Parquet [0] and it's been difficult being new to WASM to find the best way of distributing the WASM that works on Node and through bundlers like Webpack.
[0]: https://github.com/kylebarron/parquet-wasm
What are some alternatives?
odiff - The fastest pixel-by-pixel image visual difference tool in the world.
datasette-stripe - A web SQL interface to your Stripe account using Datasette.
cyanide - BSON documents in Elixir language
quickjs-emscripten - Safely execute untrusted Javascript in your Javascript, and execute synchronous code that uses async functions
hsv5 - HTML5 Based Alternative to CSV, TSV, JSONL, etc
transmitic - Encrypted, peer to peer, file transfer program :: https://discord.gg/tRT3J6T :: https://www.reddit.com/r/transmitic/ :: https://twitter.com/transmitic
simdjson - Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
geopolars - Geospatial extensions for Polars
ndjson.github.io - Info Website for NDJSON
ndjson-spec - Specification
rson - Rust Object Notation