parquet-wasm
odbc2parquet
parquet-wasm | odbc2parquet | |
---|---|---|
6 | 5 | |
464 | 206 | |
- | - | |
9.0 | 9.3 | |
3 days ago | 1 day ago | |
Rust | Rust | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
parquet-wasm
- FLaNK AI Weekly for 29 April 2024
- Parquet-WASM: Rust-based WebAssembly bindings to read and write Parquet data
-
Goodbye, Node.js Buffer
nodejs-polars is node-specific and uses native FFI. polars can be compiled to Wasm but doesn't yet have a js API out of the box.
As for the fastest way to serialize data to Pandas data to the browser, you should use Parquet; it's the fastest to write on the Python side and read on the JS side, while also being compressed. See https://github.com/kylebarron/parquet-wasm (full disclosure, I wrote this)
-
Rust 1.63.0
I'm building WebAssembly bindings to existing Rust libraries [0] and lower-dependency geospatial tools [1]. Rust makes it very easy to bind rust code to both WebAssembly and Python. And by avoiding some large C geospatial dependencies we can get reliable performance in both wasm and Python using the exact same codebase.
[0]: https://github.com/kylebarron/parquet-wasm
[1]: https://github.com/kylebarron/geopolars
- Why isn’t there a decent file format for tabular data?
-
Recommendations when publishing a WASM library
Looks to be a great resource. I've been working on a WASM implementation of reading and writing Apache Parquet [0] and it's been difficult being new to WASM to find the best way of distributing the WASM that works on Node and through bundlers like Webpack.
[0]: https://github.com/kylebarron/parquet-wasm
odbc2parquet
- Postgres and Parquet in the Data Lke
-
MySQL table data to direct parquet output
Although, I found a GitHub page (odbc2parquet) which can export the table (also a query output) to parquet.
-
Parquet best practices
Is this a one-time task? Maybe check out ODBC2PARQUET https://github.com/pacman82/odbc2parquet
-
Thoughts on Using Airbyte to read/write to S3?
I tried writing parquet to s3 with Airbyte a few months ago and gave up. It was extremely slow for small tables and would not work at all for larger tables. I wound up using this https://github.com/pacman82/odbc2parquet + aws cli
-
Extract data from ERP systems to Snowflake - Which tools (besides Airbyte)?
Yes, I have been tinkering around with odbc2parquet (https://github.com/pacman82/odbc2parquet) and storing it in a variant column. For the dependency/workflow management maybe prefect
What are some alternatives?
datasette-stripe - A web SQL interface to your Stripe account using Datasette.
sql-spark-connector - Apache Spark Connector for SQL Server and Azure SQL
quickjs-emscripten - Safely execute untrusted Javascript in your Javascript, and execute synchronous code that uses async functions
roapi - Create full-fledged APIs for slowly moving datasets without writing a single line of code.
transmitic - Encrypted, peer to peer, file transfer program :: https://discord.gg/tRT3J6T :: https://www.reddit.com/r/transmitic/ :: https://twitter.com/transmitic
sqlpad - Web-based SQL editor. Legacy project in maintenance mode.
geopolars - Geospatial extensions for Polars
geoparquet - Specification for storing geospatial vector data (point, line, polygon) in Parquet
odiff - The fastest pixel-by-pixel image visual difference tool in the world.
duckdb_fdw - DuckDB Foreign Data Wrapper for PostgreSQL
rson - Rust Object Notation
FreeSql - 🦄 .NET aot orm, C# orm, VB.NET orm, Mysql orm, Postgresql orm, SqlServer orm, Oracle orm, Sqlite orm, Firebird orm, 达梦 orm, 人大金仓 orm, 神通 orm, 翰高 orm, 南大通用 orm, 虚谷 orm, 国产 orm, Clickhouse orm, QuestDB orm, MsAccess orm.