Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Shameless plug, my project exposes the common Parquet operations using a Rust CLI tool using the Rust API for Apache Arrow and can be used without any Java/Hadoop/Spark dependencies. Also available a static binary.
https://github.com/manojkarthick/pqrs
We have a Relational API in addition to SQL! Here are some examples for the Python Relational API client:
https://github.com/duckdb/duckdb/blob/master/examples/python...
Plus, if you are working in Python, you can use DuckDB as the engine underneath Ibis, Fugue, Siuba, or anything that works with SQLAlchemy (using the DuckDB-engine driver)! In R, you can use dplyr or dbplyr.
DuckDB's file format is one way to persist data (it uses a single file), but you can also write out to Parquet, or write out to Apache Arrow and then parquet (in a partitioned format I believe).
Disclaimer - I write docs for DuckDB!
Hi @ritchie46 - I have just written [raku Dan](https://github.com/p6steve/raku-Dan) as a way to scratch the "data analytics" itch in a new way -- my next step is to write Dan::Polars as a polars binding via (eg.) raku NativeCall. Can you point me to a good recipe for success? [email protected]
Related posts
- How moving from Pandas to Polars made me write better code without writing better code
- GlareDB: An open source SQL database to query and analyze distributed data
- SQLite + Rust: Building a CLI Password Vault 🦀
- Proton, a fast and lightweight alternative to Apache Flink
- What I Talk About When I Talk About Query Optimizer (Part 1): IR Design