Our great sponsors
-
delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs (by delta-io)
-
Apache Arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
I prefer parquet (or delta for larger datasets. CSV for very small datasets, or the ones that will be later used/edited in Excel or Googke sheets.
In fact I have asked Apache Github how to read select column of particular row group of a parquet file. https://github.com/apache/arrow/issues/35688
The Parquet-Go library is very complex, not yet success to use it. So I ask whether DuckDB can provide API https://github.com/duckdb/duckdb/issues/7776