InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 6 C++ Arrow Projects
-
Apache Arrow
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
Project mention: Show HN: Aiopandas – Async .apply() and .map() for Pandas, Faster API/LLMs Calls | news.ycombinator.com | 2025-03-15https://github.com/apache/arrow/blob/main/python/pyarrow/tes...
pyarrow/src/arrow/python/async.h:
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
Project mention: Unleashing GPU Power: Supercharge Your Data Processing with cuDF | dev.to | 2024-06-21
cuDF Documentation
-
ustore
Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️
-
duckdb-airport-extension
The Airport extension for DuckDB, enables the use of Arrow Flight with DuckDB
That's one way of looking at it. To me this UI seems like both a useful tool and an advertisement.
There's another way this could have gone. DuckDB Labs might have published the extension as providing official HTTP API for all to use. Then simultaneously MotherDuck would announce support for it in their UI. Now with access to any and all databases whether in-browser, anywhere through official HTTP API or in their managed cloud service.
I for one would like HTTP API for some things that now necessitates doing my own in Python. I don't see yet much need for the UI. I'm not looking for public, multiuser service. Just something that I can use locally which doesn't have to be inside a process (such as Python or web browser). There's such API in the extension now, but it's without docs and in C++ [1]. There's also the option of using 3rd party community extension that also does HTTP API [2]. Then there's one that supports remote access with Arrow Flight, but gRPC only it seems [3]. But official, stable version would be nice.
[1] https://github.com/duckdb/duckdb-ui/blob/main/src/http_serve...
[2] https://duckdb.org/community_extensions/extensions/httpserve...
[3] https://github.com/Query-farm/duckdb-airport-extension
-
vinum
Vinum is a SQL processor for Python, designed for data analysis workflows and in-memory analytics.
-
Project mention: Pgeon: Apache Arrow PostgreSQL connector in C++ | news.ycombinator.com | 2024-08-11
C++ Arrow discussion
C++ Arrow related posts
-
Adding concurrent read/write to DuckDB with Arrow Flight
-
Unlocking DuckDB from Anywhere - A Guide to Remote Access with Apache Arrow and Flight RPC (gRPC)
-
Kotlin DataFrame ❤️ Arrow
-
Random access string compression with FSST and Rust
-
Unleashing GPU Power: Supercharge Your Data Processing with cuDF
-
The Simdjson Library
-
cuDF – GPU DataFrame Library
-
A note from our sponsor - InfluxDB
www.influxdata.com | 15 May 2025
Index
What are some of the best open-source Arrow projects in C++? This list will help you:
# | Project | Stars |
---|---|---|
1 | Apache Arrow | 15,388 |
2 | cudf | 8,917 |
3 | ustore | 573 |
4 | duckdb-airport-extension | 194 |
5 | vinum | 65 |
6 | pgeon | 59 |