SaaSHub helps you find the best software and product alternatives Learn more →
Top 21 Dataframe Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
explorer
Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
rumble
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more (by RumbleDB)
-
dataframe_sql
A Python package that parses SQL and interprets it as methods that act upon existing pandas (or other types of) DataFrames that have been declared and registered
-
TableIO.jl
A glue package for reading and writing tabular data. It aims to provide a uniform api for reading and writing tabular data from and to multiple sources.
-
FloridaPropertyData
A Python-based tool for retrieving and processing property data for specific counties in Florida using Parcel ID numbers. Simplifies data retrieval and offers customization options for real estate agents, investors, and government officials.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
This is because 0.1 is in actuality the floating point value value 0.1000000000000000055511151231257827021181583404541015625, and thus 1 divided by it is ever so slightly smaller than 10. Nevertheless, fpround(1 / fpround(1 / 10)) = 10 exactly.
I found out about this recently because in Polars I defined a // b for floats to be (a / b).floor(), which does return 10 for this computation. Since Python's correctly-rounded division is rather expensive, I chose to stick to this (more context: https://github.com/pola-rs/polars/issues/14596#issuecomment-...).
TileDB, Inc. | Full-Time | REMOTE | USA, Greece/EU | [https://tiledb.com](https://tiledb.com/)
TileDB has recently announced a $34 million Series B fund-raise and is actively hiring for engineers across a range of roles (SRE, backend/distributed systems, database internals, and more). You will have the opportunity to work on innovative technology that creates impact for challenging problems in genomics, geospatial, machine learning, distributed systems, and many other areas.
TileDB Cloud is the modern database, allowing developers and scientists to capture, analyze, and share any data with any tool. We build on a broad foundation of open source, maintaining the TileDB storage engine, libraries for genomics (single-cell and population), geospatial (raster, point clouds, and more), a TileDB visualization engine extending Babylon.js, and much more ([github.com/TileDB-Inc/TileDB](http://github.com/TileDB-Inc/TileDB))
With TileDB, all data — tables, genomics, images, videos, location, time-series — is captured as multi-dimensional arrays. To supercharge this data, TileDB Cloud implements a serverless infrastructure delivering query execution, access control, data and code sharing, and distributed computing at global scale — eliminating cluster management, minimizing TCO, and promoting scientific collaboration and reproducibility.
Website: [https://tiledb.com](https://tiledb.com/) | GitHub: https://github.com/TileDB-Inc/TileDB | Blog: https://tiledb.com/blog
We are actively hiring for several roles including:
- Site Reliability Engineer (k8s, Terraform, automation, Prometheus, CloudWatch, GitOps; Golang, Python)
Numpy functionality is largely covered by https://www.gonum.org/ but for pandas I'm not sure if there is an equivalent as widely accepted. However, you might try https://github.com/rocketlaunchr/dataframe-go which I have not tried but it looks like it covers some of what you're looking for
The Explorer library [0] in Elixir uses Polars underneath it.
[0] https://github.com/elixir-explorer/explorer
Project mention: Introducing seaborn-polars, a package allowing to use Polars DataFrames and LazyFrames with Seaborn | /r/Python | 2023-05-15Yes, with the upcoming dataframe api protocol the implementation and API will be separated for libraries that adopt that protocol.
Julia is great https://github.com/JuliaData or https://github.com/sl-solution/DLMReader.jl might be a good startingpoint
Dataframes related posts
-
Why Python's Integer Division Floors (2010)
-
Polars 0.20 Released
-
Polars: Dataframes powered by a multithreaded query engine, written in Rust
-
Polars 0.34 is released. (A query engine focussing on DataFrame front ends)
-
Polars 0.34 is released. (A query engine focussing on DataFrame front ends)
-
Polars
-
If you could ask the creators of pandas for one additional feature, what would it be?
-
A note from our sponsor - SaaSHub
www.saashub.com | 5 May 2024
Index
What are some of the best open-source Dataframe projects? This list will help you:
Project | Stars | |
---|---|---|
1 | polars | 26,378 |
2 | pandera | 3,012 |
3 | TileDB | 1,764 |
4 | DataFrames.jl | 1,696 |
5 | dataframe-go | 1,112 |
6 | explorer | 977 |
7 | pdpipe | 715 |
8 | eland | 611 |
9 | DataFramesMeta.jl | 472 |
10 | datacompy | 386 |
11 | riptable | 346 |
12 | rumble | 207 |
13 | dataframe_sql | 96 |
14 | dataframe-api | 95 |
15 | red_amber | 61 |
16 | sql_to_ibis | 50 |
17 | DLMReader.jl | 26 |
18 | heidi | 25 |
19 | TableIO.jl | 13 |
20 | mainframe | 3 |
21 | FloridaPropertyData | 2 |
Sponsored