SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python CSV Projects
-
pandas-ai
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
View the Project on GitHub
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
Project mention: XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal | news.ycombinator.com | 2025-03-27
I used to use q for this sort of thing. Not sure if there are better choices now as it have been a few years.
https://harelba.github.io/q/
-
A couple of reasons which pop to mind:
- Searching a plain text data file is O(n). Searching a SQLite database that has been properly indexed, which is very easy to do nowadays with FTS5, is O(log n) worst case scenario and O(1) in the best case. This doesn't explain why SQLite over a dataframe or anything, but it definitely justifies it over plain text for large email collections.
- SQLite is really easy to write custom views and programs around. Virtually every major programming language can work with it without issue. See also: simonw's wonderful https://datasette.io/ .
- SQLite is an accepted archival format by the Library of Congress, if you ever want to go down the rabbit hole of digital preservation.
-
-
csvkit
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
Project mention: XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal | news.ycombinator.com | 2025-03-27 -
django-import-export
Django application and library for importing and exporting data with admin integration.
This is where the django-import-export library comes in handy. It provides an easy way to import and export data in various formats, such as CSV, xlsx and more.
-
datamodel-code-generator
Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
ethereum-etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
-
pygraphistry
PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
Nice!
It's interesting from the perspective of maintenance too. You can bet most constants like warp sizes will change, so you get into things like having profiles, autotuners, or not sweating the small stuff.
We went more extreme, and nowadays focus on several layers up: By accepting the (high!) constant overheads of tools like RAPIDS cuDF , we get in exchange the ability to easily crank code with good saturation on the newest GPUs and that any data scientist can edit and extend. Likewise, they just need to understand basics like data movement and columnar analytics data reps to make GPU pipelines. We have ~1 CUDA kernel left and many years of higher-level.
As an example, this is one of the core methods of our new graph query language (think cypher on pandas/spark), and it gets Graph500 level performance on cheapo GPUs just by being data parallel with high saturation per step: https://github.com/graphistry/pygraphistry/blob/master/graph... . Despite ping-ponging a ton because cudf doesn't (yet) coalesce GPU kernel calls, it still places well, and is easy to maintain & extend.
-
Project mention: Show HN: Scraper for job listings directly from company websites | news.ycombinator.com | 2024-12-07
jobfunnel is FOSS and accepting contributions: https://github.com/PaulMcInnis/JobFunnel
Currently supports indeed, in the past supported glassdoor and others.
-
python-benedict
:blue_book: dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.
View the Project on GitHub
-
-
extract_otp_secrets
Extract one time password (OTP) secrets from QR codes exported by two-factor authentication (2FA) apps such as "Google Authenticator". The exported QR codes from authentication apps can be captured by camera, read from images, or read from text files. The secrets can be exported to JSON or CSV, or printed as QR codes to console.
-
CleverCSV
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
-
-
-
-
-
-
-
rainbow_csv
🌈Rainbow CSV - Vim plugin: Highlight columns in CSV and TSV files and run queries in SQL-like language
-
pytablewriter
pytablewriter is a Python library to write a table in various formats: AsciiDoc / CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python CSV discussion
Python CSV related posts
-
A Tool I Built for Synthetic Datasets
-
XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal
-
Show HN: Fuzzy deduplicate any CSV using vector embeddings
-
Developing a CKAN Handler for MindsDB: Bridging Open Data and Machine Learning
-
Export data from Django Admin to CSV
-
Show HN: Django-import-export v4 is out
-
Plotille: Plot in the terminal using Braille dots
-
A note from our sponsor - SaaSHub
www.saashub.com | 17 May 2025
Index
What are some of the best open-source CSV projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | pandas-ai | 20,042 |
2 | q | 10,282 |
3 | datasette | 10,020 |
4 | visidata | 8,217 |
5 | csvkit | 6,177 |
6 | django-import-export | 3,195 |
7 | datamodel-code-generator | 3,183 |
8 | ethereum-etl | 3,017 |
9 | pygraphistry | 2,257 |
10 | JobFunnel | 2,010 |
11 | python-benedict | 1,567 |
12 | DataProfiler | 1,486 |
13 | extract_otp_secrets | 1,342 |
14 | CleverCSV | 1,293 |
15 | pyexcel | 1,242 |
16 | municipios-brasileiros | 1,134 |
17 | finviz | 1,116 |
18 | csvs-to-sqlite | 895 |
19 | rows | 876 |
20 | URS | 875 |
21 | rainbow_csv | 667 |
22 | pytablewriter | 626 |
23 | test-lists | 480 |