Python CSV

Open-source Python projects categorized as CSV

Top 23 Python CSV Projects

  1. pandas-ai

    Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

    Project mention: Pandas AI | news.ycombinator.com | 2025-07-18
  2. Sevalla

    Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

    Sevalla logo
  3. q

    q - Run SQL directly on delimited files and multi-file sqlite databases (by harelba)

    Project mention: XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal | news.ycombinator.com | 2025-03-27

    I used to use q for this sort of thing. Not sure if there are better choices now as it have been a few years.

    https://harelba.github.io/q/

  4. datasette

    An open source multi-tool for exploring and publishing data

    Project mention: The current state of LLM-driven development | news.ycombinator.com | 2025-08-10

    I've been using LLM-assistance for my larger open source projects - https://github.com/simonw/datasette https://github.com/simonw/llm and https://github.com/simonw/sqlite-utils - for a couple of years now.

    Also literally hundreds of smaller plugins and libraries and CLI tools, see https://github.com/simonw?tab=repositories (now at 880 repos) and https://pypi.org/user/simonw/ (340 published packages).

    Unlike my tools.simonwillison.net stuff the vast majority of those products are covered by automated tests and usually have comprehensive documentation too.

  5. visidata

    A terminal spreadsheet multitool for discovering and arranging data

  6. csvkit

    A suite of utilities for converting to and working with CSV, the king of tabular file formats.

    Project mention: Sqawk: A fusion of SQL and Awk: Applying SQL to text-based data files | news.ycombinator.com | 2025-05-26

    I wonder how this compares to csvkit [1].

    [1]: https://csvkit.readthedocs.io/

  7. datamodel-code-generator

    Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.

  8. django-import-export

    Django application and library for importing and exporting data with admin integration.

  9. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. ethereum-etl

    Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ

  11. pygraphistry

    PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer

    Project mention: Initial CUDA Performance Lessons | news.ycombinator.com | 2024-10-11

    Nice!

    It's interesting from the perspective of maintenance too. You can bet most constants like warp sizes will change, so you get into things like having profiles, autotuners, or not sweating the small stuff.

    We went more extreme, and nowadays focus on several layers up: By accepting the (high!) constant overheads of tools like RAPIDS cuDF , we get in exchange the ability to easily crank code with good saturation on the newest GPUs and that any data scientist can edit and extend. Likewise, they just need to understand basics like data movement and columnar analytics data reps to make GPU pipelines. We have ~1 CUDA kernel left and many years of higher-level.

    As an example, this is one of the core methods of our new graph query language (think cypher on pandas/spark), and it gets Graph500 level performance on cheapo GPUs just by being data parallel with high saturation per step: https://github.com/graphistry/pygraphistry/blob/master/graph... . Despite ping-ponging a ton because cudf doesn't (yet) coalesce GPU kernel calls, it still places well, and is easy to maintain & extend.

  12. JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

    Project mention: Show HN: Scraper for job listings directly from company websites | news.ycombinator.com | 2024-12-07

    jobfunnel is FOSS and accepting contributions: https://github.com/PaulMcInnis/JobFunnel

    Currently supports indeed, in the past supported glassdoor and others.

  13. python-benedict

    :blue_book: dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.

    Project mention: Supercharge Your Python Dictionaries with python-benedict! | dev.to | 2025-04-18

    View the Project on GitHub

  14. DataProfiler

    What's in your data? Extract schema, statistics and entities from datasets

  15. extract_otp_secrets

    Extract one time password (OTP) secrets from QR codes exported by two-factor authentication (2FA) apps such as "Google Authenticator". The exported QR codes from authentication apps can be captured by camera, read from images, or read from text files. The secrets can be exported to JSON or CSV, or printed as QR codes to console.

    Project mention: De-Googling TOTP Authenticator Codes | news.ycombinator.com | 2025-09-01

    - that opened a new need for "safe TOTP replication with offline access", and that's how I ended-up running my own vaultwarden instance and using the bitwarden clients across devices.

    I'm glad I did, and I can't recommend it more. IIRC, this¹ helped tremendously along the way.

    ¹: https://github.com/scito/extract_otp_secrets

  16. CleverCSV

    CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

  17. pyexcel

    Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

  18. municipios-brasileiros

    :house_with_garden: Dados relacionados aos municípios brasileiros

  19. finviz

    Unofficial API for finviz.com

  20. csvs-to-sqlite

    Convert CSV files into a SQLite database

  21. URS

    Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.

  22. rows

    A common, beautiful interface to tabular data, no matter the format

  23. rainbow_csv

    🌈Rainbow CSV - Vim plugin: Highlight columns in CSV and TSV files and run queries in SQL-like language

  24. pytablewriter

    pytablewriter is a Python library to write a table in various formats: AsciiDoc / CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.

  25. test-lists

    URL testing lists intended for discovering website censorship

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python CSV discussion

Log in or Post with

Python CSV related posts

  • Sqawk: A fusion of SQL and Awk: Applying SQL to text-based data files

    2 projects | news.ycombinator.com | 26 May 2025
  • A Tool I Built for Synthetic Datasets

    1 project | dev.to | 31 Mar 2025
  • XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal

    6 projects | news.ycombinator.com | 27 Mar 2025
  • Show HN: Fuzzy deduplicate any CSV using vector embeddings

    2 projects | news.ycombinator.com | 4 Nov 2024
  • Developing a CKAN Handler for MindsDB: Bridging Open Data and Machine Learning

    4 projects | dev.to | 16 Oct 2024
  • Export data from Django Admin to CSV

    1 project | dev.to | 1 Jul 2024
  • Show HN: Django-import-export v4 is out

    1 project | news.ycombinator.com | 17 May 2024
  • A note from our sponsor - Sevalla
    sevalla.com | 2 Sep 2025
    Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more! Learn more →

Index

What are some of the best open-source CSV projects in Python? This list will help you:

# Project Stars
1 pandas-ai 21,924
2 q 10,312
3 datasette 10,296
4 visidata 8,429
5 csvkit 6,242
6 datamodel-code-generator 3,419
7 django-import-export 3,242
8 ethereum-etl 3,069
9 pygraphistry 2,323
10 JobFunnel 2,063
11 python-benedict 1,577
12 DataProfiler 1,511
13 extract_otp_secrets 1,428
14 CleverCSV 1,305
15 pyexcel 1,256
16 municipios-brasileiros 1,141
17 finviz 1,124
18 csvs-to-sqlite 912
19 URS 909
20 rows 880
21 rainbow_csv 672
22 pytablewriter 633
23 test-lists 489

Sponsored
Deploy and host your apps and databases, now with $50 credit!
Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!
sevalla.com