Python CSV

Open-source Python projects categorized as CSV

Top 23 Python CSV Projects

  1. pandas-ai

    Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

    Project mention: PandaAI: Talk to Your Data, Not to Your Code! | dev.to | 2025-05-06

    View the Project on GitHub

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. q

    q - Run SQL directly on delimited files and multi-file sqlite databases (by harelba)

    Project mention: XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal | news.ycombinator.com | 2025-03-27

    I used to use q for this sort of thing. Not sure if there are better choices now as it have been a few years.

    https://harelba.github.io/q/

  4. datasette

    An open source multi-tool for exploring and publishing data

    Project mention: Gmail to SQLite | news.ycombinator.com | 2025-05-09

    A couple of reasons which pop to mind:

    - Searching a plain text data file is O(n). Searching a SQLite database that has been properly indexed, which is very easy to do nowadays with FTS5, is O(log n) worst case scenario and O(1) in the best case. This doesn't explain why SQLite over a dataframe or anything, but it definitely justifies it over plain text for large email collections.

    - SQLite is really easy to write custom views and programs around. Virtually every major programming language can work with it without issue. See also: simonw's wonderful https://datasette.io/ .

    - SQLite is an accepted archival format by the Library of Congress, if you ever want to go down the rabbit hole of digital preservation.

  5. visidata

    A terminal spreadsheet multitool for discovering and arranging data

  6. csvkit

    A suite of utilities for converting to and working with CSV, the king of tabular file formats.

    Project mention: XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal | news.ycombinator.com | 2025-03-27
  7. django-import-export

    Django application and library for importing and exporting data with admin integration.

    Project mention: Export data from Django Admin to CSV | dev.to | 2024-07-01

    This is where the django-import-export library comes in handy. It provides an easy way to import and export data in various formats, such as CSV, xlsx and more.

  8. datamodel-code-generator

    Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. ethereum-etl

    Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ

  11. pygraphistry

    PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer

    Project mention: Initial CUDA Performance Lessons | news.ycombinator.com | 2024-10-11

    Nice!

    It's interesting from the perspective of maintenance too. You can bet most constants like warp sizes will change, so you get into things like having profiles, autotuners, or not sweating the small stuff.

    We went more extreme, and nowadays focus on several layers up: By accepting the (high!) constant overheads of tools like RAPIDS cuDF , we get in exchange the ability to easily crank code with good saturation on the newest GPUs and that any data scientist can edit and extend. Likewise, they just need to understand basics like data movement and columnar analytics data reps to make GPU pipelines. We have ~1 CUDA kernel left and many years of higher-level.

    As an example, this is one of the core methods of our new graph query language (think cypher on pandas/spark), and it gets Graph500 level performance on cheapo GPUs just by being data parallel with high saturation per step: https://github.com/graphistry/pygraphistry/blob/master/graph... . Despite ping-ponging a ton because cudf doesn't (yet) coalesce GPU kernel calls, it still places well, and is easy to maintain & extend.

  12. JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

    Project mention: Show HN: Scraper for job listings directly from company websites | news.ycombinator.com | 2024-12-07

    jobfunnel is FOSS and accepting contributions: https://github.com/PaulMcInnis/JobFunnel

    Currently supports indeed, in the past supported glassdoor and others.

  13. python-benedict

    :blue_book: dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.

    Project mention: Supercharge Your Python Dictionaries with python-benedict! | dev.to | 2025-04-18

    View the Project on GitHub

  14. DataProfiler

    What's in your data? Extract schema, statistics and entities from datasets

  15. extract_otp_secrets

    Extract one time password (OTP) secrets from QR codes exported by two-factor authentication (2FA) apps such as "Google Authenticator". The exported QR codes from authentication apps can be captured by camera, read from images, or read from text files. The secrets can be exported to JSON or CSV, or printed as QR codes to console.

  16. CleverCSV

    CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

  17. pyexcel

    Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

  18. municipios-brasileiros

    :house_with_garden: Dados relacionados aos municípios brasileiros

  19. finviz

    Unofficial API for finviz.com

  20. csvs-to-sqlite

    Convert CSV files into a SQLite database

  21. rows

    A common, beautiful interface to tabular data, no matter the format

  22. URS

    Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.

  23. rainbow_csv

    🌈Rainbow CSV - Vim plugin: Highlight columns in CSV and TSV files and run queries in SQL-like language

  24. pytablewriter

    pytablewriter is a Python library to write a table in various formats: AsciiDoc / CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.

  25. test-lists

    URL testing lists intended for discovering website censorship

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python CSV discussion

Log in or Post with

Python CSV related posts

  • A Tool I Built for Synthetic Datasets

    1 project | dev.to | 31 Mar 2025
  • XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal

    6 projects | news.ycombinator.com | 27 Mar 2025
  • Show HN: Fuzzy deduplicate any CSV using vector embeddings

    2 projects | news.ycombinator.com | 4 Nov 2024
  • Developing a CKAN Handler for MindsDB: Bridging Open Data and Machine Learning

    4 projects | dev.to | 16 Oct 2024
  • Export data from Django Admin to CSV

    1 project | dev.to | 1 Jul 2024
  • Show HN: Django-import-export v4 is out

    1 project | news.ycombinator.com | 17 May 2024
  • Plotille: Plot in the terminal using Braille dots

    3 projects | news.ycombinator.com | 4 May 2024
  • A note from our sponsor - SaaSHub
    www.saashub.com | 17 May 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source CSV projects in Python? This list will help you:

# Project Stars
1 pandas-ai 20,042
2 q 10,282
3 datasette 10,020
4 visidata 8,217
5 csvkit 6,177
6 django-import-export 3,195
7 datamodel-code-generator 3,183
8 ethereum-etl 3,017
9 pygraphistry 2,257
10 JobFunnel 2,010
11 python-benedict 1,567
12 DataProfiler 1,486
13 extract_otp_secrets 1,342
14 CleverCSV 1,293
15 pyexcel 1,242
16 municipios-brasileiros 1,134
17 finviz 1,116
18 csvs-to-sqlite 895
19 rows 876
20 URS 875
21 rainbow_csv 667
22 pytablewriter 626
23 test-lists 480

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?