Python CSV

Open-source Python projects categorized as CSV

Top 23 Python CSV Projects

  • pandas-ai

    Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

    Project mention: Using RAG to Build Your IDE Agents | dev.to | 2024-06-18

    In this blog, we will build a powerful IDE agent for PandasAI using Dash Agent. Then later on, we'll understand how using RAG can significantly improve LLM responses.

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • q

    q - Run SQL directly on delimited files and multi-file sqlite databases (by harelba)

  • datasette

    An open source multi-tool for exploring and publishing data

    Project mention: Show HN: SQLite Transaction Benchmarking Tool | news.ycombinator.com | 2024-07-17

    I wrote an async wrapper around SQLite in Python - I'm using a thread pool: https://github.com/simonw/datasette/blob/main/datasette/data...

    I have multiple threads for reads and a single dedicated thread for writes, which I send operations to via a queue. That way I avoid ever having two writes against the same connection at the same time.

  • visidata

    A terminal spreadsheet multitool for discovering and arranging data

    Project mention: Data Science at the Command Line, 2nd Edition (2021) | news.ycombinator.com | 2024-05-06

    I'd like to call out one of my favorite pieces of software from the past 10 years: VisiData [1] has completely changed the way I do ad-hoc data processing, and is now my go-to for pretty much all use cases that I previously used spreadsheets for, and about half of those I previously used databases for.

    It's a TUI application, not strictly CLI, but scriptable, and I figure anyone building pipelines using tools like jq, q, awk, grep, etc. to process tabular data will find it extremely useful.

    ----

    [1]: https://visidata.org

  • csvkit

    A suite of utilities for converting to and working with CSV, the king of tabular file formats.

  • django-import-export

    Django application and library for importing and exporting data with admin integration.

    Project mention: Export data from Django Admin to CSV | dev.to | 2024-07-01

    This is where the django-import-export library comes in handy. It provides an easy way to import and export data in various formats, such as CSV, xlsx and more.

  • ethereum-etl

    Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ

    Project mention: Blockchain transactions decoding: making wallet activity understandable | dev.to | 2023-10-27

    Event is a log entity which EVM smart contracts can emit during transaction execution. Events are very good at signalling that an some action has taken place on-chain. Applications can subscribe and listen to events to trigger some off-chain logic or they can index, transform and store events in some off-chain storage (look at The Graph protocol or Ethereum ETL).

  • InfluxDB

    Purpose built for real-time analytics at any scale. InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.

    InfluxDB logo
  • datamodel-code-generator

    Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.

    Project mention: Datamodel-code-generator: Pydantic model/dataclass from OpenAPI, JSON, YAML | news.ycombinator.com | 2023-11-16
  • pygraphistry

    PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer

    Project mention: Graph Data Fits in Memory | news.ycombinator.com | 2024-04-15

    Extra fun: We find most enterprise/gov graph analytics work only requires 1-2 attributes to go along with the graph index, and those attributes often are already numeric (time, $, ...) or can be dictionary-encoded as discussed here (categorical, ID, ...)... so even 'tough' billion scale graphs are fine on 1 gpu.

    Early, but that's been the basic thinking into our new GFQL system: slice into the columns you want, and then do all the in-GPU traversals you want. In our V1, we keep things dataframe-native include the in-GPU data representation, and are already working on the first extensions to support switching to more graph-native indexing for steps as needed.

    Ex: https://github.com/graphistry/pygraphistry/blob/master/demos...

  • JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

  • python-benedict

    :blue_book: dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.

  • DataProfiler

    What's in your data? Extract schema, statistics and entities from datasets

    Project mention: LongRoPE: Extending LLM Context Window Beyond 2M Tokens | news.ycombinator.com | 2024-02-22

    It's been possible to skip tokenization for a long time, my team and I did it here - https://github.com/capitalone/DataProfiler

    For what it's worth, we actually were working with LSTMs with nearly a billion params back in 2016-2017 area. Transformers made it far more effective to train and execute, but ultimately LSTMs are able to achieve similar results, though slow & require more training data.

  • CleverCSV

    CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

  • pyexcel

    Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

    Project mention: Advice on ETL and Data Sharing work process | /r/ETL | 2023-11-07

    You could try and write some simple python using the pyexcel and pandas libraries. I created a tool as a consultant with these packages that parsed spreadsheets with data from factories from all around the world. They did not lock down the Excel files used to submit data and it made it so much harder. If you go this route, I would recommend starting by putting your data into a SQLite database. Once you have your data in a database, you unlock the power of SQL for pulling reports. Also, you can port the data into a proper database if you ever need to. ChatGPT can probably get you a good chunk of the way there.

  • extract_otp_secrets

    Extract one time password (OTP) secrets from QR codes exported by two-factor authentication (2FA) apps such as "Google Authenticator". The exported QR codes from authentication apps can be captured by camera, read from images, or read from text files. The secrets can be exported to JSON or CSV, or printed as QR codes to console.

    Project mention: Show HN: AuthWin – Authenticator App for Windows | news.ycombinator.com | 2024-03-03

    This library uses the GPL v3 license: https://github.com/scito/extract_otp_secrets?tab=GPL-3.0-1-o...

    Your options are to either go open-source or remove the library.

  • municipios-brasileiros

    :house_with_garden: Dados relacionados aos municípios brasileiros

  • finviz

    Unofficial API for finviz.com

  • csvs-to-sqlite

    Convert CSV files into a SQLite database

  • rows

    A common, beautiful interface to tabular data, no matter the format

  • URS

    Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.

    Project mention: Nitter Shutting Down | news.ycombinator.com | 2024-01-27

    If they don't want you to use their API just respect their wishes and scrape Reddit. https://github.com/JosephLai241/URS it's the only moral thing we can do.

  • rainbow_csv

    🌈Rainbow CSV - Vim plugin: Highlight columns in CSV and TSV files and run queries in SQL-like language

  • pytablewriter

    pytablewriter is a Python library to write a table in various formats: AsciiDoc / CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.

  • sterraxcyl

    Instagram OSINT tool to export and analyse followers | following with their details

    Project mention: Tool to see mutual followers of several Instagram pages? | /r/OSINT | 2023-11-17
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python CSV discussion

Log in or Post with

Python CSV related posts

  • Export data from Django Admin to CSV

    1 project | dev.to | 1 Jul 2024
  • Show HN: Django-import-export v4 is out

    1 project | news.ycombinator.com | 17 May 2024
  • Plotille: Plot in the terminal using Braille dots

    3 projects | news.ycombinator.com | 4 May 2024
  • Friends don't let friends export to CSV

    7 projects | news.ycombinator.com | 25 Mar 2024
  • And I thought amazing fics suddenly being deleted was a myth

    1 project | /r/AO3 | 18 Nov 2023
  • Advice on ETL and Data Sharing work process

    1 project | /r/ETL | 7 Nov 2023
  • CSV2Notion Neo - Upload & Merge CSV Data with Images to Notion Database.

    1 project | /r/Airtable | 26 Oct 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 17 Sep 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source CSV projects in Python? This list will help you:

Project Stars
1 pandas-ai 12,553
2 q 10,183
3 datasette 9,389
4 visidata 7,790
5 csvkit 5,956
6 django-import-export 2,999
7 ethereum-etl 2,920
8 datamodel-code-generator 2,616
9 pygraphistry 2,119
10 JobFunnel 1,770
11 python-benedict 1,485
12 DataProfiler 1,414
13 CleverCSV 1,249
14 pyexcel 1,198
15 extract_otp_secrets 1,113
16 municipios-brasileiros 1,088
17 finviz 1,052
18 csvs-to-sqlite 872
19 rows 865
20 URS 780
21 rainbow_csv 614
22 pytablewriter 606
23 sterraxcyl 532

Sponsored
Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com

Did you konow that Python is
the 1st most popular programming language
based on number of metions?