SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python CSV Projects
-
Project mention: I wrote this iCalendar (.ics) command-line utility to turn common calendar exports into more broadly compatible CSV files. | /r/commandline | 2023-03-24
CSV utilities (still haven't pick a favorite one...): https://github.com/harelba/q https://github.com/BurntSushi/xsv https://github.com/wireservice/csvkit https://github.com/johnkerl/miller
-
Project mention: Little Data: How do we query personal data? (2013) | news.ycombinator.com | 2024-03-01
I'm a fan on simonw's datasette/dogsheep ecosystem https://datasette.io/
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
[4] "Is it possible to "flatten" structured data (like JSON?)": https://github.com/saulpw/visidata/discussions/1605
-
csvkit
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
Project mention: I wrote this iCalendar (.ics) command-line utility to turn common calendar exports into more broadly compatible CSV files. | /r/commandline | 2023-03-24CSV utilities (still haven't pick a favorite one...): https://github.com/harelba/q https://github.com/BurntSushi/xsv https://github.com/wireservice/csvkit https://github.com/johnkerl/miller
-
django-import-export
Django application and library for importing and exporting data with admin integration.
django-import-export provides a sophisticated framework for importing data. Good if you need to do this on a regular basis and need to do some work on the data before writing to the database.
-
ethereum-etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
Project mention: Blockchain transactions decoding: making wallet activity understandable | dev.to | 2023-10-27Event is a log entity which EVM smart contracts can emit during transaction execution. Events are very good at signalling that an some action has taken place on-chain. Applications can subscribe and listen to events to trigger some off-chain logic or they can index, transform and store events in some off-chain storage (look at The Graph protocol or Ethereum ETL).
-
datamodel-code-generator
Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
Project mention: Datamodel-code-generator: Pydantic model/dataclass from OpenAPI, JSON, YAML | news.ycombinator.com | 2023-11-16 -
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
pygraphistry
PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
Project mention: The "missing" graph datatype already exists. It was invented in the '70s | news.ycombinator.com | 2024-03-05If you enjoy this kind of thinking, we recently released GFQL for dataframe-native graph querying & compute
Imagine Neo4j Cypher, except no need for a database -- just import it -- and automatically vectorizes for significantly faster CPU+GPU performance. This is fundamentally similar to the kinds of implementations a datalog approach enables. (And indeed one of the alternative interfaces we were considering!)
We've run it on 100M+ edge graphs on some of the cheapest GPUs you can get, and are getting ready for the next rev with aggregate compute: https://github.com/graphistry/pygraphistry/blob/master/demos...
-
-
python-benedict
:blue_book: dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.
-
Project mention: LongRoPE: Extending LLM Context Window Beyond 2M Tokens | news.ycombinator.com | 2024-02-22
It's been possible to skip tokenization for a long time, my team and I did it here - https://github.com/capitalone/DataProfiler
For what it's worth, we actually were working with LSTMs with nearly a billion params back in 2016-2017 area. Transformers made it far more effective to train and execute, but ultimately LSTMs are able to achieve similar results, though slow & require more training data.
-
CleverCSV
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
There’s things like this, but I consider the existence of messy, non standard CSV files (backed by a decade of experience dealing with the problem) a strong reason to not use the format ever.
-
You could try and write some simple python using the pyexcel and pandas libraries. I created a tool as a consultant with these packages that parsed spreadsheets with data from factories from all around the world. They did not lock down the Excel files used to submit data and it made it so much harder. If you go this route, I would recommend starting by putting your data into a SQLite database. Once you have your data in a database, you unlock the power of SQL for pulling reports. Also, you can port the data into a proper database if you ever need to. ChatGPT can probably get you a good chunk of the way there.
-
-
https://github.com/mariostoev/finviz may be helpful to you
-
extract_otp_secrets
Extract one time password (OTP) secrets from QR codes exported by two-factor authentication (2FA) apps such as "Google Authenticator". The exported QR codes from authentication apps can be captured by camera, read from images, or read from text files. The secrets can be exported to JSON or CSV, or printed as QR codes to console.
Project mention: Show HN: AuthWin – Authenticator App for Windows | news.ycombinator.com | 2024-03-03This library uses the GPL v3 license: https://github.com/scito/extract_otp_secrets?tab=GPL-3.0-1-o...
Your options are to either go open-source or remove the library.
-
-
-
If they don't want you to use their API just respect their wishes and scrape Reddit. https://github.com/JosephLai241/URS it's the only moral thing we can do.
-
pytablewriter
pytablewriter is a Python library to write a table in various formats: AsciiDoc / CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.
-
rainbow_csv
🌈Rainbow CSV - Vim plugin: Highlight columns in CSV and TSV files and run queries in SQL-like language
Probably not an exact fit, but this plugin came to mind: rainbow_csv
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python CSV related posts
- And I thought amazing fics suddenly being deleted was a myth
- Advice on ETL and Data Sharing work process
- CSV2Notion Neo - Upload & Merge CSV Data with Images to Notion Database.
- Fx – Terminal JSON Viewer
- Why my favourite API is a zipfile on the European Central Bank's website
- The Awk Programming Language, Second Edition
- Question about Merge from CSV
-
A note from our sponsor - SaaSHub
www.saashub.com | 19 Mar 2024
Index
What are some of the best open-source CSV projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | q | 10,092 |
2 | datasette | 8,764 |
3 | visidata | 7,328 |
4 | csvkit | 5,776 |
5 | django-import-export | 2,835 |
6 | ethereum-etl | 2,792 |
7 | datamodel-code-generator | 2,196 |
8 | pygraphistry | 2,022 |
9 | JobFunnel | 1,709 |
10 | python-benedict | 1,385 |
11 | DataProfiler | 1,342 |
12 | CleverCSV | 1,197 |
13 | pyexcel | 1,168 |
14 | municipios-brasileiros | 1,048 |
15 | finviz | 996 |
16 | extract_otp_secrets | 919 |
17 | rows | 859 |
18 | csvs-to-sqlite | 854 |
19 | URS | 709 |
20 | pytablewriter | 591 |
21 | rainbow_csv | 564 |
22 | sterraxcyl | 467 |
23 | test-lists | 394 |