Python open-data

Open-source Python projects categorized as open-data

Top 23 Python open-data Projects

  1. CKAN

    CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.

    Project mention: CKAN – The open source data management system | news.ycombinator.com | 2024-12-04
  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. open-thoughts

    Fully open data curation for reasoning models

    Project mention: Hugging Face is looking for reasoning datasets beyond math, science and coding | dev.to | 2025-04-16

    OpenThoughts-114k generation code

  4. opendata.cern.ch

    Source code for the CERN Open Data portal

    Project mention: New Proofs Probe the Limits of Mathematical Truth | news.ycombinator.com | 2025-02-04

    https://opendata.cern.ch/

    I've gotten into 3D printing, and load and temperature data of different filaments is always appreciated.

    Mixing materials together, microscopic images, etc...

    I get a lot of value from YouTubers who simple follow a consistent methodology of endurance or break testing products or materials. Tear downs and documentation of internals, performance statistics, etc...

    Channels like CNCKitchen or ProjectFarm are excellent citizen scientists for example.

  5. Herbie

    Download numerical weather prediction datasets (HRRR, RAP, GFS, IFS, etc.) from NOMADS, NODD partners (Amazon, Google, Microsoft), ECMWF open data, and the University of Utah Pando Archive System. (by blaylockbk)

    Project mention: Show HN: Gribstream.com – Historical Weather Forecast API | news.ycombinator.com | 2024-12-20

    "GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy" (2024) https://deepmind.google/discover/blog/gencast-predicts-weath...

    "Probabilistic weather forecasting with machine learning" (2024) ; GenCast paper https://www.nature.com/articles/s41586-024-08252-9

    blaylockbk/Herbie: https://github.com/blaylockbk/Herbie :

    > Download numerical weather prediction datasets (HRRR, RAP, GFS, IFS, etc.) from NOMADS, NODD partners (Amazon, Google, Microsoft), ECMWF open data, and the Pando Archive System

    The Herbie docs mention GFS GraphCast but not yet GenCast? https://herbie.readthedocs.io/en/stable/gallery/noaa_models/...

  6. pudl

    The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.

  7. meteostat-python

    Access and analyze historical weather and climate data with Python.

  8. innovationgraph

    GitHub Innovation Graph

  9. Sevalla

    Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

    Sevalla logo
  10. wetterdienst

    Open weather data for humans.

  11. UCF-SST-CitySim1-Dataset

    Official github page of UCF SST CitySim Dataset

  12. upgini

    Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

  13. nycdb

    Database of NYC Housing Data

  14. images

    Public domain photos of Members of the United States Congress (by unitedstates)

  15. Kotori

    A flexible data historian based on InfluxDB, Grafana, MQTT, and more. Free, open, simple.

  16. PatZilla

    PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.

  17. open-grid-emissions

    Tools for producing high-quality hourly generation and emissions data for U.S. electric grids

  18. osmand_map_creation

    OSM data + open address data compiled for use in OSMAnd

  19. dribdat

    Self-hosted challenge board for sweeter hackathons 🐝

  20. Bus-Departure-Board

    A selection of Python programs which will retrieve live bus and rail UK open data and output it to a ER-OLEDM032 (256X64) display screen.

  21. wikdict-gen

    Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project

  22. at-python

    API for Python

  23. dashmap.io

    DashMap is an open source web platform that gathers, analyses and visualises urban data.

  24. datapusher-plus

    Push data into the CKAN Datastore fast & reliably while inferring, calculating & suggesting metadata using Jinja2 Formulas defined in your scheming metadata schema. It pushes real good!

    Project mention: Developing a CKAN Handler for MindsDB: Bridging Open Data and Machine Learning | dev.to | 2024-10-16

    CKAN serves as a data catalog, organizing metadata and actual data in its databases incorporating Datapusher Plus, powered by the lightning-fast QSV library.

  25. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python open-data discussion

Log in or Post with

Python open-data related posts

  • Hugging Face is looking for reasoning datasets beyond math, science and coding

    2 projects | dev.to | 16 Apr 2025
  • Open Thoughts: open data curation for reasoning models

    1 project | news.ycombinator.com | 22 Feb 2025
  • New Proofs Probe the Limits of Mathematical Truth

    1 project | news.ycombinator.com | 4 Feb 2025
  • CKAN – The open source data management system

    1 project | news.ycombinator.com | 4 Dec 2024
  • LHC experiments at CERN observe quantum entanglement at the highest energy yet

    1 project | news.ycombinator.com | 22 Sep 2024
  • Latest JavaScript News, Updates, and Tutorials

    1 project | dev.to | 9 Aug 2024
  • Open Source takes center stage at United Nations

    5 projects | news.ycombinator.com | 17 Jul 2024
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 1 Sep 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source open-data projects in Python? This list will help you:

# Project Stars
1 CKAN 4,805
2 open-thoughts 2,053
3 opendata.cern.ch 715
4 Herbie 634
5 pudl 553
6 meteostat-python 523
7 innovationgraph 492
8 wetterdienst 396
9 UCF-SST-CitySim1-Dataset 388
10 upgini 337
11 nycdb 230
12 images 185
13 Kotori 117
14 PatZilla 108
15 open-australian-legal-corpus-creator 94
16 open-grid-emissions 85
17 osmand_map_creation 82
18 dribdat 75
19 Bus-Departure-Board 54
20 wikdict-gen 51
21 at-python 46
22 dashmap.io 45
23 datapusher-plus 39

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?