open-data

Top 23 open-data Open-Source Projects

  • CKAN

    CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.

  • Project mention: Open Source Flask-based web applications | dev.to | 2023-07-11

    CKAN The Open Source Data Portal Software

  • common-voice

    Common Voice is part of Mozilla's initiative to help teach machines how real people speak.

  • Project mention: OpenAI's Whisper is another case study in Colonisation | news.ycombinator.com | 2024-02-06

    Mozillas Common Voice Project (https://commonvoice.mozilla.org/) is creating an open dataset for many minority languages to make it easier to support them in STT systems. If you speak one of these languages please consider donating a few minutes of your voice.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Killed by Google

    Part guillotine, part graveyard for Google's doomed apps, services, and hardware.

  • Project mention: With Vids, Google thinks it has the next big productivity tool for work | news.ycombinator.com | 2024-04-09
  • open-data

    Free football data from StatsBomb

  • Project mention: How to practice data analytics skills | news.ycombinator.com | 2023-12-25
  • fma

    FMA: A Dataset For Music Analysis

  • awesome-open-geoscience

    Curated from repositories that make our lives as geoscientists, hackers and data wranglers easier or just more awesome

  • kuwala

    Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demograp

  • Project mention: Show HN: GeoSage – A ETL Webtool for Geo and Demographics Data from the Open Web | news.ycombinator.com | 2023-10-05

    --> Google Trends Data for Regions (Coming Soon)

    The tool goes beyond our previously published CLI tool (https://github.com/kuwala-io/kuwala/tree/master/kuwala) by providing a hostable solution with a user-friendly interface. We have not open-sourced it yet but a demo is available here: https://geosage.kuwala.io/.

    Urban planners can utilize movement data to analyze foot traffic in different city zones. Marketers can leverage demographic data to tailor campaigns more effectively. Developers can build their apps on top of it.

    To round it up .... GeoSage brings...

    Unified Data Management: Access data from OSM, Facebook, and soon Google, all in one place.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Transform-to-Open-Science

    Transformation to Open Science

  • Project mention: Levels of Open Access · nasa/Transform-to-Open-Science · Discussion #454 · GitHub | /r/Open_Access_tracking | 2023-04-30
  • opendata.cern.ch

    Source code for the CERN Open Data portal

  • Project mention: Observable 2.0, a static site generator for data apps | news.ycombinator.com | 2024-02-15

    I think the idea of Framework is really good, but static data limits the applications, excluding monitoring and other cases in which the data is constantly changing, but the dashboard can stay as it is. For example, I'd love to see a revamped Framework version of the LHC beam monitor and related pages (see https://op-webtools.web.cern.ch/vistar/, but check again in 2 months or so, when the accelerator will be running).

    In high-energy physics, ROOT is /the/ toolkit for data analysis, and I guess jsROOT (https://root.cern.ch/js/) could also be used to load data to be shown in Framework dashboards. I thought the idea of Framework as a blogging engine with powerful data visualization built-in could be very interesting. Think, for example, about physicists pulling open data (https://opendata.cern.ch) and writing about their analysis or someone pulling data from https://ourworldindata.org/ in their own visualizations to support their case while writing about a particular subject, etc.

  • Anahita

    Anahita is a platform and framework for developing open science and knowledge sharing applications on a social networking foundation.

  • fraud-detection-handbook

    Reproducible Machine Learning for Credit Card Fraud Detection - Practical Handbook

  • awesome-portugal-data

    🇵🇹 Lista de repositórios de dados abertos em Portugal

  • mais

    ⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/

  • Herbie

    Download numerical weather prediction datasets (HRRR, RAP, GFS, IFS, etc.) from NOMADS, NODD partners (Amazon, Google, Microsoft), ECMWF open data, and the University of Utah Pando Archive System. (by blaylockbk)

  • Project mention: Struggling to find archive forecast data. Looking for help | /r/meteorology | 2023-12-08

    Thank you everyone! I've found what I needed: the HRRR-B Python package by Brian Blaylock. It's fantastic for downloading and reading HRRR grib2 files and works great for my project. Highly recommended!

  • meteostat-python

    Access and analyze historical weather and climate data with Python.

  • Project mention: Povijesni vremenski podaci | /r/croatia | 2023-06-15

    Probaj s: https://github.com/meteostat/meteostat-python

  • UCF-SST-CitySim1-Dataset

    Official github page of UCF SST CitySim Dataset

  • wetterdienst

    Open weather data for humans.

  • innovationgraph

    GitHub Innovation Graph

  • Project mention: GitHub Innovation Graph | news.ycombinator.com | 2024-02-05
  • upgini

    Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

  • Project mention: The fastest way to improve quality of ML model on tabular data | /r/learnmachinelearning | 2023-06-18

    web: https://upgini.com

  • kamu-cli

    New generation decentralized data lake and a streaming data pipeline

  • free-exercise-db

    Open Public Domain Exercise Dataset in JSON format, over 800 exercises with a browsable public searchable frontend

  • awesome-italian-public-datasets

    A selection of interesting Open dataset from the Italian Public Administration and Civic Data use cases

  • nycdb

    Database of NYC Housing Data

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

open-data related posts

Index

What are some of the best open-source open-data projects? This list will help you:

Project Stars
1 CKAN 4,253
2 common-voice 3,247
3 Killed by Google 2,345
4 open-data 2,200
5 fma 2,108
6 awesome-open-geoscience 1,337
7 kuwala 755
8 Transform-to-Open-Science 655
9 opendata.cern.ch 635
10 Anahita 430
11 fraud-detection-handbook 429
12 awesome-portugal-data 386
13 mais 381
14 Herbie 374
15 meteostat-python 352
16 UCF-SST-CitySim1-Dataset 334
17 wetterdienst 324
18 innovationgraph 319
19 upgini 290
20 kamu-cli 275
21 free-exercise-db 258
22 awesome-italian-public-datasets 248
23 nycdb 182

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com