The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 23 open-data Open-Source Projects
-
CKAN
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
-
common-voice
Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
awesome-open-geoscience
Curated from repositories that make our lives as geoscientists, hackers and data wranglers easier or just more awesome
-
kuwala
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demograp
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
Anahita
Anahita is a platform and framework for developing open science and knowledge sharing applications on a social networking foundation.
-
fraud-detection-handbook
Reproducible Machine Learning for Credit Card Fraud Detection - Practical Handbook
-
mais
⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/
-
Herbie
Download numerical weather prediction datasets (HRRR, RAP, GFS, IFS, etc.) from NOMADS, NODD partners (Amazon, Google, Microsoft), ECMWF open data, and the University of Utah Pando Archive System. (by blaylockbk)
-
upgini
Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs
-
free-exercise-db
Open Public Domain Exercise Dataset in JSON format, over 800 exercises with a browsable public searchable frontend
-
awesome-italian-public-datasets
A selection of interesting Open dataset from the Italian Public Administration and Civic Data use cases
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
CKAN The Open Source Data Portal Software
Project mention: OpenAI's Whisper is another case study in Colonisation | news.ycombinator.com | 2024-02-06Mozillas Common Voice Project (https://commonvoice.mozilla.org/) is creating an open dataset for many minority languages to make it easier to support them in STT systems. If you speak one of these languages please consider donating a few minutes of your voice.
Project mention: With Vids, Google thinks it has the next big productivity tool for work | news.ycombinator.com | 2024-04-09
Project mention: Show HN: GeoSage – A ETL Webtool for Geo and Demographics Data from the Open Web | news.ycombinator.com | 2023-10-05--> Google Trends Data for Regions (Coming Soon)
The tool goes beyond our previously published CLI tool (https://github.com/kuwala-io/kuwala/tree/master/kuwala) by providing a hostable solution with a user-friendly interface. We have not open-sourced it yet but a demo is available here: https://geosage.kuwala.io/.
Urban planners can utilize movement data to analyze foot traffic in different city zones. Marketers can leverage demographic data to tailor campaigns more effectively. Developers can build their apps on top of it.
To round it up .... GeoSage brings...
Unified Data Management: Access data from OSM, Facebook, and soon Google, all in one place.
Project mention: Levels of Open Access · nasa/Transform-to-Open-Science · Discussion #454 · GitHub | /r/Open_Access_tracking | 2023-04-30
Project mention: Observable 2.0, a static site generator for data apps | news.ycombinator.com | 2024-02-15I think the idea of Framework is really good, but static data limits the applications, excluding monitoring and other cases in which the data is constantly changing, but the dashboard can stay as it is. For example, I'd love to see a revamped Framework version of the LHC beam monitor and related pages (see https://op-webtools.web.cern.ch/vistar/, but check again in 2 months or so, when the accelerator will be running).
In high-energy physics, ROOT is /the/ toolkit for data analysis, and I guess jsROOT (https://root.cern.ch/js/) could also be used to load data to be shown in Framework dashboards. I thought the idea of Framework as a blogging engine with powerful data visualization built-in could be very interesting. Think, for example, about physicists pulling open data (https://opendata.cern.ch) and writing about their analysis or someone pulling data from https://ourworldindata.org/ in their own visualizations to support their case while writing about a particular subject, etc.
Project mention: Struggling to find archive forecast data. Looking for help | /r/meteorology | 2023-12-08Thank you everyone! I've found what I needed: the HRRR-B Python package by Brian Blaylock. It's fantastic for downloading and reading HRRR grib2 files and works great for my project. Highly recommended!
Probaj s: https://github.com/meteostat/meteostat-python
Project mention: The fastest way to improve quality of ML model on tabular data | /r/learnmachinelearning | 2023-06-18web: https://upgini.com
open-data related posts
- With Vids, Google thinks it has the next big productivity tool for work
- Google Axion Processors, our new Arm-based CPUs
- Google's Decision to Effectively Kill-off Small Sites
- Calls grow for Sundar Pichai to step down from Google CEO position
- Google's Gemini Headaches Spur $90B Selloff
- Our Company Is Doing So Well That You're All Fired
- Gemini Ultra now available in Google Bard
-
A note from our sponsor - WorkOS
workos.com | 26 Apr 2024
Index
What are some of the best open-source open-data projects? This list will help you:
Project | Stars | |
---|---|---|
1 | CKAN | 4,253 |
2 | common-voice | 3,247 |
3 | Killed by Google | 2,345 |
4 | open-data | 2,200 |
5 | fma | 2,108 |
6 | awesome-open-geoscience | 1,337 |
7 | kuwala | 755 |
8 | Transform-to-Open-Science | 655 |
9 | opendata.cern.ch | 635 |
10 | Anahita | 430 |
11 | fraud-detection-handbook | 429 |
12 | awesome-portugal-data | 386 |
13 | mais | 381 |
14 | Herbie | 374 |
15 | meteostat-python | 352 |
16 | UCF-SST-CitySim1-Dataset | 334 |
17 | wetterdienst | 324 |
18 | innovationgraph | 319 |
19 | upgini | 290 |
20 | kamu-cli | 275 |
21 | free-exercise-db | 258 |
22 | awesome-italian-public-datasets | 248 |
23 | nycdb | 182 |
Sponsored