awesome-public-real-time-datasets
FLaNK-Halifax
awesome-public-real-time-datasets | FLaNK-Halifax | |
---|---|---|
8 | 14 | |
434 | 1 | |
18.7% | - | |
5.1 | 7.3 | |
15 days ago | 7 months ago | |
TypeScript | ||
Creative Commons Zero v1.0 Universal | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
awesome-public-real-time-datasets
- List of publicly available datasets with real-time data
- FLaNK Stack Weekly for 20 Nov 2023
- Bytewax: Stream processing library built using Python and Rust
- Public Real-Time Datasets and Sources
-
What are some good publicly available real-time data sources?
Added for now - https://github.com/bytewax/awesome-public-real-time-datasets/commit/94ca4a3d40dc212690c6cdc22c107289b4268661
I am attempting to source via the wisdom of the crowd here. I often find it hard to find good real-time data sources for learning about streaming, prototyping, or building hobby projects. I started researching and then created an "Awesome List" in a GitHub repo - https://github.com/bytewax/awesome-public-real-time-datasets.
-
Ask HN: What are some public real-time data sources?
I started an awesome list with real-time data sources here: https://github.com/bytewax/awesome-public-real-time-datasets . Have any datasets or data sources I should add to this list? Comment below or PRs welcome :).
FLaNK-Halifax
- FLaNK Stack Weekly for 27 November 2023
- FLaNK Stack Weekly for 20 Nov 2023
- FLaNK Stack Weekly for 13 November 2023
- FLaNK Stack Weekly 06 Nov 2023
- FLaNK Stack Weekly for 30 Oct 2023
- FLaNK Stack Weekly 23 Oct 2023
- FLaNK Stack Weekly 16 October 2023
- FLaNK Stack Weekly 09 Oct 2023
- FLaNK Stack Weekly 2 October 2023
- FLaNK Stack for 25 September 2023
What are some alternatives?
datagen - Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.
rivet - The open-source visual AI programming environment and TypeScript library
screenshot-to-code - Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
vimGPT - Browse the web with GPT-4V and Vimium
RedfinScraper - Scrapes Redfin data.
SeaGOAT - local-first semantic code search engine
mockingbird - Mockingbird is a mock streaming data generator
flink-cdc - Flink CDC is a streaming data integration tool
superset - Apache Superset is a Data Visualization and Data Exploration Platform
CML_AMP_Intelligent-QA-Chatbot-with-NiFi-Pinecone-and-Llama2 - The prototype deploys an Application in CML using a Llama2 model from Hugging Face to answer questions augmented with knowledge extracted from the website. This prototype introduces Pinecone as a database for storing vectors for semantic search.
depthai-python - DepthAI Python Library
co-tracker - CoTracker is a model for tracking any point (pixel) on a video.