datagen
awesome-public-real-time-datasets
datagen | awesome-public-real-time-datasets | |
---|---|---|
7 | 8 | |
135 | 358 | |
3.0% | 8.4% | |
6.1 | 5.0 | |
about 2 months ago | about 2 months ago | |
TypeScript | ||
Apache License 2.0 | Creative Commons Zero v1.0 Universal |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
datagen
-
What are your favorite tools or components in the Kafka ecosystem?
For fake data, shameless plug for https://github.com/MaterializeInc/datagen/tree/main
- What are some good publicly available real-time data sources?
-
Simulating Streaming Data for Fraud Detection with Datagen CLI
Building and testing a real-time fraud detection application requires a continuous stream of realistic data. But generating that data can be a challenge. That's why we recently created the Datagen CLI, a simple tool that helps you create believable fake data using the FakerJS API.
-
How train my SQL skills with real world data engineering problems ?
Generate fake data with a normalized schema of your choosing with this tool from Materialize, then denormalize it and build a warehouse model.
- FLiPN-FLaNK Stack Weekly for 20 March 2023
- Datagen CLI: Stream Fake Relational Data
awesome-public-real-time-datasets
- List of publicly available datasets with real-time data
- FLaNK Stack Weekly for 20 Nov 2023
- Bytewax: Stream processing library built using Python and Rust
- Public Real-Time Datasets and Sources
-
What are some good publicly available real-time data sources?
Added for now - https://github.com/bytewax/awesome-public-real-time-datasets/commit/94ca4a3d40dc212690c6cdc22c107289b4268661
I am attempting to source via the wisdom of the crowd here. I often find it hard to find good real-time data sources for learning about streaming, prototyping, or building hobby projects. I started researching and then created an "Awesome List" in a GitHub repo - https://github.com/bytewax/awesome-public-real-time-datasets.
-
Ask HN: What are some public real-time data sources?
I started an awesome list with real-time data sources here: https://github.com/bytewax/awesome-public-real-time-datasets . Have any datasets or data sources I should add to this list? Comment below or PRs welcome :).
What are some alternatives?
ChatGLM-6B - ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
screenshot-to-code - Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
CloudDemo2023 - 2023 Demos
RedfinScraper - Scrapes Redfin data.
halp - A CLI tool to get help with CLI tools 🐙
superset - Apache Superset is a Data Visualization and Data Exploration Platform
mockingbird - Mockingbird is a mock streaming data generator
cf-url-shortener - URL Shortener Cloudflare function that uses Upstash Redis and Kafka along with https://materialize.com
depthai-python - DepthAI Python Library
debezium-ui - A web UI for Debezium; Please log issues at https://issues.redhat.com/browse/DBZ.
Scada-LTS - Scada-LTS is an Open Source, web-based, multi-platform solution for building your own SCADA (Supervisory Control and Data Acquisition) system.