DataDreamer
nifi
DataDreamer | nifi | |
---|---|---|
5 | 35 | |
667 | 4,449 | |
10.3% | 2.6% | |
8.5 | 9.9 | |
13 days ago | 2 days ago | |
Python | Java | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DataDreamer
- FLaNK AI - 01 April 2024
- FLaNK Stack 26 February 2024
- FLaNK Stack Weekly 19 Feb 2024
- DataDreamer
-
Ask HN: What have you built with LLMs?
We've built a prompting, synthetic data generation, and training library called DataDreamer: https://github.com/datadreamer-dev/DataDreamer
nifi
- FLaNK Stack Weekly 19 Feb 2024
- Ask HN: What are some unpopular technologies you wish people knew more about?
- FLaNK Stack Weekly for 13 November 2023
-
Ask HN: What low code platforms are worth using?
Apache NIFI (https://nifi.apache.org/).
It uses the concept of Flow-based programming. Also its so underacknolged but this tool is very flexible. I have used as an Event Bus all the 3rd-Party Integrations.
- Apache Nifi: easy to use, powerful, reliable system to process, distribute data
- Tool decision - What architecture would you choose and why?
-
Help with choosing techstack for a new DE team
Presently setting up Apache Nifi + Apache MiNiFi for the ETL portion of my work. NiFi was easy enough to figure out; but the docs for MiNiFi have been a pain due to differences between the Java and C++ versions. I then entirely configured it with the Java version so that it was easier to search for answers for the MiNiFi yaml syntax.
-
MS SQL Change Data Capture
Found it
-
Is there something like airflow but written in Scala/Java?
Apache Camel Apache Nifi Spring Cloud
-
Json splitting and Rerouting (new to nifi)
NIFI, like most Apache projects does most of its discussion on its mailing lists, but also has a slack.
What are some alternatives?
speedb - A RocksDB compliant high performance scalable embedded key-value store
Logstash - Logstash - transport and process your logs, events, or other data
tracecat - 😼 The open source alternative to Tines / Splunk SOAR. Build AI-assisted workflows, orchestrate alerts, and close cases fast.
superset - Apache Superset is a Data Visualization and Data Exploration Platform
CML_AMP-to-Airgapped - Download the AMP catalog for an offline (airgapped) deployment of the AMP catalog.
meltano
FLaNK-python-processors - Many processors
meltano - Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Metabase - The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
Apache Cassandra - Mirror of Apache Cassandra
django-project-template - The Django project template I use, for installation with django-admin.