nifi
Apache Camel
nifi | Apache Camel | |
---|---|---|
35 | 21 | |
4,410 | 5,318 | |
1.8% | 0.7% | |
9.9 | 10.0 | |
8 days ago | 4 days ago | |
Java | Java | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
nifi
- FLaNK Stack Weekly 19 Feb 2024
- Ask HN: What are some unpopular technologies you wish people knew more about?
- FLaNK Stack Weekly for 13 November 2023
-
Ask HN: What low code platforms are worth using?
Apache NIFI (https://nifi.apache.org/).
It uses the concept of Flow-based programming. Also its so underacknolged but this tool is very flexible. I have used as an Event Bus all the 3rd-Party Integrations.
- Apache Nifi: easy to use, powerful, reliable system to process, distribute data
- Tool decision - What architecture would you choose and why?
-
Help with choosing techstack for a new DE team
Presently setting up Apache Nifi + Apache MiNiFi for the ETL portion of my work. NiFi was easy enough to figure out; but the docs for MiNiFi have been a pain due to differences between the Java and C++ versions. I then entirely configured it with the Java version so that it was easier to search for answers for the MiNiFi yaml syntax.
-
MS SQL Change Data Capture
Found it
-
Is there something like airflow but written in Scala/Java?
Apache Camel Apache Nifi Spring Cloud
-
Json splitting and Rerouting (new to nifi)
NIFI, like most Apache projects does most of its discussion on its mailing lists, but also has a slack.
Apache Camel
- Show HN: Winglang – a new Cloud-Oriented programming language
-
Ask HN: What is the correct way to deal with pipelines?
"correct" is a value judgement that depends on lots of different things. Only you can decide which tool is correct. Here are some ideas:
- https://camel.apache.org/
- https://www.windmill.dev/
- https://github.com/huginn/huginn
Your idea about a queue (in redis, or postgres, or sqlite, etc) is also totally valid. These off-the-shelf tools I listed probably wouldn't give you a huge advantage IMO.
-
Is there something like airflow but written in Scala/Java?
Apache Camel Apache Nifi Spring Cloud
-
Why messaging is much better than REST for inter-microservice communications
This reminds me more of Apache Camel[0] than other things it's being compared to.
> The process initiator puts a message on a queue, and another processor picks that up (probably on a different service, on a different host, and in different code base) - does some processing, and puts its (intermediate) result on another queue
This is almost exactly the definition of message routing (ie: Camel).
I'm a bit doubtful about the pitch because the solution is presented as enabling you to maintain synchronous style programming while achieving benefits of async processing. This just isn't true, these are fundamental tradeoffs. If you need a synchronous answer back then no amount of queuing, routing, prioritisation, etc etc will save you when the fundamental resource providing that is unavailable, and the ultimate outcome that your synchronous client now hangs indefinitely waiting for a reply message instead of erroring hard and fast is not desirable at all. If you go into this ad hoc, and build in a leaky abstraction that asynchronous things are are actually synchronous and vice versa, before you know it you are going to have unstable behaviour or even worse, deadlocks all over your system and the worst part - the true state of the system is now hidden in which messages are pending in transient message queues everywhere.
What really matters here is to fundamentally design things from the start with patterns that allow you to be very explicit about what needs to be synchronous vs async (building on principles of idempotency, immutability, coherence, to maximise the cases where async is the answer).
The notion of Apache Camel is to make all these decisions a first class elements of your framework and then to extract out the routing layer as a dedicated construct. The fact it generalises beyond message queues (treating literally anything that can provide a piece of data as a message provider) is a bonus.
[0] https://camel.apache.org/
-
Can I continuously write to a CSV file with a python script while a Java application is continuously reading from it?
Since you're writing a Java app to consume this, I highly recommend Apache Camel to do the consuming of messages for it. You can trivially aim it at file systems, message queues, databases, web services and all manner of other sources to grab your data for you, and you can change your mind about what that source is, without having to rewrite most of your client code.
-
S3 to S3 transform
For a simple sequential Pipeline, my goto would be Apache Camel. As soon as you want complexity its either Apache Nifi or a micro service architecture.
-
🗞️ We have just released our JBang! catalog 🛍️
🐪 Apache Camel : Camel JBang, A JBang-based Camel app for easily running Camel routes.
- 7GUIs of Java/Object Oriented Design?
-
System Design: Enterprise Service Bus (ESB)
Apache Camel
-
Advanced: Java, JVM and general knowledge
So, my advice is this. Expand your knowledge. Pursue higher education on topics you are familiar with, but also explore topics you are not. Read documentation, but question it. I just found out about something called Apache Camel today that I am excited to read up on. Why is it better than Spring? Is it really? What's happening here? This is always what excites me as a developer and engineer. There is so much to learn.
What are some alternatives?
Logstash - Logstash - transport and process your logs, events, or other data
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
superset - Apache Superset is a Data Visualization and Data Exploration Platform
Apache Kafka - Mirror of Apache Kafka
meltano
Apache Pulsar - Apache Pulsar - distributed pub-sub messaging system
meltano - Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Apache ActiveMQ Artemis - Mirror of Apache ActiveMQ Artemis
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Spring Boot - Spring Boot
Metabase - The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
Aeron - Efficient reliable UDP unicast, UDP multicast, and IPC message transport