conduit
plumber
conduit | plumber | |
---|---|---|
7 | 19 | |
345 | 2,043 | |
3.2% | 0.4% | |
9.5 | 7.7 | |
8 days ago | about 1 month ago | |
Go | Go | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
conduit
-
Pulling CDC data from Postgres
I'd like to mention Conduit + its Postgres connector. The Pg connector comes built-in, so all that is needed is a single Conduit binary to get started. It relies on WAL, but the connector creates the replication slot itself (if needed).
-
How to connect already setup kafka cluster to mongodb?
GitHub - ConduitIO/Conduit: Data Integration for Production Data Stores. Conduit is meant to be a bit more general-purpose than Kafka Connect and is an easy drop-in replacement. We're working hard to make that even easier. We are still in the early stages of this project and are trying to build more and more connectors. You can find out more about our connector roadmap on the Github repo. Our connector philosophy is to be real-time first, and double down on change data capture (CDC) capabilities, all with permissive licensing.
-
What services you guys used for CDC (Change Data capture) for Sql as well as no sql databases ?
If you're looking for a tool with a UI and in which you can also easily extend the functionality with your own, custom data connectors, you might also want take a look at Conduit which is another open-source tool we've developed to make building and running real-time data infrastructure more straightforward and less time consuming.
-
Alternative Kafka Integration Framework to Kafka Connect?
You might want to check out: https://github.com/conduitio/conduit
-
Where is the modern data stack for software engineers?
This is why we are working on a project called Conduit at Meroxa. We hope to change the experience software engineers have with data.
- Conduit: Data Integration for Production Data Stores
- Conduit: Data Integration Tool for Production Data Stores written in Go
plumber
-
plumber VS kaf - a user suggested alternative
2 projects | 12 Jan 2024
-
14 DevOps and SRE Tools for 2024: Your Ultimate Guide to Stay Ahead
Streamdal
-
Show HN: Streamdal – an open-source tail -f for your data
4. Go to the provided UI (or run the CLI app) and be able to peek into what your app is reading or writing, like with `tail -f`.
And that's basically it. There's a bunch more functionality in the project but we find this to be the most immediately useful part. Every developer we've shown this to has said "I wish I had this at my gig at $company" - and we feel exactly the same. We are devs and this is what we’ve always wanted, hundreds of times - a way to just quickly look at the data our software is producing in real-time, without having to jump through any hoops.
If you want to learn more about the "why" and the origin of this project - you can read about it here: https://streamdal.com/manifesto
— — —
HOW DOES IT WORK?
The SDK establishes a long-running session with the server (using gRPC) and "listens" for commands that are forwarded to it all the way from the UI -> server -> SDK.
The commands are things like: "show me the data that you are currently consuming", "apply these rules to all data that you produce", "inspect the schema for all data", and so on.
The SDK interprets the command and either executes Wasm-based rules against the data it's processing or if it's a `tail` request - it'll send the data to the server, which will forward it to the UI for display.
The SDK IS part of the critical path but it does not have a dependency on the server. If the server is gone, you won't be able to use the UI or send commands to the SDKs, but that's about it - the SDKs will continue to work and attempt to reconnect to the server behind the scenes.
— — —
TECHNICAL BITS
The project consists of a lot of "buzzwordy" tech: we use gRPC, grpc-Web, protobuf, redis, Wasm, Deno, ReactFlow, and probably a few other things.
The server is written in Go, all of the Wasm is Rust and the UI is Typescript. There are SDKs for Go, Python, and Node. We chose these languages for the SDKs because we've been working in them daily for the past 10+ years.
The reasons for the tech choices are explained in detail here: https://docs.streamdal.com/en/resources-support/open-source/
— — —
LAST PART
OK, that's it. What do you think? Is it useful? Can we answer anything?
- If you like what you're seeing, give our repo a star: https://github.com/streamdal/streamdal
-
In memory message broker, any recommendations?
Checkout plumber https://github.com/streamdal/plumber if you want all the Postgres changes sent to basically any type of broker queue https://docs.streamdal.com/en/data-ingestion/relay/postgresql-cdc/. I would say Nat's Jetstream is probably the way to go if you have K8s running already. It's a dead simple service written in Go. Just make sure you allocate enough memory to Jetstream.
-
Pulling CDC data from Postgres
I recommend Streamdal. The connecting agent is open source and distributed by default, so it will scale horizontally WAY better than Debezium. All data ingested is indexed into parquet as well, and you can do serverless functions/transforms on the platform to reduce Snowflake compute costs.
-
Data Pipelines - how do you build data pipelines for sources not available in today’s ELT tools (Fivetran, Talend, Airbyte)? Old fashioned scripts and YOLO?
For CDC and event driven part of the stack, Plumber is a great free tool. That project is going to be adding sampling soon too - this can def help with the cost of ETL.
-
Open source project ideas
https://github.com/batchcorp/plumber check it out if you want to get into event driven systems
-
What would you rewrite in Golang?
That’s awesome to see. My coworker and I always figured Go would be perfect for this. Going to be a serious amount of work! I see you use NATS as well. Big fan of it. Checkout our project https://github.com/batchcorp/plumber if you end up needing to inspect or send messages while deving against it.
-
I want to participate in a golang open source projects. Have any suggestions or recommendations?
Checkout plumber https://github.com/batchcorp/plumber join our slack https://launchpass.com/streamdal we got a pretty knowledgeable group
- batchcorp/plumber: A swiss army knife CLI tool for interacting with Kafka, RabbitMQ and other messaging systems.
What are some alternatives?
turbine-go - Turbine Library for Go
akhq - Kafka GUI for Apache Kafka to manage topics, topics data, consumers group, schema registry, connect and more...
dozer - Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks.
kowl - Redpanda Console is a developer-friendly UI for managing your Kafka/Redpanda workloads. Console gives you a simple, interactive approach for gaining visibility into your topics, masking data, managing consumer groups, and exploring real-time data with time-travel debugging. [Moved to: https://github.com/redpanda-data/console]
sqlpipe - SQLpipe makes it easy to move the result of one query from one database to another.
kafka_manager - Simplifies eventing between microservices using kafka with kafka-go client
bytewax - Python Stream Processing
FASTER - Fast persistent recoverable log and key-value store + cache, in C# and C++.
Benthos - Fancy stream processing made operationally mundane
Enqueue - Message Queue, Job Queue, Broadcasting, WebSockets packages for PHP, Symfony, Laravel, Magento. DEVELOPMENT REPOSITORY - provided by Forma-Pro
deprecated-core - 🔮 Instill Core contains components for supporting Instill VDP and Instill Model
message-db - Microservice native message and event store for Postgres