gcs-connector-for-apache-kafka
secor
gcs-connector-for-apache-kafka | secor | |
---|---|---|
1 | 3 | |
65 | 1,837 | |
- | 0.3% | |
8.3 | 0.0 | |
20 days ago | 4 days ago | |
Java | Java | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gcs-connector-for-apache-kafka
-
Kafka to GCS Persistence Tools
Connector: Low Amount of Users
secor
-
Kafka to GCS Persistence Tools
secor: Seems a bit on the older side
-
Storing kakfa messages to dynamoDb
Since you are trying to archive this data, a simpler solution is to use something like secor to archive the data in kafka to S3. It’s much cheaper than dynamodb too.
-
ELT, Data Pipeline
Once we had our producer working for Kafka , it was time for a consumer to start pulling data and push it to GCS. With some research over at Github we found Secor from Pinterest to be a viable option for our use. Though it being a great piece of software, it wasn't mapping ideally to our design, for that purpose we had to submit few Pull requests to make the necessary changes to the secor project for our use and the greater good of the open source community. From updating the docs (PR268, PR271, PR277) on how to set it up to adding flexible upload directory structure with hourly support (PR275) and support for partitioned parser with no offset folder (PR279), also added flexible delimited file reader, writer option (PR291) for better control over file structure. Below diagram is our current ELT pipeline running in production.
What are some alternatives?
strimzi-kafka-operator - Apache Kafka® running on Kubernetes
kafkat - KafkaT-ool
debezium - Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
Scio - A Scala API for Apache Beam and Google Cloud Dataflow.
ksql - The database purpose-built for stream processing applications.
Apache Kafka - Mirror of Apache Kafka
mongo-kafka - MongoDB Kafka Connector
fluent-plugin-kafka - Kafka input and output plugin for Fluentd
kafka-ui - Open-Source Web UI for Apache Kafka Management
Thingsboard - Open-source IoT Platform - Device management, data collection, processing and visualization.
kafka-connect-elasticsearch - Kafka Connect Elasticsearch connector
graylog - Free and open log management