spark-cassandra-connector VS kafka-journal

Compare spark-cassandra-connector vs kafka-journal and see what are their differences.

spark-cassandra-connector

DataStax Connector for Apache Spark to Apache Cassandra (by datastax)

kafka-journal

Event sourcing journal implementation using Kafka as main storage (by evolution-gaming)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
spark-cassandra-connector kafka-journal
1 2
1,930 110
-0.1% 0.9%
5.1 9.0
7 days ago 1 day ago
Scala Scala
Apache License 2.0 MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

spark-cassandra-connector

Posts with mentions or reviews of spark-cassandra-connector. We have used some of these posts to build our list of alternatives and similar projects.
  • Reading from cassandra in Spark does not return all the data when using JoinWithCassandraTable
    1 project | /r/apachespark | 9 Mar 2022
    This works perfectly fine and I get all the data I'm expecting. However if I change spark.cassandra.sql.inClauseToJoinConversionThreshold(see https://github.com/datastax/spark-cassandra-connector/blob/master/doc/reference.md) to something lower like 20 which means I hit the threshold (my cross-product is 10*10=100) and JoinWithCassandraTable will be used. I suddenly do not get all the data, and on top of that I get duplicated rows for some of the data. It looks like I'm completely missing some of the partition keys, and some of the partition keys return duplicated rows (this quick-analysis might however be wrong).

kafka-journal

Posts with mentions or reviews of kafka-journal. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-10-29.
  • James Roper on the future of Lagom.
    2 projects | /r/scala | 29 Oct 2021
    The other people seem to do it even without Lightbend support, I am sure Lightbend can do it much better: https://github.com/evolution-gaming/kafka-journal
  • Streaming journals
    1 project | /r/Akka | 9 Jun 2021
    The closest is Evolution Kafka Journal (https://github.com/evolution-gaming/kafka-journal), which is battle-tested and tuned for high load, but is barely documented and doesn't provide any tools or APIs.

What are some alternatives?

When comparing spark-cassandra-connector and kafka-journal you can also consider the following projects:

deequ - Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

akka-persistence-cassandra - A replicated Akka Persistence journal backed by Apache Cassandra

Quill - Compile-time Language Integrated Queries for Scala

Reactive-kafka - Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.

GCP Datastore Akka Persistence Plugin - akka-persistence-gcp-datastore is a journal and snapshot store plugin for akka-persistence using google cloud firestore in datastore mode.

opa-kafka-plugin - Open Policy Agent (OPA) plug-in for Kafka authorization