On Efficiently Partitioning a Topic in Apache Kafka

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • liftbridge

    Lightweight, fault-tolerant message streams.

  • https://liftbridge.io/

    Apache Pulsar might be worth a look, but it's actually more complex under the hood than Kafka, but has a lot of features built-in that either aren't in FOSS Kafka yet, like tiered storage, or won't be until Confluent doesn't dominate the PMC (like an integrated schema registry), or just can't be done very nicely, if at all, like decent multi-tenancy.

    That said, it's a fast moving target, the code quality last I looked was patchy in places, ditto the documentation for both it and Bookkeeper, and the admin overhead is higher (managing bookies and brokers and Zookeepers vs. just brokers and ZK with Kafka, or when KRaft is production ready, just brokers).

  • pykafka

    Discontinued Apache Kafka client for Python; high-level & low-level consumer/producer, with great performance.

  • I just wanted to mention that pykafka is currently unmaintained and archived on GitHub:

    https://github.com/Parsely/pykafka

    pykafka was originally developed and maintained by my team at Parse.ly, but we no longer maintain it. We instead encourage folks to use confluent-kafka-python, which is what we have ourselves switched to in our production systems:

    https://github.com/confluentinc/confluent-kafka-python

    (pykafka was developed at a time before Confluent invested in their own Python binding. Some of the history of the project is described in this 2016 blog post[1] and our original 2015 announcement[2].)

    [1]: https://blog.parse.ly/pykafka-now/

    [2]: https://blog.parse.ly/announcing-pykafka-python-support-for-...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • confluent-kafka-python

    Confluent's Kafka Python Client

  • I just wanted to mention that pykafka is currently unmaintained and archived on GitHub:

    https://github.com/Parsely/pykafka

    pykafka was originally developed and maintained by my team at Parse.ly, but we no longer maintain it. We instead encourage folks to use confluent-kafka-python, which is what we have ourselves switched to in our production systems:

    https://github.com/confluentinc/confluent-kafka-python

    (pykafka was developed at a time before Confluent invested in their own Python binding. Some of the history of the project is described in this 2016 blog post[1] and our original 2015 announcement[2].)

    [1]: https://blog.parse.ly/pykafka-now/

    [2]: https://blog.parse.ly/announcing-pykafka-python-support-for-...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts