Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Definitely used in production ;) and at rather some scale.
It runs thousands of clusters, daily, both in CSP hosted offerings (including our own ClickHouse Cloud) and at customers running the OSS release.
Never accept any claims at face value and always test. But, in this case, it is quite battle-hardened (i.e. the Jepsen tests run 3x daily https://github.com/ClickHouse/ClickHouse/tree/master/tests/j...)
But yes, ZooKeeper is pretty amazing. We are building on the backs of giants.
I'd also argue the RAFT v. ZAB is an important production scale conversation. But, as the blog says, Zookeper is a better option when you require scalability with a read-heavy workload.
Coincidentally, as someone who worked on this blog, I was surprised (and pleased!) to see that we are not the only ones who felt the need to build a Zookeeper alternative.
Looks like folks at StreamNative did as well, with their Oxia project: https://github.com/streamnative/oxia. They were just talking about this yesterday at Confluent Current ("Introducing Oxia: A Scalable Zookeeper Alternative" was the title of their talk). https://streamnative.io/blog/introducing-oxia-scalable-metad...
Seems to be a trend :)
That's a _very_ incorrect statement. You can use any OpenJDK (which is GPLv2 with classpath exception) distribution you want to run Apache Zookeeper without having to have any agreement with Oracle or pay any fee. The Oracle JDK is just Oracle's commercial version of their OpenJDK distribution with Oracle support.
You can use the OpenJDK distro shipped in your Linux distro (RedHat, Debian, etc.), you can use Microsoft's OpenJDK distro[1], you can use the Eclipse OpenJDK distro, you can use Amazon's OpenJDK distro [3] and there are a whole bunch more.
[1] https://www.microsoft.com/openjdk
It's been a few years since I've checked in with distributed lock services. Why would someone adopt ZooKeeper after etcd gained maturity? I recall seeing benchmarks more than 5 years ago where a naive proxy like zetcd[0] out-performs ZooKeeper itself in many ways and offers more consistent latencies. etcd has gotten lots of battle-testing being Kubernetes core datastore, but I can also see how that has shaped its design in a way that might not fit other projects.
As an anecdote, FoundationDB (and Kafka?) also replaced their usage of ZooKeeper.
[0]: https://github.com/etcd-io/zetcd
That's true - C++ libraries are typically bug-ridden and require exhaustive efforts to clean up.
But the latest bugs found by ClickHouse continuous integration system in the related library were fixed about a year ago:
https://github.com/eBay/NuRaft/pull/373
Any thoughts here on Fly's Corrosion? https://github.com/superfly/corrosion