Apache Thrift VS Apache Avro

Compare Apache Thrift vs Apache Avro and see what are their differences.

Our great sponsors
  • Scout APM - Less time debugging, more time building
  • SonarQube - Static code analysis for 29 languages.
  • OPS - Build and Run Open Source Unikernels
Apache Thrift Apache Avro
6 7
8,878 2,039
1.2% 2.3%
8.8 9.2
1 day ago 1 day ago
C++ Java
Apache License 2.0 Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

Apache Thrift

Posts with mentions or reviews of Apache Thrift. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-01-01.

Apache Avro

Posts with mentions or reviews of Apache Avro. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-01-18.
  • Serialization
    2 projects | dev.to | 18 Jan 2022
    When serializing a value, we convert it to a different sequence of bytes. This sequence is often a human-readable string (all the bytes can be read and interpreted by humans as text), but not necessarily. The serialized format can be binary. Binary data (example: an image) is still bytes, but makes use of non-text characters, so it looks like gibberish in a text editor. Binary formats won't make sense unless deserialized by an appropriate program. An example of a human-readable serialization format is JSON. Examples of binary formats are Apache Avro, Protobuf.
  • Dreaming and Breaking Molds – Establishing Best Practices with Scott Haines
    3 projects | dev.to | 8 Dec 2021
    Scott: It's like a very large row of Avro data that had everything you could possibly ever need. It was like 115 columns. Most things were null, and it became every data type you'd ever want. It's like, is it mobile? Look for mobile_. It's like, this is really crappy. I didn't know about, I guess, the hardships of data engineering at that point. Because this was the first time where I was like, okay, you're on the ground basically pulling data now, and now we're going to do stuff with it. We're going to power our whole entire application with it. And I remember that just being exciting. The gears were turning. I was waking up super early. I wanted to go in to just to work on it more. It was the first thing where it's like, man, that's just like the coolest thing in the whole entire world.
  • Apache Hudi - The Streaming Data Lake Platform
    8 projects | dev.to | 27 Jul 2021
    Hudi is designed around the notion of base file and delta log files that store updates/deltas to a given base file (called a file slice). Their formats are pluggable, with Parquet (columnar access) and HFile (indexed access) being the supported base file formats today. The delta logs encode data in Avro (row oriented) format for speedier logging (just like Kafka topics for e.g). Going forward, we plan to inline any base file format into log blocks in the coming releases, providing columnar access to delta logs depending on block sizes. Future plans also include Orc base/log file formats, unstructured data formats (free form json, images), and even tiered storage layers in event-streaming systems/OLAP engines/warehouses, work with their native file formats.
  • Getting started with Kafka Connector for Azure Cosmos DB using Docker
    6 projects | dev.to | 6 Jul 2021
    So far we dealt with JSON, a commonly used data format. But, Avro is heavily used in production due to its compact format which leads to better performance and cost savings. To make it easier to deal with Avro data schema, there is Confluent Schema Registry which provides a serving layer for your metadata along with a RESTful interface for storing and retrieving your Avro (as well as JSON and Protobuf schemas). We will use the Docker version for the purposes of this blog post.
  • Tips for Designing Apache Kafka Message Payloads
    3 projects | dev.to | 29 Apr 2021
    Avro: Small and schema-driven Apache Avro is a serialisation system that keeps the data tidy and small, which is ideal for Kafka records. The data structure is described with a schema (example below) and messages can only be created if they conform with the requirements of the schema. The producer takes the data and the schema, produces a message that goes to the kafka broker, and registers the schema with a schema registry. The consumers do the same in reverse: take the message, ask the schema registry for the schema, and assemble the full data structure. Avro has a strong respect for data types, requires all payloads conform with the schema, and since data such as fieldnames is encoded in the schema rather than repeated in every payload, the overall payload size is reduced.
  • Scala 3.0 serialization
    5 projects | reddit.com/r/scala | 30 Mar 2021
    For binary serialization using Avro there's Vulcan which is released for 3.0.0-RC1 and will shortly be released for 3.0.0-RC2. (Disclosure: I'm a maintainer)
  • Looking for simple avro like serialization format
    3 projects | reddit.com/r/rust | 22 Jan 2021
    You can make use of the official C library with rust-bindgen and wrap what you need from there.

What are some alternatives?

When comparing Apache Thrift and Apache Avro you can also consider the following projects:

gRPC - The C based gRPC (C++, Python, Ruby, Objective-C, PHP, C#)

Protobuf - Protocol Buffers - Google's data interchange format

SBE - Simple Binary Encoding (SBE) - High Performance Message Codec

ZeroMQ - ZeroMQ core engine in C++, implements ZMTP/3.1

Cap'n Proto - Cap'n Proto serialization/RPC system - core tools and C++ library

Apache Parquet - Apache Parquet

iceberg - Apache Iceberg

nanomsg - nanomsg library

Big Queue - A big, fast and persistent queue based on memory mapped file.

Apache Orc - Apache ORC - the smallest, fastest columnar storage for Hadoop workloads