Avro

Top 23 Avro Open-Source Projects

  • Apache Avro

    Apache Avro is a data serialization system.

  • Project mention: Open Table Formats Such as Apache Iceberg Are Inevitable for Analytical Data | news.ycombinator.com | 2024-01-18

    Apache AVRO [1] is one but it has been largely replaced by Parquet [2] which is a hybrid row/columnar format

    [1] https://avro.apache.org/

  • rq

    Record Query - A tool for doing record analysis and transformation (by dflemstr)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • schema-registry

    Confluent Schema Registry for Kafka

  • Project mention: JR, quality Random Data from the Command line, part I | dev.to | 2023-05-07

    So, is JR yet another faking library written in Go? Yes and no. JR indeed implements most of the APIs in fakerjs and Go fake it, but it's also able to stream data directly to stdout, Kafka, Redis and more (Elastic and MongoDB coming). JR can talk directly to Confluent Schema Registry, manage json-schema and Avro schemas, easily maintain coherence and referential integrity. If you need more than what is OOTB in JR, you can also easily pipe your data streams to other cli tools like kcat thanks to its flexibility.

  • examples

    Apache Kafka and Confluent Platform examples and demos (by confluentinc)

  • DataProfiler

    What's in your data? Extract schema, statistics and entities from datasets

  • Project mention: LongRoPE: Extending LLM Context Window Beyond 2M Tokens | news.ycombinator.com | 2024-02-22

    It's been possible to skip tokenization for a long time, my team and I did it here - https://github.com/capitalone/DataProfiler

    For what it's worth, we actually were working with LSTMs with nearly a billion params back in 2016-2017 area. Transformers made it far more effective to train and execute, but ultimately LSTMs are able to achieve similar results, though slow & require more training data.

  • avsc

    Avro for JavaScript :zap:

  • pmacct

    pmacct is a small set of multi-purpose passive network monitoring tools [NetFlow IPFIX sFlow libpcap BGP BMP RPKI IGP Streaming Telemetry].

  • Project mention: NetFlow-equivalent analysis for mirrored traffic | /r/networking | 2023-07-12

    If you want a tool that can ingest from a span port and generate netflow or IPFIX there is pmacct. This should work with your existing tooling that collects netflow data.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • adam

    ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.

  • kafkactl

    Command Line Tool for managing Apache Kafka

  • Cinchoo ETL

    ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)

  • Avro4s

    Avro schema generation and serialization / deserialization for Scala

  • Project mention: Scala 3 Macros: How to Read Annotations | /r/scala | 2023-06-26

    for example https://github.com/sksamuel/avro4s - check AvroName and AvroNamespace

  • vscode-data-preview

    Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files

  • SlimMessageBus

    Lightweight message bus interface for .NET (pub/sub and request-response) with transport plugins for popular message brokers.

  • NoProto

    Flexible, Fast & Compact Serialization with RPC

  • compendium-client

    Mu (μ) is a purely functional framework for building micro services.

  • mongo-kafka

    MongoDB Kafka Connector

  • Project mention: Difficulty configuring log4j when deploying code as plugin for an app | /r/CodingHelp | 2023-10-27

    I am working on a custom Kafka-Mongo sink connector (specifically, a custom WriteModelStrategy to be used with the official Mongo sink connector here: https://github.com/mongodb/mongo-kafka ). My code is not a standalone, executable Java application but rather a JAR that augments the functionality of another Java application.

  • kafka-connect-file-pulse

    🔗 A multipurpose Kafka Connect connector that makes it easy to parse, transform and stream any file, in any format, into Apache Kafka

  • Project mention: Kafka Connect Filepulse 2.13.0 is now available! This version includes support for SFTP and Alibaba OSS. It also contains many bug fixes and improvements. 🚀 | /r/apachekafka | 2023-09-15
  • ABRiS

    Avro SerDe for Apache Spark structured APIs.

  • srclient

    Golang Client for Schema Registry

  • rumble

    ⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more (by RumbleDB)

  • avro4k

    Avro support for kotlinx.serialization

  • Project mention: People who use Spring and Kotlin... | /r/Kotlin | 2023-09-26

    I've just recently implemented Google Pub/Sub with Avro serialization using Avro4k library and it works fine https://github.com/avro-kotlin/avro4k

  • clickhouse-sink-connector

    Replicate data from MySQL, Postgres and MongoDB to ClickHouse

  • datagen

    Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.

  • Project mention: What are your favorite tools or components in the Kafka ecosystem? | /r/apachekafka | 2023-05-31

    For fake data, shameless plug for https://github.com/MaterializeInc/datagen/tree/main

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Avro related posts

Index

What are some of the best open-source Avro projects? This list will help you:

Project Stars
1 Apache Avro 2,756
2 rq 2,254
3 schema-registry 2,136
4 examples 1,845
5 DataProfiler 1,362
6 avsc 1,245
7 pmacct 1,014
8 adam 967
9 kafkactl 748
10 Cinchoo ETL 735
11 Avro4s 716
12 vscode-data-preview 522
13 SlimMessageBus 433
14 NoProto 362
15 compendium-client 326
16 mongo-kafka 322
17 kafka-connect-file-pulse 305
18 ABRiS 221
19 srclient 221
20 rumble 207
21 avro4k 180
22 clickhouse-sink-connector 173
23 datagen 133

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com