sirix VS zed

Compare sirix vs zed and see what are their differences.

sirix

SirixDB is an an embeddable, bitemporal, append-only database system and event store, storing immutable lightweight snapshots. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach. (by sirixdb)

zed

A novel data lake based on super-structured data (by brimdata)
Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
sirix zed
44 13
1,083 1,305
2.0% 2.5%
9.1 9.5
7 days ago 8 days ago
Java Go
BSD 3-clause "New" or "Revised" License BSD 3-clause "New" or "Revised" License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

sirix

Posts with mentions or reviews of sirix. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-24.
  • Show HN: Integer Map Data Structure
    3 projects | news.ycombinator.com | 24 Jan 2024
    We're using a similar trie structure as the main document (node) index in SirixDB[1]. Lately, I got some inspiration for different page-sizes based on the ART and HAMT basically for the rightmost inner pages (as the node-IDs are generated by a simple sequence generator and thus also all inner pages (we call them IndirectPage) except for the rightmost are fully occupied (the tree height is adapted dynamically depending on the size of the stored data. Currently, always 1024 references are stored to indirect child pages, but I'll experiment with smaller sized, as the inner nodes are simply copied for each new revision, whereas the leaf pages storing the actual data are versioned themselfes with a novel sliding snapshot algorithm.

    You can simply compute from a unique nodeId each data is assigned (64bit) the page and reference to traverse on each level in the trie through some bit shifting.

    [1] https://github.com/sirixdb/sirix

  • Endatabas: A SQLite-inspired, SQL document database with full history
    3 projects | news.ycombinator.com | 1 Dec 2023
    I'm working on something similar for the JVM, however with no document semantics, but on a much more fine granular level.

    JSON is shredded during an initial import into a tree structure with fine granular nodes. Thus, an import can be done with very low memory consumption (permitted that auto-commit issues a sync to disk before RAM space is exceeded). Furthermore, it doesn't require a WAL for consistency. Instead the indexes are stored in a log-structure using a persistent tree (as in every commit creates a new tree root). A sliding snapshot algorithm makes sure, that only a fragment of a page has to be copied on a write.

    As thus, it's also a perfect candidate for an event store, storing both the (lightweight) snapshots and tracking the changes optionally.

    https://github.com/sirixdb/sirix

    The architecture is described over here:

    https://sirix.io/docs/concepts.html

    Furthermore I'm working on a tutorial for a local client usage (work in progress):

    https://sirix.io/docs/jsoniq-tutorial.html

    Kind regards

  • Show HN: Bitemporal, Binary JSON Based DBS and Event Store
    6 projects | news.ycombinator.com | 13 Nov 2023
    If anyone is up to building a new frontend, that would be awesome (of course, work could also be split between interested people) :-)

    https://github.com/sirixdb/sirix/issues/627

  • Show HN: Light implementation of Event Sourcing using PostgreSQL as event store
    9 projects | news.ycombinator.com | 31 Oct 2023
    I'm working on an append-only (immutable) (bi)temporal DBS[1] in my spare time, which transforms CRUD operations into an event store, automatically providing an audit log for each stored node, while the nodes are stored with immutable node-IDs, which never change. As the contents stored are based on a custom binary JSON format also a rolling hash can optionally be built, to check if a whole subtree has changed or not.

    The system uses persistent index data structures to share unchanged pages between revisions.

    The intermittant snapshots are omitted. Rather the snapshot is spread over several revisions, applying a sliding snapshot algorithm on the data pages (thus, avoiding write peaks, while at max a predefined number of page fragments has to be read in parallel to reconstruct a page in-memory).

    [1] https://sirix.io | https://sirix.io/docs/concepts.html

  • Show HN: Evolutionary (binary) JSON data store (full immutable revision history)
    3 projects | news.ycombinator.com | 21 Oct 2023
    I've already posted the project a couple of years ago and it gained some interest, but a lot of stuff has been done since then, especially regarding performance, a complete new JSON store, a REST API, various internals refactored, an improved JSONiq based query engine allowing updates, a now already dated web UI, a new Kotlin based CLI, a Python and TypeScript client to ease the use of Sirix...

    First prototypes from a precursor stem already from 2005.

    So, what is it all about?

    I'm working on an evolutionary data store in my spare time[1]. It is based on the idea to get rid of the need for a second trx log (the WAL) by using a persistent tree of tries (preserving the previous revision through copy on write and path copying to the root) index as the log itself with only a single permitted read/write txn concurrently and in parallel to N read-only txns, which are bound to specific revisions during the start. The single writer is permitted on a resource (comparable to a table/relation in a relational DB) basis within a database, reads do not involve any locks at all.

    The idea is, that the system atomically swaps the tree root to the new version (replicated). If something fails the log can simply be truncated to the former tree root.

    Thus, the system has many similarities with Git (structural sharing of unchanged nodes/pages) and ZFS snapshots (regarding the latter the keyed trie has been inspired by ZFS, as well as that checksums for child pages are stored in parent pages in the references to the child pages)[2].

    You can of course simply execute time travel queries on the whole revision history, add commit comments and the author to answer questions such as who committed what at which point in time and why...

    The system not only copies full data pages, but it applies a sliding snapshot versioning algorithm to keep storage space to a minimum.

    Thus, it's best suited for fast flash drives with fast random reads and sequential writes. Data is never overwritten, thus audit trails are given for free.

    The system stores find granular JSON nodes, thus the structure and size of an object has almost no limits. A path summary is built, which is an unordered set of all paths to leaf nodes in the tree and enables various optimizations. Furthermore a rolling hash is optionally built, whereas during inserts all ancestor node hashes are adapted.

    Furthermore it optionally keeps track of update operations and the ctx nodes involved during txn commits. Thus, you can easily get the changes between revisions, you can check the full history of nodes, as well as navigate in time to the first revision, the last revision, the next and previous revision of a node...

    You can also open a revision at a specific system time revert to a revision and commit a new version while preserving all revisions in-between.

    As said one feature is, that the objects can be arbitrarily nested, thus almost no limits in the number and updates are cheap.

    A dated Jupyter notebook with some examples can be found in [3] and overall documentation in [4].

    The query engine[5] Brackit is retargetable (a couple of interfaces and rewrite rules have to be implemented for DB systems) and especially finds implicit joins and applies known algorithms from the relational DB systems world to optimize joins and aggregate functions due to set-oriented processing of the operators.[6]

    I've given an interview in [7], but I'm usually very nervous, so don't judge too harshly.

    Give it a try and happy coding!

    Kind regards

    Johannes

    [1] https://sirix.io | https://github.com/sirixdb/sirix

    [2] https://sirix.io/docs/concepts.html

    [3] https://colab.research.google.com/drive/1NNn1nwSbK6hAekzo1YbED52RI3NMqqbG#scrollTo=CBWQIvc0Ov3P

    [4] https://sirix.io/docs/

    [5] http://brackit.io

    [6] https://colab.research.google.com/drive/19eC-UfJVm_gCjY--koOWN50sgiFa5hSC

    [7] https://youtu.be/Ee-5ruydgqo?si=Ift73d49w84RJWb2

  • Evolutionary, JSON data store (keeping the full revision history)
    3 projects | news.ycombinator.com | 20 Oct 2023
  • Immutable Data
    2 projects | news.ycombinator.com | 26 Jun 2023
    You can use Datomic for instance (mentioned already in your article IIRC!?) or SirixDB[1] on sich I'm working in my spare time.

    The idea is an indexed append-only log-structure and to use a functional tree structure (sharing unchanged nodes between revisions) plus a novel algorithm to balance incremental and full dumps of database pages using a sliding window instead.

    [1] https://sirix.io | https://github.com/sirixdb/sirix

  • Java opensource projects that need help from community.
    13 projects | /r/java | 20 May 2023
    Append-only database system (based on a persistent inddx structure): https://github.com/sirixdb/sirix or a retargetable query compiler https://github.com/sirixdb/brackit
  • Looking to help out on some open source projects
    4 projects | /r/opensource | 17 Apr 2023
    You can work on a temporal data store called SirixDB: https://github.com/sirixdb/sirix
  • SirixDB - an embeddable, evolutionary database system
    2 projects | /r/java | 3 Apr 2023

zed

Posts with mentions or reviews of zed. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-05.
  • Ask HN: What projects are trying to reinvent core software infrastructure?
    2 projects | news.ycombinator.com | 5 Aug 2023
  • The Zed Project | Zed
    1 project | /r/dataengineering | 25 May 2023
    1 project | /r/programming | 25 May 2023
  • VAST 3.0 released. Open-Source Security Data Pipelines with Kusto-like syntax
    2 projects | /r/cybersecurity | 15 Mar 2023
    VAST is an open-source SecDataOps project for working with data from open-source security tools. Version 3.0 adds a pipeline syntax similar to splunk, Kusto, PRQL, and Zed.
  • The Magic of Small Databases
    3 projects | news.ycombinator.com | 28 Jan 2023
  • zed
    1 project | /r/devopspro | 20 May 2022
  • Super-Structured Data: Rethinking the Schema
    3 projects | news.ycombinator.com | 17 May 2022
    Cool, I didn't realize you used sqlite-utils for your performance demo!

    It's not particularly designed for speed - it should be fast as far as Python code goes (I use some generator tricks to stream data and avoid having to load everything into memory at once) but I wouldn't expect "sqlite-utils insert" to win any performance competitions with tools written in other languages.

    Those benchmarks against sqlite itself are definitely interesting. I'm looking forward to playing with the "native ZNG support for Python" mentioned on https://github.com/brimdata/zed/blob/main/docs/libraries/pyt... when that becomes available.

  • Zq: An Easier (and Faster) Alternative to Jq
    36 projects | news.ycombinator.com | 26 Apr 2022
    Hi, all. Author here. Thanks for all the great feedback.

    I've learned a lot from your comments and pointers.

    The Zed project is broader than "a jq alternative" and my bad for trying out this initial positioning. I do know there are a lot of people out there who find jq really confusing, but it's clear if you become an expert, my arguments don't hold water.

    We've had great feedback from many of our users who are really productive with the blend of search, analytics, and data discovery in the Zed language, and who find manipulating eclectic data in the ZNG format to be really easy.

    Anyway, we'll write more about these other aspects of the Zed project in the coming weeks and months, and in the meantime, if you find any of this intriguing and want to kick the tires, feel free to hop on our slack with questions/feedback or file GitHub issues if you have ideas for improvements or find bugs.

    Thanks a million!

    https://github.com/brimdata/zed

  • The many uses of mock data
    4 projects | dev.to | 1 Jan 2022
    In my observation, mock data has tended to be used in a rather loose, slipshod, careless manner. Unlike documentation, it is treated as the garbage of software material. (Sometimes even referred to as "garbage data"). People will try to avoid writing it by using elaborate "generators" such as jFairy or zed.
  • Internet Object – A JSON alternative data serialization format
    6 projects | news.ycombinator.com | 24 Oct 2021
    There are a few examples in the ZSON spec...

    https://github.com/brimdata/zed/blob/main/docs/formats/zson....

    And you can easily see whatever data you'd like formatted as ZSON using the "zq" CLI tool, but I just made this gist (with some data from the brimdata/zed-sample-data report) so you can have a quick look (the bstring stuff is a little noisy and an artifact of the data source being Zeek)... https://gist.github.com/mccanne/94865d557ca3de8abfd3eb09e8ac...

What are some alternatives?

When comparing sirix and zed you can also consider the following projects:

CXXGraph - Header-Only C++ Library for Graph Representation and Algorithms

simdjson - Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

keycloak-kafka - Keycloak module to produce events to kafka

yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor

hash4j - Dynatrace hash library for Java

jid - json incremental digger

sqlglot - Python SQL Parser and Transpiler

feedback - Public feedback discussions for: GitHub for Mobile, GitHub Discussions, GitHub Codespaces, GitHub Sponsors, GitHub Issues and more! [Moved to: https://github.com/github-community/community]

Sinatra - Classy web-development dressed in a DSL (official / canonical repo)

gojq - Pure Go implementation of jq

scim-for-keycloak - a third party module that extends keycloak by SCIM functionality

awesome-semantic-web - A curated list of various semantic web and linked data resources.