sirix

SirixDB is an embeddable, temporal, evolutionary database system, which uses an append-only approach to store immutable revisions. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach. (by sirixdb)

Sirix Alternatives

Similar projects and alternatives to sirix

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better sirix alternative or higher similarity.

sirix reviews and mentions

Posts with mentions or reviews of sirix. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-07-29.
  • Semantic Diff for SQL
    7 projects | news.ycombinator.com | 29 Jul 2022
  • Dolt Is Git for Data
    5 projects | news.ycombinator.com | 23 Jun 2022
    Basically my research project[1] I'm working on in my spare time is all about versioning and efficiently storing small sized revisions of the data as well as allowing sophisticated time travel queries for audits and analysis.

    Of course all secondary user-defined, typed indexes are also versioned.

    Basically the technical idea is to map a huge tree of index tries (with revisions as indexed leave pages at the top-level and a document index as well as secondary indexes on the second level) to an append-only file. To reduce write amplification and to reduce the size of each snapshot data pages are first compressed and second versioned through a sliding snapshot algorithm. Thus, Sirix does not simply do a copy on write per page. Instead it writes nodes, which have been changed in the current revision plus nodes which fall out of the sliding window (therefore it needs a fast random-read drive).

    [1] https://github.com/sirixdb/sirix

  • Ask HN: Do you prefer Svelte or SolidJS?
    4 projects | news.ycombinator.com | 2 Jun 2022
    Hello,

    I want to find enthusiastic OSS frontend developers for my JSON data store project[1], which is able to retain the full revision history of a database resource (binary JSON) through small sized copy-on-write snapshots of the main index tree of tries and a novel sliding snapshot algorithm.

    As I'm a fan of compilers (http://brackit.io) I think either working on the current frontend with Svelte[2], which is currently really dated and uses Sapper or a new frontend using SolidJS would be great.

    What are the advantages/disadvantages of both frameworks in your opinion? I'm a backend software engineer, but maybe SolidJS is more familiar to frontend devs because of JSX and at least in benchmarks it seems to be faster. But maybe the differences except for the different syntaxes aren't that big.

    I envision visualizations for comparing revisions of resources or subtrees therein and also to visualize time travel queries. A screenshot of the old frontend: https://github.com/sirixdb/sirix/blob/master/Screenshot%20from%202020-09-28%2018-50-58.png

    Let me know which framework you'd prefer for the task at hand and what are the advantages/disadvantages in your opinion for both of them in general.

    If you want to help, it's even better. Let me know :-)

    [1] https://sirix.io || https://github.com/sirixdb/sirix

  • Implementing a Merkle Tree for an Immutable Verifiable Log
    2 projects | news.ycombinator.com | 6 May 2022
    It iterates through all revisions of a node and returns the node and the revision (and we could also add the author) when it has been updated.

    To make this even easier I could write a native Java function for use in the query.

    [1] https://sirix.io

  • Zq: An Easier (and Faster) Alternative to Jq
    36 projects | news.ycombinator.com | 26 Apr 2022
    That's one of the main steps forward for Brackit, a retargetable JSONiq query engine/compiler (http://brackit.io) and the append-only data store SirixDB (https://sirix.io) and a new web frontend. My vision is not only to explore the most recent revision but also any other older revisions, to display the diffs, to display thd results of time travel queries... help is highly welcome as I'm myself a backend engineer and working on the query engine and the data store itself :-)
  • Postgres Auditing in 150 lines of SQL
    10 projects | news.ycombinator.com | 9 Mar 2022
    As posted in another comment that's basically what https://github.com/sirixdb/sirix supports along with easily reconstructing former revisions of a JSON document, sophisticated secondary (also versioned) indexes, querying with JSONiq and optimizations at query compile time as for joins and aggregates....
    10 projects | news.ycombinator.com | 9 Mar 2022
    I'd argue that's still a lot of work to manually do. However, great work and detail, thanks a lot :-)

    I'm working on a database system[1] in my spare time, which automatically retains all revisions and assignes revision timestamps during commits (single timestamp in the RevisionRootPage). Furthermore, it is tamper proof and the whole storage can be verified by comparing a single UberPage hash as in ZFS.

    Basically it is a persistent trie-based revision index (plus document and secondary indexes) mapped to durable storage, a simple log-structured append-only file. A second file tracks revision offsets to provide binary search on an in-memory map of timestamps. As the root of the tree is atomically swapped it does not need a WAL, which basically is another data file and can be omitted in this case.

    Besides versioning the data itself in a binary encoding similar to BSON it tracks changes and writes simple JSON diff files for each new revision.

    The data pages are furthermore not simply copied on write, but a sliding snapshot algorithm makes sure, that only changed records mainly have to be written. Before the page fragments are written on durable storage they are furthermore compressed and in the future might be encrypted.

    [1] https://sirix.io | https://github.com/sirixdb/sirix

    10 projects | news.ycombinator.com | 9 Mar 2022
    This is precisely what https://github.com/sirixdb/sirix does. A resource in a database is stored in a huge persistent structure of index pages.

    The main index is a trie, which indexes revision numbers. The leaf nodes of this trie are "RevisionRootPages". Under each RevisionRootPage another trie indexes the main data. Data is addressed through dense unique and stable 64bit int nodeKeys. Furthermore, the user-defined secondary indexes currently are also stored as further tries under a RevisionRootPage.

    The last layer of inner pages in a trie adds references to a predefined maximum number of data page fragments. The copy-on-write architecture does not simply copy whole data pages, but it depends on the versioning algorithm. The default is a sliding snapshot algorithm, which copies changed/inserted/deleted nodes plus nodes, which fall out of a predefined window (usually the size is low, as the page fragments have to be read from random locations in parallel to reconstruct a full page). This reduces the amount of data to store for each new revision. The inner pages of the trie (as well as the data pages) are not page-aligned, thus they might be small. Furthermore, they are compressed before writing to persistent storage.

    Currently, it offers a single read-write transaction on a resource plus read-only transactions without any locks.

  • Select, put and delete data from JSON, TOML, YAML, XML and CSV files
    11 projects | news.ycombinator.com | 7 Mar 2022
    Regarding XQuery we just added JSON querying on top in Brackit[1] / SirixDB[2].

    Brackit is a retargetable query compiler and does a lot of optimizations at compile time as for instance optimizing joins and aggregations. It is useable as an in-memory processor or as a query processor of a database system.

    The Ph.D. thesis of Sebastian:

    Separating Key Concerns in Query Processing - Set Orientation, Physical Data Independence, and Parallelism

    http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/publicatio...

    [1] http://brackit.io

    [2] https://sirix.io

  • Brackit: A retargatable JSONiq query engine
    4 projects | news.ycombinator.com | 2 Mar 2022
    Hi all,

    Sebastian and his students did a tremendous job creating Brackit[1] in the first place as a retargetable query engine for different data stores. They worked hard to optimize aggregations and joins. Despite its clear database query engine routes, it's furthermore useable as a standalone ad-hoc in-memory query engine.

    Sebastian did his research for his Ph.D. at the TU-Kaiserslautern at the database systems group of Theo Härder. Theo Härder coined the well-known acronym ACID with Andreas Reuter, the desired properties of transactions.

    As he's currently not maintaining the project anymore, I stepped up and forked the project a couple of years ago. I'm using it for my evolutionary, immutable data store SirixDB[2], which stores the entire history of your JSON data in small-sized snapshots in an append-only file (tailored binary format similar to BSON). It's exceptionally well suited for audits, undo operations, and sophisticated analytical time travel queries.

    I've changed a lot of stuff, such that Brackit is getting more and more compatible with the JSONiq query language standard, added JSONiq update primitives, array slices as known from Python and fixed several bugs. Furthermore, I've added interfaces for temporal data stores, temporal XPath axis to navigate not only in space, but also in time and temporal extension functions in SirixDB, index rewrite rules, etc. pp.

    As Brackit can query XML, you're of course able to transform XML data to JSON and vice versa.

    Moshe and I are working on a Jupyter Notebook / Tutorial[3] for interactive queries.

    We're looking forward to your bug reports, issues, and questions. Contributions are, of course, highly welcome. Maybe even implementations for other data stores or common query optimizations.

    Furthermore, we'd gladly see further (university-based?) research.

    It should, for instance, be possible to add vector instructions for SIMD instructions in the future, as the query engine is already set-oriented and processes sets of tuples for the so-called FLWOR expressions (see JSONiq). Brackit rewrites FLWOR expression trees in the AST to a pipeline of operations to port optimizations from relational query engines for efficient join processing and aggregate expressions. Furthermore, certain parts of the queries are parallelizable, as detailed in Sebastian's thesis. We also envision a stage for the compiler to use distributed processing (first research used MapReduce, but we can now use better-suited approaches, of course).

    Kind regards

    Johannes

    [1] https://github.com/sirixdb/brackit

    [2] https://sirix.io | https://github.com/sirixdb/sirix

    [3] https://colab.research.google.com/drive/19eC-UfJVm_gCjY--koO...

  • A note from our sponsor - Sonar
    www.sonarsource.com | 23 Mar 2023
    Sonar helps you commit clean code every time. With over 600 unique rules to find Java bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work. Learn more →

Stats

Basic sirix repo stats
26
806
8.6
9 days ago
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com