Brackit: A retargatable JSONiq query engine

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

brackit

21 46 6.9 Java

Query processor with proven optimizations, ready to use for your JSON store to query semi-structured data with JSONiq. Can also be used as an ad-hoc in-memory query processor.

Hi all,
Sebastian and his students did a tremendous job creating Brackit[1] in the first place as a retargetable query engine for different data stores. They worked hard to optimize aggregations and joins. Despite its clear database query engine routes, it's furthermore useable as a standalone ad-hoc in-memory query engine.
Sebastian did his research for his Ph.D. at the TU-Kaiserslautern at the database systems group of Theo Härder. Theo Härder coined the well-known acronym ACID with Andreas Reuter, the desired properties of transactions.
As he's currently not maintaining the project anymore, I stepped up and forked the project a couple of years ago. I'm using it for my evolutionary, immutable data store SirixDB[2], which stores the entire history of your JSON data in small-sized snapshots in an append-only file (tailored binary format similar to BSON). It's exceptionally well suited for audits, undo operations, and sophisticated analytical time travel queries.
I've changed a lot of stuff, such that Brackit is getting more and more compatible with the JSONiq query language standard, added JSONiq update primitives, array slices as known from Python and fixed several bugs. Furthermore, I've added interfaces for temporal data stores, temporal XPath axis to navigate not only in space, but also in time and temporal extension functions in SirixDB, index rewrite rules, etc. pp.
As Brackit can query XML, you're of course able to transform XML data to JSON and vice versa.
Moshe and I are working on a Jupyter Notebook / Tutorial[3] for interactive queries.
We're looking forward to your bug reports, issues, and questions. Contributions are, of course, highly welcome. Maybe even implementations for other data stores or common query optimizations.
Furthermore, we'd gladly see further (university-based?) research.
It should, for instance, be possible to add vector instructions for SIMD instructions in the future, as the query engine is already set-oriented and processes sets of tuples for the so-called FLWOR expressions (see JSONiq). Brackit rewrites FLWOR expression trees in the AST to a pipeline of operations to port optimizations from relational query engines for efficient join processing and aggregate expressions. Furthermore, certain parts of the queries are parallelizable, as detailed in Sebastian's thesis. We also envision a stage for the compiler to use distributed processing (first research used MapReduce, but we can now use better-suited approaches, of course).
Kind regards
Johannes
[1] https://github.com/sirixdb/brackit
[2] https://sirix.io | https://github.com/sirixdb/sirix
[3] https://colab.research.google.com/drive/19eC-UfJVm_gCjY--koO...

sirix

44 1,085 9.1 Java

SirixDB is an an embeddable, bitemporal, append-only database system and event store, storing immutable lightweight snapshots. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach.

Hi all,
Sebastian and his students did a tremendous job creating Brackit[1] in the first place as a retargetable query engine for different data stores. They worked hard to optimize aggregations and joins. Despite its clear database query engine routes, it's furthermore useable as a standalone ad-hoc in-memory query engine.
Sebastian did his research for his Ph.D. at the TU-Kaiserslautern at the database systems group of Theo Härder. Theo Härder coined the well-known acronym ACID with Andreas Reuter, the desired properties of transactions.
As he's currently not maintaining the project anymore, I stepped up and forked the project a couple of years ago. I'm using it for my evolutionary, immutable data store SirixDB[2], which stores the entire history of your JSON data in small-sized snapshots in an append-only file (tailored binary format similar to BSON). It's exceptionally well suited for audits, undo operations, and sophisticated analytical time travel queries.
I've changed a lot of stuff, such that Brackit is getting more and more compatible with the JSONiq query language standard, added JSONiq update primitives, array slices as known from Python and fixed several bugs. Furthermore, I've added interfaces for temporal data stores, temporal XPath axis to navigate not only in space, but also in time and temporal extension functions in SirixDB, index rewrite rules, etc. pp.
As Brackit can query XML, you're of course able to transform XML data to JSON and vice versa.
Moshe and I are working on a Jupyter Notebook / Tutorial[3] for interactive queries.
We're looking forward to your bug reports, issues, and questions. Contributions are, of course, highly welcome. Maybe even implementations for other data stores or common query optimizations.
Furthermore, we'd gladly see further (university-based?) research.
It should, for instance, be possible to add vector instructions for SIMD instructions in the future, as the query engine is already set-oriented and processes sets of tuples for the so-called FLWOR expressions (see JSONiq). Brackit rewrites FLWOR expression trees in the AST to a pipeline of operations to port optimizations from relational query engines for efficient join processing and aggregate expressions. Furthermore, certain parts of the queries are parallelizable, as detailed in Sebastian's thesis. We also envision a stage for the compiler to use distributed processing (first research used MapReduce, but we can now use better-suited approaches, of course).
Kind regards
Johannes
[1] https://github.com/sirixdb/brackit
[2] https://sirix.io | https://github.com/sirixdb/sirix
[3] https://colab.research.google.com/drive/19eC-UfJVm_gCjY--koO...

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Implementing a Merkle Tree for an Immutable Verifiable Log

2 projects | news.ycombinator.com | 6 May 2022
Show HN: Brackit – a retargetable JSONiq based query engine for JSON

3 projects | news.ycombinator.com | 1 Mar 2022
Show HN: Bitemporal, Binary JSON Based DBS and Event Store

6 projects | news.ycombinator.com | 13 Nov 2023
Show HN: Evolutionary (binary) JSON data store (full immutable revision history)

3 projects | news.ycombinator.com | 21 Oct 2023
Evolutionary, JSON data store (keeping the full revision history)

3 projects | news.ycombinator.com | 20 Oct 2023

Brackit: A retargatable JSONiq query engine

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Xquery Java JSON HacktoberFest temporal-data
Post date: 2 Mar 2022

brackit

sirix

InfluxDB

Related posts

Implementing a Merkle Tree for an Immutable Verifiable Log

Show HN: Brackit – a retargetable JSONiq based query engine for JSON

Show HN: Bitemporal, Binary JSON Based DBS and Event Store

Show HN: Evolutionary (binary) JSON data store (full immutable revision history)

Evolutionary, JSON data store (keeping the full revision history)

Brackit: A retargatable JSONiq query engine

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Xquery Java JSON HacktoberFest temporal-data Post date: 2 Mar 2022

brackit

sirix

InfluxDB

Related posts

Implementing a Merkle Tree for an Immutable Verifiable Log

Show HN: Brackit – a retargetable JSONiq based query engine for JSON

Show HN: Bitemporal, Binary JSON Based DBS and Event Store

Show HN: Evolutionary (binary) JSON data store (full immutable revision history)

Evolutionary, JSON data store (keeping the full revision history)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Xquery Java JSON HacktoberFest temporal-data
Post date: 2 Mar 2022