brackit vs miller

brackit

Query processor with proven optimizations, ready to use for your JSON store to query semi-structured data with JSONiq. Can also be used as an ad-hoc in-memory query processor. (by sirixdb)

Source Code

brackit.io

Suggest alternative

Edit details

miller

Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON (by johnkerl)

Source Code

miller.readthedocs.io

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

brackit		miller
	Project
21	Mentions	63
46	Stars	8,559
-	Growth	-
6.9	Activity	9.0
3 months ago	Latest Commit	7 days ago
Java	Language	Go
GNU General Public License v3.0 or later	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

brackit

Posts with mentions or reviews of brackit. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-13.

Show HN: Bitemporal, Binary JSON Based DBS and Event Store
6 projects | news.ycombinator.com | 13 Nov 2023
Show HN: Evolutionary (binary) JSON data store (full immutable revision history)
3 projects | news.ycombinator.com | 21 Oct 2023

I've already posted the project a couple of years ago and it gained some interest, but a lot of stuff has been done since then, especially regarding performance, a complete new JSON store, a REST API, various internals refactored, an improved JSONiq based query engine allowing updates, a now already dated web UI, a new Kotlin based CLI, a Python and TypeScript client to ease the use of Sirix...
First prototypes from a precursor stem already from 2005.
So, what is it all about?
I'm working on an evolutionary data store in my spare time[1]. It is based on the idea to get rid of the need for a second trx log (the WAL) by using a persistent tree of tries (preserving the previous revision through copy on write and path copying to the root) index as the log itself with only a single permitted read/write txn concurrently and in parallel to N read-only txns, which are bound to specific revisions during the start. The single writer is permitted on a resource (comparable to a table/relation in a relational DB) basis within a database, reads do not involve any locks at all.
The idea is, that the system atomically swaps the tree root to the new version (replicated). If something fails the log can simply be truncated to the former tree root.
Thus, the system has many similarities with Git (structural sharing of unchanged nodes/pages) and ZFS snapshots (regarding the latter the keyed trie has been inspired by ZFS, as well as that checksums for child pages are stored in parent pages in the references to the child pages)[2].
You can of course simply execute time travel queries on the whole revision history, add commit comments and the author to answer questions such as who committed what at which point in time and why...
The system not only copies full data pages, but it applies a sliding snapshot versioning algorithm to keep storage space to a minimum.
Thus, it's best suited for fast flash drives with fast random reads and sequential writes. Data is never overwritten, thus audit trails are given for free.
The system stores find granular JSON nodes, thus the structure and size of an object has almost no limits. A path summary is built, which is an unordered set of all paths to leaf nodes in the tree and enables various optimizations. Furthermore a rolling hash is optionally built, whereas during inserts all ancestor node hashes are adapted.
Furthermore it optionally keeps track of update operations and the ctx nodes involved during txn commits. Thus, you can easily get the changes between revisions, you can check the full history of nodes, as well as navigate in time to the first revision, the last revision, the next and previous revision of a node...
You can also open a revision at a specific system time revert to a revision and commit a new version while preserving all revisions in-between.
As said one feature is, that the objects can be arbitrarily nested, thus almost no limits in the number and updates are cheap.
A dated Jupyter notebook with some examples can be found in [3] and overall documentation in [4].
The query engine[5] Brackit is retargetable (a couple of interfaces and rewrite rules have to be implemented for DB systems) and especially finds implicit joins and applies known algorithms from the relational DB systems world to optimize joins and aggregate functions due to set-oriented processing of the operators.[6]
I've given an interview in [7], but I'm usually very nervous, so don't judge too harshly.
Give it a try and happy coding!
Kind regards
Johannes
[1] https://sirix.io | https://github.com/sirixdb/sirix
[2] https://sirix.io/docs/concepts.html
[3] https://colab.research.google.com/drive/1NNn1nwSbK6hAekzo1YbED52RI3NMqqbG#scrollTo=CBWQIvc0Ov3P
[4] https://sirix.io/docs/
[5] http://brackit.io
[6] https://colab.research.google.com/drive/19eC-UfJVm_gCjY--koOWN50sgiFa5hSC
[7] https://youtu.be/Ee-5ruydgqo?si=Ift73d49w84RJWb2
Evolutionary, JSON data store (keeping the full revision history)
3 projects | news.ycombinator.com | 20 Oct 2023
Java opensource projects that need help from community.
13 projects | /r/java | 20 May 2023

Append-only database system (based on a persistent inddx structure): https://github.com/sirixdb/sirix or a retargetable query compiler https://github.com/sirixdb/brackit
Whats Wrong with Java/Spring
1 project | /r/java | 28 Mar 2023

[2] http://brackit.io
Ask HN: Do you prefer Svelte or SolidJS?
4 projects | news.ycombinator.com | 2 Jun 2022

Hello,
I want to find enthusiastic OSS frontend developers for my JSON data store project[1], which is able to retain the full revision history of a database resource (binary JSON) through small sized copy-on-write snapshots of the main index tree of tries and a novel sliding snapshot algorithm.
As I'm a fan of compilers (http://brackit.io) I think either working on the current frontend with Svelte[2], which is currently really dated and uses Sapper or a new frontend using SolidJS would be great.
What are the advantages/disadvantages of both frameworks in your opinion? I'm a backend software engineer, but maybe SolidJS is more familiar to frontend devs because of JSX and at least in benchmarks it seems to be faster. But maybe the differences except for the different syntaxes aren't that big.
I envision visualizations for comparing revisions of resources or subtrees therein and also to visualize time travel queries. A screenshot of the old frontend: https://github.com/sirixdb/sirix/blob/master/Screenshot%20from%202020-09-28%2018-50-58.png
Let me know which framework you'd prefer for the task at hand and what are the advantages/disadvantages in your opinion for both of them in general.
If you want to help, it's even better. Let me know :-)
[1] https://sirix.io || https://github.com/sirixdb/sirix
Implementing a Merkle Tree for an Immutable Verifiable Log
2 projects | news.ycombinator.com | 6 May 2022

Basically JSONiq, with a few minor syntax differences.
Our query engine/compiler is and can be used by other data stores as well:
http://brackit.io
Zq: An Easier (and Faster) Alternative to Jq
36 projects | news.ycombinator.com | 26 Apr 2022

That's one of the main steps forward for Brackit, a retargetable JSONiq query engine/compiler (http://brackit.io) and the append-only data store SirixDB (https://sirix.io) and a new web frontend. My vision is not only to explore the most recent revision but also any other older revisions, to display the diffs, to display thd results of time travel queries... help is highly welcome as I'm myself a backend engineer and working on the query engine and the data store itself :-)
Brackit - a flexible query compiler for JSON, separating key concerns in query processing
1 project | /r/Database | 14 Mar 2022
Flexible JSON Query Compiler – Separating Key Concerns in Query Processing
1 project | news.ycombinator.com | 14 Mar 2022

miller

Posts with mentions or reviews of miller. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-22.

Qsv: Efficient CSV CLI Toolkit
8 projects | news.ycombinator.com | 22 Dec 2023
jq 1.7 Released
33 projects | news.ycombinator.com | 6 Sep 2023

jq and miller[1] are essential parts of my toolbelt, right up there with awk and vim.
[1]: https://github.com/johnkerl/miller
Perl first commit: a “replacement” for Awk and sed
3 projects | news.ycombinator.com | 8 Jul 2023

> This works really well if your problem can be solved in one or two liners.
My personal comfort threshold is around the 100-line mark. It's even possible to write maintainable shell scripts up to 500 lines, but it mostly depends on the problem you're trying to solve, and the discipline of the programmer to follow best practices (use sane defaults, ShellCheck, etc.).
> It go bad very quickly when, say, you have two CSV files and want to join them the sql-way.
In that case we're talking about structured data, and, yeah, Perl or Python would be easier to work with. That said, depending on the complexity of the CSV, you can still go a long way with plain Bash with IFS/read(1) or tr(1) to split CSV columns. This wouldn't be very robust, but there are tools that handle CSV specifically[1], which can be composed in a shell script just fine.
So it's always a balancing act of being productive quickly with a shell script, or reaching out for a programming language once the tools aren't a good fit, or maintenance becomes an issue.
[1]: https://miller.readthedocs.io/
Need help on cleaning this data!!
1 project | /r/datacleaning | 13 Jun 2023

where mlr is from https://github.com/johnkerl/miller
Running weekly average
1 project | /r/bash | 10 Jun 2023

if this class of problems (i.e., csv/tsv data) is your main target you may find miller (https://github.com/johnkerl/miller) much more useful in the long run
GQL: A new SQL like query language for .git files written in Rust
2 projects | /r/programming | 9 Jun 2023

That said, you may be interested in Miller (https://github.com/johnkerl/miller) which provides similar capabilities for CSV, JSON, and XML files. It doesn't use a SQL grammar, but that's just the proverbial lipstick on the thing. I'm not the author, but I have used it and I see some parallels in use cases at the very least.
johnkerl/miller: Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
1 project | /r/devel | 8 Jun 2023
Any cli utility to create ascii/org mode tables?
3 projects | /r/commandline | 12 Apr 2023

worth giving Miller a shot
I wrote this iCalendar (.ics) command-line utility to turn common calendar exports into more broadly compatible CSV files.
6 projects | /r/commandline | 24 Mar 2023

CSV utilities (still haven't pick a favorite one...): https://github.com/harelba/q https://github.com/BurntSushi/xsv https://github.com/wireservice/csvkit https://github.com/johnkerl/miller
Miller: Like Awk, sed, cut, join, and sort for CSV, TSV, and tabular JSON
1 project | /r/hypeurls | 16 Mar 2023

What are some alternatives?

When comparing brackit and miller you can also consider the following projects:

sirix - SirixDB is an an embeddable, bitemporal, append-only database system and event store, storing immutable lightweight snapshots. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach.

visidata - A terminal spreadsheet multitool for discovering and arranging data

jmespath.py - JMESPath is a query language for JSON.

xsv - A fast CSV command line toolkit written in Rust.

textql - Execute SQL against structured text like CSV or TSV

jq - Command-line JSON processor [Moved to: https://github.com/jqlang/jq]

gron - Make JSON greppable!

dasel - Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

zed - A novel data lake based on super-structured data

csvtk - A cross-platform, efficient and practical CSV/TSV toolkit in Golang

yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor

brackit vs sirix miller vs visidata brackit vs jmespath.py miller vs xsv brackit vs textql miller vs jq brackit vs gron miller vs dasel brackit vs zed miller vs csvtk brackit vs dasel miller vs yq

Compare brackit vs miller and see what are their differences.

brackit

miller

brackit

miller

What are some alternatives?