crux
mergestat-lite
crux | mergestat-lite | |
---|---|---|
16 | 10 | |
1,475 | 3,419 | |
- | 0.3% | |
9.7 | 6.3 | |
over 2 years ago | 3 days ago | |
Clojure | Go | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
crux
- Speeding Up `Atan2f` by 50x
-
Bridging the Blockchain / Database Divide (Temporal Graph Queries for Corda)
Hi, a couple of my colleagues spent some time working on this integration with our open source database product (https://opencrux.com), and I'm curious to know - has anyone done similar things to connect Corda with a secondary off-the-shelf query engine?
- Crux 1.18.0 Is Out
-
Crux 1.18.0 is out!
For more details, see the release notes.
-
Looking for Intermediate & Advanced SQL Users for Research
The context is that I work on on https://opencrux.com, which offers a bi-temporal Datalog query layer (as well as SQL) that more or less addresses the intersection of the two, since Datalog is great for expressing recursive queries.
-
How to query Datomic, Datascript, Asami, or other graph databases
I suppose another somewhat important distinction, once again performance related, is that graph databases will typically track index statistics to aid with query planning. For example, Crux uses stored knowledge of attribute-value cardinalities (recently via HyperLogLog) to optimise the join order of a query - this can make a big difference when attempting to traverse large graphs efficiently.
-
Free project to practice sql ?
Agreed, recursive querying & bitemporal modelling in SQL are non-trivial problems, and the combination of the two is harder still. For an alternative perspective on tackling such problems I'd suggest looking at Datalog, which makes recursion a breeze, and a database with first-class bitemporality - both of which feature in https://opencrux.com (which I happen to work on :))
-
Ask HN: What under-the-radar technology are you super excited about?
I work on Crux so can share a few details about our implementation of Datalog. The query is compiled into a kind of Worst-Case Optimal Join algorithm [0] which means that certain types of queries (e.g. cyclic graph-analytical queries, like counting triangles) are generally more efficient than what is possible with a non-WCOJ query execution strategy. However, the potency of this approach relies on the query planner calculating a good ordering of variables for the join order, and this is a hard problem in itself.
Crux is usually very competent at selecting a sensible variable ordering but when it makes a bad choice your query will take an unnecessary performance hit. The workaround for these situations is to break your query into smaller queries (since we don't wish to support any kind of hinting). Over the longer term we will be continuing to build more intelligent heuristics that make use of advanced population statistics. For instance we are about to merge a PR that uses HyperLogLog to inform attribute selectivity: https://github.com/juxt/crux/pull/1472
[0] https://cs.stanford.edu/people/chrismre/papers/paper49.Ngo.p...
- Bitemporal History
- Git as a NoSql Database
mergestat-lite
-
SQLite Doesn't Use Git
You can query git with this: https://github.com/mergestat/mergestat if you like the idea.
-
A SQLite extension for reading large files line-by-line
Hey, author here, happy to answer any questions! Also checkout this notebook for a deeper dive into sqlite-lines, along with a slick WASM demonstration and more thoughts on the codebase itself https://observablehq.com/@asg017/introducing-sqlite-lines
I really dig SQLite, and I believe SQLite extensions will push it to another level. I rarely reach for Pandas or other "traditional" tools and query languages, and instead opt for plain ol' SQLite and other extensions. As a shameless plug, I recently started a blog series on SQLite and related tools and extensions if you want to learn more! Next week I'll be publishing more SQLite extensions for parsing HTML + making HTTP requests https://observablehq.com/@asg017/a-new-sqlite-blog-series
A few other SQLite extensions:
- xlite, for reading Excel files, in Rust https://github.com/x2bool/xlite
- sqlean, several small SQLite extensions in C https://github.com/nalgeon/sqlean
- mergestat, several SQLite extensions for developers (mainly Github's API) in Go https://github.com/mergestat/mergestat
- Show HN: Contribution Graph as a Git Command
-
Exploring Git Repos With MergeStat 🔬
mergestat is an open-source tool that allows users to run SQL queries on the contents and history of git repositories.
-
The world of PostgreSQL wire compatibility
Thanks for this write up! I've been really interested in postgres compatibility in the context of a tool I maintain (https://github.com/mergestat/mergestat) that uses SQLite. I've been looking for a way to expose the SQLite capabilities over a more commonly used wire-protocol like postgres (or mysql) so that existing BI and visualization tools can access the data.
This project is an interesting one: https://github.com/dolthub/go-mysql-server that provides a MySQL interface (wire and SQL) to arbitrary "backends" implemented in go.
It's really interesting how compatibility with existing protocols has become an important feature of new databases - there's so much existing tooling that already speaks postgres (or mysql), being able to leverage that is a huge advantage IMO
-
Go library for printing human readable, relative time differences 🕰️
timediff is a Go package for printing human readable, relative time differences. Output is based on ranges defined in the Day.js JavaScript library, and can be customized if needed. It's currently used by the mergestat command-line interface.
- Askgit: Command-line tool for running SQL queries on Git repositories
-
Semantic Git Commit Messages
Assuming committers adhere to it, there could be some interesting use cases when combined with a tool like AskGit (https://github.com/askgitdev/askgit) for understanding what "categories" of work is being done in a codebase.
Maybe even what directories/files tend to see `fix` or `refactor` more frequently (signs of a poorly design or "hot" area?)
-
Git as a NoSql Database
I've been very curious to explore this type of use case with askgit (https://github.com/augmentable-dev/askgit) which was designed for running simple "slice and dice" queries and aggregations on git history (and change stats) for basic analytical purposes. I've been curious about how this could be applied to a small text+git based "db". Say, for a regular json or CSV dumps.
This also reminds me of Dolt: https://github.com/dolthub/dolt which I believe has been on HN a couple times
What are some alternatives?
xtdb - An immutable database for application development and time-travel data compliance, with SQL and XTQL. Developed by @juxt
git-xargs - git-xargs is a command-line tool (CLI) for making updates across multiple Github repositories with a single command.
asami - A graph store for Clojure and ClojureScript
flan - A tasty tool that lets you save, load and share postgres snapshots with ease
specter - Clojure(Script)'s missing piece
sqlite-plus - The ultimate set of SQLite extensions
materialize - The data warehouse for operational workloads.
csv-sql - Command-line tool to load csv and excel (xlsx) files and run sql commands
mnm - mnm implements TMTP protocol. Let Internet sites message members directly, instead of unreliable, insecure email. Contributors welcome! (Server)
datasette-lite - Datasette running in your browser using WebAssembly and Pyodide
dolt - Dolt – Git for Data
xlite - Query Excel spredsheets (.xlsx, .xls, .ods) using SQLite