spyql
logica
spyql | logica | |
---|---|---|
23 | 19 | |
902 | 1,680 | |
- | - | |
0.0 | 9.1 | |
over 1 year ago | 13 days ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
spyql
-
Fq: Jq for Binary Formats
I prefer a SQL-like format. It’s not as complete but it cover most of the day-to-day use cases. Take a look at https://github.com/dcmoura/spyql (I am the author). Congrats on fq!
-
Command-line data analytics made easy with SPyQL
SPyQL documentation: spyql.readthedocs.io
-
This Week In Python
spyql – Query data on the command line with SQL-like SELECTs powered by Python expressions
- Command-line data analytics made easy
-
Jc – JSONifies the output of many CLI tools
This is great!
I am the author of SPyQL [1]. Combining JC with SPyQL you can easily query the json output and run python commands on top of it from the command-line :-) You can do aggregations and so forth in a much simpler and intuitive way than with jq.
I just wrote a blogpost [2] that illustrates it. It is more focused on CSV, but the commands would be the same if you were working with JSON.
[1] https://github.com/dcmoura/spyql
- The fastest command-line tools for querying large JSON datasets
-
Working with more than 10gb csv
You can import the data into a PostgreSQL/MySQL/SQLite/... database and then query the database. However, even with the right choice of indexes, it might take a while to run queries on a table with hundreds of millions of records. You can easily import your data to these databases with SpyQL: $ spyql "SELECT * FROM csv TO sql(table=my_table_name) | sqlite3 my.db" (you would need to create the table my_table_name before running the command).
-
ClickHouse Cloud is now in Public Beta
https://github.com/dcmoura/spyql/blob/master/notebooks/json_...
And ClickHouse looks like a normal relational database - there is no need for multiple components for different tiers (like in Druid), no need for manual partitioning into "daily", "hourly" tables (like you do in Spark and Bigquery), no need for lambda architecture... It's refreshing how something can be both simple and fast.
- A SQLite extension for reading large files line-by-line
-
I want to convert a large JSON file into Tabular Format.
I thought this library was pretty nifty for json. It's also relatively fast compared to most json parsers: https://github.com/dcmoura/spyql
logica
-
Prolog language for PostgreSQL proof of concept
If you're interested in this I would also recommend you check out Logica[0], which is a datalog-like language that is explicitly made to compile to SQL queries.
0: https://logica.dev/
- Logica
- New welcome page for Logica language
-
Introduction to Datalog
> I guess the intention is to be better than SQL but then I was left with "under which circumstances?"
Excellent question.
Two of the most common use cases for databases are "transactional processing" (manipulating small numbers of rows in real time) and "analytical processing" (querying enormous numbers of rows, typically in a read-only fashion).
SQL is generally fine for transactional workloads.
But analytical queries sometimes involve multi-page queries, with lots of JOINs and CTEs. And these queries are often automatically generated.
And once you start writing actual multi-page "programs" in SQL, you may decide that it's a fairly clunky and miserable programming language. What Datalog typically buys you is a way to cleanly decompose large queries into "subroutines." And it offers a simpler syntax for many kinds of complex JOINs.
Unfortunately, there isn't really a standard dialect of Datalog, or even a particular dialect with mainstream traction. So choosing Datalog is a bit of a tradeoff: does it buy you enough, for your use case, that it's worth being a bit outside the mainstream? Maybe! But I'd love to see something like Logica gain more traction: https://logica.dev/
-
Mangle, a programming language for deductive database programming
Interesting; a Google engineer previously published a Datalog variant for BigQuery: https://logica.dev/
This new language seems similar to differential-Datalog (which is sadly in maintenance mode): https://news.ycombinator.com/item?id=33521561
- Show HN: PRQL 0.2 – Releasing a better SQL
-
Show HN: PRQL – A Proposal for a Better SQL
Looks pretty cool. I'd be interested if the README had a comparison with Google's Logica (https://github.com/EvgSkv/logica)
-
PathQuery, Google's Graph Query Language
Oh wow that is neat!
And yes, this kind of thing is why datalog is a lot more amenable to fast query plans & runtimes than prolog. This part is especially cool: https://github.com/EvgSkv/logica/blob/main/compiler/dialects...
-
Thought about Logica: Google new programming language that compiles to SQL ?
Google new programming Language that compiles to SQL (Support BigQuery and Postgres) feels very exciting. Blog: https://opensource.googleblog.com/2021/04/logica-organizing-your-data-queries.html Github: https://github.com/EvgSkv/logica
-
Google Logica Aims To Make SQL Queries More Reusable and Readable
Going to be? It already is. In fact, one thing the article misses is right there at the bottom of the project page:
What are some alternatives?
prql - PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
scryer-prolog - A modern Prolog implementation written mostly in Rust.
malloy - Malloy is an experimental language for describing data relationships and transformations.
ungoogled-chromium-archlinux - Arch Linux packaging for ungoogled-chromium
tresql - Shorthand SQL/JDBC wrapper language, providing nested results as JSON and more
Preql - An interpreted relational query language that compiles to SQL.
prosto - Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
pxi - 🧚 pxi (pixie) is a small, fast, and magical command-line data processor similar to jq, mlr, and awk.
differential-datalog - DDlog is a programming language for incremental computation. It is well suited for writing programs that continuously update their output in response to input changes. A DDlog programmer does not write incremental algorithms; instead they specify the desired input-output mapping in a declarative manner.