sqlglot
polars
sqlglot | polars | |
---|---|---|
56 | 144 | |
5,511 | 26,218 | |
- | 2.9% | |
9.9 | 10.0 | |
5 days ago | 5 days ago | |
Python | Rust | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
sqlglot
-
The Future of MySQL is PostgreSQL: an extension for the MySQL wire protocol
This is probably referring to "zero changes to your driver code" and not "zero changes to the SQL you send over this driver".
Translating between SQL dialects is notoriously hard and attempts to translate [1] are working in 95% of cases. But the last 5% would require 5x amount of work. That's because "SQL dialect" also includes weird edge cases of type inference of things like COALESCE(5, FALSE) and emulation of system catalogs (pg_catalog, information_schema).
[1] https://github.com/tobymao/sqlglot
- FLaNK AI Weekly 18 March 2024
- SQLGlot: No-dependency SQL parser, transpiler, optimizer for 21 SQL dialects
-
Transpile Any SQL to PostgreSQL Dialect
Recommend checking out https://github.com/tobymao/sqlglot if you are interested in this capability for other SQL dialects
Tools like this are helpful for:
- Rendering SQL in a consistent way, eg for snapshot testing
-
This Week In Python
sqlglot – Python SQL Parser and Transpiler
- SQLglot: Python SQL Parser and Transpiler
-
Build the dependency graph of your BigQuery pipelines at no cost: a Python implementation
In the project we used Python lib networkx and a DiGraph object (Direct Graph). To detect a table reference in a Query, we use sqlglot, a SQL parser (among other things) that works well with Bigquery.
- A Primer on SQLGlot's Abstract Syntax Tree
-
Show HN: SQL Polyglot
Cool! Is this built with sqlglot[1] on the back end?
[1] https://github.com/tobymao/sqlglot
-
sqlglot - Amazing SQL parsing library
Wanted to give sqlglot a shoutout as it saved me a ton of time.
polars
-
Why Python's Integer Division Floors (2010)
This is because 0.1 is in actuality the floating point value value 0.1000000000000000055511151231257827021181583404541015625, and thus 1 divided by it is ever so slightly smaller than 10. Nevertheless, fpround(1 / fpround(1 / 10)) = 10 exactly.
I found out about this recently because in Polars I defined a // b for floats to be (a / b).floor(), which does return 10 for this computation. Since Python's correctly-rounded division is rather expensive, I chose to stick to this (more context: https://github.com/pola-rs/polars/issues/14596#issuecomment-...).
-
Polars
https://github.com/pola-rs/polars/releases/tag/py-0.19.0
-
Stuff I Learned during Hanukkah of Data 2023
That turned out to be related to pola-rs/polars#11912, and this linked comment provided a deceptively simple solution - use PARSE_DECLTYPES when creating the connection:
- Polars 0.20 Released
- Segunda linguagem
- Polars: Dataframes powered by a multithreaded query engine, written in Rust
- Summing columns in remote Parquet files using DuckDB
- Polars 0.34 is released. (A query engine focussing on DataFrame front ends)
What are some alternatives?
sqloxide - Python bindings for sqlparser-rs
vaex - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
JSqlParser - JSqlParser parses an SQL statement and translate it into a hierarchy of Java classes. The generated hierarchy can be navigated using the Visitor Pattern
modin - Modin: Scale your Pandas workflows by changing a single line of code
Transcrypt - Python 3.9 to JavaScript compiler - Lean, fast, open! -
datafusion - Apache DataFusion SQL Query Engine
zetasql - ZetaSQL - Analyzer Framework for SQL
DataFrames.jl - In-memory tabular data in Julia
duckdb - DuckDB is an in-process SQL OLAP Database Management System
datatable - A Python package for manipulating 2-dimensional tabular data structures
criterion.rs - Statistics-driven benchmarking library for Rust
Apache Arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing