Top 23 SQL Open-Source Projects
Apache Spark - A unified analytics engine for large-scale data processingProject mention: On explaining technical stuff in a non-technical way — (Py)Spark | dev.to | 2021-04-23
The homework example illustrates, as I understand it, the over-simplified basic thinking behind Apache Spark (and many similar frameworks and systems, e.g. horizontal or vertical data “sharding”), splitting the data into reasonable groups (called “partitions” in Spark’s case), given the fact that you know what kind of tasks you have to perform on the data, so that you are efficient, and distribute those partitions to ideally equal number of workers (or as many workers as your system can provide). These workers can be in the same machine or in different ones, e.g. each worker on one machine (node). There must be a coordinator of all this effort, to collect all the necessary information that is needed to perform the task and to redistribute the load in case of failure. It is also necessary to have a (network) connection between the coordinator and the workers to communicate and exchange data and information. Or even re-partition the data in case of either failure or when the computations require it (e.g. we need to calculate something on each row of data independently but then we need to group those rows by a key). There is also the concept of doing things in a “lazy” way and use caching to keep track of intermediate results and not having to calculate everything from scratch all the time.
TiDB is an open source distributed HTAP database compatible with the MySQL protocolProject mention: TiGraph: 8,700x Computing Performance Achieved by Combining Graphs + the RDBMS Syntax | dev.to | 2021-04-05
The three hackers on the TiGraph team are all top developers in the TiDB community:
Scout APM - Leading-edge performance monitoring starting at $39/month. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.
An easy-to-use multi SQL dialect ORM tool for Node.jsProject mention: Debugging Chronicles: Serverless offline + Sequelize | dev.to | 2021-05-02
We immediately check the dependencies update and indeed Sequelize had been updated, then we also found an issue on their Github and some questions on StackOverflow mentioning a similar error. Nevertheless, reverting to the previous working had no effect at all. The error was still happening and we had no clue what other dependency could have something to do with an error so deep in Sequelize codebase.
CockroachDB - the open source, cloud-native distributed SQL database.Project mention: #30DaysofAppwrite : Appwrite’s building blocks | dev.to | 2021-05-03
Appwrite uses MariaDB as the default database for project collections, documents, and all other metadata. Appwrite is agnostic to the database you use under the hood and support for more databases like Postgres, CockroachDB, MySQL and MongoDB is currently under active development! 😊
SQL powered operating system instrumentation, monitoring, and analytics.Project mention: Is there a way to scan a network for computers running specific software (Java in this case) | reddit.com/r/sysadmin | 2021-04-26
Many options exist. OSQuery is one, and it's free, and it can be used to grab a bunch of other system information which might be useful at a later date. https://osquery.io/
ClickHouse® is a free analytics DBMS for big dataProject mention: Little Analyst in a Big Data Pond | reddit.com/r/datascience | 2021-04-29
As many have already mentioned this is more of data engineering than Data Science one. Try to build ETL pipelines for storing the data to a data lake or data warehouse(more organised). Make sure the pipelines are reliable and fall back mechanism to ensure consistency. Check out open source DBs https://clickhouse.tech (open-source OLAP database management system) or you can get started with Postgres as well. https://airbyte.io is an open source project which provides data integrations/pipelines.
MyBatis SQL mapper framework for Java
A query builder for PostgreSQL, MySQL and SQLite3, designed to be flexible, portable, and fun to use.Project mention: Generate TypeScript definitions from PostgreSQL | dev.to | 2021-04-14
I've been enjoying using Knex.js database client for quite some time when implementing GraphQL API backends. One thing that it currently lucks though, is the ability to generate strongly typed (TypeScript) models from the actual database schema.
Dapper - a simple object mapper for .Net (by DapperLib)Project mention: Is it possible to convert sql string to LINQ expression? | reddit.com/r/csharp | 2021-04-26
Are you maybe searching for sometrhing like Dapper?
The official home of the Presto distributed SQL query engine for big dataProject mention: Inside Presto Optimizer | dev.to | 2021-04-19
We will use the Presto Foundation fork version 0.245 for this blog post.
An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.Project mention: TimescaleDB Raises $40M | news.ycombinator.com | 2021-05-05
Fair point about adaptive chunking. You sound like a long-term user!
There is always a trade-off between getting features to users quickly to experiment and incrementally improve, versus doing it always very conservatively.
When we launched adaptive chunking (introduced in 0.11, deprecated in 1.2), we explicitly marked it as beta and default off, to hopefully reflect that. 
The approach we are now taking with Timescale Analytics  is to have an explicit distinction between experimental features (which will be part of a distinct"experimental" schema in the database, and must be expressly turned on with appropriate warnings) and stable features. Hopefully this can help find a good balance between stability and velocity, but feedback welcome!
Go MySQL Driver is a MySQL driver for Go's (golang) database/sql package (by go-sql-driver)Project mention: Web Development in Go: Middleware, Templating, Databases & Beyond | dev.to | 2021-01-27
For example, here's how to use the MySQL driver package with database/sql:
Dolt – It's Git for DataProject mention: Git as a NoSql Database | news.ycombinator.com | 2021-04-05
I've been very curious to explore this type of use case with askgit (https://github.com/augmentable-dev/askgit) which was designed for running simple "slice and dice" queries and aggregations on git history (and change stats) for basic analytical purposes. I've been curious about how this could be applied to a small text+git based "db". Say, for a regular json or CSV dumps.
This also reminds me of Dolt: https://github.com/dolthub/dolt which I believe has been on HN a couple times
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview QuestionsProject mention: Questions you would get asked on an interview? | reddit.com/r/devops | 2021-01-28
I think the link you're looking for is https://github.com/bregman-arie/devops-exercises
The lightweight, distributed relational database built on SQLiteProject mention: Is it possible to distribute a Sqlite database across several servers? | reddit.com/r/sqlite | 2021-05-05
q - Run SQL directly on CSV or TSV files (by harelba)
The core infrastructure backend (API, database, Docker, etc). (by bitwarden)Project mention: What are some excellent Github projects that really showcase best practices and great architecture and design? | reddit.com/r/csharp | 2021-05-05
I really enjoy reading https://github.com/bitwarden/server
A safe, extensible ORM and Query Builder for RustProject mention: diesel.exe - Application Error | reddit.com/r/rust | 2021-04-20
I managed to install the diesel cli like they showed on the getting started page, but when I try to run the diesel commands from command promt I get an error box that pops up saying: "The application was unable to start correctly (0xc000007b). Click OK to close the application." Apparently there was a similar issue previously (https://github.com/diesel-rs/diesel/issues/2034) but they just said its probably some missing DLLs but how do I know what DLLs are missing? Any ideas on how to fix this issue?
Database migrations. CLI and Golang library.Project mention: 🎉 The Create Go App project has grown to v2, but is still easier, better, faster & stronger | dev.to | 2021-05-06
postgres — configured PostgreSQL container with apply migrations (by golang-migrate/migrate tool) for backend.
Azure Data Studio is a data management tool that enables working with SQL Server, Azure SQL DB and SQL DW from Windows, macOS and Linux. (by microsoft)Project mention: Drawbridge: What SQL Server on Linux is built on | news.ycombinator.com | 2021-01-13
Cool! How do I enable MySQL support?
This issue led me to believe it's not implemented yet: https://github.com/Microsoft/azuredatastudio/issues/4904
And search for MySQL or MariaDB on extensions marketplace nets zero results.
Universal command-line interface for SQL databasesProject mention: Reading database metadata (schema) | reddit.com/r/golang | 2021-04-29
A few months ago I started working on adding \d* commands to usql that would allow to list and describe various database objects, like tables, views, indexes, etc. I started looking for existing solutions in Go and stumbled upon this issue: https://github.com/golang/go/issues/7408
What are some of the best open-source SQL projects? This list will help you: