tiflash
tidb
Our great sponsors
tiflash | tidb | |
---|---|---|
5 | 27 | |
929 | 36,134 | |
1.3% | 1.0% | |
9.7 | 10.0 | |
2 days ago | 2 days ago | |
C++ | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tiflash
-
Significantly faster quicksort using SIMD
This is great, and can definitely help quite a lot database and big data projects. I can immediately imagine this is a perfect match to one open source HTAP system (https://github.com/tigraph/tidb) which uses SIMD in their columnar processing engine TiFlash (https://github.com/pingcap/tiflash).
-
Best language for database kernel development?
One of the founder of TiDB/TiKV here from [PingCAP](https://pingcap.com)
I have been thinking about this problem with my peers when I started to build [TiDB](https://github.com/pingcap/tidb) seven years ago. At that time, nearly all of us were familiar with Go language, so we decided to use Go to build the SQL layer of TiDB. Thanks to Go, we could develop TiDB very quickly and released the first MVP in half a year. I remembered clearly the sense when we ran TPC-C successfully, although the TPMC was just 1 at that time, this was a good start for us.
But Go had some problems, e.g. the GC was not good before, the fair scheduling might cause some latency problem, or data racing may happen sometimes. So when we decided to build a distributed storage (aha, [TiKV](https://githbu.com/tikv/tikv), we wanted use another language to guarantee safety. I really admire our courage - we chose Rust which was just released 1.0 and missed lots of libraries at that time. Now it seems that this is an awesome choice, TiKV has been graduated from CNCF, and been used as building block not only for TiDB, but also for other distributed systems. Thanks Rust.
When TiDB started being used in many companies, we found that our customer not only ran lots of online transactions in TiDB, but also they wanted to ran some realtime analytic queries directly because the data has been in TiDB already. So we decided to build a HTAP database, to introduce a column storage beside TiKV, this is [TiFlash](https://github.com/pingcap/tiflash). We build TiFlash based on Clickhouse, so of course, we use C++.
As you can see, to build only one integrated database - TiDB, we at least use three languages, every language has its own reason to be introduced. We can treat the distributed database as a service system, each service can be built with your favorite language and the services are linked by gRPC like TiDB does now. You may doubt that - “hey, guys, you are building a database, performance is very importance”. Yes, this is true, but we also build a complex distributed system, especially on the cloud. Scale-out, elastic, user experience must be important too. This is trade off for an engineer :-)
- TiFlash: The columnar storage engine of TiDB, is now open sourced
- Tiflash, Yet another columnar storage engine based on ClickHouse
- TiFlash: Analytical Engine for TiDB
tidb
-
A MySQL compatible database engine written in pure Go
tidb has been around for a while, it is distributed, written in Go and Rust, and MySQL compatible. https://github.com/pingcap/tidb
Somewhat relatedly, StarRocks is also MySQL compatible, written in Java and C++, but it's tackling OLAP use-cases. https://github.com/StarRocks/starrocks
-
Show HN: GitHub Organization Analytics
It's MySQL-Compatible database for scale and real-time analytics https://github.com/pingcap/tidb
- TiDB: An open-source distributed MySQL compatible database
- TiDB: Open-source, cloud-native, distributed, MySQL compatible database
- Embed hard-coded SQL into binaries for a cleaner look!
-
Ask HN: Who is hiring? (January 2023)
PingCAP | https://www.pingcap.com | Database Engineer, Product Manager, Developer Advocate and more | Remote in California | Full-time
We work on a MySQL compatible distributed database called TiDB https://github.com/pingcap/tidb/ and key-value store called TiKV.
TiDB is written in Go and TiKV is written in Rust.
More roles and locations are available on https://www.pingcap.com/careers/
-
Banco de dados puramente com go
Pesquise por CockroachDB ou TiDB
- MySQL-mimic - Python implementation of the MySQL server wire protocol.
- Apache Pegasus – A a distributed key-value storage system
-
What is your experience with mixed workload (OLTP and OLAP) databases?
OLTP usually comes with high throughput of transactions, which means usually write(e.g., IUD - insert, update, delete) to read (e.g., select) ratio is above 4 or 5 or even higher. There are some good benchmarks to test OLTP workload like TPC-C (https://www.tpc.org/tpcc/), and some benchmarks to test OLAP workload like TPC-H (https://www.tpc.org/tpch/). For mixed or hybrid OLTP and OLAP (it's called HTAP, see this blog for some background https://en.pingcap.com/blog/the-beauty-of-htap-tidb-and-allo...), TPC-H was originally designed for this, however, it actually doesn't reveal the real world workload with several drawbacks. A newer research work from UC Berkeley proposed a HTAP benchmark called TAOBench (https://www.vldb.org/pvldb/vol15/p1965-cheng.pdf) which is pretty interesting and worthy to check.
For the HTAP systems, as mentioned in the above blog, there are quite a few industrial products, like Google just announced AlloyDB (https://cloud.google.com/alloydb), Snowflake's UniStore (https://www.snowflake.com/workloads/unistore/), and one of the most popular open source projects TiDB (https://github.com/pingcap/tidb) which have been deployed by many business applications.
Hopefully these may help a little bit :-)
What are some alternatives?
vops
vitess - Vitess is a database clustering system for horizontal scaling of MySQL.
cockroach - CockroachDB - the open source, cloud-native distributed SQL database.
oceanbase - OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.
InfluxDB - Scalable datastore for metrics, events, and real-time analytics
go-mysql-elasticsearch - Sync MySQL data into elasticsearch
go-mysql - a powerful mysql toolset with Go
dgraph - The high-performance database for modern applications
kingshard - A high-performance MySQL proxy
jaeger - CNCF Jaeger, a Distributed Tracing Platform
etcd - Distributed reliable key-value store for the most critical data of a distributed system
migrate - Database migrations. CLI and Golang library.