trusearch
clickhouse-backup
trusearch | clickhouse-backup | |
---|---|---|
1 | 5 | |
5 | 1,154 | |
- | 2.3% | |
5.4 | 9.7 | |
almost 3 years ago | 8 days ago | |
Go | Go | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
trusearch
-
Roll Your Own Minilanguages with Mini-Interpreters (1989)
These days you can benefit from minilanguages without actually implementing them. It is surprisingly easy to embed JS or Lua into your program.
Recently I was looking for Linux games at some Russian torrent site which match my taste. Search engine on it is not particularly good, it allows search only on torrent title. While each torrent description contains genre, year and a lot of other info which is pretty suitable for machine interpretation.
So I found XML dump of tracker database (about 22 GB) with torrent descriptions and built tool which allows to process torrent records with an arbitrary JS script to filter records. Like AWK, but suited for specific XML input and scripted with JS.
This way I could filter games based on year, genre and gaming engine. JS engine I used is implemented in pure Go, so it doesn't introduced additional runtime dependencies. With such approach it's easy to write any search query, aggregation and so on.
BTW, here is the project link: https://github.com/Snawoot/trusearch
clickhouse-backup
-
Backing up Plausible Analytics database
I set up Plausible Analytics in my Kubernetes cluster and am trying to figure out how to properly back up and restore the Clickhouse database. I am trying to use https://github.com/AlexAkulov/clickhouse-backup but it only supports table of the MergeTree family. Plausible uses a table named schema_migrations which is of type TinyLog, so it's skipped during backups, making restores useless (because the table is restored empty, when Plausible starts it will try to run all the migrations which will fail because the other tables already exist).
-
Difference in data size for the Clickhouse backups
already fixed on https://github.com/AlexAkulov/clickhouse-backup/issues/224
-
The ClickHouse Community
Similar experience to vulkoingim: steep learning curve but quite stable once deployed properly.
Schema management in zookeeper has been the biggest pain point for us. Occasionally individual clickhouse shards will get out of sync during a schema update, which can be hard to diagnose.
We use a heavily modified version of clickhouse-backup[1], which works well for us.
As for hands-off replica reboot: you must have an automated process to reapply the same schema which exists in zookeeper, otherwise it won't resync. If the local schema gets out of sync with that in zookeeper, then you'll have issues again.
I expect a lot of these ergonomics issues will be fixed over time. It's already much easier to use than it was 3 years ago, and even if progress on usability and reducing the learning curve is slow the database performance makes it worth it.
[1] https://github.com/AlexAkulov/clickhouse-backup
-
ClickHouse incremental backups
clickhouse-backup allows us to perform local backups, that are always full backups, and full or incremental uploads to remote storage. In my previous post I talked about how to perform full backups and uploads. Now we are going to review all the steps required to work with incremental uploads. This way we could upload a weekly full backup to our remote storage and perform daily incremental uploads.
-
Backup and restore with clickhouse-backup
We can automate this process thanks to clickhouse-backup.
What are some alternatives?
goroutine-inspect - An interactive tool to analyze Golang goroutine dump.
chproxy - Open-Source ClickHouse http proxy and load balancer
slackdump - Save or export your private and public Slack messages, threads, files, and users locally without admin privileges.
jaeger-clickhouse - Jaeger ClickHouse storage plugin implementation
onedump - Effortlessly database dump with one command (and zero dependencies WIP).
Trickster - Open Source HTTP Reverse Proxy Cache and Time Series Dashboard Accelerator
xj2go - Convert xml and json to go struct
s3backup - A super simple solution for backup
infernal_js - Infernal Runner CPC (HTML5)
flow-pipeline - A set of tools and examples to run a flow-pipeline (sFlow, NetFlow)
wal-g - Archival and Restoration for databases in the Cloud
clickhouse-bulk - Collects many small inserts to ClickHouse and send in big inserts