-
A related database using ideas from Clickhouse:
https://github.com/VictoriaMetrics/VictoriaMetrics
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
I found Apache Luecene really easy to use, but haven't experienced it at scale:
https://lucene.apache.org/
-
Promtail/Loki https://github.com/grafana/loki is an alternative to elk, but while it seems more lightweight, it definitely is less featureful. The integration with grafana/prometheus seems nice, but I've only toyed with it, not used in production.
-
Typesense
Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences
-
> SQL is a perfect language for analytics.
Slightly off topic, but I strongly agree with this statement and wonder why the languages used for a lot of data science work (R, Python) don't have such a strong focus on SQL.
It might just be my brain, but SQL makes so much logical sense as a query language and, with small variances, is used to directly query so many databases.
In R, why learn the data.tables (OK, speed) or dplyr paradigms, when SQL can be easily applied directly to dataframes? There are libraries to support this like sqldf[1], tidyquery[2] and duckdf[3] (author). And I'm sure the situation is similar in Python.
This is not a post against great libraries like data.table and dplyr, which I do use from time to time. It's more of a question about why SQL is not more popular as the query language de jour for data science.
[1] https://cran.r-project.org/web/packages/sqldf/index.html
[2] https://github.com/ianmcook/tidyquery
[3] https://github.com/phillc73/duckdf
-
MeiliSearch
A lightning-fast search engine API bringing AI-powered hybrid search to your sites and applications.
https://github.com/meilisearch/MeiliSearch gets a lot of traction recently. There is also Sphinx and its fork https://manticoresearch.com/ - very lightweight and fast.
-
Yeah, I agree sqldf is quite slow. Fair point.
As you've seen, duckdb registers an "R data frame as a virtual table." I'm not sure what they mean by "yet" either.
Of course it is possible to write an R dataframe to an on-disk duckdb table, if that's what you want to do.
There are some simple benchmarks on the bottom of the duckdf README[1]. Essentially I found for basic SQL SELECT queries, dplyr is quicker, but for much more complex queries, the duckdf/duckdb combination performs better.
If you really want speed of course, just use data.table.
[1] https://github.com/phillc73/duckdf
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
> SQL doesn't compose all that well.
On that topic, I really enjoy working in Elixir because Ecto [1] lets you write "SQL" with Elixir's composable functional syntax. It sits somewhere between "the language is compiled to SQL" and ORM. The Ruby-esque syntax took some getting used to, but once I was past that hurdle my productivity skyrocketed. It's not 100% feature complete compatibility with all the different SQL dialects, but most of what you'll need is there.
[1] https://github.com/elixir-ecto/ecto
-
* you can go the other way too: read Clickhouse from PostgreSQL (see https://github.com/Percona-Lab/clickhousedb_fdw, although we didn't try this)
-
Could you provide more details about the limited of JOIN capabilities? AFAIK, Clickhouse has multiple join algorithms and supports on-disk joins to avoid out of memory:
https://github.com/ClickHouse/ClickHouse/issues/10830
https://github.com/ClickHouse/ClickHouse/issues/9702#issueco...
-
I just there was a foss loki-like solution built on ch - that was stable and used in production.
I know there's a few projects (see below) - but I'm not aware of anything mature..
https://github.com/QXIP/cloki-go
https://github.com/lmangani/cloki
-
sonic
🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.
I'm personally very fond of sonic [0] for full text search.
> Sonic can be used as a simple alternative to super-heavy and full-featured search backends such as Elasticsearch in some use-cases. It is capable of normalizing natural language search queries, auto-completing a search query and providing the most relevant results for a query....
> When reviewing Elasticsearch (ELS) and others, we found those were full-featured heavyweight systems that did not scale well with Crisp's freemium-based cost structure.
> At the end, we decided to build our own search backend, designed to be simple and lightweight on resources
[0] - https://github.com/valeriansaliou/sonic
-
It is good. I can't find any CDC for Postgres for the incremental sync. And so I had to use the bulk update/sync and that causes performance issues occasionally. Also, some Algolia features are not available yet https://github.com/meilisearch/instant-meilisearch/issues/21...
Related posts
-
Outgrowing Postgres: Handling increased user concurrency
-
JOOQ Is Not a Replacement for Hibernate. They Solve Different Problems
-
liquibase alternatives - Logidze and dolt
3 projects | 19 Nov 2024 -
Optimize Database Performance in Ruby on Rails and ActiveRecord
-
Replication: cluster creation, joining, updating table settings