ClickHouse as an alternative to Elasticsearch for log storage and analysis

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • VictoriaMetrics

    VictoriaMetrics: fast, cost-effective monitoring solution and time series database

  • A related database using ideas from Clickhouse:

    https://github.com/VictoriaMetrics/VictoriaMetrics

  • Apache Solr

    Apache Lucene and Solr open-source search software

  • I found Apache Luecene really easy to use, but haven't experienced it at scale:

    https://lucene.apache.org/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • loki

    Like Prometheus, but for logs.

  • Promtail/Loki https://github.com/grafana/loki is an alternative to elk, but while it seems more lightweight, it definitely is less featureful. The integration with grafana/prometheus seems nice, but I've only toyed with it, not used in production.

  • Typesense

    Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

  • tidyquery

    Query R data frames with SQL

  • > SQL is a perfect language for analytics.

    Slightly off topic, but I strongly agree with this statement and wonder why the languages used for a lot of data science work (R, Python) don't have such a strong focus on SQL.

    It might just be my brain, but SQL makes so much logical sense as a query language and, with small variances, is used to directly query so many databases.

    In R, why learn the data.tables (OK, speed) or dplyr paradigms, when SQL can be easily applied directly to dataframes? There are libraries to support this like sqldf[1], tidyquery[2] and duckdf[3] (author). And I'm sure the situation is similar in Python.

    This is not a post against great libraries like data.table and dplyr, which I do use from time to time. It's more of a question about why SQL is not more popular as the query language de jour for data science.

    [1] https://cran.r-project.org/web/packages/sqldf/index.html

    [2] https://github.com/ianmcook/tidyquery

    [3] https://github.com/phillc73/duckdf

  • MeiliSearch

    A lightning-fast search API that fits effortlessly into your apps, websites, and workflow

  • https://github.com/meilisearch/MeiliSearch gets a lot of traction recently. There is also Sphinx and its fork https://manticoresearch.com/ - very lightweight and fast.

  • duckdf

    🦆 SQL for R dataframes, with ducks

  • Yeah, I agree sqldf is quite slow. Fair point.

    As you've seen, duckdb registers an "R data frame as a virtual table." I'm not sure what they mean by "yet" either.

    Of course it is possible to write an R dataframe to an on-disk duckdb table, if that's what you want to do.

    There are some simple benchmarks on the bottom of the duckdf README[1]. Essentially I found for basic SQL SELECT queries, dplyr is quicker, but for much more complex queries, the duckdf/duckdb combination performs better.

    If you really want speed of course, just use data.table.

    [1] https://github.com/phillc73/duckdf

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • ecto

    A toolkit for data mapping and language integrated query.

  • > SQL doesn't compose all that well.

    On that topic, I really enjoy working in Elixir because Ecto [1] lets you write "SQL" with Elixir's composable functional syntax. It sits somewhere between "the language is compiled to SQL" and ORM. The Ruby-esque syntax took some getting used to, but once I was past that hurdle my productivity skyrocketed. It's not 100% feature complete compatibility with all the different SQL dialects, but most of what you'll need is there.

    [1] https://github.com/elixir-ecto/ecto

  • clickhousedb_fdw

    PostgreSQL's Foreign Data Wrapper For ClickHouse

  • * you can go the other way too: read Clickhouse from PostgreSQL (see https://github.com/Percona-Lab/clickhousedb_fdw, although we didn't try this)

  • ClickHouse

    ClickHouse® is a free analytics DBMS for big data

  • Could you provide more details about the limited of JOIN capabilities? AFAIK, Clickhouse has multiple join algorithms and supports on-disk joins to avoid out of memory:

    https://github.com/ClickHouse/ClickHouse/issues/10830

    https://github.com/ClickHouse/ClickHouse/issues/9702#issueco...

  • cloki-go-legacy

    Discontinued Clickhouse Loki API in GO (WIP)

  • I just there was a foss loki-like solution built on ch - that was stable and used in production.

    I know there's a few projects (see below) - but I'm not aware of anything mature..

    https://github.com/QXIP/cloki-go

    https://github.com/lmangani/cloki

  • sonic

    🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.

  • I'm personally very fond of sonic [0] for full text search.

    > Sonic can be used as a simple alternative to super-heavy and full-featured search backends such as Elasticsearch in some use-cases. It is capable of normalizing natural language search queries, auto-completing a search query and providing the most relevant results for a query....

    > When reviewing Elasticsearch (ELS) and others, we found those were full-featured heavyweight systems that did not scale well with Crisp's freemium-based cost structure.

    > At the end, we decided to build our own search backend, designed to be simple and lightweight on resources

    [0] - https://github.com/valeriansaliou/sonic

  • meilisearch-js-plugins

    The search client to use Meilisearch with InstantSearch.

  • It is good. I can't find any CDC for Postgres for the incremental sync. And so I had to use the bulk update/sync and that causes performance issues occasionally. Also, some Algolia features are not available yet https://github.com/meilisearch/instant-meilisearch/issues/21...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • What is Hybrid Search?

    6 projects | dev.to | 6 Feb 2024
  • SQL-Parsing

    2 projects | /r/SQL | 25 Jun 2023
  • Seeking a free full text search solution for large data with progress display

    5 projects | /r/golang | 26 May 2023
  • What Is a Vector Database

    22 projects | news.ycombinator.com | 5 May 2023
  • Meilisearch v1.0 – the open-source Rust alternative to Algolia and Elasticsearch

    8 projects | news.ycombinator.com | 8 Feb 2023