Grafana Mimir – 1B active series TSDB

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • mimir

    Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.

  • It's hard to tell exactly how this works but judging from the tutorial's docker-compose.yml [0] it looks like this runs as a separate API next to Prometheus and you tell Prometheus to write [1] to Mimir. I'm unclear how reads work from it or maybe they don't.

    Maybe I'm completely misunderstanding.

    [0] https://github.com/grafana/mimir/blob/main/docs/sources/tuto...

    [1] https://github.com/grafana/mimir/blob/main/docs/sources/tuto...

  • thanos

    Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.

  • > I can't find any other open source time series database except Mimir/Cortex which allows this much scale (clustering options in their open source version)

    The following open source time series databases also can scale horizontally to many nodes:

    - Thanos - https://github.com/thanos-io/thanos/

    - M3 - https://github.com/m3db/m3

    - Cluster version of VictoriaMetrics - https://docs.victoriametrics.com/Cluster-VictoriaMetrics.htm... (I'm CTO at VictoriaMetrics)

    > Can we use Prometheus/Mimir as general purpose time series database?

    This depends on what do you mean under "general purpose time series database". Prometheus/Mimir are optimized for storing (timestamp, value) series where timestamp is a unix timestamp in milliseconds and value is a floating-point number. Each series has a name and can have arbitrary set of additional (label=value) labels. Prometheus/Mimir aren't optimized for storing and processing series of other value types such as strings (aka logs) and complex datastructures (aka events and traces).

    So, if you need storing time series with floating-point values, then Prometheus/Mimir may be a good fit. Otherwise take a look at ClickHouse [1] - it can efficiently store and process time series with values of arbitrary types.

    [1] https://clickhouse.com/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • prometheus

    The Prometheus monitoring system and time series database.

  • Which restrictions do you have in mind?

    Quick look at the issue looks like it wanted to avoid using local storage by Prometheus, but that’s Prometheus specific problem, not remote-read problem.

    Remote-read is a generic protocol (https://github.com/prometheus/prometheus/blob/a1121efc18ba15...), you pass query (start/end time and matchers), and get back data.

  • dskit

    Distributed systems kit

  • Much of the architecture is similar; a number of components are shared in https://github.com/grafana/dskit.

  • swagger-editor

    Swagger Editor

  • Grafana alerts (before version 8) worked great. We use them, but the Grafana 8 alerting features are half-baked at best.

    * Grafana 8 alerts removed the Image Preview, which was extremely useful during issues.

    * Grafana 8 alerts don't have any way of being stored as code. In fact the API that they provide in their docs [0][1] doesn't work, or isn't up to date.

    * The expression languages have zero documentation about them, so aren't exactly useful for things that might get a developer out of bed in the middle of the night.

    [0] https://editor.swagger.io/?url=https://raw.githubusercontent...

  • cortex

    A horizontally scalable, highly available, multi-tenant, long term Prometheus. (by cortexproject)

  • Disclosure: I work for AWS, but I don't work on the Amazon Managed Service for Prometheus. I have my own very long held opinions about Free and Open Source software, and I am only speaking for myself.

    To me, the AGPLv3 license isn't about forcing software users to "give changes back" to a project. It is about giving the permissions to users of software that are necessary for Software Freedom [1] when they access a program over a network. In practice, that means that changes often flow "upstream" to copyleft licensed programs one way or another. But it was never about obligating changes to be "given back" to upstream. In my personal opinion, you should be "free to fork" Free and Open Source Software (FOSS). Indeed, the Grafana folks seem to have decided to do that with Grafana Mimir.

    Personally, I hope that they accept contributions under the AGPLv3 license, and hold themselves to the same obligations that others are held to with regard to providing corresponding source code of derivative works when it is made available to users over a network. In my personal opinion, too often companies use a contributor agreement that excuses them from those obligations, and also allows them to sell the software to others under licenses that do not carry copyleft obligations. See [2] for a blog post that goes into some detail about this.

    If you look at the Coretex project MAINTAINERS file [3], you will see that there are two folks listed that currently work at AWS, but no other company other than Grafana Labs today. I would love to see more diversity in maintainers for a project like this, as I think too many maintainers from any one company isn't the best for long term project sustainability.

    I think if you look at the Cortex Community Meeting minutes [4], you can see that AWS folks are regularly "showing up" in healthy numbers, and working collaboratively with anyone who accepts the open invitation to participate. There have been some pretty big improvements to Coretex that have merged lately, like some of the work on parallel compaction [5, 6].

    TL;DR, I think it is easy to jump to some conclusions about how things are going in a FOSS project that don't hold water if you do some cursory exploration. I think best way to know what's going on in a project is to get involved!

    --

    [1] the rights needed to: run the program for any purpose; to study how the program works, and modify it; to redistribute copies; to distribute copies of modified versions to others

    [2] https://meshedinsights.com/2021/06/14/legally-ignoring-the-l...

    [3] https://github.com/cortexproject/cortex/blob/master/MAINTAIN...

    [4] https://docs.google.com/document/d/1shtXSAqp3t7fiC-9uZcKkq3m...

    [5] https://aws.amazon.com/blogs/opensource/scaling-cortex-with-...

    [6] https://github.com/cortexproject/cortex/pull/4624

  • m3

    M3 monorepo - Distributed TSDB, Aggregator and Query Engine, Prometheus Sidecar, Graphite Compatible, Metrics Platform

  • > I can't find any other open source time series database except Mimir/Cortex which allows this much scale (clustering options in their open source version)

    The following open source time series databases also can scale horizontally to many nodes:

    - Thanos - https://github.com/thanos-io/thanos/

    - M3 - https://github.com/m3db/m3

    - Cluster version of VictoriaMetrics - https://docs.victoriametrics.com/Cluster-VictoriaMetrics.htm... (I'm CTO at VictoriaMetrics)

    > Can we use Prometheus/Mimir as general purpose time series database?

    This depends on what do you mean under "general purpose time series database". Prometheus/Mimir are optimized for storing (timestamp, value) series where timestamp is a unix timestamp in milliseconds and value is a floating-point number. Each series has a name and can have arbitrary set of additional (label=value) labels. Prometheus/Mimir aren't optimized for storing and processing series of other value types such as strings (aka logs) and complex datastructures (aka events and traces).

    So, if you need storing time series with floating-point values, then Prometheus/Mimir may be a good fit. Otherwise take a look at ClickHouse [1] - it can efficiently store and process time series with values of arbitrary types.

    [1] https://clickhouse.com/

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • ClickHouse

    ClickHouse® is a free analytics DBMS for big data

  • > I can't find any other open source time series database except Mimir/Cortex which allows this much scale (clustering options in their open source version)

    The following open source time series databases also can scale horizontally to many nodes:

    - Thanos - https://github.com/thanos-io/thanos/

    - M3 - https://github.com/m3db/m3

    - Cluster version of VictoriaMetrics - https://docs.victoriametrics.com/Cluster-VictoriaMetrics.htm... (I'm CTO at VictoriaMetrics)

    > Can we use Prometheus/Mimir as general purpose time series database?

    This depends on what do you mean under "general purpose time series database". Prometheus/Mimir are optimized for storing (timestamp, value) series where timestamp is a unix timestamp in milliseconds and value is a floating-point number. Each series has a name and can have arbitrary set of additional (label=value) labels. Prometheus/Mimir aren't optimized for storing and processing series of other value types such as strings (aka logs) and complex datastructures (aka events and traces).

    So, if you need storing time series with floating-point values, then Prometheus/Mimir may be a good fit. Otherwise take a look at ClickHouse [1] - it can efficiently store and process time series with values of arbitrary types.

    [1] https://clickhouse.com/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Tools for frontend monitoring with Prometheus

    6 projects | dev.to | 9 Apr 2024
  • Monitoring Customer System without direct connection

    4 projects | /r/sysadmin | 28 Feb 2021
  • Observability at KubeCon + CloudNativeCon Europe 2024 in Paris

    7 projects | dev.to | 26 Mar 2024
  • 4 facets of API monitoring you should implement

    3 projects | dev.to | 2 Mar 2024
  • Root Cause Chronicles: Quivering Queue

    5 projects | dev.to | 16 Jan 2024