baleen
vector
Our great sponsors
baleen | vector | |
---|---|---|
1 | 95 | |
16 | 16,187 | |
- | 5.1% | |
0.0 | 9.9 | |
8 months ago | 6 days ago | |
Kotlin | Rust | |
BSD 3-clause "New" or "Revised" License | Mozilla Public License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
baleen
-
Cue, an open-source data validation language
If you are looking to do data validation from the JVM, you may try Baleen (written in Kotlin): https://github.com/ShopRunner/baleen/
I'm one of the contributors. We created a DSL in the language to describe the data and create tests. You can then use that data description to validate against json, csv, avro... One of the neat things we came up with was the concept of a data trace which is like a stack trace but is a path through the data to a particular error.
vector
- FLaNK AI Weekly 18 March 2024
-
Vector: A high-performance observability data pipeline
Datadog bought Timber Technologies (creators of Vector) two years ago. https://www.datadoghq.com/blog/datadog-acquires-timber-techn...
Timber definitely intended to just rock out & demolish everything else out there with their agent/forwarder/aggregator tech. But it wasn't a competitive play against OTel, in my humble opinion. Timber's whole shtick is that it integrates with everything, with really flexible/good glue logic in-between. A competent multi-system (logging, metrics, eventually traces) fluentd++. OTel - I want to believe - would have been part of that original vision.
It's just taking a really really long time. One can speculate how direction & velocity might have changed since the Datadog acquisition. The lack of tracing (anywhere except Datadog, so far) materializing has been a hard hard hard & sad thing to see. OG https://github.com/vectordotdev/vector/issues/1444 and newer https://github.com/vectordotdev/vector/issues/17307
Vector is fantastic software. Currently running a multi-GB/s log pipeline with it. Vector agents as DaemonSets collecting pod and journald logs then forwarding w/ vector's protobuf protocol to a central vector aggregator Deployment with various sinks - s3, gcs/bigquery, loki, prom.
The documentation is great but it can be hard to find examples of common patterns, although it's getting better with time and a growing audience.
My pro-tip has been to prefix your searches with "vector dev A recent contribution added an alternative to prometheus pushgateway that handles counters better: https://github.com/vectordotdev/vector/issues/10304#issuecom...
-
About reading logs
We don't pull logs, we forward logs to a centralized logging service.
-
Self hosted log paraer
opensearch - amazon fork of Elasticsearch https://opensearch.org/docs/latestif you do this an have distributed log sources you'd use logstash for, bin off logstash and use vector (https://vector.dev/) its better out of the box for SaaS stuff.
-
Show HN: Homelab Monitoring Setup with Grafana
I think there's nothing currently that combines both logging and metrics into one easy package and visualizes it, but it's also something I would love to have.
Vector[1] would work as the agent, being able to collect both logs and metrics. But the issue would then be storing it. I'm assuming the Elastic Stack might now be able to do both, but it's just to heavy to deal with in a small setup.
A couple of months ago I took a brief look at that when setting up logging for my own homelab (https://pv.wtf/posts/logging-and-the-homelab). Mostly looking at the memory usage to fit it on my synology. Quickwit[2] and Log-Store[3] both come with built in web interfaces that reduce the need for grafana, but neither of them do metrics.
- [1] https://vector.dev
-
Lightweight logging on RPi?
I would recommend that you run vector as a systems service so you don't have to worry about managing it. Here is a basic config to do that - https://github.com/vectordotdev/vector/blob/master/distribution/systemd/vector.service .
-
Monitoring traefik access logs easily
You could have a look at Grafana Loki, it's easy to run (single binary for a small setup). Shipping your logs can be done by Promtail or something like Vector. They're both lightweight log shippers with support for Loki.
- Ask HN: How to build an image search service?
What are some alternatives?
graylog - Free and open log management
Fluentd - Fluentd: Unified Logging Layer (project under CNCF)
agent - Vendor-neutral programmable observability pipelines.
syslog-ng - syslog-ng is an enhanced log daemon, supporting a wide range of input and output methods: syslog, unstructured text, queueing, SQL & NoSQL.
OpenSearch - 🔎 Open source distributed and RESTful search engine.
tracing - Application level tracing for Rust.
qryn - qryn is a polyglot, high-performance observability framework for ClickHouse. Ingest, store and analyze logs, metrics and telemetry traces from any agent supporting Loki, Prometheus, OTLP, Tempo, Elastic, InfluxDB and many more formats and query transparently using Grafana or any other compatible client.
thanos - Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
opensearch - OpenSearch is a collection of simple formats for the sharing of search results.
helm-charts
core - OPNsense GUI, API and systems backend
kube-prometheus - Use Prometheus to monitor Kubernetes and applications running on Kubernetes