fst
vector
Our great sponsors
fst | vector | |
---|---|---|
11 | 95 | |
1,698 | 16,187 | |
- | 5.1% | |
3.5 | 9.9 | |
2 months ago | 6 days ago | |
Rust | Rust | |
The Unlicense | Mozilla Public License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
fst
-
How to use mmap safely in Rust?
The fst crate effectively relies on mmap for it to work right. The folks here suggesting you just use the heap might be right, but only if using the heap is actually plausible. If your dictionary is GBs big (an FST might be bigger than available memory), then copying it the heap first would be disastrous.
-
Official /r/rust "Who's Hiring" thread for job-seekers and job-offerers [Rust 1.64]
You'll love what we're working on if you're interested in the implementation of:- Tantivy- Meilisearch- Finite State Transducers
-
rustc is unacceptably slow compiling long lists of constant slices
Here's an example of longest prefix matching using a FST which I based my approach on: https://github.com/BurntSushi/fst/pull/104/files
-
Official /r/rust "Who's Hiring" thread for job-seekers and job-offerers [Rust 1.63]
Finite State Transducers
-
Wikit Desktop - A dictionary application using tauri GUI framework
As a result, I have a plan to implement a desktop version from then and I finished today with a beta version. The desktop is based on tauri, and the dictionary index algorithm is FST (it is an awesome index algorithm).
-
WordBueno.com online dictionary. Fast, no frills, mobile friendly.
WordBueno’s data is currently derived from Wiktionary. The backend is using Rust’s warp with fst for indexing.
- Show HN: WordBueno: sleek dictionary built with Rust and Svelte
-
Speed of Rust vs. C
No you don't. I've written multiple programs that load things instantly off the file system via memory maps. See the fst crate[1], for example, which is designed to work with memory maps.
Rust "works badly with memory mapped files" doesn't mean, "Rust can't use memory mapped files." It means, "it is difficult to reconcile Rust's safety story with memory maps." ripgrep for example uses memory maps because they are faster sometimes, and its safety contract[2] is a bit strained. But it works.
[1] - https://github.com/BurntSushi/fst/
[2] - https://docs.rs/grep-searcher/0.1.7/grep_searcher/struct.Mma...
-
Debian discusses vendoring again
Good catch. That's a lapse on my part. I typically would not use a crate for something like that. I've implemented fnv several times: here, here and here. Looks like I just didn't do that for globset.
vector
- FLaNK AI Weekly 18 March 2024
-
Vector: A high-performance observability data pipeline
Datadog bought Timber Technologies (creators of Vector) two years ago. https://www.datadoghq.com/blog/datadog-acquires-timber-techn...
Timber definitely intended to just rock out & demolish everything else out there with their agent/forwarder/aggregator tech. But it wasn't a competitive play against OTel, in my humble opinion. Timber's whole shtick is that it integrates with everything, with really flexible/good glue logic in-between. A competent multi-system (logging, metrics, eventually traces) fluentd++. OTel - I want to believe - would have been part of that original vision.
It's just taking a really really long time. One can speculate how direction & velocity might have changed since the Datadog acquisition. The lack of tracing (anywhere except Datadog, so far) materializing has been a hard hard hard & sad thing to see. OG https://github.com/vectordotdev/vector/issues/1444 and newer https://github.com/vectordotdev/vector/issues/17307
Vector is fantastic software. Currently running a multi-GB/s log pipeline with it. Vector agents as DaemonSets collecting pod and journald logs then forwarding w/ vector's protobuf protocol to a central vector aggregator Deployment with various sinks - s3, gcs/bigquery, loki, prom.
The documentation is great but it can be hard to find examples of common patterns, although it's getting better with time and a growing audience.
My pro-tip has been to prefix your searches with "vector dev A recent contribution added an alternative to prometheus pushgateway that handles counters better: https://github.com/vectordotdev/vector/issues/10304#issuecom...
-
About reading logs
We don't pull logs, we forward logs to a centralized logging service.
-
Self hosted log paraer
opensearch - amazon fork of Elasticsearch https://opensearch.org/docs/latestif you do this an have distributed log sources you'd use logstash for, bin off logstash and use vector (https://vector.dev/) its better out of the box for SaaS stuff.
-
Show HN: Homelab Monitoring Setup with Grafana
I think there's nothing currently that combines both logging and metrics into one easy package and visualizes it, but it's also something I would love to have.
Vector[1] would work as the agent, being able to collect both logs and metrics. But the issue would then be storing it. I'm assuming the Elastic Stack might now be able to do both, but it's just to heavy to deal with in a small setup.
A couple of months ago I took a brief look at that when setting up logging for my own homelab (https://pv.wtf/posts/logging-and-the-homelab). Mostly looking at the memory usage to fit it on my synology. Quickwit[2] and Log-Store[3] both come with built in web interfaces that reduce the need for grafana, but neither of them do metrics.
- [1] https://vector.dev
-
Lightweight logging on RPi?
I would recommend that you run vector as a systems service so you don't have to worry about managing it. Here is a basic config to do that - https://github.com/vectordotdev/vector/blob/master/distribution/systemd/vector.service .
-
Monitoring traefik access logs easily
You could have a look at Grafana Loki, it's easy to run (single binary for a small setup). Shipping your logs can be done by Promtail or something like Vector. They're both lightweight log shippers with support for Loki.
- Ask HN: How to build an image search service?
What are some alternatives?
graylog - Free and open log management
Fluentd - Fluentd: Unified Logging Layer (project under CNCF)
agent - Vendor-neutral programmable observability pipelines.
syslog-ng - syslog-ng is an enhanced log daemon, supporting a wide range of input and output methods: syslog, unstructured text, queueing, SQL & NoSQL.
OpenSearch - 🔎 Open source distributed and RESTful search engine.
tracing - Application level tracing for Rust.
qryn - qryn is a polyglot, high-performance observability framework for ClickHouse. Ingest, store and analyze logs, metrics and telemetry traces from any agent supporting Loki, Prometheus, OTLP, Tempo, Elastic, InfluxDB and many more formats and query transparently using Grafana or any other compatible client.
thanos - Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
opensearch - OpenSearch is a collection of simple formats for the sharing of search results.
helm-charts
core - OPNsense GUI, API and systems backend
kube-prometheus - Use Prometheus to monitor Kubernetes and applications running on Kubernetes