duckdf
VictoriaMetrics
duckdf | VictoriaMetrics | |
---|---|---|
3 | 97 | |
41 | 10,900 | |
- | 2.3% | |
0.0 | 9.9 | |
4 months ago | about 15 hours ago | |
R | Go | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
duckdf
-
DuckDB – in-process SQL OLAP database management system
Quite a while ago, when duckdb was just a duckling, I wrote an R package that supported direct manipulation of R dataframes using SQL.[1] duckdb was the engine for this.
The approach was never as fast as data.table but did approach the speed of dplyr for more complex queries.
Life had other things in store for me and I haven’t touched this library for a while now.
At the time there was no Julia connector for duckdb, but now that there is, I’d like to try this approach in that language.
[1] https://github.com/phillc73/duckdf
-
ClickHouse as an alternative to Elasticsearch for log storage and analysis
Yeah, I agree sqldf is quite slow. Fair point.
As you've seen, duckdb registers an "R data frame as a virtual table." I'm not sure what they mean by "yet" either.
Of course it is possible to write an R dataframe to an on-disk duckdb table, if that's what you want to do.
There are some simple benchmarks on the bottom of the duckdf README[1]. Essentially I found for basic SQL SELECT queries, dplyr is quicker, but for much more complex queries, the duckdf/duckdb combination performs better.
If you really want speed of course, just use data.table.
[1] https://github.com/phillc73/duckdf
-
Julia 1.6: what has changed since Julia 1.0?
That's a really good point that I'd not really thought about. I'd never really considered the difference between calling just functions versus macros.
Thinking about Query.jl and DataFramesMeta.jl, and I am for sure not an expert in either, I can't specifically speak to your `head` example, but other base functions can be combined with macros. For example, see the LINQ examples from DataFramesMeta.jl[1] where `mean` is being used. Or again the LINQ style examples in Query.jl[2], where `descending` is used in the first example, or `length` later in the Grouping examples.
Is that the kind of thing you meant?
For whatever reason, with the way my brain is wired, the LINQ style of query just works for me. I have never directly used LINQ, but do have some SQL experience. In fact, I wrote some dinky little wrapper functions[3] around duckdb[4] so I could directly query R dataframes and datatables with SQL using that backend, rather than sqldf[5].
[1] https://juliadata.github.io/DataFramesMeta.jl/stable/#@linq-...
[2] https://www.queryverse.org/Query.jl/stable/linqquerycommands...
[3] https://github.com/phillc73/duckdf
[4] https://duckdb.org/
[5] https://cran.r-project.org/web/packages/sqldf/index.html
VictoriaMetrics
-
OpenTelemetry Is Too Complicated
VictoriaMetrics CTO here.
The referred library is the official OpenTelemetry package for reading metrics in Go language [1] - more details are available at [2].
Note that we at VictoriaMetrics like the idea of unified observability standard like OpenTelemetry. The issue is in the current otel implementation. It is too bloated and very inefficient. This contradicts to our experience with observability cases, which need very optimized format for metrics' transfer in order to reduce costs on CPU and network traffic needed to transfer and process these metrics.
VictoriaMetrics continues investing in OpenTelemetry by providing integration docs [3] and improving the existing functionality for otel metrics' ingestion [4].
[1] https://github.com/open-telemetry/opentelemetry-proto-go
[2] https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2570...
[3] https://docs.victoriametrics.com/guides/getting-started-with...
[4] https://github.com/VictoriaMetrics/VictoriaMetrics/issues/60...
-
Observability at KubeCon + CloudNativeCon Europe 2024 in Paris
Victoria Metrics
- All you need is Wide Events, not "Metrics, Logs and Traces"
-
Top 11 Grafana Alternatives in 2023
VictoriaMetrics is primarily a time-series database designed for efficiently storing and querying time-series data. It is often used as a back-end data store for time-series data generated by monitoring systems like Prometheus. VictoriaMetrics excels at handling large volumes of time-series data, offering efficient storage and query capabilities.
-
InfluxDB CTO: Why We Moved from Go to Rust
Not sure I follow since there are very competitive tools written in Go such as https://victoriametrics.com for an example in this space.
-
ÎĽMon: Stupid simple monitoring
Did you try VictoriaMetrics [1] and vmagent [2]? It is a single self-contained binary without external dependencies. It requires relatively low amounts of CPU, RAM, disk space and disk IO, and it runs on ARM.
[1] https://github.com/VictoriaMetrics/VictoriaMetrics/
[2] https://docs.victoriametrics.com/vmagent.html
-
CERN swaps out databases to feed its petabyte-a-day habit
https://github.com/VictoriaMetrics/VictoriaMetrics#cardinali...
If I understanding correctly, it deal with high cardinality by dropping data, the operators need to monitor for this and adjust their data to lower the cardinality.
-
Prometheus Observability Platform: Intro
VictoriaMetrics
-
VictoriaMetrics VS openobserve - a user suggested alternative
2 projects | 30 Aug 2023
-
OpenTelemetry in 2023
You shouldn't unless you want to use the new open source standard for telemetry. You won't benefit from simplicity or performance improvements. It would be quite the opposite. You can check what is the actual cost of open telemetry adoption here [0]
But if you ever decide to go this path - VictoriaMetrics supports OpenTelemetry protocol for metrics [1]
[0] https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2570
[1] https://docs.victoriametrics.com/Single-server-VictoriaMetri...
What are some alternatives?
tidyquery - Query R data frames with SQL
mimir - Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
Typesense - Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences
thanos - Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
julia - The Julia Programming Language
prometheus - The Prometheus monitoring system and time series database.
loki - Like Prometheus, but for logs.
Makie.jl - Interactive data visualizations and plotting in Julia
ClickHouse - ClickHouse® is a free analytics DBMS for big data
MeiliSearch - A lightning-fast search API that fits effortlessly into your apps, websites, and workflow
InfluxDB - Scalable datastore for metrics, events, and real-time analytics