Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
The unbearable fussiness of the smart home
8 projects | news.ycombinator.com | 24 Nov 2021
Au contraire - your setup is way more sophisticated than mine.
I have some 433MHz kit, plus a bundle of various arduinos, but (for me) the complexity of getting them to talk back to base has kept me procrastinating for years.
The ESP8266/ESP32 devices, with WiFi built-in, are effectively the same price as arduinos (here in AU, via ebay) but so much more convenient because of that extra memory + the wifi. I'm going to have some frustration with the 3 vs 5 volt, especially with some of the more esoteric components, but so far it's been a breeze to setup.
As Outworlder observes, my back-end is way more complex than a normal human would need - I'm replicating a stack we use at work, so it's basically taking up a bit of space on my home lab. Cortex is for a serious (enterprisey) amount of long-term storage of time-series metrics. Prometheus is easy enough to set up - it scrapes web end points that contain key / value pairs in plain text, and puts those in its own time series data store. Sqlite will scale just as well, I'm sure.
If you have the bandwidth, I can recommend playing with some of these things, just in case they may make your life easier later. Prometheus (server) will run on a Raspbery Pi easy enough.
That's a code fragment to run on an ESP and present a prometheus-compatible end point with a handful of key/value pairs - if you have a spare ESP, run it up, and hit the endpoint to see what I mean. The simplicity is compelling.
If / when I go down the path of custom components plugged into arduinos - and I'd like to one day build something to measure the levels in my rainwater tanks - I think that I'd try to get those data back into an intermediary device (esp or RPi) that could present them in this opentelemetry format, as it would make it easier to swap things around in the future.
Grafana is fantastic, and can produce some gorgeous visualisations from Prometheus (and other) sources. You may even be able to get it doing something with your sqlite DB.
Running up a monitoring agent (Prometheus' node_exporter, or InfluxDB's telegraf - functionally very similar) on your laptop may be a good way to experiment with live data and visualisations with low-effort. (Note that Telegraf will by default try to push into an InfluxDB -- I'm not a huge fan of InfluxDB -- but you can configure it to provide an otel / prometheus-compatible scrapable web endpoint instead.)
Ask HN: How to built a HIGHLY scalable API monitoring tool?
4 projects | news.ycombinator.com | 16 Dec 2021
The unbearable fussiness of the smart home
8 projects | news.ycombinator.com | 24 Nov 2021
> [...] that feed into a prometheus -> cortex store, so I can then map them on Grafana.
I had to Google because I've never heard of any of those. Did I find the right ones?
Mine is much more primitive. My indoor temperature monitor is an ESP8266 that uploads the temperature to a simple PHP page that saves it in an sqlite DB. A cron job runs a Perl script every few minutes that extract the data for the last hour, 3 hours, 12 hours, 48 hours, and since the beginning of time and uses gnuplot to produce PNG graphs. There's a static page on my server that displays those graphs.
My outdoor temperature monitor uses a cheap AcuRite 433 MHz indoor/outdoor thermometer I bought. I have an RPi with an RTL-SDR attached spying on the communications between the AcuRite sensor outside and the AcuRite display inside using rtl_433. A script looks at the rtl_433 and finds the AcuRite sensor data and puts it in an sqlite DB. I haven't yet gotten around to making something to graph it.
The nice thing about that approach is that it was also easy to add support for other 433 MHz wireless sensors near me, such as the AcuRite fridge/freezer thermometer I have. I can also see a few assorted sensors of neighbors (temperature, humidity, soil moisture, tire pressure, wind speed, wind direction, rain, and a few other random things). If I wanted to it would be easy to add them to the DB.
When I made a wireless tipping range gauge recently. I used a 433 MHz transmitter module  and added a decoder  to rtl_433 that understands my data stream format. That gets my data into the rtl_433 output. No need to futz around with 433 MHz receiver modules which appear to be a pain in the ass . An ATTiny85 counts the tips and runs the transmitter. The ATTiny85, the transmitter module, a battery holder, an RJ11 socket because the rain gauge has an RJ11 connector, a board to put those things on , and a small waterproof case is pretty much the complete parts list.
I think I'm going to standardize on this general approach. For things that do not have WiFi and only need to report data 433 MHz modules and custom decoders fro rtl_433 on the RPi. For things that do have WiFi, such as any future ESP projects I do, they will just use WiFi to talk to the RPi. If anything needs to get sent outside of my LAN the RPi will handle it.
The RPi is also currently controlling a space heater in my living room, getting connection data from my cable modem periodically and recording that in an sqlite DB, and serving a simple web page that lets me quickly change inputs and volume on my Denon receiver and so I'm already pretty much committed to keeping it running all the time.
 Decoders can be specified in a simple text file. Here's the one for my rain guage as an example:
Processing large datasets from mongodb in realtime
1 project | reddit.com/r/golang | 30 Jul 2021
Not a lot to go on in your post, but you might find some inspiration from this project (written in golang) which handles huge data sets (metrics). https://cortexmetrics.io/
How are you tracking your SLA's/SLO
2 projects | reddit.com/r/sre | 3 Apr 2021
Thanos or Cortex.
2 projects | reddit.com/r/networking | 22 Mar 2021
You will probably want to look at Cortex, it's designed to be the multi-tenant database. You can either build it up yourself, or use Grafana hosted version.
Sizing Considerations for Prometheus
3 projects | reddit.com/r/PrometheusMonitoring | 12 Mar 2021
But yes, Cortex is primarily Grafana Labs :)
Launch HN: Opstrace (YC S19) – open-source Datadog
11 projects | news.ycombinator.com | 1 Feb 2021
(3) Transparency and predictability of costs—you pay your cloud provider for the storage/network/compute for running Opstrace and can take advantage of any credits/discounts you negotiate with them. We are incentivized to help you understand exactly where you are spending money because you pay us for the value you get from our product with per-user pricing. (For more about costs, see our recent blog post here: https://opstrace.com/blog/pulling-cost-curtain-back). (4) It should be REAL Open Source with the Apache License, Version 2.0.
To get started, you install Opstrace into your AWS or GCP account with one command: `opstrace create`. This installs Opstrace in your account, creates a domain name and sets up authentication for you for free. Once logged in you can create tenants that each contain APIs for Prometheus, Fluentd/Loki and more. Each tenant has a Grafana instance you can use. A tenant can be used to logically separate domains, for example, things like prod, test, staging or teams. Whatever you prefer.
At the heart of Opstrace runs a Cortex (https://github.com/cortexproject/cortex) cluster to provide the above-mentioned scalable Prometheus API, and a Loki (https://github.com/grafana/loki) cluster for the logs. We front those with authenticated endpoints (all public in our repo). All the data ends up stored only in S3 thanks to the amazing work of the developers on those projects.
An "open source Datadog" requires more than just metrics and logs. We are actively working on a new UI for managing, querying and visualizing your data and many more features, like automatic ingestion of logs/metrics from cloud services (CloudWatch/Stackdriver), Datadog compatible API endpoints to ease migrations and side by side comparisons and synthetics (e.g. Pingdom). You can follow along on our public roadmap: https://opstrace.com/docs/references/roadmap.
We will always be open source, and we make money by charging a per-user subscription for our commercial version which will contain fine-grained authz, bring-your-own OIDC and custom domains.
We’d love to hear what your perspective is. What are your experiences related to the problems discussed here? Are you all happy with the tools you’re using today?11 projects | news.ycombinator.com | 1 Feb 2021
Thanks for bringing this topic to this thread. I'm a physicist by heart and education myself and observe that in the software/observability industry we like to collect data much more than we're interested in properly processing and interpreting it.
> Understanding the distribution of your data (rather than just averages) is arguably the most important feature you want from a monitoring dashboard, so the weak support for quantiles is very limiting.
So much yes! It's a relief to see that we have people here in this thread (and industry) who understand this :-).
People that have a deep background and experience in experimentation, measurement, and quantification rightfully have to see the nature of the data distribution first before they feel in any way OK about proceeding with aggregates.
Parent commenter knows this, but for people reading along: using aggregates (such as mean, standard deviation, standard error, quantiles, ...) implies dropping information. Going from the full distribution to a simplified representation naturally implies that what we talk about is a lossy transformation of data. Of course, one wants to be smart about _which_ information to drop. It should be intuitive that one can only be smart about this choice when having knowledge about the underlying distribution. Often, data is not normally distributed, not Poisson-distributed, but instead somewhat uniquely distributed based on the use case -- in a way that deserves brief characterization (a quick look is often enough!); which then allows for making informed decisions about which aggregate parameters to look at -- and which pieces of information are fine to drop.
> Histograms require manually specifying the distribution of your data, which is time-consuming, lossy, and can introduce significant error bands around your quantile estimates.
Yes! Great point. Honestly, I was a little bit shocked when I saw how this works in the Prometheus ecosystem. I happen to have an example for this I think: we (Opstrace) have contributed a tiny patch to Cortex where we changed the parameterization of a specific histogram metric, because the upper band was super broad, leading to a blind spot (a lack of resolution) in the range of values that was most interesting to us -- see https://github.com/cortexproject/cortex/issues/2530 and
Gopher Gold #14 - Wed Oct 07 2020
22 projects | dev.to | 7 Oct 2020
cortexproject/cortex (Go): A horizontally scalable, highly available, multi-tenant, long term Prometheus.
What are some alternatives?
thanos - Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
loki - Like Prometheus, but for logs.
opstrace - The Open Source Observability Distribution
windows_exporter - Prometheus exporter for Windows machines
TimescaleDB - An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
kubebuilder - Kubebuilder - SDK for building Kubernetes APIs using CRDs
Ory Kratos - Next-gen identity server (think Auth0, Okta, Firebase) with Ory-hardened authentication, MFA, FIDO2, profile management, identity schemas, social sign in, registration, account recovery, and IoT auth. Golang, headless, API-only - without templating or theming headaches.
spark-on-k8s-operator - Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
veneur - A distributed, fault-tolerant pipeline for observability data
fission - Fast and Simple Serverless Functions for Kubernetes
duf - Disk Usage/Free Utility - a better 'df' alternative