rkvdns_examples
collectd-systemd
rkvdns_examples | collectd-systemd | |
---|---|---|
2 | 1 | |
0 | 9 | |
- | - | |
7.6 | 10.0 | |
17 days ago | over 3 years ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
rkvdns_examples
-
Monitoring your logs is mostly a tarpit
Seems defeatist to me.
1) There has to be a notion that some things are worth acknowledging as "events"; this leads to the idea that what logs contain is indicators of events. It's a fundamentally philosophical notion. It means you need to take the time to decide what constitutes an event. Hearkening to machine learning and pirates, global warming may inversely correlate with pirates but that doesn't imply causation (either way): you can't just throw statistical techniques at data looking for "hits" and think that's significant. Even if you find some indicator as the article notes it could change; so you should identify some canary indicators and event those as well.
2) Which leads to the point about "bug parts": don't rely on a specific rare indicator, or the failure to identify such an indicator. If you find high-reliability indicators great, but look for other indicators which occur more often, that can be counted, and track those. For instance an indicator that e.g. systemd is restarting /something/, and that's happening more or less frequently, and correlates with a performance observable. If it stops reporting at all, you can start with the presumption that something about logging itself changed.
At this point my philosophical disagreement with centralized logging comes to the fore: it's expensive to load stuff into Splunk. I agree, and that's why I disagree with the approach and prefer federation.
You can use the Totalizer Agent (https://github.com/m3047/rkvdns_examples/tree/main/totalizer...) to increment counters in Redis for regex-identified keys. I don't care whether you use RKVDNS to retrieve the data or something else.
-
Ask HN: How do you monitor your systemd services?
In general this evolves to a SIEM-like solution in IT or gets added to the tag menagerie in OT.
If you're focused on "notifications are bad" note that notifications are push, and pull solutions are possible. Tail logs (or journalctl) and post significant events to Redis (https://github.com/m3047/rkvdns_examples/tree/main/totalizer...) for example.
collectd-systemd
-
Ask HN: How do you monitor your systemd services?
This combo does the job for me: grafana + riemann + influxdb and collectd as the main agent. collectd bundles many plugins so you can watch logs, monitor running processes or have something custom [1]. This setup is very light to start with and can scale well (up until you hit influxdb limits :D).
[1] https://github.com/mbachry/collectd-systemd
What are some alternatives?
Healthchecks - Open-source cron job and background task monitoring service, written in Python & Django
aioredis - asyncio (PEP 3156) Redis support
ntfy - Send push notifications to your phone or desktop using PUT/POST
uptime-kuma - A fancy self-hosted monitoring tool
Netdata - The open-source observability platform everyone needs
systemd-utils - Random systemd utilities