systemd-utils
rkvdns_examples
systemd-utils | rkvdns_examples | |
---|---|---|
2 | 2 | |
85 | 0 | |
- | - | |
3.2 | 7.6 | |
over 2 years ago | 22 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
systemd-utils
-
Ask HN: How do you monitor your systemd services?
I use the `OnFailure` property to trigger a service that emails me for failed services like backups which are run as system timers + service.
I also use `failure-monitor` which is Python service that monitors `journald`.
Files on Github for those interested:
https://github.com/kylemanna/systemd-utils
- Sending Emails to Myself
rkvdns_examples
-
Monitoring your logs is mostly a tarpit
Seems defeatist to me.
1) There has to be a notion that some things are worth acknowledging as "events"; this leads to the idea that what logs contain is indicators of events. It's a fundamentally philosophical notion. It means you need to take the time to decide what constitutes an event. Hearkening to machine learning and pirates, global warming may inversely correlate with pirates but that doesn't imply causation (either way): you can't just throw statistical techniques at data looking for "hits" and think that's significant. Even if you find some indicator as the article notes it could change; so you should identify some canary indicators and event those as well.
2) Which leads to the point about "bug parts": don't rely on a specific rare indicator, or the failure to identify such an indicator. If you find high-reliability indicators great, but look for other indicators which occur more often, that can be counted, and track those. For instance an indicator that e.g. systemd is restarting /something/, and that's happening more or less frequently, and correlates with a performance observable. If it stops reporting at all, you can start with the presumption that something about logging itself changed.
At this point my philosophical disagreement with centralized logging comes to the fore: it's expensive to load stuff into Splunk. I agree, and that's why I disagree with the approach and prefer federation.
You can use the Totalizer Agent (https://github.com/m3047/rkvdns_examples/tree/main/totalizer...) to increment counters in Redis for regex-identified keys. I don't care whether you use RKVDNS to retrieve the data or something else.
-
Ask HN: How do you monitor your systemd services?
In general this evolves to a SIEM-like solution in IT or gets added to the tag menagerie in OT.
If you're focused on "notifications are bad" note that notifications are push, and pull solutions are possible. Tail logs (or journalctl) and post significant events to Redis (https://github.com/m3047/rkvdns_examples/tree/main/totalizer...) for example.
What are some alternatives?
self-hosted - Sentry, feature-complete and packaged up for low-volume deployments and proofs-of-concept
collectd-systemd - collectd plugin to monitor systemd services
NPushOver - Full fledged, async, .Net Pushover client
Healthchecks - Open-source cron job and background task monitoring service, written in Python & Django
Sentry - Developer-first error tracking and performance monitoring
aioredis - asyncio (PEP 3156) Redis support
apprise - Apprise - Push Notifications that work with just about every platform!
uptime-kuma - A fancy self-hosted monitoring tool
ntfy - Send push notifications to your phone or desktop using PUT/POST