-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
As for metrics collection, Prometheus is what I would recommend. It scales far better than pretty much everything else out there. It's open source. It can monitor anything, not just networking devices. It's built-in to a lot of modern software and some vendors support it directly.
I wrote a "better smokeping" tool that produces much more detailed data than IP SLA.
We are using checkmk to monitor 1000s of locations in terms of reachability and performance. Basically you use the checkmk (www.checkmk.com) server as a central site that pings your network devices and your servers (both is important since a slow server doesn't mean "the network is so slow(tm)" ) In Checkmk you get nice graphs but you can also export your data to grafana (which is what we do) to build a smokeping like expirience. Smokeping is a nice tool, but it's rather old and does not scale too well. Checks from your network devices can be implemented using ipsla (cisco?). Theres a plugin for that: https://checkmk.com/de/integrations/cisco\_ip\_sla. If you want to monitor stuff from a (near) user perspective: Check MK supports a distributed setup that allows you to place sensors in differenent locations (if you don't want to implement the full end-to-end monitoring using something like Robot-Framework (https://github.com/simonmeggle/robotmk). If you want deeper network visibility then you could pair checkmk with ntopng (https://www.ntop.org/products/traffic-analysis/ntop/). This way you'll get a lot more than plain RTT and network interface load like ipfix, dpi, *flow, etc...