There have been a few different iterations of NTP monitoring for various Operating Systems. I know that this simple script has been used across a variety of systems.
ntpq -crv | tr "," "\n" | sed -e "/^$/d" -e "s/^[ \t]*//" -e "s/rootdisp=/rootdispersion=/" -e "s/=/,/" | egrep "stratum|rootdis|refid|offset"
Does anyone else have any additions or enhancements to it?
Would this be good to just add natively to the Hardware Plugin or are there too many different OS issues that could crop up?
The HARDWARE plugin sort of hiccuped a little as we implemented a slightly naive “ntpq” scraper initially, but now we do support most of the standard implementations - see below - but only for the system clock and not any peering info - which in itself can offer additional monitoring metrics such as round-trip-times etc.
See timeSyncSource in the current docs for HARDWARE
I think the main challenge nowadays is supporting the variety of different NTP implementations. For Linux you have, at least:
The, for Windows,
- builtin w32tm / AD based
- Meinberg (and derived) ntpd
and so on.
Now, putting together a plugin that checks what’s running and gets data from it is much harder but, from memory, the big deal is that they all report the details subtly differently. Even different major releases of ntpd reported different stats over the years,
Don’t even go near PTP…
Of course, I’m probably over-thinking and over-complicating it, but regardless of how the data is collected what we probably need is some consistency of the value we report; What matters, especially over time for analysis?
So it sounds like the one size fits all is just not really feasible. Better to have some fit for purpose options and let users mix and match what will work with their deployments.