External uptime monitoring tools for services and websites
Curious what you guys are seeing used for external monitoring (monitoring outside your network) for critical services. Are you monitoring any home lab services in this way? Also, what tools or services are you guys seeing used for production/corporate environments to monitor uptime for web services, etc?
@jnew1213 do you guys use tools to monitor Horizon externally?
I've been using UptimeRobot.com for years to to check a number of Websites and my Plex server at five-minute intervals.
Although I cannot receive emails from them that a site is off-the-air when my Internet service is down (DUH!), at least I get an estimate of the downtime once my service returns. I figure that's handy if I ever need to figure how I'd do if I had an SLA with a third-party for site hosting. (I do host one site for a third party.)
i haven't messed with anything like this for a few reasons.
1) if it's Plex, i'm sure i'll hear about it from one of my friends lol
2) if it's something critical i'll either a) be home and notice or b) not be home and unable to VPN home anyway due to said critical failure 😆
@jnew1213 @malcolm-r I have been trying out Betterstack in the last few days. Pretty cool what you can monitor for free and they have a 3 minute interval on the free side which is nice. They check from 5 regions or so (US, Europe, Asia, Australia, etc). I may spin up a blog post covering this soon. It looks like they have some nice Kubernetes and cloud monitoring and log tools as well.
I took a look. They have some offerings, but the only thing I need -- the only thing visible from outside -- is Website monitoring. Sticking with Uptime Robot.
@jnew1213 have you tried out Uptime Kuma by chance? I may be wrong, but I think it is a fork of Uptime Robot.
Tried it. No relation to Uptime Robot, which is a Website that checks Website status across the Web. Uptime Robot's minimal checking is free, with advanced options (frequency, notification types, etc.) available at various pricing tiers.
I tried Uptime Kuma earlier last year. I found it offered minimal monitoring on an okay looking dashboard, but it wasn't a substitute for a full blown monitor. I think it's probably okay for a typical small office kind of setup, but add a bit of "enterprisy" stuff and it quickly showed missing capabilities.
I'm using Check_MK here for monitoring, though I haven't gotten very far setting it up. The RAW version is free and very capable. We use it at work for monitoring a very large variety of infrastructure components. I have it set up in its own VM (okay, most things here are in their own VMs). That's my future for monitoring here. I just need to get into it, especially where vSphere and all the VMs and their applications are concerned.
One thing is notifications... I get them from Veeam. I get them from the Synology NASes. I get a few from other systems. They come into Outlook. Some stay and some get filtered into the bit bucket. Due to the lab aspect of things, some VMs and applications here go up and down on occasion (Horizon, for example; three pods across five connection servers, two Unified Access Gateways, one Microsoft SQL Server, four hosts, maybe an App Volumes Manager workstation and Dynamic Environment Manager installation.)
I don't want to be notified (nagged!) when some or all of that is down when I mean it to be down. So finding a way NOT to be inundated with status updates is important. I think I need a combination of active alerts in the form of emails, and passive alerts, as on a dashboard.
It all needs to be worked out. At the office, we have a full-timer who spends hour after hour on this stuff.
@jnew1213 as always, great insights! Definitely great to hear you guys are also using Check MK for monitoring at work in your large environment. I love to see FOSS in the enterprise. Home lab is definitely a mixed bag of monitoring solutions since as you mention, we don't want to have a pager go off at night for the home lab, but still want to know when things are down or not working.
For home, I am using a mixed bag of several solutions. Since I blog and create content along with other things, I have tons of monitoring solutions running, including PRTG, CheckMK, Grafana dashboards monitoring Kubernetes and hypervisors, vRealize Suite (Aria, wondering what it will be after Broadcom) for vSphere specific monitoring, Uptime Kuma, and a few other things.
I am always on the lookout for good monitoring though as it seems all of them have a weak point that I look to try to solve with something else, but that is part of the fun I guess
I ran PRTG a while back. When you install it, it goes around adding things to monitor on your network. That sounds okay, but I don't need a service check plus another service check plus a ping to tell me that something's down. It adds dozens of redundant checks and, of course, there's a limit of 100 total in the free version.
I tried that Grafana thing that folks show off running in their home lab. Tried it twice. Never got anything to refresh properly. It certainly produces pretty graphs, but I lost interest quick.
I think we probably have differing views on FOSS, in and out of the enterprise. Another conversation for another day. (Hint: I am in the minority here.)
@jnew1213 PRTG can definitely be overzealous on the sensors...it does like to add several things that you can prune back down. It is in the settings for the discovery to add the recommended sensors depending on what type of technology it detects. I don't think PRTG is perfect though...definitely things about it I don't like. It is one of the easiest to get started with though for monitoring lots of different infrastructure.
I go both ways when it comes to FOSS and enterprise solutions. It is hard to beat the "it just works out of the box" of enterprise-grade monitoring like Aria that is purpose built for a specific technology. However, I am a tinkerer at heart and like to flip switches and try things which FOSS helps me to scratch that itch a bit I guess.
Definitely, more conversations to come and things I want to pick your brain about.
This is fun. Pick away!
i run PRTG (among other things) and i like that it's mostly "point and click" to monitor devices. plus someone wrote a grafana plugin for it a while back: https://grafana.com/grafana/plugins/jasonlashua-prtg-datasource/
it hasn't been updated in quite some time, but it works for my needs.