Bug in Ontap 9.8 blocks check_netapp_shelfenv

The plugin check_netapp_shelfenv does not work in Ontap 9.8 After much back and forth, we have now received confirmation that in Ontap 9.8 the endpoint /api/private/cli/storage/shelf for the RESTful API is missing or not working. On the command line it looks like this: curl -X GET -u nagios:**** -k https://some.filer.com/api/private/cli/storage/shelf { "error": { "message": "entry doesn't exist", "code": "4", "target": "shelf" } } This is of course unfortunate, because it means that our plugin check_netapp_shelfenv can no longer function.
Read full post

EMS Log-Monitoring

How to integrate the EMS-Log into an existing System-Monitoring solution In this article ONTAP REST APIs: Automate Notification of High-Severity Events Mahalakshmi describes how to use messages to get notified about system events, depending on their type and severity. It’s a really flexible and comprehensive way to monitor NetApps ONTAP. What if you already have a system-monitoring solution like Nagios, Icinga, op5 Monitor or Shinken in place? In that case the destinations (the recipients of the notifications) are already defined in the monitoring system.
Read full post

RESTfull Disk Check for NetApp's Ontap

The family of Check NetApp REST monitoring plugins has grown. With the check_netapp_disk container-type plugin the storage admin has a constant eye on disks that have been moved into unwanted containers. Let’s look at an example: .$ check_netapp_disk container-type -H sim96 NETAPP DISK CONTAINER TYPE OK - 28 disks checked sim96cluster-01.NET-1.28: spare sim96cluster-01.NET-1.27: spare sim96cluster-01.NET-1.26: aggregate sim96cluster-01.NET-1.25: aggregate sim96cluster-01.NET-1.24: aggregate sim96cluster-01.NET-1.23: aggregate sim96cluster-01.NET-1.22: aggregate sim96cluster-01.NET-1.21: aggregate sim96cluster-01.NET-1.20: aggregate sim96cluster-01.NET-1.19: spare sim96cluster-01.
Read full post

check_site_simple checks media-sources as well

The latest release of check_site_simple (v1.1.0.beta.1 as of writing this article) will find broken links to image and media-sources as well. Like it’s predecessors the integrated crawler of check_site_simple flags any broken link found somewhere on the checked site but now links to images and media files (e.g. audio or video files linked from your site) will get checked too. To give users full control we have also integrated a new switch --disable-media-checks.
Read full post

Check NetApp's E-Series

The first stable release (v1.1.0) of our brand new Check E-Series product brings an automated health check for e-series nodes. check_eseries_health will run the health checks regularly and automatically report any errors found. Example $ ./check_eseries_health --host=10.1.1.125 --system-id 1 [...] NETAPP ESERIES HEALTH CRITICAL - 16 health checks checked, 1 CRITICAL StorageGRID-XG-102.missingVolumes: notCompleted (CRITICAL) StorageGRID-XG-102.integratedHealthCheck: ok StorageGRID-XG-102.dbSubRecordsValidation: ok StorageGRID-XG-102.melEventCheck: ok StorageGRID-XG-102.validPassword: ok StorageGRID-XG-102.failedDrivesPresent: ok StorageGRID-XG-102.exclusiveOperations: ok StorageGRID-XG-102.driveCheck: ok StorageGRID-XG-102.nvsramDisableCfwDownloads: ok StorageGRID-XG-102.
Read full post

v6.x Customers: Missing --stm could fill up disk

We have discovered an issue in the new 6.x versions of check_netapp_pro if a getter contacts the new RESTful API of NetApp. These getters require an explicit --stm (e.g. --stm=1h). Otherwise the collected storefiles are not deleted and could fill up the monitoring servers disk. Again this does affect installations only if both of the following conditions are met: the plugins version is 6.x  the plugins getter contacts the RESTful API of the filer (Ontap 9.
Read full post

Performance Issues in large Environments

We have at least one confirmed report, that our latest release v6.0.0 has performance problems in large environments, when running many checks at the same time. There seems to be a limit of approximately 12 checks per second due to the overhead of running a docker container. We are working on a solution and will report here once we can provide an updated version. Customers with large environments may wait before upgrading to v6.
Read full post

Dependency free version 6.0.0 released as stable

After extensive testing we released the next major version 6.0.0 today. As already announced this version brings: Easy installation and update without dependencies on Perl or Perl-Modules.  Getter amd checks compatible with NetApps new RESTful API New checks and options More details are in the release history. This version is already available on your distributors portal. The completely rewritten and new documentation is available online at docs.monitoring-plugins.pro.

First v6.0.0 Release Candidate

We just pushed the first RC for the upcoming 6.0.0 version to the Distributors Portal. Customers who would like to test are very much appreciated! (Pls. check with the distributor to get access to unstable releases as well, if you do not see them on your account.) The 6.0.0 version brings a completely dockered and therefore dependency free installation plus new checks and options. Please read the release history for details.
Read full post

Checking a specific Nodes Instances

Since 4.0.0 every over-all check has a feature, which we call “instance-name contains node-name". We have introduced this as an replacement for the sporadic available --node|--vserver parameter (not available any more). So since then each instance-name has the name of its node prefixed. This allows to use the --include|--exclude parameters to filter for all instances of a specific node. E.g. if you have a new-node with a different hardware you may want to check only the ports of this new-node:
Read full post