RESTful volume check for NetApps ONTAP filer

The v1.2.0 release of our Check NetApp-REST product contains a new monitoring plugin called check_netapp_volume usage. It allows the monitoring of: the used space of each volume (in Bytes) the total used space of several volumes the average used space of several volumes the min/max of used space of several volumes A typical output would look like: $ check_netapp_volume usage -H filer -w 30GiB -c 50GiB NETAPP VOLUME USAGE OK - 5 volumes checked vserv_b.
Read full post

Monitoring Check for E-Series Network Interfaces

The v1.1.0 release of our Check E-Series product brings also a new monitoring plugin called check_eseries_nic. This allows the monitoring of: the link-status (up/down) the current-speed (1Gib/s, 10GiB/s, …) setup-error (true/false) A typical output would for the link-status check look like: $ ./check_eseries_nic link-status -H 10.1.1.11 --system-id=1 NETAPP ESERIES NIC CRITICAL - 4 nics checked, 3 CRITICAL wan1.280107001000000000000: down (CRITICAL), slot 1, label P2 wan0.280007001000000000000: up, slot 0, label P1.
Read full post

E-Series Free Pool Space Monitoring

We have released a new monitoring plugin called check_eseries_pool. The first version has exactly one subcommand called free-space. It monitors the free space still available for thin volumes in the disk pool. It is particularly useful when you start taking snapshots or mirroring to another system. In such cases, you will monitor the FreePoolSpace to track the changes until they are mirrored to the other system or until the snapshot is deleted.
Read full post

Less data please :-)

Less can be more Some time ago, we received an enquiry in which a customer illustrated a general problem using the LUN check as follows: If I run ./check_netapp_pro.pl LunState -H fas --alarm_limit 1 This will show you all the LUNs on the Cluster, both online and offline. We have systems that contain over 200 LUNs, so Nagios shows a very long list of LUNs, which is confusing for some users.
Read full post

Bug in Ontap 9.8 blocks check_netapp_shelfenv

The plugin check_netapp_shelfenv does not work in Ontap 9.8 After much back and forth, we have now received confirmation that in Ontap 9.8 the endpoint /api/private/cli/storage/shelf for the RESTful API is missing or not working. On the command line it looks like this: curl -X GET -u nagios:**** -k https://some.filer.com/api/private/cli/storage/shelf { "error": { "message": "entry doesn't exist", "code": "4", "target": "shelf" } } This is of course unfortunate, because it means that our plugin check_netapp_shelfenv can no longer function.
Read full post

EMS Log-Monitoring

How to integrate the EMS-Log into an existing System-Monitoring solution In this article ONTAP REST APIs: Automate Notification of High-Severity Events Mahalakshmi describes how to use messages to get notified about system events, depending on their type and severity. It’s a really flexible and comprehensive way to monitor NetApps ONTAP. What if you already have a system-monitoring solution like Nagios, Icinga, op5 Monitor or Shinken in place? In that case the destinations (the recipients of the notifications) are already defined in the monitoring system.
Read full post

RESTfull Disk Check for NetApp's Ontap

The family of Check NetApp REST monitoring plugins has grown. With the check_netapp_disk container-type plugin the storage admin has a constant eye on disks that have been moved into unwanted containers. Let’s look at an example: .$ check_netapp_disk container-type -H sim96 NETAPP DISK CONTAINER TYPE OK - 28 disks checked sim96cluster-01.NET-1.28: spare sim96cluster-01.NET-1.27: spare sim96cluster-01.NET-1.26: aggregate sim96cluster-01.NET-1.25: aggregate sim96cluster-01.NET-1.24: aggregate sim96cluster-01.NET-1.23: aggregate sim96cluster-01.NET-1.22: aggregate sim96cluster-01.NET-1.21: aggregate sim96cluster-01.NET-1.20: aggregate sim96cluster-01.NET-1.19: spare sim96cluster-01.
Read full post

check_site_simple checks media-sources as well

The latest release of check_site_simple (v1.1.0.beta.1 as of writing this article) will find broken links to image and media-sources as well. Like it’s predecessors the integrated crawler of check_site_simple flags any broken link found somewhere on the checked site but now links to images and media files (e.g. audio or video files linked from your site) will get checked too. To give users full control we have also integrated a new switch --disable-media-checks.
Read full post

Check NetApp's E-Series

The first stable release (v1.1.0) of our brand new Check E-Series product brings an automated health check for e-series nodes. check_eseries_health will run the health checks regularly and automatically report any errors found. Example $ ./check_eseries_health --host=10.1.1.125 --system-id 1 [...] NETAPP ESERIES HEALTH CRITICAL - 16 health checks checked, 1 CRITICAL StorageGRID-XG-102.missingVolumes: notCompleted (CRITICAL) StorageGRID-XG-102.integratedHealthCheck: ok StorageGRID-XG-102.dbSubRecordsValidation: ok StorageGRID-XG-102.melEventCheck: ok StorageGRID-XG-102.validPassword: ok StorageGRID-XG-102.failedDrivesPresent: ok StorageGRID-XG-102.exclusiveOperations: ok StorageGRID-XG-102.driveCheck: ok StorageGRID-XG-102.nvsramDisableCfwDownloads: ok StorageGRID-XG-102.
Read full post

v6.x Customers: Missing --stm could fill up disk

We have discovered an issue in the new 6.x versions of check_netapp_pro if a getter contacts the new RESTful API of NetApp. These getters require an explicit --stm (e.g. --stm=1h). Otherwise the collected storefiles are not deleted and could fill up the monitoring servers disk. Again this does affect installations only if both of the following conditions are met: the plugins version is 6.x  the plugins getter contacts the RESTful API of the filer (Ontap 9.
Read full post