The family of Check NetApp REST monitoring plugins has grown. With the check_netapp_disk container-type plugin the storage admin has a constant eye on disks that have been moved into unwanted containers.
Let’s look at an example:
.$ check_netapp_disk container-type -H sim96 NETAPP DISK CONTAINER TYPE OK - 28 disks checked sim96cluster-01.NET-1.28: spare sim96cluster-01.NET-1.27: spare sim96cluster-01.NET-1.26: aggregate sim96cluster-01.NET-1.25: aggregate sim96cluster-01.NET-1.24: aggregate sim96cluster-01.NET-1.23: aggregate sim96cluster-01.NET-1.22: aggregate sim96cluster-01.NET-1.21: aggregate sim96cluster-01.NET-1.20: aggregate sim96cluster-01.NET-1.19: spare sim96cluster-01.
The Input-/Output Operations per Second (IOPS) or Operations per Second (OPS) can be monitored on various levels:
System Operations (total_ops) This is why the counter total_ops in the PerfSys check exists. Unfortunately, this counter is deprecated and will no longer be supported by NetApp in new versions of DataONTAP. As an alternative we would implement a check on workload basis. **The getter object needed for this check has already been implemented in current versions.
After having received feedback from a client saying the cluster mode collector get_netapp-cm.pl needs up to 45 seconds or more to collect 4500 snapshots, we looked into possibilities to accelerate the getter. In previous versions we relied on the default setting of the NetApp API which is set to 20 instances to be collected on the filer. Our first round of tests on a simulator showed that one could significantly reduce the runtime by increasing this value to 40 instead of 20.
As of the next release, DiskPaths will skip unassigned disks (e.g. disks left over after an aggregate has been destroyed) since it is not its job to identify them. Skipped disks can be viewed with -verbose. From now on it is important to configure Disk Check with -what=unassigned. A best practice is not to leave any disks unassigned (thank you to Walter RING of NetApp Austria for his insight!)
Our Disk check is going to have a new feature. Using the switch ‑‑what=non-zeroed-spare an alarm is sent as soon as non-zeroed spare disks have been found. Depending on wether the zeroing process is currently running or not, a WARNING (including progress) will be sent or else CRITICAL.
In order to facilitate the replacement of a broken hard disk, Disk Check will soon be able to display model, type, manufacturer and rpm using the switch --show_disk_inventory=always|non-ok|never Example of a simulator: $ ./check_netapp_pro.pl Disk -H sim821 --show_inventory_info=always NETAPP_PRO DISK OK - 14 disks checked v5.16: not failed (**inventory-info: FCAL, NETAPP, VD-1000MB-FZ-520, 15000rpm**) ... If the switch is set to --show_inventory_info=**non-ok** the inventory info is only displayed if the disk is broken or failed.
Up until now, the Disk Check was used to inspect the failed (or offline) attribute of disks. Unfortunately in cluster mode, this caused disks that did not belong to an aggregate to be skipped and thus not to be checked. Therefore, we have extended the check as follows:
Failed disks will always be reported as such, irrespective of the cause In addition, cluster mode filers can be checked for _broken _disks This means, that broken disks of a cluster mode filer will be shown in the check for _failed _as well as broken disks.
Unassigned disks currently cause the Disk Getter to crash. We have already developed a fix for this problem. At the same time, the check **Disk **can be used to check for unassigned disks: $ **./check_netapp_pro.pl Disk ‑H fasdc_cluster ‑‑what=unassigned** NETAPP_PRO DISK WARNING - 192 disks checked, 0 critical and 1 warning fasdc_cluster 1.13.13: unassigned (WARNING) fasdc_cluster-01 1.10.0: assigned to fasdc_cluster-01 fasdc_cluster-01 1.10.1: assigned to fasdc_cluster-01 fasdc_cluster-01 1.10.10: assigned to fasdc_cluster-01 (…) The fix and extension will be available in version 3.