Identifying Overcommitment in Aggregates

New Check displays warning message when Thin Provisioned Volumes grow too quickly.

The check OvercommitAggr  is already capable of identifying cases when the total sum of the actual disk space used by thin provisioned volumes on an aggregate is running low. Currently, absolute (in Byte) or percentage (in relation to aggregate size) threshold values can be set. The check obtains a whole new quality with the newly included metric _growth, _which interpolates past growth rates into the future and allows for warning messages to be displayed. For example, an aggregate’s disk space usage could reach 80% within the next two days if its volumes continue to grow at the current rate. Therefore, the key is to realize that the relationship between the sum of all volumes and the actual disk space available on an aggregate changes over time. In other words: should the trend continue as it did for the last x days, the aggregate would run out of disk space within the next y days.

An example using the new metric growth

Lets look at an example in which we set the followings parameters:

  • Retrospective period:  --lookbehind=1d (1 day)
  • Forecast period: --lookahead=2d (2 days)
  • Warning / critical threshold for the forecast period (in % of aggregate size): --warning=80--critical=95

The following example shows how a series of events could unfold:``` Day 1: 20% Usage → OK

Day 2: 15% Usage → OK (if the present trend continues a 5% usage is expected in 2 days)

Day 3: 20% Usage → OK (30% expected in 2 days)

Day 4: 25% Usage → OK (35% expected in 2 days)

Day 5: 45% Usage → WARNING Considering the current trend we expect a 85% usage in 2 days which would be more than the 80% warning-threshold

Day 6: 50% Usage → OK (60% expected in 2 days)

Day 7: 80% Usage → CRITICAL (140% expected in 2 days)


–discover and –explore
Monitoring Auto-Size Volumes

Comments