The following is a two part solution. You can use either one independently, but I have found that combination of two works best for me:
After implementing the below solution you will be able to:
Part #1: - MONITORING:
1. Monitor any number of servers with any number of disks with just one application component per server
2. Monitor effectively both large disks (measured in Terra-bytes) and small disks (measured in Gigabytes)
3. Utilise just one component per server to monitor any number of logical disks (saving you on licenses)
Part #2: - ALERTING:
1. Alert effectively on both large disks (measured in Terra bytes) and small disks (measured in Gigabytes)
2. Be able to quickly and easily alter threshold values for certain disks based on specific requirements
=======================================================
MONITORING
(1)
Let's first create a WMI script to monitor disk space across all logical disks on the server. By default it will sets the following threshold:
IF FREE SPACE < 4% AND FREE SPACE < 20GB this will trigger an alert. This generally works well for both small and large disks.
'Script will scan through all disks on the node
'IF FREE SPACE IS LESS THAN 4% AND FREE SPACE IS LESS THAN 20GB - Statistic counter will be increased, which can be tracked in SAM
'This rule generally work very well for both large disks (measured in Terra-bytes) and small disks (measured in GigaBytes)
'--------------------------
'POSSIBLE IMPROVEMENT:
'This script will check all disks per node and will apply same threshold values (as described above)
'However, certain disks might need specific threshold.
'Improvement can be made to somehow utilise "Script Arguments" field in SAM to configure overrides for certain logical disks
'--------------------------
On Error Resume Next
Const FAIL = 1, SUCCESS = 0
Const wbemFlagReturnImmediately = &h10
Const wbemFlagForwardOnly = &h20
Dim MonitoredServer
Dim arrComputers
Dim strComputer
Dim strFreePercentThreshold
Dim strFreePercentCurrent
Dim strFreeMBThreshold
Dim strFreeMBCurrent
Dim objItem, objWMIService, colItems
Dim Statistic
Dim Message
'SCRIPT INPUT: [IP] [FreePercentThreshold] [FreeMBThreshold]
'Example of input: ${IP} 4 20000
Set lstArgs = WScript.Arguments
If lstArgs.Count = 3 Then
MonitoredServer = Trim(lstArgs(0))
strFreePercentThreshold = Trim(lstArgs(1))
strFreeMBThreshold = Trim(lstArgs(2))
Else
WScript.Echo "Message: Usage: Script arguments should be [IP] [FreePercentThreshold] [FreeMBThreshold]"
WScript.Echo "Statistic: 0"
WScript.Quit( FAIL )
End If
'WScript.Echo "Your Input: " & MonitoredServer & " " & strFreePercentThreshold & " " & strFreeMBThreshold
arrComputers = Array(MonitoredServer)
For Each strComputer In arrComputers
statistic = 0
message = "DISKS USAGE"
Set objWMIService = GetObject("winmgmts:" _
& "{impersonationLevel=impersonate}!\\" & strComputer _
& "\root\cimv2")
If Err.Number <> 0 Then
WScript.Quit( FAIL )
End If
Set colItems = objWMIService.ExecQuery("SELECT * FROM Win32_LogicalDisk WHERE Description LIKE '%Fixed%'", "WQL", _
wbemFlagReturnImmediately + wbemFlagForwardOnly)
For Each objItem In colItems
strFreePercentCurrent = Round( ( (objItem.FreeSpace * 100) / objItem.Size),0 )
strFreeMBCurrent = Round((objItem.FreeSpace / 1000000),0)
message = message & " | " & objItem.Caption & "(" & objItem.VolumeName & "), " & strFreePercentCurrent & "% Free, " & strFreeMBCurrent & "MB Free"
If (CInt(strFreePercentCurrent) <= CInt(strFreePercentThreshold)) Then
'WScript.Echo "Free Percent Threshold Breached"
If (CLng(strFreeMBCurrent) <= CLng(strFreeMBThreshold)) Then
'WScript.Echo "AND Bytes Threshold Breached"
statistic = statistic + 1
End If
End If
Next
Next
WScript.Echo "Message: " & Message
WScript.Echo "Statistic: " & CInt(Statistic)
WScript.Quit( Success )
(2)
Create SAM template with Windows Script Monitor component with the above script and assign application to servers you want to monitor
- Supply ${IP} 4 20000 as script argument in SAM template
- Here is what you should get in front-end
(3)
Create a group with dynamic query which will include all those applications from all servers. This will give you an excellent idea at any point in time that all your volumes are healthy. I use this group on my dashboard as well
ALERTING
(4)
Create 3 additional custom properties for "Volumes"
We will use them as follows:
v_ovrd_prcnt: to override disk percentage value threshold
v_ovrd_bytes: to override bytes value threshold
v_ovrd_desc: to specify some additional notes for other engineers/user about why we have overridden default value
(5)
Use same principal as above to configure your alert rule + include a possibility to override default rule if necessary
(it looks a bit lengthy and complicated, but works perfectly well)
The result is:
* by default you will receive an alert IF FREE SPACE < 5% AND FREE SPACE < 20GB on any disk on any server
* to override this rule just simply configure custom properties value for either percentage override or bytes override (or both values if you need to) in SAM front-end for the volume. If you will configure just one override, for example v_ovrd_prcnt = 30%, then it will trigger at 30% free space, regardless of how much space in bytes you have left.
* Note, that I have used 5% and 20GB rule in alerting, as opposed to 4% and 20GB in the above monitoring script. This is because I would like to receive an email alert slightly before it ends up on dashboard for everyone to see (most of the time I am able to quickly fix it silently, without making it visible for the whole business). Well, for you this might be other way round.
--
Thank you,
Alex