Quantcast
Channel: THWACK: All Content - Server & Application Monitor
Viewing all articles
Browse latest Browse all 12281

Best Practices for Alerting & Alarm Management

$
0
0

We’re just getting started with SAM & LEM and am looking for suggestions on how to handle alerting & alarm mgmt.  We don’t have a true 24x7 operations team, but use an oncall rotation to respond to critical issues after business hrs.  I’m assuming this is typical of many – the business wants systems to be up and running 24x7 but they don’t want a 24x7 Operations Center.  So, I’m curious as to how others are using SolarWinds to manage this?

 

Do most of you rely on the built-in alerts to page your teams 24x7 after spending a few weeks/months tweaking the thresholds and minimizing the “noise”?  Or do you have different paging rules for after-hrs (maybe up/down only) and then respond to other health alarms during business hrs? Do the teams keep SAM/LEM open at all times to monitor systems or only when they’re troubleshooting an alert?  Also, and I haven’t seen this feature yet, but do you have an acknowledgement process, where someone needs to respond to the alerts within a specific amount of time before it escalates to a different oncall person?


Viewing all articles
Browse latest Browse all 12281

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>