There are a series of settings that can be configured for each rule. Different configuration options are available for different types of rules.
- For all rules you can configure the actions taken when a rule violation occurs. This includes specifying whether or not notifications will be sent out and whether or not the state of the object will change.
- Event Count Rules - For most rules you can specify whether events generated by the rule are “Error” Events or “Warning” Events. For “Warning” Events you can specify the “Event Count Escalation”, which determines the number of Events that cause the rule to escalate to an “Error” Event and also the “Escalation/De-escalation Timing”, which determines the time window (in minutes) that is considered for escalating and de-escalating the status of the Event.
- Threshold Rules - For rules that relate to thresholds, there are two types of thresholds:
- Value thresholds - These thresholds determine the specific value that will escalate from to a "Warning" and "Error" state. The rule defines the condition for escalating to a "Warning" and "Error" state and also the condition to de-escalate from "Error" state to "Warning" state and to "OK" state. For example, a CPU Usage rule can determine that if the CPU utilization reaches of above the 50% threshold, it will escalate to a "Warning" state and if the value is above the 80% threshold, it will escalate to an "Error" state. To protect against "flapping", the rule also includes de-escalation and escalation delays that require the object to spend the configured time (in minutes) in the higher/lower state to actually escalate/de-escalate the state. For example, the CPU has to be in under 80% for a period of 15 minutes in order to de-escalate from "Error" to "Warning" state.
- The conditions are based on the percentage of the time that the object spent on a determined escalation/de-escalation timing window. Warnings and Errors use separate escalation and de-escalation windows, allowing users to better refine the escalation. For example, a source reporting blank picture for 5% of 60 minutes as a whole might constitute a warning. But a source reporting blank picture for 10% of a 15 minute window, can be configured to escalate to an error.
The following is an example of the settings that can be configured for a rule that involves warning and error thresholds.- State-Over-Time Thresholds - For alerts that are based on a "Boolean" state, whether it is Boolean (e.g., Source Offline) or because the exact value is not important (e.g., CC errors or not recovered packets), the threshold is defined by a time window and a percentage of time in which that window spent in the relevant state. For example, 50% out of 6 minutes is an error threshold, so if a source reports a blank picture for 3
minutes out of the 6 min window, ZM will put it into an error state. However, if the warning threshold is 5% out of 60 minutes, if a source reports a blank picture for 3 minutes out of the 60 min window, ZM will put it into a warning state. To protect against "flapping", the rule also includes de-escalation and escalation delays that require the object to spend the configured time (in minutes) in the higher/lower state to actually escalate/de-escalate the state.
Image Added
Child pages (Children Display) |
---|