Escalation Rule Example

Consider a healthy Zixi Source that suddenly encounters intermittent packet loss over the course of several minutes, which eventually recovers. Notifications for packet loss, resulting in CC Errors, are controlled by the “Source – Analysis – TR101 P1 Transport” Rule of the Event Profile, which is configured as follows:

The escalation window is set to 10 minutes and an escalation count of 4 events. According to the rule, notification emails should be sent for both Error and OK states.

The resulting behavior of the monitoring system is as follows:

  1. ZEN Master checks the source. The source is actively streaming, and packets are delivered. For each confirmation during the rolling escalation widow, the response is good.

  2. A network error causes the source to lose packets. The following state check informs ZEN Master of CC Errors due to packet loss. As this is the first non-positive event during the window, ZEN Master places the source into a warning state.

  3. The source has recovered, at the next check the response indicates the source is good. ZEN Master returns the source into a good state.

  4. Further packet loss occurs, and the following checks return a non-positive result of a source exhibiting CC Errors. ZEN Master has returned the source to a warning state. Note that the cumulative event total is now 2, and that concurrent non-positive results are considered a single ongoing event.

  5. Monitoring of the source continues, returning several more CC Error results between to concurrent sets of good responses. The event count has reached 3, when the following check returns another non-positive result. The event count within the window is now 4, meeting the escalation requirement. ZEN Master now escalates the state of the source to an Error. Due to the notification settings, an Alert email is sent to users with notification privileges on the source.

  6. The next and subsequent confirmations return positive results, but the Source state has already met the Error criteria within the window and remains in Error.

  7. Once the Error criteria is met, further responses indicating CC Errors within the window will cause the Error state on the Source to persist.

  8. The source recovers, and subsequent checks return positive state. After a full de-escalation window of good responses ZEN Master returns the source to a Good status. Due to the notification settings, an Alert email is sent to all users with notification privileges on the source

 

In scenario 2, the healthy source encounters sustained interruption before recovery.

  1. The healthy source begins to return non-positive responses indicating CC Errors, and ZEN Master places the source in a warning state. The cumulative event count is 1. 

  2. Packet loss in ongoing for almost the full duration of the escalation window, and the confirmations continue to return non-positive responses indicating CC Errors. The Event count remains 1 and the warning states persists.

  3. Upon another non-positive response, if the full duration of the escalation window is reached, ZEN Master escalates the state of the source to an Error. Note that in this scenario the event count did not exceed one, but continual non-positive events for the duration of the window result in an Error state. Due to the notification settings, an email Alert is generated to all users with notification privileges on the source.

  4. The source recovers, and subsequent confirmations return a good state. After a full de-escalation window of good responses (i.e., 10 minutes and 4 event counts), ZEN Master finally returns the source to an OK state. ZEN Master generates an Alert email to all users with notification privileges on the source.