Sudden Change is a time-series metric used to analyze trends of historical data and make predictions based on past behavior.
This process finds an expected average of activity by generating data points every 5 minutes in a sliding, one-hour window of time. It then uses this data to make predictions on the next interval.
When the future point is actualized and falls outside the expected prediction range, that interval is defined as a sudden change event. You can configure the scope of this metric’s prediction range by adjusting the acceptable percentage change of the future interval; doing so increases or decreases the sensitivity of your policy and affects the frequency of any associated alerts.
Sudden Change is useful for trend analysis on hourly min/max rollup data, like disk space usage or number of page hits. A best practice to consider when setting up a Sudden Change condition is to understand the average behavior of a metric beforehand. If the behavior is naturally prone to fluctuate, use a higher percentage. This widens the predictability scope, avoiding unnecessary policy activity and alerts.
You are monitoring the response time for a transaction. The conditions for the policy are looking for increases in response time of more than 50%, with a duration of 2 hours. An event is generated after the response time has exceeded the average by 50% for that specified duration.
Complete the following steps to configure a new policy using the Sudden Change condition:
The following graph depicts a policy with a sudden change deviation condition on a certain metric. In this example, an alert triggers due to a sudden change event (in red) on the 25th data point.
This graph is for functionality demonstration and not found in the product. In Metricly, a red dot appears on the Metric Explorer graph where the sudden change event occurred.
A percent drop, or step change, is computed as:
| (projected value) - (observed value) | / | (projected value) |
The sudden change algorithm returns this value for use by a condition in a policy. If the value exceeds the threshold in the policy condition, then the condition is true. If all the other conditions in the policy (if any) are also true, then an event is emitted.
Before reporting back the above value as a potential change, the algorithm performs several checks.
One of these checks is designed to determine if the regression model is a good enough fit for us to have any confidence in it’s projected value. Another check is used to add confidence that the observed value is sufficiently different from the projected value to be truly “anomalous”. Additional checks deal with detecting the trend in values leading up to the observed value. For example, if the trend was already negative and the actual observed value is just continuation of that trend, then no drop will be reported. The algorithm also requires data points to be consistently available and not be sparse.
In the above example, it is possible that some of these checks may have failed. In that case, the algorithm reports back that there was no drop.
When configuring a condition for sudden change deviation, we recommend setting a duration of no longer than 5 minutes. This is due to the nature of the event you are trying to capture: a single, sudden change in activity. Expecting a secondary sudden change in a longer duration of time may cause your policy to never activate, meaning you could miss otherwise genuine alerts.