#Alerts

AWS ASG Policies

Policy names are prefixed with AWS ASG – Policy name Duration Condition 1 (and) Condition 2 (and) Condition 3 Cat. Description Elevated CPUActivity (Normal Network Activity) 30 min aws.ec2.cpuutilization has an upper baseline + upper contextual deviation metricly.aws.ec2.bytesinperse does not have a upper baseline + upper contextual deviation metricly.aws.ec2.bytesoutpersec does not have a upper baseline + upper contextual deviation. INFO This policy is designed to catch cases where CPU activity is higher than than normal and cannot be explained by a corresponding increase in network traffic.

AWS DynamoDB Policies

Policy names are prefixed with AWS DynamoDB – Policy Name Duration Condition 1 Cat. Description Elevated Read Capacity Utilization 30 Min metricly.aws.dynamodb.readcapacityutilization has an upper baseline deviation + an upper contextual deviation + a static threshold ≥ 50. WARNING Read Capacity Utilization has been higher than expected for over 30 minutes; also, the actual value has been above 50% for that time. Elevated Write Capacity Utilization 30 Min metricly.

AWS EBS Policies

Before reading about the EBS default policy, it is important to understand the following Metricly computed metrics. Average Latency: Average Latency is straightforward as it represents the average amount of time that it takes for a disk operation to complete. Queue Length Differential: Queue Length Differential measures the difference between the actual disk queue length and the “ideal” disk queue length.The ideal queue length is based on Amazon’s rule of thumb that for every 200 IOPS you should have a queue length of 1.

AWS EC2 Policies

Policy names are prefixed with AWS EC2 – Policy name Duration Condition 1 (and) Condition 2 (and) Condition 3 Cat. Description Elevated CPUActivity (Normal Network Activity) 30 min aws.ec2.cpuutilization has an upper baseline deviation + an upper contextual deviation metricly.aws.ec2.bytesinpersec does not have a upper baseline deviation + does not have a upper contextual deviation metricly.aws.ec2.bytesoutpersec does not have a upper baseline deviation + does not have a upper contextual deviationn INFO Increases in CPU activity are not uncommon when there is a rise in network activity.

AWS EFS Policies

Policy names are prefixed with AWS EFS – Policy name Duration Condition 1 (and) Condition 2 (and) Condition 3 Cat. Description AWS EFS – Depleted Burst Credit Balance 15 minutes aws.efs.burstcreditbalance = 0 Critical There are no burst credits left. The number of burst credits that a file system has is zero. AWS EFS – IO Percentage Critical 15 minutes aws.

AWS ELB Policies

Policy names are prefixed with AWS ELB – Policy name Duration Condition 1 (and) Condition 2 Category Description Elevated BackendError Rate (Low Volume) 15 min metricly.aws.elb.httpcodebackenderrorpercent has an upper baseline deviation + an upper contextual deviation metricly.aws.elb.requestcount has a static threshold <1,000 WARNING This is the first of three policies that look at elevated backend error rates. This policy looks specifically at low traffic volume cases.

AWS Elasticache Policies

Policy names are prefixed with AWS Elasticache – Policy name Duration Condition 1 (and) Condition 2 Category Description Memcached – CPU Threshold Exceeded 5 min aws.elasticache.cpuutilization has a static threshold >90% CRITICAL The Memcached Node has exceeded the CPU threshold of 90%. The cache cluster may need to be scaled, either by using a larger node type or by adding more nodes.

AWS Lambda Policies

Policy names are prefixed with AWS Lambda – Policy name Duration Conditions Category Description Elevated Invocation Count 30 min aws.lambda.invocations has an upper baseline deviation + an upper contextual deviation WARNING The number of calls to the function (invocations) have been greater than expected for at least the last 30 minutes. Depressed Invocation Count 10 min aws.lambda.invocations has a lower baseline deviation + a lower contextual deviation WARNING The number of calls to the function (invocations) have been lower than expected for at least the last 10 minutes.

AWS RDS Policies

Policy names are prefixed with AWS RDS – Policy name Duration Condition 1 (and) Condition 2 (and) Condition 3 Cat. Description Elevated RDS CPU Activity (Normal Network Activity) 30 min metricly.aws.rds.cpuutilization has an upper baseline deviation + an upper contextual deviation + a static threshold > 20 metricly.aws.rds.networkreceivethroughput does not have an upper baseline deviation + does not have a upper contextual deviation metricly.

AWS Route 53 Policies

Connection Time The connection time is higher than usual. ConnectionTime is the average time, in milliseconds, that it took Route 53 health checkers to establish a TCP connection with the endpoint. Health Check Status The Health Check failed. HealthCheckStatus is the status of the health check endpoint that CloudWatch is checking. 1 indicates healthy, and 0 indicates unhealthy. Time of First Byte The time to first byte is higher than usual.