Policy names are prefixed with Kakfa –
Policy name | Duration | Condition 1 | (and) Condition 2 | Category | Description |
---|---|---|---|---|---|
Depressed Number of Zookeeper Connections | 30 min | kafka.zookeeper.zk_num_alive_connections has a lower baseline deviation | WARNING | The number of active connections to Zookeeper has been lower than expected for at least the past 30 minutes. | |
Elevated Consumer Lag | 15 min | kafka.zookeeper.consumer_groups.*.comsuler_lag has an upper baseline deviation | WARNING | Consumer lag has been higher than expected for at least 15 minutes. | |
Elevated Consumer Purgatory Size | 15 min | kafka.server.DelayedOperationPurgatory.Fetch.PurgatorySize hasan upper baseline deviation | WARNING | The purgatory size for consumer fetch requests is higher than expected. This may be causing increases in consumer request latency. | |
Elevated Consumer Servicing Time | 15 min | kafka.network.RequestMetrics.FetchConsumer.TotalTimeMs.Meanhasan upper baseline deviation | WARNING | The broker is taking longer than usual to service consumer requests. | |
Elevated Number of Outstanding Zookeeper Requests | 15 min | kafka.zookeeper.zk_outstanding_requests has an upper baseline deviation | WARNING | The number of outstanding Zookeeper requests has been higher than expected for at least the past 15 minutes. This could be resulting in performance issues. | |
Elevated Producer Purgatory Size | 15 min | kafka.server.DelayedOperationPurgatory.Produce.PurgatorySizehasan upper baseline deviation | WARNING | The purgatory size for producer requests is higher than expected. This may be causing increases in producer request latency. | |
Elevated Producer Servicing Time | 15 min | kafka.network.RequestMetrics.Produce.TotalTimeMs.Mean has an upper baseline deviation | WARNING | The broker is taking longer than usual to service producer requests. | |
Elevated Topic Activity | 30 min | iBrokerTopicMetrics._all.BytesInPerSec.Count has an upper baseline deviation | BrokerTopicMetrics._all.BytesOutPerSec.Count has an upper baseline deviation | WARNING | Topic activity has been higher than expected for at least the past 30 minutes. |
Elevated Zookeeper Latency | 15 min | kafka.zookeeper.zk_avg_latency has an upper baseline deviation | WARNING | The average latency for Zookeeper requests has been higher than expected for at least the past 15 minutes. | |
Extended Period of Consumer Lag | 1 hour and 15 min | kafka.zookeeper.consumer_groups.*.consumer_lag has an upper baseline deviation | CRITICAL | Consumer lag has been higher than expected for over an hour. | |
No Active Controllers | 5 min | kafka.controller.ActiveControllerCount has a static threshold < 1 | CRITICAL | There are no active controllers in the Kafka cluster. | |
Unclean Leader Election Rate Greater Than 0 | 5 min | kafka.controller.UncleanLeaderElectionsPerSec.Count has a static threshold > 0 | CRITICAL | An out-of-sync replica was chosen as leader because none of the available replicas were in sync. Some data loss has occurred as a result. | |
Under Replicated Partition Count Greater Than 0 | 30 min | kafka.server.ReplicaManager.UnderReplicatedPartitions has a static threshold > 0 | CRITICAL | The number of partitions which are under-replicated has been greater than 0 for at least 30 minutes. |