Kafka Policies

Policy names are prefixed with Kakfa –

Policy name Duration Condition 1 (and) Condition 2 Category Description
Depressed Number of Zookeeper Connections 30 min kafka.zookeeper.zk_num_alive_connections has a lower baseline deviation WARNING The number of active connections to Zookeeper has been lower than expected for at least the past 30 minutes.
Elevated Consumer Lag 15 min kafka.zookeeper.consumer_groups.*.comsuler_lag has an upper baseline deviation WARNING Consumer lag has been higher than expected for at least 15 minutes.
Elevated Consumer Purgatory Size 15 min kafka.server.DelayedOperationPurgatory.Fetch.PurgatorySize hasan upper baseline deviation WARNING The purgatory size for consumer fetch requests is higher than expected. This may be causing increases in consumer request latency.
Elevated Consumer Servicing Time 15 min kafka.network.RequestMetrics.FetchConsumer.TotalTimeMs.Meanhasan upper baseline deviation WARNING The broker is taking longer than usual to service consumer requests.
Elevated Number of Outstanding Zookeeper Requests 15 min kafka.zookeeper.zk_outstanding_requests has an upper baseline deviation WARNING The number of outstanding Zookeeper requests has been higher than expected for at least the past 15 minutes. This could be resulting in performance issues.
Elevated Producer Purgatory Size 15 min kafka.server.DelayedOperationPurgatory.Produce.PurgatorySizehasan upper baseline deviation WARNING The purgatory size for producer requests is higher than expected. This may be causing increases in producer request latency.
Elevated Producer Servicing Time 15 min kafka.network.RequestMetrics.Produce.TotalTimeMs.Mean has an upper baseline deviation WARNING The broker is taking longer than usual to service producer requests.
Elevated Topic Activity 30 min iBrokerTopicMetrics._all.BytesInPerSec.Count has an upper baseline deviation BrokerTopicMetrics._all.BytesOutPerSec.Count has an upper baseline deviation WARNING Topic activity has been higher than expected for at least the past 30 minutes.
Elevated Zookeeper Latency 15 min kafka.zookeeper.zk_avg_latency has an upper baseline deviation WARNING The average latency for Zookeeper requests has been higher than expected for at least the past 15 minutes.
Extended Period of Consumer Lag 1 hour and 15 min kafka.zookeeper.consumer_groups.*.consumer_lag has an upper baseline deviation CRITICAL Consumer lag has been higher than expected for over an hour.
No Active Controllers 5 min kafka.controller.ActiveControllerCount has a static threshold < 1 CRITICAL There are no active controllers in the Kafka cluster.
Unclean Leader Election Rate Greater Than 0 5 min kafka.controller.UncleanLeaderElectionsPerSec.Count has a static threshold > 0 CRITICAL An out-of-sync replica was chosen as leader because none of the available replicas were in sync. Some data loss has occurred as a result.
Under Replicated Partition Count Greater Than 0 30 min kafka.server.ReplicaManager.UnderReplicatedPartitions has a static threshold > 0 CRITICAL The number of partitions which are under-replicated has been greater than 0 for at least 30 minutes.