Chapter 7. Monitoring your cluster using JMX

Zookeeper, the Kafka broker, Kafka Connect, and the Kafka clients all expose management information using Java Management Extensions (JMX). Most management information is in the form of metrics that are useful for monitoring the condition and performance of your Kafka cluster. Like other Java applications, Kafka provides this management information through managed beans or MBeans.

JMX works at the level of the JVM (Java Virtual Machine). To obtain management information, external tools can connect to the JVM that is running Zookeeper, the Kafka broker, and so on. By default, only tools on the same machine and running as the same user as the JVM are able to connect.

Note

Management information for Zookeeper is not documented here. You can view Zookeeper metrics in JConsole. For more information, see Monitoring using JConsole.

7.1. JMX configuration options

You configure JMX using JVM system properties. The scripts provided with AMQ Streams (bin/kafka-server-start.sh and bin/connect-distributed.sh, and so on) use the KAFKA_JMX_OPTS environment variable to set these system properties. The system properties for configuring JMX are the same, even though Kafka producer, consumer, and streams applications typically start the JVM in different ways.

7.2. Disabling the JMX agent

You can prevent local JMX tools from connecting to the JVM (for example, for compliance reasons) by disabling the JMX agent for an AMQ Streams component. The following procedure explains how to disable the JMX agent for a Kafka broker.

Procedure

  1. Use the KAFKA_JMX_OPTS environment variable to set com.sun.management.jmxremote to false.

    export KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote=false
    bin/kafka-server-start.sh
  2. Start the JVM.

7.3. Connecting to the JVM from a different machine

You can connect to the JVM from a different machine by configuring the port that the JMX agent listens on. This is insecure because it allows JMX tools to connect from anywhere, with no authentication.

Procedure

  1. Use the KAFKA_JMX_OPTS environment variable to set -Dcom.sun.management.jmxremote.port=<port>. For <port>, enter the name of the port on which you want the Kafka broker to listen for JMX connections.

    export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
      -Dcom.sun.management.jmxremote.port=<port>
      -Dcom.sun.management.jmxremote.authenticate=false
      -Dcom.sun.management.jmxremote.ssl=false"
    bin/kafka-server-start.sh
  2. Start the JVM.
Important

It is recommended that you configure authentication and SSL to ensure that the remote JMX connection is secure. For more information about the system properties needed to do this, see the JMX documentation.

7.4. Monitoring using JConsole

The JConsole tool is distributed with the Java Development Kit (JDK). You can use JConsole to connect to a local or remote JVM and discover and display management information from Java applications. If using JConsole to connect to a local JVM, the names of the JVM processes corresponding to the different components of AMQ Streams are as follows:

AMQ Streams componentJVM process

Zookeeper

org.apache.zookeeper.server.quorum.QuorumPeerMain

Kafka broker

kafka.Kafka

Kafka Connect standalone

org.apache.kafka.connect.cli.ConnectStandalone

Kafka Connect distributed

org.apache.kafka.connect.cli.ConnectDistributed

A Kafka producer, consumer, or Streams application

The name of the class containing the main method for the application.

When using JConsole to connect to a remote JVM, use the appropriate host name and JMX port.

Many other tools and monitoring products can be used to fetch the metrics using JMX and provide monitoring and alerting based on those metrics. Refer to the product documentation for those tools.

7.5. Important Kafka broker metrics

Kafka provides many MBeans for monitoring the performance of the brokers in your Kafka cluster. These apply to an individual broker rather than the entire cluster.

The following tables present a selection of these broker-level MBeans organized into server, network, logging, and controller metrics.

7.5.1. Kafka server metrics

The following table shows a selection of metrics that report information about the Kafka server.

MetricMBeanDescriptionExpected value

Messages in per second

kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec

The rate at which individual messages are consumed by the broker.

Approximately the same as the other brokers in the cluster.

Bytes in per second

kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec

The rate at which data sent from producers is consumed by the broker.

Approximately the same as the other brokers in the cluster.

Replication bytes in per second

kafka.server:type=BrokerTopicMetrics,name=ReplicationBytesInPerSec

The rate at which data sent from other brokers is consumed by the follower broker.

N/A

Bytes out per second

kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec

The rate at which data is fetched and read from the broker by consumers.

N/A

Replication bytes out per second

kafka.server:type=BrokerTopicMetrics,name=ReplicationBytesOutPerSec

The rate at which data is sent from the broker to other brokers. This metric is useful to monitor if the broker is a leader for a group of partitions.

N/A

Under-replicated partitions

kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions

The number of partitions that have not been fully replicated in the follower replicas.

Zero

Under minimum ISR partition count

kafka.server:type=ReplicaManager,name=UnderMinIsrPartitionCount

The number of partitions under the minimum In-Sync Replica (ISR) count. The ISR count indicates the set of replicas that are up-to-date with the leader.

Zero

Partition count

kafka.server:type=ReplicaManager,name=PartitionCount

The number of partitions in the broker.

Approximately even when compared with the other brokers.

Leader count

kafka.server:type=ReplicaManager,name=LeaderCount

The number of replicas for which this broker is the leader.

Approximately the same as the other brokers in the cluster.

ISR shrinks per second

kafka.server:type=ReplicaManager,name=IsrShrinksPerSec

The rate at which the number of ISRs in the broker decreases

Zero

ISR expands per second

kafka.server:type=ReplicaManager,name=IsrExpandsPerSec

The rate at which the number of ISRs in the broker increases.

Zero

Maximum lag

kafka.server:type=ReplicaFetcherManager,name=MaxLag,clientId=Replica

The maximum lag between the time that messages are received by the leader replica and by the follower replicas.

Proportional to the maximum batch size of a produce request.

Requests in producer purgatory

kafka.server:type=DelayedOperationPurgatory,name=PurgatorySize,delayedOperation=Produce

The number of send requests in the producer purgatory.

N/A

Requests in fetch purgatory

kafka.server:type=DelayedOperationPurgatory,name=PurgatorySize,delayedOperation=Fetch

The number of fetch requests in the fetch purgatory.

N/A

Request handler average idle percent

kafka.server:type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent

Indicates the percentage of time that the request handler (IO) threads are not in use.

A lower value indicates that the workload of the broker is high.

Request (Requests exempt from throttling)

kafka.server:type=Request

The number of requests that are exempt from throttling.

N/A

Zookeeper request latency in milliseconds

kafka.server:type=ZooKeeperClientMetrics,name=ZooKeeperRequestLatencyMs

The latency for ZooKeeper requests from the broker, in milliseconds.

N/A

Zookeeper session state

kafka.server:type=SessionExpireListener,name=SessionState

The status of the broker’s connection to Zookeeper.

CONNECTED

7.5.2. Kafka network metrics

The following table shows a selection of metrics that report information about requests.

MetricMBeanDescriptionExpected value

Requests per second

kafka.network:type=RequestMetrics,name=RequestsPerSec,request={Produce|FetchConsumer|FetchFollower}

The total number of requests made for the request type per second. The Produce, FetchConsumer, and FetchFollower request types each have their own MBeans.

N/A

Request bytes (request size in bytes)

kafka.network:type=RequestMetrics,name=RequestBytes,request=([-.\w]+)

The size of requests, in bytes, made for the request type identified by the request property of the MBean name. Separate MBeans for all available request types are listed under the RequestBytes node.

N/A

Temporary memory size in bytes

kafka.network:type=RequestMetrics,name=TemporaryMemoryBytes,request={Produce|Fetch}

The amount of temporary memory used for converting message formats and decompressing messages.

N/A

Message conversions time

kafka.network:type=RequestMetrics,name=MessageConversionsTimeMs,request={Produce|Fetch}

Time, in milliseconds, spent on converting message formats.

N/A

Total request time in milliseconds

kafka.network:type=RequestMetrics,name=TotalTimeMs,request={Produce|FetchConsumer|FetchFollower}

Total time, in milliseconds, spent processing requests.

N/A

Request queue time in milliseconds

kafka.network:type=RequestMetrics,name=RequestQueueTimeMs,request={Produce|FetchConsumer|FetchFollower}

The time, in milliseconds, that a request currently spends in the queue for the request type given in the request property.

N/A

Local time (leader local processing time) in milliseconds

kafka.network:type=RequestMetrics,name=LocalTimeMs,request={Produce|FetchConsumer|FetchFollower}

The time taken, in milliseconds, for the leader to process the request.

N/A

Remote time (leader remote processing time) in milliseconds

kafka.network:type=RequestMetrics,name=RemoteTimeMs,request={Produce|FetchConsumer|FetchFollower}

The length of time, in milliseconds, that the request waits for the follower. Separate MBeans for all available request types are listed under the RemoteTimeMs node.

N/A

Response queue time in milliseconds

kafka.network:type=RequestMetrics,name=ResponseQueueTimeMs,request={Produce|FetchConsumer|FetchFollower}

The length of time, in milliseconds, that the request waits in the response queue.

N/A

Response send time in milliseconds

kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request={Produce|FetchConsumer|FetchFollower}

The time taken, in milliseconds, to send the response.

N/A

Network processor average idle percent

kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent

The average percentage of time that the network processors are idle.

Between zero and one.

7.5.3. Kafka log metrics

The following table shows a selection of metrics that report information about logging.

MetricMBeanDescriptionExpected Value

Log flush rate and time in milliseconds

kafka.log:type=LogFlushStats,name=LogFlushRateAndTimeMs

The rate at which log data is written to disk, in milliseconds.

N/A

Offline log directory count

kafka.log:type=LogManager,name=OfflineLogDirectoryCount

The number of offline log directories (for example, after a hardware failure).

Zero

7.5.4. Kafka controller metrics

The following table shows a selection of metrics that report information about the controller of the cluster.

MetricMBeanDescriptionExpected Value

Active controller count

kafka.controller:type=KafkaController,name=ActiveControllerCount

The number of brokers designated as controllers.

One indicates that the broker is the controller for the cluster.

Leader election rate and time in milliseconds

kafka.controller:type=ControllerStats,name=LeaderElectionRateAndTimeMs

The rate at which new leader replicas are elected.

Zero

7.5.5. Yammer metrics

Metrics that express a rate or unit of time are provided as Yammer metrics. The class name of an MBean that uses Yammer metrics is prefixed with com.yammer.metrics.

Yammer rate metrics have the following attributes for monitoring requests:

  • Count
  • EventType (Bytes)
  • FifteenMinuteRate
  • RateUnit (Seconds)
  • MeanRate
  • OneMinuteRate
  • FiveMinuteRate

Yammer time metrics have the following attributes for monitoring requests:

  • Max
  • Min
  • Mean
  • StdDev
  • 75/95/98/99/99.9th Percentile

7.6. Producer MBeans

The following MBeans will exist in Kafka producer applications, including Kafka Streams applications and Kafka Connect with source connectors.

7.6.1. MBeans matching kafka.producer:type=producer-metrics,client-id=*

These are metrics at the producer level.

AttributeDescription

batch-size-avg

The average number of bytes sent per partition per-request.

batch-size-max

The max number of bytes sent per partition per-request.

batch-split-rate

The average number of batch splits per second.

batch-split-total

The total number of batch splits.

buffer-available-bytes

The total amount of buffer memory that is not being used (either unallocated or in the free list).

buffer-total-bytes

The maximum amount of buffer memory the client can use (whether or not it is currently used).

bufferpool-wait-time

The fraction of time an appender waits for space allocation.

compression-rate-avg

The average compression rate of record batches.

connection-close-rate

Connections closed per second in the window.

connection-count

The current number of active connections.

connection-creation-rate

New connections established per second in the window.

failed-authentication-rate

Connections that failed authentication.

incoming-byte-rate

Bytes/second read off all sockets.

io-ratio

The fraction of time the I/O thread spent doing I/O.

io-time-ns-avg

The average length of time for I/O per select call in nanoseconds.

io-wait-ratio

The fraction of time the I/O thread spent waiting.

io-wait-time-ns-avg

The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.

metadata-age

The age in seconds of the current producer metadata being used.

network-io-rate

The average number of network operations (reads or writes) on all connections per second.

outgoing-byte-rate

The average number of outgoing bytes sent per second to all servers.

produce-throttle-time-avg

The average time in ms a request was throttled by a broker.

produce-throttle-time-max

The maximum time in ms a request was throttled by a broker.

record-error-rate

The average per-second number of record sends that resulted in errors.

record-error-total

The total number of record sends that resulted in errors.

record-queue-time-avg

The average time in ms record batches spent in the send buffer.

record-queue-time-max

The maximum time in ms record batches spent in the send buffer.

record-retry-rate

The average per-second number of retried record sends.

record-retry-total

The total number of retried record sends.

record-send-rate

The average number of records sent per second.

record-send-total

The total number of records sent.

record-size-avg

The average record size.

record-size-max

The maximum record size.

records-per-request-avg

The average number of records per request.

request-latency-avg

The average request latency in ms.

request-latency-max

The maximum request latency in ms.

request-rate

The average number of requests sent per second.

request-size-avg

The average size of all requests in the window.

request-size-max

The maximum size of any request sent in the window.

requests-in-flight

The current number of in-flight requests awaiting a response.

response-rate

Responses received sent per second.

select-rate

Number of times the I/O layer checked for new I/O to perform per second.

successful-authentication-rate

Connections that were successfully authenticated using SASL or SSL.

waiting-threads

The number of user threads blocked waiting for buffer memory to enqueue their records.

7.6.2. MBeans matching kafka.producer:type=producer-metrics,client-id=*,node-id=*

These are metrics at the producer level about connection to each broker.

AttributeDescription

incoming-byte-rate

The average number of responses received per second for a node.

outgoing-byte-rate

The average number of outgoing bytes sent per second for a node.

request-latency-avg

The average request latency in ms for a node.

request-latency-max

The maximum request latency in ms for a node.

request-rate

The average number of requests sent per second for a node.

request-size-avg

The average size of all requests in the window for a node.

request-size-max

The maximum size of any request sent in the window for a node.

response-rate

Responses received sent per second for a node.

7.6.3. MBeans matching kafka.producer:type=producer-topic-metrics,client-id=*,topic=*

These are metrics at the topic level about topics the producer is sending messages to.

AttributeDescription

byte-rate

The average number of bytes sent per second for a topic.

byte-total

The total number of bytes sent for a topic.

compression-rate

The average compression rate of record batches for a topic.

record-error-rate

The average per-second number of record sends that resulted in errors for a topic.

record-error-total

The total number of record sends that resulted in errors for a topic.

record-retry-rate

The average per-second number of retried record sends for a topic.

record-retry-total

The total number of retried record sends for a topic.

record-send-rate

The average number of records sent per second for a topic.

record-send-total

The total number of records sent for a topic.

7.7. Consumer MBeans

The following MBeans will exist in Kafka consumer applications, including Kafka Streams applications and Kafka Connect with sink connectors.

7.7.1. MBeans matching kafka.consumer:type=consumer-metrics,client-id=*

These are metrics at the consumer level.

AttributeDescription

connection-close-rate

Connections closed per second in the window.

connection-count

The current number of active connections.

connection-creation-rate

New connections established per second in the window.

failed-authentication-rate

Connections that failed authentication.

incoming-byte-rate

Bytes/second read off all sockets.

io-ratio

The fraction of time the I/O thread spent doing I/O.

io-time-ns-avg

The average length of time for I/O per select call in nanoseconds.

io-wait-ratio

The fraction of time the I/O thread spent waiting.

io-wait-time-ns-avg

The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.

network-io-rate

The average number of network operations (reads or writes) on all connections per second.

outgoing-byte-rate

The average number of outgoing bytes sent per second to all servers.

request-rate

The average number of requests sent per second.

request-size-avg

The average size of all requests in the window.

request-size-max

The maximum size of any request sent in the window.

response-rate

Responses received sent per second.

select-rate

Number of times the I/O layer checked for new I/O to perform per second.

successful-authentication-rate

Connections that were successfully authenticated using SASL or SSL.

7.7.2. MBeans matching kafka.consumer:type=consumer-metrics,client-id=*,node-id=*

These are metrics at the consumer level about connection to each broker.

AttributeDescription

incoming-byte-rate

The average number of responses received per second for a node.

outgoing-byte-rate

The average number of outgoing bytes sent per second for a node.

request-latency-avg

The average request latency in ms for a node.

request-latency-max

The maximum request latency in ms for a node.

request-rate

The average number of requests sent per second for a node.

request-size-avg

The average size of all requests in the window for a node.

request-size-max

The maximum size of any request sent in the window for a node.

response-rate

Responses received sent per second for a node.

7.7.3. MBeans matching kafka.consumer:type=consumer-coordinator-metrics,client-id=*

These are metrics at the consumer level about the consumer group.

AttributeDescription

assigned-partitions

The number of partitions currently assigned to this consumer.

commit-latency-avg

The average time taken for a commit request.

commit-latency-max

The max time taken for a commit request.

commit-rate

The number of commit calls per second.

heartbeat-rate

The average number of heartbeats per second.

heartbeat-response-time-max

The max time taken to receive a response to a heartbeat request.

join-rate

The number of group joins per second.

join-time-avg

The average time taken for a group rejoin.

join-time-max

The max time taken for a group rejoin.

last-heartbeat-seconds-ago

The number of seconds since the last controller heartbeat.

sync-rate

The number of group syncs per second.

sync-time-avg

The average time taken for a group sync.

sync-time-max

The max time taken for a group sync.

7.7.4. MBeans matching kafka.consumer:type=consumer-fetch-manager-metrics,client-id=*

These are metrics at the consumer level about the consumer's fetcher.

AttributeDescription

bytes-consumed-rate

The average number of bytes consumed per second.

bytes-consumed-total

The total number of bytes consumed.

fetch-latency-avg

The average time taken for a fetch request.

fetch-latency-max

The max time taken for any fetch request.

fetch-rate

The number of fetch requests per second.

fetch-size-avg

The average number of bytes fetched per request.

fetch-size-max

The maximum number of bytes fetched per request.

fetch-throttle-time-avg

The average throttle time in ms.

fetch-throttle-time-max

The maximum throttle time in ms.

fetch-total

The total number of fetch requests.

records-consumed-rate

The average number of records consumed per second.

records-consumed-total

The total number of records consumed.

records-lag-max

The maximum lag in terms of number of records for any partition in this window.

records-lead-min

The minimum lead in terms of number of records for any partition in this window.

records-per-request-avg

The average number of records in each request.

7.7.5. MBeans matching kafka.consumer:type=consumer-fetch-manager-metrics,client-id=*,topic=*

These are metrics at the topic level about the consumer's fetcher.

AttributeDescription

bytes-consumed-rate

The average number of bytes consumed per second for a topic.

bytes-consumed-total

The total number of bytes consumed for a topic.

fetch-size-avg

The average number of bytes fetched per request for a topic.

fetch-size-max

The maximum number of bytes fetched per request for a topic.

records-consumed-rate

The average number of records consumed per second for a topic.

records-consumed-total

The total number of records consumed for a topic.

records-per-request-avg

The average number of records in each request for a topic.

7.7.6. MBeans matching kafka.consumer:type=consumer-fetch-manager-metrics,client-id=*,topic=*,partition=*

These are metrics at the partition level about the consumer's fetcher.

AttributeDescription

records-lag

The latest lag of the partition.

records-lag-avg

The average lag of the partition.

records-lag-max

The max lag of the partition.

records-lead

The latest lead of the partition.

records-lead-avg

The average lead of the partition.

records-lead-min

The min lead of the partition.

7.8. Kafka Connect MBeans

Note

Kafka Connect will contain the producer MBeans for source connectors and consumer MBeans for sink connectors in addition to those documented here.

7.8.1. MBeans matching kafka.connect:type=connect-metrics,client-id=*

These are metrics at the connect level.

AttributeDescription

connection-close-rate

Connections closed per second in the window.

connection-count

The current number of active connections.

connection-creation-rate

New connections established per second in the window.

failed-authentication-rate

Connections that failed authentication.

incoming-byte-rate

Bytes/second read off all sockets.

io-ratio

The fraction of time the I/O thread spent doing I/O.

io-time-ns-avg

The average length of time for I/O per select call in nanoseconds.

io-wait-ratio

The fraction of time the I/O thread spent waiting.

io-wait-time-ns-avg

The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.

network-io-rate

The average number of network operations (reads or writes) on all connections per second.

outgoing-byte-rate

The average number of outgoing bytes sent per second to all servers.

request-rate

The average number of requests sent per second.

request-size-avg

The average size of all requests in the window.

request-size-max

The maximum size of any request sent in the window.

response-rate

Responses received sent per second.

select-rate

Number of times the I/O layer checked for new I/O to perform per second.

successful-authentication-rate

Connections that were successfully authenticated using SASL or SSL.

7.8.2. MBeans matching kafka.connect:type=connect-metrics,client-id=*,node-id=*

These are metrics at the connect level about connection to each broker.

AttributeDescription

incoming-byte-rate

The average number of responses received per second for a node.

outgoing-byte-rate

The average number of outgoing bytes sent per second for a node.

request-latency-avg

The average request latency in ms for a node.

request-latency-max

The maximum request latency in ms for a node.

request-rate

The average number of requests sent per second for a node.

request-size-avg

The average size of all requests in the window for a node.

request-size-max

The maximum size of any request sent in the window for a node.

response-rate

Responses received sent per second for a node.

7.8.3. MBeans matching kafka.connect:type=connect-worker-metrics

These are metrics at the connect level.

AttributeDescription

connector-count

The number of connectors run in this worker.

connector-startup-attempts-total

The total number of connector startups that this worker has attempted.

connector-startup-failure-percentage

The average percentage of this worker’s connectors starts that failed.

connector-startup-failure-total

The total number of connector starts that failed.

connector-startup-success-percentage

The average percentage of this worker’s connectors starts that succeeded.

connector-startup-success-total

The total number of connector starts that succeeded.

task-count

The number of tasks run in this worker.

task-startup-attempts-total

The total number of task startups that this worker has attempted.

task-startup-failure-percentage

The average percentage of this worker’s tasks starts that failed.

task-startup-failure-total

The total number of task starts that failed.

task-startup-success-percentage

The average percentage of this worker’s tasks starts that succeeded.

task-startup-success-total

The total number of task starts that succeeded.

7.8.4. MBeans matching kafka.connect:type=connect-worker-rebalance-metrics

AttributeDescription

completed-rebalances-total

The total number of rebalances completed by this worker.

epoch

The epoch or generation number of this worker.

leader-name

The name of the group leader.

rebalance-avg-time-ms

The average time in milliseconds spent by this worker to rebalance.

rebalance-max-time-ms

The maximum time in milliseconds spent by this worker to rebalance.

rebalancing

Whether this worker is currently rebalancing.

time-since-last-rebalance-ms

The time in milliseconds since this worker completed the most recent rebalance.

7.8.5. MBeans matching kafka.connect:type=connector-metrics,connector=*

AttributeDescription

connector-class

The name of the connector class.

connector-type

The type of the connector. One of 'source' or 'sink'.

connector-version

The version of the connector class, as reported by the connector.

status

The status of the connector. One of 'unassigned', 'running', 'paused', 'failed', or 'destroyed'.

7.8.6. MBeans matching kafka.connect:type=connector-task-metrics,connector=*,task=*

AttributeDescription

batch-size-avg

The average size of the batches processed by the connector.

batch-size-max

The maximum size of the batches processed by the connector.

offset-commit-avg-time-ms

The average time in milliseconds taken by this task to commit offsets.

offset-commit-failure-percentage

The average percentage of this task’s offset commit attempts that failed.

offset-commit-max-time-ms

The maximum time in milliseconds taken by this task to commit offsets.

offset-commit-success-percentage

The average percentage of this task’s offset commit attempts that succeeded.

pause-ratio

The fraction of time this task has spent in the pause state.

running-ratio

The fraction of time this task has spent in the running state.

status

The status of the connector task. One of 'unassigned', 'running', 'paused', 'failed', or 'destroyed'.

7.8.7. MBeans matching kafka.connect:type=sink-task-metrics,connector=*,task=*

AttributeDescription

offset-commit-completion-rate

The average per-second number of offset commit completions that were completed successfully.

offset-commit-completion-total

The total number of offset commit completions that were completed successfully.

offset-commit-seq-no

The current sequence number for offset commits.

offset-commit-skip-rate

The average per-second number of offset commit completions that were received too late and skipped/ignored.

offset-commit-skip-total

The total number of offset commit completions that were received too late and skipped/ignored.

partition-count

The number of topic partitions assigned to this task belonging to the named sink connector in this worker.

put-batch-avg-time-ms

The average time taken by this task to put a batch of sinks records.

put-batch-max-time-ms

The maximum time taken by this task to put a batch of sinks records.

sink-record-active-count

The number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task.

sink-record-active-count-avg

The average number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task.

sink-record-active-count-max

The maximum number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task.

sink-record-lag-max

The maximum lag in terms of number of records that the sink task is behind the consumer’s position for any topic partitions.

sink-record-read-rate

The average per-second number of records read from Kafka for this task belonging to the named sink connector in this worker. This is before transformations are applied.

sink-record-read-total

The total number of records read from Kafka by this task belonging to the named sink connector in this worker, since the task was last restarted.

sink-record-send-rate

The average per-second number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations.

sink-record-send-total

The total number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker, since the task was last restarted.

7.8.8. MBeans matching kafka.connect:type=source-task-metrics,connector=*,task=*

AttributeDescription

poll-batch-avg-time-ms

The average time in milliseconds taken by this task to poll for a batch of source records.

poll-batch-max-time-ms

The maximum time in milliseconds taken by this task to poll for a batch of source records.

source-record-active-count

The number of records that have been produced by this task but not yet completely written to Kafka.

source-record-active-count-avg

The average number of records that have been produced by this task but not yet completely written to Kafka.

source-record-active-count-max

The maximum number of records that have been produced by this task but not yet completely written to Kafka.

source-record-poll-rate

The average per-second number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker.

source-record-poll-total

The total number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker.

source-record-write-rate

The average per-second number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations.

source-record-write-total

The number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker, since the task was last restarted.

7.8.9. MBeans matching kafka.connect:type=task-error-metrics,connector=*,task=*

AttributeDescription

deadletterqueue-produce-failures

The number of failed writes to the dead letter queue.

deadletterqueue-produce-requests

The number of attempted writes to the dead letter queue.

last-error-timestamp

The epoch timestamp when this task last encountered an error.

total-errors-logged

The number of errors that were logged.

total-record-errors

The number of record processing errors in this task.

total-record-failures

The number of record processing failures in this task.

total-records-skipped

The number of records skipped due to errors.

total-retries

The number of operations retried.

7.9. Kafka Streams MBeans

Note

A Streams application will contain the producer and consumer MBeans in addition to those documented here.

7.9.1. MBeans matching kafka.streams:type=stream-metrics,client-id=*

These metrics are collected when the metrics.recording.level configuration parameter is info or debug.

AttributeDescription

commit-latency-avg

The average execution time in ms for committing, across all running tasks of this thread.

commit-latency-max

The maximum execution time in ms for committing across all running tasks of this thread.

commit-rate

The average number of commits per second.

commit-total

The total number of commit calls across all tasks.

poll-latency-avg

The average execution time in ms for polling, across all running tasks of this thread.

poll-latency-max

The maximum execution time in ms for polling across all running tasks of this thread.

poll-rate

The average number of polls per second.

poll-total

The total number of poll calls across all tasks.

process-latency-avg

The average execution time in ms for processing, across all running tasks of this thread.

process-latency-max

The maximum execution time in ms for processing across all running tasks of this thread.

process-rate

The average number of process calls per second.

process-total

The total number of process calls across all tasks.

punctuate-latency-avg

The average execution time in ms for punctuating, across all running tasks of this thread.

punctuate-latency-max

The maximum execution time in ms for punctuating across all running tasks of this thread.

punctuate-rate

The average number of punctuates per second.

punctuate-total

The total number of punctuate calls across all tasks.

skipped-records-rate

The average number of skipped records per second.

skipped-records-total

The total number of skipped records.

task-closed-rate

The average number of tasks closed per second.

task-closed-total

The total number of tasks closed.

task-created-rate

The average number of newly created tasks per second.

task-created-total

The total number of tasks created.

7.9.2. MBeans matching kafka.streams:type=stream-task-metrics,client-id=*,task-id=*

Task metrics.

These metrics are collected when the metrics.recording.level configuration parameter is debug.

AttributeDescription

commit-latency-avg

The average commit time in ns for this task.

commit-latency-max

The maximum commit time in ns for this task.

commit-rate

The average number of commit calls per second.

commit-total

The total number of commit calls.

7.9.3. MBeans matching kafka.streams:type=stream-processor-node-metrics,client-id=*,task-id=*,processor-node-id=*

Processor node metrics.

These metrics are collected when the metrics.recording.level configuration parameter is debug.

AttributeDescription

create-latency-avg

The average create execution time in ns.

create-latency-max

The maximum create execution time in ns.

create-rate

The average number of create operations per second.

create-total

The total number of create operations called.

destroy-latency-avg

The average destroy execution time in ns.

destroy-latency-max

The maximum destroy execution time in ns.

destroy-rate

The average number of destroy operations per second.

destroy-total

The total number of destroy operations called.

forward-rate

The average rate of records being forwarded downstream, from source nodes only, per second.

forward-total

The total number of of records being forwarded downstream, from source nodes only.

process-latency-avg

The average process execution time in ns.

process-latency-max

The maximum process execution time in ns.

process-rate

The average number of process operations per second.

process-total

The total number of process operations called.

punctuate-latency-avg

The average punctuate execution time in ns.

punctuate-latency-max

The maximum punctuate execution time in ns.

punctuate-rate

The average number of punctuate operations per second.

punctuate-total

The total number of punctuate operations called.

7.9.4. MBeans matching kafka.streams:type=stream-[store-scope]-metrics,client-id=*,task-id=*,[store-scope]-id=*

State store metrics.

These metrics are collected when the metrics.recording.level configuration parameter is debug.

AttributeDescription

all-latency-avg

The average all operation execution time in ns.

all-latency-max

The maximum all operation execution time in ns.

all-rate

The average all operation rate for this store.

all-total

The total number of all operation calls for this store.

delete-latency-avg

The average delete execution time in ns.

delete-latency-max

The maximum delete execution time in ns.

delete-rate

The average delete rate for this store.

delete-total

The total number of delete calls for this store.

flush-latency-avg

The average flush execution time in ns.

flush-latency-max

The maximum flush execution time in ns.

flush-rate

The average flush rate for this store.

flush-total

The total number of flush calls for this store.

get-latency-avg

The average get execution time in ns.

get-latency-max

The maximum get execution time in ns.

get-rate

The average get rate for this store.

get-total

The total number of get calls for this store.

put-all-latency-avg

The average put-all execution time in ns.

put-all-latency-max

The maximum put-all execution time in ns.

put-all-rate

The average put-all rate for this store.

put-all-total

The total number of put-all calls for this store.

put-if-absent-latency-avg

The average put-if-absent execution time in ns.

put-if-absent-latency-max

The maximum put-if-absent execution time in ns.

put-if-absent-rate

The average put-if-absent rate for this store.

put-if-absent-total

The total number of put-if-absent calls for this store.

put-latency-avg

The average put execution time in ns.

put-latency-max

The maximum put execution time in ns.

put-rate

The average put rate for this store.

put-total

The total number of put calls for this store.

range-latency-avg

The average range execution time in ns.

range-latency-max

The maximum range execution time in ns.

range-rate

The average range rate for this store.

range-total

The total number of range calls for this store.

restore-latency-avg

The average restore execution time in ns.

restore-latency-max

The maximum restore execution time in ns.

restore-rate

The average restore rate for this store.

restore-total

The total number of restore calls for this store.

7.9.5. MBeans matching kafka.streams:type=stream-record-cache-metrics,client-id=*,task-id=*,record-cache-id=*

Record cache metrics.

These metrics are collected when the metrics.recording.level configuration parameter is debug.

AttributeDescription

hitRatio-avg

The average cache hit ratio defined as the ratio of cache read hits over the total cache read requests.

hitRatio-max

The maximum cache hit ratio.

hitRatio-min

The mininum cache hit ratio.