Chapter 10. Debezium logging

Debezium has extensive logging built into its connectors, and you can change the logging configuration to control which of these log statements appear in the logs and where those logs are sent. Debezium (as well as Kafka, Kafka Connect, and Zookeeper) use the Log4j logging framework for Java.

By default, the connectors produce a fair amount of useful information when they start up, but then produce very few logs when the connector is keeping up with the source databases. This is often sufficient when the connector is operating normally, but may not be enough when the connector is behaving unexpectedly. In such cases, you can change the logging level so that the connector generates much more verbose log messages describing what the connector is doing and what it is not doing.

10.1. Debezium logging concepts

Before configuring logging, you should understand what Log4J loggers, log levels, and appenders are.

Loggers

Each log message produced by the application is sent to a specific logger (for example, io.debezium.connector.mysql). Loggers are arranged in hierarchies. For example, the io.debezium.connector.mysql logger is the child of the io.debezium.connector logger, which is the child of the io.debezium logger. At the top of the hierarchy, the root logger defines the default logger configuration for all of the loggers beneath it.

Log levels

Every log message produced by the application also has a specific log level:

  1. ERROR - errors, exceptions, and other significant problems
  2. WARN - potential problems and issues
  3. INFO - status and general activity (usually low-volume)
  4. DEBUG - more detailed activity that would be useful in diagnosing unexpected behavior
  5. TRACE - very verbose and detailed activity (usually very high-volume)

Appenders

An appender is essentially a destination where log messages are written. Each appender controls the format of its log messages, giving you even more control over what the log messages look like.

To configure logging, you specify the desired level for each logger and the appender(s) where those log messages should be written. Since loggers are hierarchical, the configuration for the root logger serves as a default for all of the loggers below it, although you can override any child (or descendant) logger.

10.2. The default Debezium logging configuration

If you are running Debezium connectors in a Kafka Connect process, then Kafka Connect uses the Log4j configuration file (for example, /opt/kafka/config/connect-log4j.properties) in the Kafka installation. By default, this file contains the following configuration:

connect-log4j.properties

log4j.rootLogger=INFO, stdout  1

log4j.appender.stdout=org.apache.log4j.ConsoleAppender  2
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout  3
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n  4
...

1 1
The root logger, which defines the default logger configuration. By default, loggers include INFO, WARN, and ERROR messages. These log messages are written to the stdout appender.
2 2
The stdout appender writes log messages to the console (as opposed to a file).
3 3
The stdout appender uses a pattern matching algorithm to format the log messages.
4 4
The pattern for the stdout appender (see the Log4j documentation for details).

Unless you configure other loggers, all of the loggers that Debezium uses inherit the rootLogger configuration.

10.3. Configuring Debezium logging

By default, Debezium connectors write all INFO, WARN, and ERROR messages to the console. However, you can change this configuration in the following ways:

Note

There are other methods that you can use to configure Debezium logging with Log4j. For more information, search for tutorials about setting up and using appenders to send log messages to specific destinations.

10.3.1. Changing the Debezium logging level

The default Debezium logging level provides sufficient information to show whether a connector is healthy or not. However, if a connector is not healthy, you can change its logging level to troubleshoot the issue.

In general, Debezium connectors send their log messages to loggers with names that match the fully-qualified name of the Java class that is generating the log message. Debezium uses packages to organize code with similar or related functions. This means that you can control all of the log messages for a specific class or for all of the classes within or under a specific package.

Procedure

  1. Open the log4j.properties file.
  2. Configure a logger for the connector.

    This example configures loggers for the MySQL connector and the database history implementation used by the connector, and sets them to log DEBUG level messages:

    log4j.properties

    ...
    log4j.logger.io.debezium.connector.mysql=DEBUG, stdout  1
    log4j.logger.io.debezium.relational.history=DEBUG, stdout  2
    
    log4j.additivity.io.debezium.connector.mysql=false  3
    log4j.additivity.io.debezium.relational.history=false  4
    ...

    1
    Configures the logger named io.debezium.connector.mysql to send DEBUG, INFO, WARN, and ERROR messages to the stdout appender.
    2
    Configures the logger named io.debezium.relational.history to send DEBUG, INFO, WARN, and ERROR messages to the stdout appender.
    3 4
    Turns off additivity, which results in log messages not being sent to the appenders of parent loggers (this can prevent seeing duplicate log messages when using multiple appenders).
  3. If necessary, change the logging level for a specific subset of the classes within the connector.

    Increasing the logging level for the entire connector increases the log verbosity, which can make it difficult to understand what is happening. In these cases, you can change the logging level just for the subset of classes that are related to the issue that you are troubleshooting.

    1. Set the connector’s logging level to either DEBUG or TRACE.
    2. Review the connector’s log messages.

      Find the log messages that are related to the issue that you are troubleshooting. The end of each log message shows the name of the Java class that produced the message.

    3. Set the connector’s logging level back to INFO.
    4. Configure a logger for each Java class that you identified.

      For example, consider a scenario in which you are unsure why the MySQL connector is skipping some events when it is processing the binlog. Rather than turn on DEBUG or TRACE logging for the entire connector, you can keep the connector’s logging level at INFO and then configure DEBUG or TRACE on just the class that is reading the binlog:

      log4j.properties

      ...
      log4j.logger.io.debezium.connector.mysql=INFO, stdout
      log4j.logger.io.debezium.connector.mysql.BinlogReader=DEBUG, stdout
      log4j.logger.io.debezium.relational.history=INFO, stdout
      
      log4j.additivity.io.debezium.connector.mysql=false
      log4j.additivity.io.debezium.relational.history=false
      log4j.additivity.io.debezium.connector.mysql.BinlogReader=false
      ...

10.3.2. Adding Debezium mapped diagnostic contexts

Most Debezium connectors (and the Kafka Connect workers) use multiple threads to perform different activities. This can make it difficult to look at a log file and find only those log messages for a particular logical activity. To make the log messages easier to find, Debezium provides several mapped diagnostic contexts (MDC) that provide additional information for each thread.

Debezium provides the following MDC properties:

dbz.connectorType
A short alias for the type of connector. For example, MySql, Mongo, Postgres, and so on. All threads associated with the same type of connector use the same value, so you can use this to find all log messages produced by a given type of connector.
dbz.connectorName
The name of the connector or database server as defined in the connector’s configuration. For example products, serverA, and so on. All threads associated with a specific connector instance use the same value, so you can find all of the log messages produced by a specific connector instance.
dbz.connectorContext
A short name for an activity running as a separate thread running within the connector’s task. For example, main, binlog, snapshot, and so on. In some cases, when a connector assigns threads to specific resources (such as a table or collection), the name of that resource could be used instead. Each thread associated with a connector would use a distinct value, so you can find all of the log messages associated with this particular activity.

To enable MDC for a connector, you configure an appender in the log4j.properties file.

Procedure

  1. Open the log4j.properties file.
  2. Configure an appender to use any of the supported Debezium MDC properties.

    In the following example, the stdout appender is configured to use these MDC properties:

    log4j.properties

    ...
    log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p  %X{dbz.connectorType}|%X{dbz.connectorName}|%X{dbz.connectorContext}  %m   [%c]%n
    ...

    The configuration in the preceding example produces log messages similar to the ones in the following output:

    ...
    2017-02-07 20:49:37,692 INFO   MySQL|dbserver1|snapshot  Starting snapshot for jdbc:mysql://mysql:3306/?useInformationSchema=true&nullCatalogMeansCurrent=false&useSSL=false&useUnicode=true&characterEncoding=UTF-8&characterSetResults=UTF-8&zeroDateTimeBehavior=convertToNull with user 'debezium'   [io.debezium.connector.mysql.SnapshotReader]
    2017-02-07 20:49:37,696 INFO   MySQL|dbserver1|snapshot  Snapshot is using user 'debezium' with these MySQL grants:   [io.debezium.connector.mysql.SnapshotReader]
    2017-02-07 20:49:37,697 INFO   MySQL|dbserver1|snapshot  	GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'debezium'@'%'   [io.debezium.connector.mysql.SnapshotReader]
    ...

    Each line in the log includes the connector type (for example, MySQL), the name of the connector (for example, dbserver1), and the activity of the thread (for example, snapshot).

10.4. Debezium logging on OpenShift

If you are using Debezium on OpenShift, you can use the Kafka Connect loggers to configure the Debezium loggers and logging levels. For more information about configuring logging properties in a Kafka Connect schema, see Using AMQ Streams on OpenShift