Chapter 5. Debezium

Red Hat Integration 2020-Q2 includes a General Availability release of Debezium on OpenShift based on the Debezium open source project. Debezium is a distributed change data capture platform that tracks database operations and streams change data events. Debezium is built on Apache Kafka and is deployed and integrated with AMQ Streams.

Debezium captures row-level changes to database tables and passes corresponding change event records to AMQ Streams. Applications can read these change event streams and access the change events in the order in which they occurred.

The following topics provide release details:

5.1. Debezium database connectors

Debezium provides connectors based on Kafka Connect for the following common databases:

  • MySQL
  • PostgreSQL

    The Debezium 1.1.2 version of the PostgreSQL connector, which was originally part of Red Hat Integration 2020-Q2, has a security vulnerability. See Debezium release 1.1.3 for details about the PostgreSQL connector update in Debezium 1.1.3, which is now part of Red Hat Integration 2020-Q2, and which fixes this vulnerability.

  • MongoDB - Supported in this release. It was a Technology Preview feature in the previous release.
  • SQL Server - Supported in this release. It was a Technology Preview feature in the previous release.

5.2. Supported database versions for Debezium

When trying out the database connectors, the following database versions are supported for this release:

DatabaseVersions

MySQL

5.7, 8.0

PostgreSQL

10, 11, 12

MongoDB

3.6, 4.0, 4.2

SQL Server

2017, 2019

Note

For PostgreSQL deployments, you use the pgoutput logical decoding output plug-in, which is the default for PostgreSQL versions 10 and later.

5.3. Debezium installation options

You can install Debezium with AMQ Streams on OpenShift or RHEL:

Important

Technology Preview features are not supported with Red Hat production service-level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend implementing any Technology Preview features in production environments. This Technology Preview feature provides early access to upcoming product innovations, enabling you to test functionality and provide feedback during the development process. For more information about support scope, see Technology Preview Features Support Scope.

5.4. New Debezium features

This release provides the following new Debezium features:

  • ByLogicalTableRouter single message transformation for re-routing data change event records to topics that you specify.
  • ExtractNewRecordState single message transformation for flattening the complex structure of a data change event record into a simplified format that some Kafka Connect consumers might require.
  • CloudEventsConverter for emitting change event records that conform to the CloudEvents specification. This is a Technology Preview feature.
  • The MongoDB, PostgreSQL, and SQL Server connectors support transaction metadata. When using these connector types, Debezium can generate events that represent transaction metadata boundaries. This lets downstream consumers group and aggregate change data event records that originate from an individual transaction.

    When using the Debezium connector for MongoDB, there is a limitation if you are using MongoDB 4.2. The limitation is that you cannot use the connector’s transaction metadata feature. This limitation is expected to be removed in a future release.

  • Avro serialization - You can configure Debezium connectors to use Avro to serialize message keys and values. This is a Technology Preview feature.

5.5. Debezium 1.1.3 release

Debezium 1.1.3 provides an updated PostgreSQL connector. There are no other updates in Debezium 1.1.3.

Red Hat Integration 2020-Q2 originally included the Debezium 1.1.2 release and now includes the Debezium 1.1.3 release. The 1.1.3 release updates only the PostgreSQL connector. The update to the connector is that it now uses version 42.2.14 of the PostgreSQL JDBC driver, which fixes a Common Vulnerability and Exposure (CVE-2020-13692) issue. Consequently, if you are already using a release of the Debezium PostgreSQL connector, it is recommended that you upgrade to the PostgreSQL connector provided in Debezium 1.1.3. To do this, go to the Red Hat Integration 2020-Q2 Software Downloads, Security Advisories tab and download the Debezium 1.1.3 PostgreSQL Connector.

Additional details are available in the Debezium 1.1x Resolved Issues knowledge article.

The use of the newer JDBC driver changes PostgreSQL connector snapshot behavior for data change event records for partitioned tables. Previously, the connector sent snapshot change event records to a topic that corresponded to the parent table. In this release, the connector routes snapshot change event records for a partitioned table to a different topic for each partition. If you need to change this behavior, you can configure the ByLogicalTableRouter single message transformation (SMT), which is a new feature in Red Hat Integration 2020-Q2.

The CVE that was fixed permitted an XML external entity attack (see https://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing). When processing XML contents from untrusted sources with the older version of the PostgreSQL JDBC driver, an attack could lead to "the disclosure of confidential data, denial of service, server side request forgery, port scanning from the perspective of the machine where the parser is located, and other system impacts".