Overview and Features of Oracle XStream CDC Source Connector for Confluent Cloud¶

The fully-managed Oracle XStream CDC Source connector for Confluent Cloud captures all changes made to rows in an Oracle database and represents the changes as change event records in Apache Kafka® topics. The connector uses Oracle’s XStream API to read changes from the database redo log.

Oracle XStream is a set of components and APIs in Oracle database that enables client applications, such as the connector, to receive changes from an Oracle database.

The connector leverages XStream Out to capture both Data Manipulation Language (DML) and Data Definition Language (DDL) changes from the database redo log. When XStream Out is used, a capture process captures changes made to an Oracle database, converts the changes into Logical Change Records (LCRs), and sends the LCRs to an outbound server. The outbound server then sends the LCRs to the connector.

Note

The connector is built using the Debezium and Kafka Connect frameworks.

Features¶

The Oracle XStream CDC Source connector provides the following features:

Snapshots
Streaming
Change event topics
Schema changes
At-least-once delivery
Before and after state for change events
Oracle multi-tenant architecture support
Customizable data type handling
Tombstone events
Heartbeats
Automated error recovery
Oracle Real Application Cluster (RAC) support

Snapshots¶

When you start the connector for the first time, it takes a snapshot of the schema for each captured table and, optionally, captures a consistent snapshot of the current state of the rows for these tables. As part of this snapshot process, the connector acquires a lock (ROW SHARE MODE) on each of the captured tables. This lock, required only for capturing the table schema and not the row data, is hence held for a short duration. The connector uses an Oracle Flashback query to capture the state of the existing rows. You can customize the snapshot behavior by using the snapshot.mode configuration property.

Note

If the connector is interrupted, stopped, or fails during the snapshot process of any tables, upon recovery or restart, the connector restarts all snapshots from the beginning. It is currently not possible to resume a snapshot of a table that is changing while ensuring that all changes to that table have been captured.

The connector supports parallel snapshots, allowing it to process multiple tables at the same time by distributing the tables across available threads. However, it does not split a single table across multiple threads. Each thread uses a separate database connection.

Streaming¶

After the initial snapshot is completed, the connector starts streaming changes for the specified tables. The connector streams changes from the Oracle database using Oracle’s XStream Out API. During this phase of operation:

The connector starts by attaching to the XStream outbound server specified in the database.out.server.name configuration property.
After successfully attaching to the outbound server, the connector receives changes made to the captured tables, and writes these changes as records to the appropriate change event topics in Kafka. Each change includes the full state of the row.

The connector receives changes from the database in transaction commit order. It ensures that events for each table are written to the change event topic in the same order as they occurred in the database.

Note

An XStream outbound server can support only one active client session at a time. This means multiple connectors cannot be attached to the same outbound server simultaneously. As a result, separate outbound servers must be configured for each connector.

Change event topics¶

The connector writes change events for all changes in a table to a specific Apache Kafka® topic dedicated to that table.

The connector uses two configuration properties to identify which tables to capture from the database:

The table.include.list configuration specifies a comma-separated list of regular expressions that match fully-qualified table identifiers for the tables whose changes should be captured.
The table.exclude.list configuration specifies a comma-separated list of regular expressions that match fully-qualified table identifiers for the tables whose changes should not be captured.

Note

The tables to be captured from the database must be specified in both the connector configuration (for example, using the table.include.list configuration property) and in the rule sets of the capture process and outbound server to which the connector is attached.

The connector can capture changes from tables across different schemas within the same database. A separate change event topic is created for each table being captured, ensuring that changes are streamed to distinct topics per table.

Schema changes¶

The connector stores the schema of captured tables over time in a dedicated topic, known as the schema history topic.

This topic is initially populated with the table schema during the initial snapshot.
It is subsequently updated as the connector processes DDL statements (like CREATE, ALTER) during the streaming phase.

Upon a connector restart, the connector reads from this topic to rebuild the schema of each captured table as it existed at the point in time when streaming resumes. This ensures that the connector can correctly interpret the change events based on the schema at the time the changes were made.

Note

The database schema history topic is intended for internal connector use only.

At-least-once delivery¶

The connector guarantees that records are delivered at least once to the Kafka topic.

Before and after state for change events¶

For update operations, the connector emits:

The state of the row before the update, with the original values.
The state of the row after the update, with the modified values.

Oracle multi-tenant architecture support¶

Each instance of the connector can capture tables from a single Pluggable Database (PDB). The PDB name, where the tables are located, can be configured using the database.pdb.name configuration property.

Note

If you need to read from tables in the Container Database (CDB), do not specify a value for the database.pdb.name configuration property.

Customizable data type handling¶

For certain data types, such as numeric and temporal, you can customize how the connector maps them to Connect data types by modifying configuration properties. This allows for greater flexibility in handling different types of data, ensuring that the change events reflect the desired format and meet specific requirements.

Tombstone events¶

When a row is deleted in the source table, a delete change event is generated and sent to the Kafka topic. Subsequently, the connector emits a tombstone event with the same key as the original record, but with a null value. Tombstone records are used in Kafka’s log compaction process to ensure that only the most recent state of a record is retained in the log.

You can modify this behavior using the tombstones.on.delete configuration property.

Heartbeats¶

The connector periodically updates the outbound server with the position of the latest change it has processed, enabling the database to purge archived redo logs containing already processed transactions. However, if the database is inactive or no changes are being made to the captured tables, the connector cannot advance the position and update the outbound server.

Heartbeats are a mechanism that allows the connector to continue advancing the position even when the database is inactive or no changes are occurring to the captured tables. When enabled, the connector:

Creates a dedicated heartbeat topic.
Emits a simple event to this topic at regular intervals as needed.

This interval can be configured using the heartbeat.interval.ms configuration property. It is recommended to set the heartbeat.interval.ms configuration to a value with an order of minutes to hours. The default value of heartbeat.interval.ms is 0, which disables emission of heartbeat records from the connector.

Note

The heartbeat topic is intended for internal connector use only.

Automated error recovery¶

The connector has automated retries for handling various retriable errors. When a retriable error occurs, the connector automatically restarts in an attempt to recover. It will retry up to three times before stopping and entering a failed state, which requires user intervention to resolve.

The list of retriable errors is fixed and cannot be configured by the user.

Oracle Real Application Cluster (RAC) support¶

The connector fully supports Oracle RACs, enabling seamless integration with Oracle’s clustered databases, and ensuring high availability and fault tolerance.

Oracle End User Terms¶

In addition to the terms of your applicable agreement with Confluent, your use of the Oracle XStream CDC Source connector for Confluent Cloud is subject to the following flow down terms from Oracle:

You must provide Confluent with prior notice if you transfer, assign, or grant any rights or interests to another individual or entity with respect to your use of the Oracle XStream CDC Source connector for Confluent Cloud.
You agree, to the extent permitted by applicable law, that Oracle has no liability for (a) any damages, whether direct, indirect, incidental, special, punitive or consequential, and (b) any loss of profits, revenue, data or data use, arising from the use of the programs with respect to your use of the Oracle XStream CDC Source connector for Confluent Cloud.
You agree that Oracle is not required to perform any obligations to you as part of your use of the Oracle XStream CDC Source connector for Confluent Cloud.
Only applicable if you are an end user at any government level. If Oracle suspends any authorization or licenses in connection with the Oracle XStream CDC Source connector for Confluent Cloud, Confluent may immediately suspend your access to the Oracle XStream CDC Source connector for Confluent Cloud until Confluent resolves the issue with Oracle.