Logical data replication is only supported in CockroachDB self-hosted clusters.
Once you have logical data replication (LDR) running, you will need to track and manage certain parts of the job:
- Conflict resolution: As changes to a table replicate from the source to the destination cluster, there can be conflicts during some operations that the job will handle with conflict resolution. When LDR is started, the job creates a dead letter queue (DLQ) table with each replicating table. LDR will send any unresolved conflicts to the DLQ, which you should monitor as LDR continues to replicate changes between the source and destination clusters.
- Schema changes: The tables that are part of the LDR job may require schema changes, which need to be coordinated manually. There are some schema changes that are supported while LDR jobs are actively running.
- Jobs: Changefeeds and backups can operate on clusters running LDR jobs. You may want to consider where you start and how you manage backups, changefeeds, row-level TTL, and so on when you're running LDR.
Conflict resolution
In LDR, conflicts are detected at both the KV and the SQL level, which determines how LDR resolves the conflict.
KV level conflicts
LDR uses last write wins (LWW) conflict resolution based on the MVCC timestamp of the replicating write. LDR will resolve conflicts by inserting the row with the latest MVCC timestamp.
Conflicts at the KV level are detected when there is either:
- An UPDATEoperation replicated to the destination cluster.
- A cross-cluster write occurs, which means both clusters are writing to the same key. For example, if the LDR stream attempts to apply a row to the destination cluster where the existing row on the destination was not written by the LDR stream.
SQL level conflicts
When a conflict cannot apply due to violating constraints, for example, a schema constraint, LDR will send the row to the DLQ.
Dead letter queue (DLQ)
When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period of a minute, which could occur if there is a unique index on the destination table (for more details, refer to Unique seconday indexes).
LDR will not pause when the writes are sent to the DLQ, you must manage the DLQ manually.
To manage the DLQ, you can evaluate entries in the incoming_row column and apply the row manually to another table with SQL statements.
As an example, for an LDR stream created on the movr.public.promo_codes table:
SHOW TABLES;
The table will have a random number within its name to ensure it is unique.
    schema_name    |         table_name         | type  | owner | estimated_row_count | locality
-------------------+----------------------------+-------+-------+---------------------+-----------
  crdb_replication | dlq_113_public_promo_codes | table | node  |                 186 | NULL
  public           | promo_codes                | table | root  |                1047 | NULL
  public           | rides                      | table | root  |                 976 | NULL
  public           | user_promo_codes           | table | root  |                 134 | NULL
  public           | users                      | table | root  |                 424 | NULL
  public           | vehicle_location_histories | table | root  |               13012 | NULL
  public           | vehicles                   | table | root  |                 153 | NULL
(7 rows)
The schema for the movr.crdb_replication.dlq_113_public_promo_codes DLQ:
CREATE TABLE crdb_replication.dlq_113_public_promo_codes (
    id INT8 NOT NULL DEFAULT unique_rowid(),
    ingestion_job_id INT8 NOT NULL,
    table_id INT8 NOT NULL,
    dlq_timestamp TIMESTAMPTZ NOT NULL DEFAULT now():::TIMESTAMPTZ,
    dlq_reason STRING NOT NULL,
    mutation_type crdb_replication.mutation_type NULL,
    key_value_bytes BYTES NOT VISIBLE NOT NULL,
    incoming_row JSONB NULL,
    crdb_internal_dlq_timestamp_id_ingestion_job_id_shard_16 INT8 NOT VISIBLE NOT NULL AS (mod(fnv32(md5(crdb_internal.datums_to_bytes(dlq_timestamp, id, ingestion_job_id))), 16:::INT8)) VIRTUAL,
CONSTRAINT dlq_113_public_promo_codes_pkey PRIMARY KEY (ingestion_job_id ASC, dlq_timestamp ASC, id ASC) USING HASH WITH (bucket_count=16)
)
Schema changes
When you start LDR on a table, the job will lock the schema, which will prevent any accidental schema changes that would cause issues for LDR. There are some supported schema changes that you can perform on a replicating table, otherwise it is necessary to stop LDR in order to coordinate the schema change.
Supported schema changes
There are some supported schema changes, which you can perform during LDR without restarting the job:
| Allowlist schema change | Exceptions | 
|---|---|
| CREATE INDEX | |
| New in v25.2: ALTER INDEX ... RENAME | N/A | 
| New in v25.2: ALTER INDEX ... NOT VISIBLE | N/A | 
| DROP INDEX | N/A | 
| New in v25.2: ALTER TABLE ... ALTER COLUMN ... SET DEFAULT | N/A | 
| New in v25.2: ALTER TABLE ... ALTER COLUMN ... DROP DEFAULT | N/A | 
| New in v25.2: ALTER TABLE ... ALTER COLUMN ... SET VISIBLE | N/A | 
| Zone configuration changes | N/A | 
| ALTER TABLE ... CONFIGURE ZONE | N/A | 
| ALTER TABLE ... SET/RESET {TTL storage parameters} | 
 | 
LDR will not replicate the allowlist schema changes to the destination table. Therefore, you must perform the schema change carefully on both the source and destination cluster.
Coordinate other schema changes
To perform any other schema change on the table, use the following approach to redirect application traffic to one cluster. You'll need to drop the existing LDR jobs, perform the schema change, and start new LDR jobs, which will require a full initial scan.
If you are running LDR in a unidirectional setup, follow Coordinate schema changes for unidirectional LDR.
Redirect application traffic to one cluster
You have a bidirectional LDR setup with a stream between cluster A to cluster B, and a stream between cluster B to cluster A.
- Redirect your application traffic to one cluster, for example, cluster A.
- Wait for all traffic from cluster B to replicate to cluster A. Check this is complete with: - SHOW LOGICAL REPLICATION JOBS WITH DETAILS;- This is complete when - replicated_timeon cluster B surpasses the time at which you redirected application traffic, which indicates that all traffic from cluster B has been replicated to cluster A.- job_id | status | targets | replicated_time | replication_start_time | conflict_resolution_type | description ----------------------+----------+--------------------------------+-------------------------------+-------------------------------+--------------------------+----------------------------------------------------------------------------------------- 1010959260799270913 | running | {movr.public.promo_codes} | 2024-10-24 17:50:05+00 | 2024-10-10 20:04:42.196982+00 | LWW | LOGICAL REPLICATION STREAM into movr.public.promo_codes from external://cluster_a 1014047902397333505 | canceled | {defaultdb.public.office_dogs} | 2024-10-24 17:30:25+00 | 2024-10-21 17:54:20.797643+00 | LWW | LOGICAL REPLICATION STREAM into defaultdb.public.office_dogs from external://cluster_a
- Drop the LDR job on both clusters. Canceling the LDR streams will remove the history retention job, which will cause the data to be garbage collected according to the - gc.ttlsecondssetting. Use- CANCEL JOB:- CANCEL JOB {ldr_job_id};
- Perform the schema change on cluster A. 
- Drop the existing table from cluster B. 
- Recreate the table and its new schema on cluster B. 
- Create new LDR streams for the table on both clusters A and B. Run - CREATE LOGICAL REPLICATION STREAMfrom the destination cluster for each stream:- CREATE LOGICAL REPLICATION STREAM FROM TABLE {database.public.table_name} ON 'external://{source_external_connection}' INTO TABLE {database.public.table_name};Note:- If your table does not contain any user-defined types or foreign key dependencies, use the - CREATE LOGICALLY REPLICATEDsyntax to start the stream for a fast, offline initial scan and automatic destination table setup.
Coordinate schema changes for unidirectional LDR
If you have a unidirectional LDR setup, you should cancel the running LDR stream and redirect all application traffic to the source cluster.
- Drop the LDR job on the destination cluster. Canceling the LDR job will remove the history retention job, which will cause the data to be garbage collected according to the - gc.ttlsecondssetting. Use- CANCEL JOB:- CANCEL JOB {ldr_job_id};
- Once the job has - canceled, perform the required schema change on both the source and destination tables.
- Start a new LDR job from the destination cluster: - CREATE LOGICAL REPLICATION STREAM FROM TABLE {database.public.table_name} ON 'external://{source_external_connection}' INTO TABLE {database.public.table_name};Note:- If your table does not contain any user-defined types or foreign key dependencies, use the - CREATE LOGICALLY REPLICATEDsyntax to start the stream for a fast, offline initial scan and automatic destination table setup.
Jobs and LDR
You can run changefeed and backup jobs on any cluster that is involved in an LDR job. Both source and destination clusters in LDR are active, which means they can both serve production reads and writes as well as run backups and changefeeds.
You may want to run jobs like changefeeds from one cluster to isolate these jobs from the cluster receiving the principal application traffic.
Changefeeds
Changefeeds will emit messages for the writes that occur to the watched table. If the watched table is also the destination to which LDR is streaming, the changefeed will additionally emit messages for the writes from the LDR job. For example:
- You create a changefeed watching the test_tableon cluster A.
- You start LDR from cluster A test_tablereplicating to cluster B'stest_table. There are writes totest_tablehappening on both clusters. At this point, the changefeed is only emitting messages for cluster A (the source of the LDR job).
- You start another LDR job from cluster B to cluster A to create bidirectional LDR. This second LDR job sends writes occurring on cluster B test_tableinto cluster Atest_table. The changefeed on cluster A will now start emitting messages for both the writes occuring from application traffic in cluster A and also the writes incoming from LDR running from cluster B to cluster A.
Backups
Backups can run on either cluster in an LDR stream. If you're backing up a table that is the destination table to which an LDR job is streaming, the backup will include writes that occur to the table from the LDR job.
TTL
If you're running row-level TTL jobs, you may not want to include these deletes in LDR. You can ignore row-level TTL deletes in LDR with the ttl_disable_changefeed_replication storage parameter set on the table in the source cluster. If you would like to ignore TTL deletes in LDR, you can use the discard = ttl-deletes option in the CREATE LOGICAL REPLICATION STREAM statement.