CockroachDB has support for Time to Live ("TTL") expiration on table rows, also known as Row-Level TTL. Row-Level TTL is a mechanism whereby rows from a table are considered "expired" and can be automatically deleted once those rows have been stored longer than a specified expiration time.
By using Row-Level TTL, you can avoid the complexity of writing and managing scheduled jobs from the application layer to mark rows as expired and perform the necessary deletions. Doing it yourself can become complicated due to the need to balance the timeliness of the deletions vs. the potentially negative performance impact of those deletions on foreground traffic from your application.
Use cases for Row-Level TTL include:
- Delete inactive data events to manage data size and performance: For example, you may want to delete order records from an online store after 90 days. 
- Delete data no longer needed for compliance: For example, a banking application may need to keep some subset of data for a period of time due to financial regulations. Row-Level TTL can be used to remove data older than that period on a rolling, continuous basis. 
- Outbox pattern: When events are written to an outbox table and published to an external system like Kafka using CockroachDB's Change Data Capture (CDC) feature (also known as "changefeeds"), those events must be deleted to prevent unbounded growth in the size of the outbox table. 
How it works
At a high level, Row-Level TTL works by:
- Issuing a selection query at a historical timestamp, yielding a set of rows that are eligible for deletion (also known as "expired").
- Issuing batched DELETEstatements for the expired rows.
- As part of the above process, deciding how many rows to SELECTandDELETEat once in each of the above queries.
- Running the SQL queries described above in parallel as background jobs.
- To minimize the performance impact on foreground application queries, the background deletion queries are rate limited; they are also submitted at a lower priority level using the admission control system. When foreground traffic increases, CockroachDB will reduce the resources allocated to TTL deletes to handle the foreground traffic. When foreground traffic decreases, CockroachDB will increase the resources allocated to TTL deletes.
The process above is conceptually similar to the process described by Batch delete on an indexed column, except that Row-Level TTL is built into CockroachDB, so it saves you from having to write code to manage the process from your application and/or external job processing framework, including tuning the rate and performance of your background queries so they don't affect foreground application query performance.
When are rows deleted?
Once rows are expired (that is, are older than the specified TTL interval), they are eligible to be deleted. However, eligible rows may not be deleted right away. Instead, they are scheduled for deletion using a background job that is run at the interval defined by the ttl_job_cron storage parameter.
Syntax overview
TTLs are defined using either the ttl_expiration_expression or ttl_expire_after storage parameters.
- Using ttl_expiration_expressionis useful for customizing expiration logic by providing an expression. For example, you could get the same behavior asttl_expire_afterby creating aTIMESTAMPTZcolumn with a default value and having thettl_expiration_expressionreference that column.
- Using ttl_expire_afteris a convenient way of setting rows to expire a fixed amount of time after they are created or updated.
Starting with CockroachDB v22.2, we recommend that most users use ttl_expiration_expression over ttl_expire_after for the following reasons:
- If you use ttl_expiration_expression, you can use an existingTIMESTAMPTZcolumn called e.g.updated_at.
- If you use ttl_expire_after, it will cause a full table rewrite, which can affect performance. Also, you can't use it with an existingTIMESTAMPTZcolumn.
Using ttl_expiration_expression
Use ttl_expiration_expression for customizing the expiration logic by providing a SQL expression. For example, you could get the same behavior as ttl_expire_after by creating a column with a default value and having the ttl_expiration_expression reference that column.
To add custom expiration logic using ttl_expiration_expression, issue the following SQL statement that uses the ttl_expiration_expression parameter, which defines a TIMESTAMPTZ after which the row is considered expired:
CREATE TABLE ttl_test_per_row (
  id UUID PRIMARY KEY default gen_random_uuid(),
  description TEXT,
  expired_at TIMESTAMPTZ
) WITH (ttl_expiration_expression = 'expired_at', ttl_job_cron = '@daily');
The statement has the following effects:
- Creates a repeating scheduled job for the table and sets it to run once per day.
- Implicitly adds the ttlandttl_cronstorage parameters.
To see the storage parameters, enter the SHOW CREATE TABLE statement:
SHOW CREATE TABLE ttl_test_per_row;
     table_name    |                                    create_statement
-------------------+------------------------------------------------------------------------------------------
  ttl_test_per_row | CREATE TABLE public.ttl_test_per_row (
                   |     id UUID NOT NULL DEFAULT gen_random_uuid(),
                   |     description STRING NULL,
                   |     expired_at TIMESTAMPTZ NULL,
                   |     CONSTRAINT ttl_test_per_row_pkey PRIMARY KEY (id ASC)
                   | ) WITH (ttl = 'on', ttl_expiration_expression = 'expired_at', ttl_job_cron = '@daily')
(1 row)
Starting with CockroachDB v22.2, we recommend that most users use ttl_expiration_expression over ttl_expire_after for the following reasons:
- If you use ttl_expiration_expression, you can use an existingTIMESTAMPTZcolumn called e.g.updated_at.
- If you use ttl_expire_after, it will cause a full table rewrite, which can affect performance. Also, you can't use it with an existingTIMESTAMPTZcolumn.
Using ttl_expire_after
To set rows to expire a fixed amount of time after they are created or updated, issue the following SQL statement using the ttl_expire_after storage parameter:
CREATE TABLE ttl_test_per_table (
  id UUID PRIMARY KEY default gen_random_uuid(),
  description TEXT,
  inserted_at TIMESTAMPTZ default current_timestamp()
) WITH (ttl_expire_after = '3 months', ttl_job_cron = '@daily');
The statement has the following effects:
- Creates a repeating scheduled job for the table and sets it to run once per day.
- Adds a NOT VISIBLEcolumn calledcrdb_internal_expirationof typeTIMESTAMPTZto represent the TTL.
- Implicitly adds the ttlandttl_cronstorage parameters.
To see the hidden column and the storage parameters, enter the SHOW CREATE TABLE statement:
SHOW CREATE TABLE ttl_test_per_table;
      table_name     |                                                                                         create_statement
---------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  ttl_test_per_table | CREATE TABLE public.ttl_test_per_table (
                     |     id UUID NOT NULL DEFAULT gen_random_uuid(),
                     |     description STRING NULL,
                     |     inserted_at TIMESTAMPTZ NULL DEFAULT current_timestamp():::TIMESTAMPTZ,
                     |     crdb_internal_expiration TIMESTAMPTZ NOT VISIBLE NOT NULL DEFAULT current_timestamp():::TIMESTAMPTZ + '3 mons':::INTERVAL ON UPDATE current_timestamp():::TIMESTAMPTZ + '3 mons':::INTERVAL,
                     |     CONSTRAINT ttl_test_per_table_pkey PRIMARY KEY (id ASC)
                     | ) WITH (ttl = 'on', ttl_expire_after = '3 mons':::INTERVAL, ttl_job_cron = '@daily')
(1 row)
Starting with CockroachDB v22.2, we recommend that most users use ttl_expiration_expression over ttl_expire_after for the following reasons:
- If you use ttl_expiration_expression, you can use an existingTIMESTAMPTZcolumn called e.g.updated_at.
- If you use ttl_expire_after, it will cause a full table rewrite, which can affect performance. Also, you can't use it with an existingTIMESTAMPTZcolumn.
TTL storage parameters
The settings that control the behavior of Row-Level TTL are provided using storage parameters. These parameters can be set during table creation using CREATE TABLE, added to an existing table using the ALTER TABLE statement, or reset to default values.
| Description | Option | Associated cluster setting | 
|---|---|---|
| ttl_expiration_expression | Recommended in v22.2+. SQL expression that defines the TTL expiration. Must evaluate to a TIMESTAMPTZ. This and/orttl_expire_afterare required to enable TTL. This parameter is useful when you want to set the TTL for individual rows in the table. For an example, see Create a table with attl_expiration_expression. | N/A | 
| ttl_expire_after | The interval when a TTL will expire. This and/or ttl_expiration_expressionare required to enable TTL. Minimum value:'1 microsecond'. | N/A | 
| ttl | Signifies if a TTL is active. Automatically set. | N/A | 
| ttl_select_batch_size | How many rows to select at one time during the row expiration check. Default: 500. Minimum: 1. | sql.ttl.default_select_batch_size | 
| ttl_delete_batch_size | How many rows to delete at a time. Default: 100. Minimum: 1. | sql.ttl.default_delete_batch_size | 
| ttl_delete_rate_limit | Maximum number of rows to be deleted per second (rate limit). Default: 0 (no limit). Note: The rate limit is applied per leaseholder per table. In practice, it will vary based on the number of ranges and leaseholders. For example, if this variable is set to 500, and a table public.foohas its ranges spread across 3 leaseholders, then the cluster-wide rate limit forpublic.foois 500 x 3 = 1500. | sql.ttl.default_delete_rate_limit | 
| ttl_row_stats_poll_interval | If set, counts rows and expired rows on the table to report as Prometheus metrics while the TTL job is running. Unset by default, meaning no stats are fetched and reported. | N/A | 
| ttl_pause | If set, stops the TTL job from executing. | N/A | 
| ttl_job_cron | Frequency at which the TTL job runs, specified using CRON syntax. Default: '@hourly'. | N/A | 
For more information about TTL-related cluster settings, see View TTL-related cluster settings.
TTL metrics
The table below lists the metrics you can use to monitor the effectiveness of your TTL settings. These metrics are visible on the Advanced Debug Page, as well as at the _status/vars endpoint which can be scraped by Prometheus.
| Name | Description | Measurement | Type | 
|---|---|---|---|
| jobs.row_level_ttl.range_total_duration | Duration for processing a range during row level TTL. | nanoseconds | Histogram | 
| jobs.row_level_ttl.select_duration | Duration for select requests during row level TTL. | nanoseconds | Histogram | 
| jobs.row_level_ttl.delete_duration | Duration for delete requests during row level TTL. | nanoseconds | Histogram | 
| jobs.row_level_ttl.rows_selected | Number of rows selected for deletion by the row level TTL job. | num_rows | Counter | 
| jobs.row_level_ttl.rows_deleted | Number of rows deleted by the row level TTL job. | num_rows | Counter | 
| jobs.row_level_ttl.num_active_ranges | Number of active workers attempting to delete for row level TTL. | num_active_workers | Count | 
| jobs.row_level_ttl.total_rows | Approximate number of rows on the TTL table. | total_rows | Count | 
| jobs.row_level_ttl.total_expired_rows | Approximate number of expired rows on the TTL table. | total_expired_rows | Count | 
By default, these metrics are aggregated, meaning that all TTL tables will report the metrics under the same label. If you want to have metrics labelled by table name (at the risk of added cardinality), you must take the following steps:
- Set the server.child_metrics.enabledcluster setting totrue.
- Set the ttl_label_metricsstorage parameter totrue.
For more information about the issues (including negative performance impacts) that can arise when you add cardinality, see the considerations listed in Using changefeed metrics labels.
Examples
Create a table with a ttl_expiration_expression
Use the SQL syntax shown below, which uses the ttl_expiration_expression parameter to refer to an expire_at column that determines each row's expiration:
CREATE TABLE ttl_test_ttl_expiration_expression (
  id INT PRIMARY KEY,
  expire_at TIMESTAMPTZ
) WITH (ttl_expiration_expression = 'expire_at');
The ttl_expiration_expression" parameter takes a SQL expression (often a column name) that defines the TTL expiration. It is used when you want to set the TTL for individual rows in a table.
The ttl_expiration_expression parameter has the following requirements:
- It must evaluate to a TIMESTAMPTZ.
- It must not reference any columns outside the table to which it is applied.
- Any column it references cannot be dropped or have its type altered.
- Finally, if the column is renamed, the value of ttl_expiration_expressionis automatically updated.
Create a table with ttl_expire_after
Use the SQL syntax shown below to create a new table with rows that expire after a 3 month interval, execute a statement like the following:
CREATE TABLE events (
  id UUID PRIMARY KEY default gen_random_uuid(),
  description TEXT,
  inserted_at TIMESTAMP default current_timestamp()
) WITH (ttl_expire_after = '3 months', ttl_job_cron = '@daily');
CREATE TABLE
Insert some data; it should work as expected:
INSERT INTO events (description) VALUES ('a thing'), ('another thing'), ('yet another thing');
INSERT 3
To see the rows and their expirations, enter the following query:
SELECT *, crdb_internal_expiration FROM events;
                   id                  |    description    |        inserted_at         |   crdb_internal_expiration
---------------------------------------+-------------------+----------------------------+--------------------------------
  117c35fe-97f6-43bc-919f-fcd2ea13779e | a thing           | 2022-04-19 18:29:53.846697 | 2022-07-19 18:29:53.846697+00
  c294890f-2f14-4e18-8001-5f806ed9bfd1 | yet another thing | 2022-04-19 18:29:53.846697 | 2022-07-19 18:29:53.846697+00
  ea72189c-2f17-4a8e-b479-6b050a87e3bb | another thing     | 2022-04-19 18:29:53.846697 | 2022-07-19 18:29:53.846697+00
(3 rows)
Add or update the row-level TTL for an existing table
To add or change the row-level TTL expiration for an existing table, use the SQL syntax shown below.
ALTER TABLE events SET (ttl_expire_after = '1 year');
ALTER TABLE
Adding or changing the Row-Level TTL settings for an existing table with a table-wide TTL will result in a schema change that performs the following changes:
- Creates a new crdb_internal_expirationcolumn for all rows.
- Backfills the value of the new crdb_internal_expirationcolumn tonow()+ttl_expire_after.
Depending on the table size, this can negatively affect performance.
View scheduled TTL jobs
You can use SHOW SCHEDULES to view all TTL-related scheduled jobs by executing the following query:
SHOW SCHEDULES;
          id         |        label         | schedule_status |        next_run        |  state  | recurrence | jobsrunning | owner |            created            |     command
---------------------+----------------------+-----------------+------------------------+---------+------------+-------------+-------+-------------------------------+-------------------
  747608117920104449 | sql-stats-compaction | ACTIVE          | 2022-03-25 16:00:00+00 | pending | @hourly    |           0 | node  | 2022-03-25 15:31:31.444067+00 | {}
  747609229470433281 | row-level-ttl-112    | ACTIVE          | 2022-03-25 16:00:00+00 | NULL    | @daily    |           0 | root  | 2022-03-25 15:37:10.613056+00 | {"tableId": 112}
(2 rows)
View running TTL jobs
You can use SHOW JOBS to see any running TTL jobs by executing the following query:
WITH x AS (SHOW JOBS) SELECT * from x WHERE job_type = 'ROW LEVEL TTL';
        job_id       |   job_type    | description | statement | user_name | status | running_status |          created           |          started           |          finished          |          modified          | fraction_completed |                                    error                                    | coordinator_id |      trace_id       |          last_run          |          next_run          | num_runs | execution_errors
---------------------+---------------+-------------+-----------+-----------+--------+----------------+----------------------------+----------------------------+----------------------------+----------------------------+--------------------+-----------------------------------------------------------------------------+----------------+---------------------+----------------------------+----------------------------+----------+-------------------
  751553547665211401 | ROW LEVEL TTL | ttl         |           | node      | failed | NULL           | 2022-04-08 13:59:00.008994 | 2022-04-08 13:59:02.730252 | 2022-04-08 13:59:03.367008 | 2022-04-08 13:59:02.587079 |                  0 | found a recent schema change on the table at 2022-04-08T13:58:35Z, aborting |              5 | 6643876482632317647 | 2022-04-08 13:59:03.224766 | 2022-04-08 13:59:33.224766 |        1 | {}
(1 row)
You can also view running TTL jobs using the Jobs page in the DB Console
Reset a storage parameter to its default value
To reset a TTL storage parameter to its default value, use the ALTER TABLE statement:
ALTER TABLE events RESET (ttl_job_cron);
ALTER TABLE
View TTL storage parameters on a table
To view TTL storage parameters on a table, you can use SHOW CREATE TABLE:
SHOW CREATE TABLE events;
  table_name |                                                                                         create_statement
-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  events     | CREATE TABLE public.events (
             |     id UUID NOT NULL DEFAULT gen_random_uuid(),
             |     description STRING NULL,
             |     inserted_at TIMESTAMP NULL DEFAULT current_timestamp():::TIMESTAMP,
             |     crdb_internal_expiration TIMESTAMPTZ NOT VISIBLE NOT NULL DEFAULT current_timestamp():::TIMESTAMPTZ + '3 mons':::INTERVAL ON UPDATE current_timestamp():::TIMESTAMPTZ + '3 mons':::INTERVAL,
             |     CONSTRAINT events_pkey PRIMARY KEY (id ASC)
             | ) WITH (ttl = 'on', ttl_expire_after = '3 mons':::INTERVAL, ttl_job_cron = '@daily')
(1 row)
You can also use the following query:
SELECT relname, reloptions FROM pg_class WHERE relname = 'events';
  relname |                                reloptions
----------+---------------------------------------------------------------------------
  events  | NULL
  events  | {ttl='on',"ttl_expire_after='3 mons':::INTERVAL",ttl_job_cron='@daily'}
(2 rows)
Control how often the TTL job runs
Setting a TTL on a table controls when the rows therein are considered expired, but it only says that such rows may be deleted at any time after the expiration. To control how often the TTL deletion job runs, use the ttl_job_cron storage parameter, which supports CRON syntax. Cockroach Labs recommends setting ttl_job_cron to be equal to or longer than the gc.ttlseconds setting, which is the garbage collection interval for the cluster. The default value of gc.ttlseconds is 14400, or 4 hours. The CRON pattern for every four hours is '0 */4 * * *'.
To control the job interval at CREATE TABLE time, add the storage parameter as shown below:
CREATE TABLE tbl (
  id UUID PRIMARY KEY default gen_random_uuid(),
  value TEXT
) WITH (ttl_expire_after = '3 weeks', ttl_job_cron = '0 */4 * * *');
CREATE TABLE
To set the ttl_job_cron storage parameter when creating a table with Row-Level TTL, you must also set either the ttl_expire_after parameter or the ttl_expiration_expression parameter.
To update the TTL deletion job interval on a table that already has Row-Level TTL enabled, use ALTER TABLE:
ALTER TABLE tbl SET (ttl_job_cron = '@weekly');
ALTER TABLE
Pause the TTL job from running
To pause the TTL job from running on a table, use the ttl_pause storage parameter:
ALTER TABLE events SET (ttl_pause = 'on');
ALTER TABLE
If you run the TTL pausing statement above against a table that does not have TTL enabled, you will get the following error:
ERROR: "ttl_expire_after" must be set
SQLSTATE: 22023
Filter out expired rows from a selection query
To fetch only those rows from a table with table-wide TTL that have not yet expired their TTL, use the hidden crdb_internal_expiration column:
SELECT * FROM events WHERE crdb_internal_expiration > now();
                   id                  |    description    |        inserted_at
---------------------------------------+-------------------+-----------------------------
  6d25862e-2e48-4993-ac3a-a2abbebebf32 | yet another thing | 2022-03-17 20:01:56.138216
  a9404386-c4da-415f-b0b0-0dfad0f13c80 | a thing           | 2022-03-17 20:01:56.138216
  d4ebf8cd-e482-4abb-8968-2ba39c9197d9 | another thing     | 2022-03-17 20:01:56.138216
(3 rows)
To fetch only those rows from a table with a ttl_expiration_expression that have not yet expired their TTL, use the expired_at column you created earlier:
SELECT * FROM ttl_test_per_row WHERE expired_at > now();
Remove Row-Level TTL from a table
To drop the TTL on an existing table, reset the ttl storage parameter.
ALTER TABLE events RESET (ttl);
If both ttl_expire_after and ttl_expiration_expression are set, and you want to remove one or the other, you can use either of:
ALTER TABLE events RESET (ttl_expire_after);
ALTER TABLE events RESET (ttl_expiration_expression);
Disable TTL jobs for the whole cluster
To disable TTL jobs for the whole cluster, set the sql.ttl.job.enabled cluster setting to false:
SET CLUSTER SETTING sql.ttl.job.enabled = false;
SET CLUSTER SETTING
View TTL-related cluster settings
To view the cluster settings that control how Row-Level TTL works, issue the following query:
WITH x AS (SHOW CLUSTER SETTINGS) SELECT * FROM x WHERE variable LIKE 'sql.ttl.%';
              variable              | value | setting_type |                                 description
------------------------------------+-------+--------------+------------------------------------------------------------------------------
  sql.ttl.default_delete_batch_size | 100   | i            | default amount of rows to delete in a single query during a TTL job
  sql.ttl.default_delete_rate_limit | 0     | i            | default delete rate limit for all TTL jobs. Use 0 to signify no rate limit.
  sql.ttl.default_select_batch_size | 500   | i            | default amount of rows to select in a single query during a TTL job
  sql.ttl.job.enabled               | false | b            | whether the TTL job is enabled
(5 rows)
Common errors
If you attempt to update a TTL storage parameter on a table that does not have TTL enabled, you will get an error as shown below:
ALTER TABLE events SET (ttl_job_cron = '@weekly');
ERROR: "ttl_expire_after" and/or "ttl_expiration_expression" must be set
SQLSTATE: 22023
If you try to reset a TTL storage parameter but resetting that parameter would result in an invalid state of the TTL subsystem, CockroachDB will signal an error. For example, there is only one way to remove Row-Level TTL from a table. If you try to remove the TTL from a table by resetting the ttl_expire_after storage parameter you set earlier, you will get the following error:
ALTER TABLE tbl RESET (ttl_expire_after);
ERROR: "ttl_expire_after" and/or "ttl_expiration_expression" must be set
SQLSTATE: 22023
Changefeeds
Row-level TTL interacts with changefeeds in the following ways:
- When expired rows are deleted, a changefeed delete message is emitted.
Backup and restore
Row-level TTL interacts with backup and restore in the following ways:
- When you run a - BACKUP, all row-level TTL information associated with the tables being backed up (including TTL expiration times) is also backed up.
- When you - RESTOREfrom a backup, all row-level TTL information associated with the tables being restored (including TTL expiration times) is also restored. Any expired rows in the restored tables are eligible to be deleted by the TTL job.
Required Privileges
To add or update Row-Level TTL settings on a table, you must have one of the following:
- Membership to the owner role for the database where the table is located.
- The CREATEorALTERprivilege on the database where the table is located.
Migrating TTL usage from earlier versions of CockroachDB
If you are migrating your TTL usage from an earlier version of CockroachDB, the ttl_expire_after and ttl_expiration_expression storage parameters can co-exist where the ttl_expire_after creates the crdb_internal_expiration column and ttl_expiration_expression overrides the default value of crdb_internal_expiration.
Limitations
- You cannot use foreign keys to create references to or from a table that uses Row-Level TTL. cockroachdb/cockroach#76407
- Any queries you run against tables with Row-Level TTL enabled do not filter out expired rows from the result set (this includes UPDATEs andDELETEs). This feature may be added in a future release. For now, follow the instructions in Filter out expired rows from a selection query.
- Enabling Row-Level TTL on a table with multiple secondary indexes can have negative performance impacts on a cluster, including increased latency and contention. This is particularly true for large tables with terabytes of data and billions of rows that are split up into multiple ranges across separate nodes.
- Increased latency may occur because secondary indexes aren't necessarily stored on the same underlying ranges as a table's primary indexes. Further, the secondary indexes' ranges may have leaseholders located on different nodes than the primary index.
- Increased contention may occur because intents must be written as part of performing the deletions.
- Finally, secondary indexes can also have a negative impact on the overall performance of TTL jobs. According to internal testing, the TTL job processing rate is worse on tables with secondary indexes. If you encounter this situation, decreasing the ttl_delete_batch_sizestorage parameter may help by decreasing the number of ranges that need to be accessed by the job.