DeepSeek总结的 PostgreSQL 19:为 UPDATE/DELETE 添加 FOR PORTION OF 子句

原文地址:https://www.depesz.com/2026/04/02/waiting-for-postgresql-19-add-update-delete-for-portion-of/

等待 PostgreSQL 19:为 UPDATE/DELETE 添加 FOR PORTION OF 子句

2026 年 4 月 1 日,Peter Eisentraut 提交了一个补丁:

为 UPDATE/DELETE 添加 FOR PORTION OF 子句

这是对 UPDATE 和 DELETE 命令的扩展,使其能够基于范围或多范围列执行"时态更新/删除"。用户可以这样写:

sql 复制代码
UPDATE t FOR PORTION OF valid_at FROM '2001-01-01' TO '2002-01-01' SET ...

(DELETE 类似),其中 valid_at 是一个范围或多范围列。

该命令会自动将操作限制在与目标时间段有重叠的行上,并且仅修改该时间段内的历史记录。如果一行数据的历史部分落在时间段内、部分落在时间段外,该命令会将这行数据的时间段截断以适应目标范围,然后插入一行或多行"时态残留数据":这些新行包含所有原始值,只是时间列被修改为仅表示未被触及的那部分历史。

为了计算所需的时态残留数据,我们使用了在 5eed8ce50c 中定义的 *_minus_multi 集合返回函数。

  • 在 bison 中添加了对 FOR PORTION OF 语法的支持。时间范围必须是常量,因此不允许使用列引用、子查询等。但像 NOW() 这样的函数是可以接受的。
  • 在执行器中添加了逻辑,用于为 FOR PORTION OF 查询所触及的记录插入"时态残留数据"部分的新行。
  • 添加了 FOR PORTION OF 的文档。
  • 添加了测试。

作者:Paul A. Jungwirth pj@illuminatedcomputing.com

评审者:Peter Eisentraut peter@eisentraut.org

讨论:https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024%40illuminatedcomputing.com


在 PostgreSQL 18 中,我们引入了时态表。简单来说,这是一种让行记录其随时间变化的历史,并且可以查询特定时间点状态的方式。

这个新提交显著简化了我们对更新和删除操作的处理方式。

让我先展示一下以前需要怎么做。首先是一些示例数据:

sql 复制代码
=$ CREATE extension btree_gist;
CREATE

=$ create table test_table (
    id int8 generated by default as identity,
    valid_range tstzrange not null default tstzrange(now(), 'infinity', '[)'),
    the_value TEXT,
    primary key (id, valid_range WITHOUT OVERLAPS)
);
CREATE TABLE

=$ INSERT INTO test_table (valid_range, the_value) VALUES (tstzrange(now() - '1 year'::INTERVAL, 'infinity', '[)'), 'initial');
INSERT 0 1

=$ INSERT INTO test_table (valid_range, the_value) VALUES (tstzrange(now() - '1 year'::INTERVAL, 'infinity', '[)'), 'second initial');
INSERT 0 1

=$ SELECT * FROM test_table;
 id |                valid_range                 |   the_value
----+--------------------------------------------+----------------
  1 | ["2025-04-02 12:29:42.375018+02",infinity) | initial
  2 | ["2025-04-02 12:29:42.378174+02",infinity) | second initial
(2 rows)

现在,假设我们要更改 id = 1 的行的值。我必须先将旧版本标记为失效,然后插入新版本,并且所有操作都在一个事务中完成,以确保数据一致性:

sql 复制代码
=$ BEGIN;
BEGIN

=$ UPDATE test_table
    SET valid_range = tstzrange( lower( valid_range ), now(), '[)')
WHERE id = 1 AND valid_range @> now();
UPDATE 1

=$ INSERT INTO test_table (id, the_value) VALUES (1, 'updated');
INSERT 0 1

=$ commit;
COMMIT

现在,表中包含三行数据:

sql 复制代码
=$ SELECT * FROM test_table;
 id |                            valid_range                            |   the_value
----+-------------------------------------------------------------------+----------------
  2 | ["2025-04-02 12:29:42.378174+02",infinity)                        | second initial
  1 | ["2025-04-02 12:29:42.375018+02","2026-04-02 12:29:42.380359+02") | initial
  1 | ["2026-04-02 12:29:42.380359+02",infinity)                        | updated
(3 rows)

当然,我们可以只查询当前可见的行:

sql 复制代码
=$ SELECT * FROM test_table WHERE valid_range @> now();
 id |                valid_range                 |   the_value
----+--------------------------------------------+----------------
  2 | ["2025-04-02 12:29:42.378174+02",infinity) | second initial
  1 | ["2026-04-02 12:29:42.380359+02",infinity) | updated
(2 rows)

删除行则更简单,我只需要更新当前版本的行:

sql 复制代码
=$ UPDATE test_table
    SET valid_range = tstzrange( lower( valid_range ), now(), '[)')
WHERE id = 1 AND valid_range @> now();
UPDATE 1

=$ SELECT * FROM test_table;
 id |                            valid_range                            |   the_value
----+-------------------------------------------------------------------+----------------
  2 | ["2025-04-02 12:29:42.378174+02",infinity)                        | second initial
  1 | ["2025-04-02 12:29:42.375018+02","2026-04-02 12:29:42.380359+02") | initial
  1 | ["2026-04-02 12:29:42.380359+02","2026-04-02 12:29:42.382341+02") | updated
(3 rows)

=$ SELECT * FROM test_table WHERE valid_range @> now();
 id |                valid_range                 |   the_value
----+--------------------------------------------+----------------
  2 | ["2025-04-02 12:29:42.378174+02",infinity) | second initial
(1 row)

这是 PostgreSQL 18 中的做法。但现在,我可以简单地:

sql 复制代码
=$ update test_table for portion of valid_range from now() to 'infinity' set the_value = 'new value' where id = 2;
UPDATE 1

=$ select * from test_table;
 id │                            valid_range                            │   the_value
────┼───────────────────────────────────────────────────────────────────┼────────────────
  1 │ ["2025-04-02 12:29:42.375018+02","2026-04-02 12:29:42.380359+02") │ initial
  1 │ ["2026-04-02 12:29:42.380359+02","2026-04-02 12:29:42.382341+02") │ updated
  2 │ ["2026-04-02 12:33:39.740173+02",infinity)                        │ new value
  2 │ ["2025-04-02 12:29:42.378174+02","2026-04-02 12:33:39.740173+02") │ second initial
(4 rows)

=$ update test_table for portion of valid_range from now() to 'infinity' set the_value = 'yet another value' where id = 2;
UPDATE 1

=$ select * from test_table;
 id │                            valid_range                            │     the_value
────┼───────────────────────────────────────────────────────────────────┼───────────────────
  1 │ ["2025-04-02 12:29:42.375018+02","2026-04-02 12:29:42.380359+02") │ initial
  1 │ ["2026-04-02 12:29:42.380359+02","2026-04-02 12:29:42.382341+02") │ updated
  2 │ ["2025-04-02 12:29:42.378174+02","2026-04-02 12:33:39.740173+02") │ second initial
  2 │ ["2026-04-02 12:33:51.701421+02",infinity)                        │ yet another value
  2 │ ["2026-04-02 12:33:39.740173+02","2026-04-02 12:33:51.701421+02") │ new value
(5 rows)

更酷的是,我还可以轻松更改过去的数据。例如:

sql 复制代码
=$ update test_table for portion of valid_range from '2025-12-01' to '2026-01-01' set the_value = 'december thing' where id = 2;
UPDATE 1

=$ select * from test_table order by id, valid_range;
 id │                            valid_range                            │     the_value
────┼───────────────────────────────────────────────────────────────────┼───────────────────
  1 │ ["2025-04-02 12:29:42.375018+02","2026-04-02 12:29:42.380359+02") │ initial
  1 │ ["2026-04-02 12:29:42.380359+02","2026-04-02 12:29:42.382341+02") │ updated
  2 │ ["2025-04-02 12:29:42.378174+02","2025-12-01 00:00:00+01")        │ second initial
  ... -- 省略中间生成的多行
(7 rows)

类似地,我可以删除数据:

sql 复制代码
=$ delete from test_table for portion of valid_range from now() to 'infinity' where id = 2;
DELETE 1

=$ select * from test_table order by id, valid_range;
 id │                            valid_range                            │     the_value
────┼───────────────────────────────────────────────────────────────────┼───────────────────
  1 │ ["2025-04-02 12:29:42.375018+02","2026-04-02 12:29:42.380359+02") │ initial
  1 │ ["2026-04-02 12:29:42.380359+02","2026-04-02 12:29:42.382341+02") │ updated
  2 │ ["2025-04-02 12:29:42.378174+02","2025-12-01 00:00:00+01")        │ second initial
  ... -- 其余行
(7 rows)

当然,我也可以删除某一段历史中的行:

sql 复制代码
=$ delete from test_table for portion of valid_range from '2025-10-01' to '2025-11-01' where id = 2;
DELETE 1

=$ select * from test_table where id = 2 order by valid_range;
 id │                            valid_range                            │     the_value
────┼───────────────────────────────────────────────────────────────────┼───────────────────
  2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02")        │ second initial
  2 │ ["2025-11-01 00:00:00+01","2025-12-01 00:00:00+01")               │ second initial
  ... -- 其余行
(6 rows)

这可能不太直观,让我们看看 id = 2 的记录在不同时间点的状态:

sql 复制代码
=$ select p, d.* from generate_series( '2025-04-01'::date, '2026-05-01'::date, '1 month'::interval) p left join lateral (select * from test_table where id = 2 and valid_range @> p ) d on (true);
           p            │   id   │                        valid_range                         │   the_value
────────────────────────┼────────┼────────────────────────────────────────────────────────────┼────────────────
 2025-04-01 00:00:00+02 │ [null] │ [null]                                                     │ [null]
 2025-05-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-06-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-07-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-08-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-09-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-10-01 00:00:00+02 │ [null] │ [null]                                                     │ [null]
 2025-11-01 00:00:00+01 │      2 │ ["2025-11-01 00:00:00+01","2025-12-01 00:00:00+01")        │ second initial
 2025-12-01 00:00:00+01 │ [null] │ [null]                                                     │ [null]
 2026-01-01 00:00:00+01 │ [null] │ [null]                                                     │ [null]
 2026-02-01 00:00:00+01 │ [null] │ [null]                                                     │ [null]
 2026-03-01 00:00:00+01 │ [null] │ [null]                                                     │ [null]
 2026-04-01 00:00:00+02 │ [null] │ [null]                                                     │ [null]
 2026-05-01 00:00:00+02 │ [null] │ [null]                                                     │ [null]
(14 rows)

其中 id 列的 NULL 值简单地表示当时不存在 id = 2 的有效行。

非常棒。非常感谢所有参与这项工作的人。

上述最后一个输出是错的,我只给DeepSeek提供了一行,想替他省点词元(token),结果弄巧成拙,它没有按照给他的部分严格对应,而是自己根据文章内容补全。原文的表格如下

复制代码
           p            │   id   │                        valid_range                         │   the_value
────────────────────────┼────────┼────────────────────────────────────────────────────────────┼────────────────
 2025-04-01 00:00:00+02 │ [null] │ [null]                                                     │ [null]
 2025-05-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-06-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-07-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-08-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-09-01 00:00:00+02 │      2 │ ["2025-04-02 12:29:42.378174+02","2025-10-01 00:00:00+02") │ second initial
 2025-10-01 00:00:00+02 │ [null] │ [null]                                                     │ [null]
 2025-11-01 00:00:00+01 │      2 │ ["2025-11-01 00:00:00+01","2025-12-01 00:00:00+01")        │ second initial
 2025-12-01 00:00:00+01 │      2 │ ["2025-12-01 00:00:00+01","2026-01-01 00:00:00+01")        │ december thing
 2026-01-01 00:00:00+01 │      2 │ ["2026-01-01 00:00:00+01","2026-04-02 12:33:39.740173+02") │ second initial
 2026-02-01 00:00:00+01 │      2 │ ["2026-01-01 00:00:00+01","2026-04-02 12:33:39.740173+02") │ second initial
 2026-03-01 00:00:00+01 │      2 │ ["2026-01-01 00:00:00+01","2026-04-02 12:33:39.740173+02") │ second initial
 2026-04-01 00:00:00+02 │      2 │ ["2026-01-01 00:00:00+01","2026-04-02 12:33:39.740173+02") │ second initial
 2026-05-01 00:00:00+02 │ [null] │ [null]                                                     │ [null]
(14 rows)

猜对了一半,已经很不容易了。

相关推荐
RestCloud2 小时前
如何用ETL实现多租户数据库的数据隔离与整合
数据库·数据仓库·etl·etlcloud·数据同步·数据集成平台·数据库传输
悢七2 小时前
单机部署 OceanBase 集群
数据库·ffmpeg·oceanbase
gjc5922 小时前
零基础OceanBase数据库入门(4):创建MySQL模式数据库
数据库·mysql·oracle·oceanbase
知识分享小能手2 小时前
MongoDB入门学习教程,从入门到精通,MongoDB创建副本集知识点梳理(10)
数据库·学习·mongodb
老衲提灯找美女2 小时前
数据库事务
java·大数据·数据库
会飞的大可3 小时前
Redis 竞品与替代方案选型可行性分析报告
数据库·redis·缓存
昨夜见军贴06163 小时前
AI报告文档审核助力本地化升级:IACheck如何支撑食品加工行业数据安全与质量协同发展
大数据·人工智能
周杰伦的稻香3 小时前
PostgreSQL基础命令
数据库·postgresql
先做个垃圾出来………3 小时前
JSON序列化问题
数据库·json