存储库地址:https://github.com/relytcloud/pg_ducklake
-
拉取docker镜像
sudo docker pull docker.1ms.run/pgducklake/pgducklake:18-main
输入密码
18-main: Pulling from pgducklake/pgducklake
d997cc310c98: Pull complete
b5ed69009603: Pull complete
cff374c7356c: Pull complete
cf8420628f40: Pull complete
159f3aaadd71: Pull complete
ec50f454fdf0: Pull complete
c5204b920d75: Pull complete
b3de9652abb2: Pull complete
f3779ed79afa: Pull complete
eaad140a72db: Pull complete
5be10d048583: Pull complete
1a1f20eb8102: Pull complete
c89a454fbafa: Pull complete
d45fcd79ff10: Pull complete
e80c313dbf75: Pull complete
9a5e596e136b: Pull complete
5bfc9ec5cce9: Pull complete
d72962811c59: Download complete
Digest: sha256:c7dcfa1bafa8e262fa4a6328f0e936f5b5eb3495d707df3defb9b8231d8b42fc
Status: Downloaded newer image for docker.1ms.run/pgducklake/pgducklake:18-main
docker.1ms.run/pgducklake/pgducklake:18-main -
运行容器
sudo docker run -d -e POSTGRES_PASSWORD=duckdb -v /home/aaa/par:/par --network host --name pgducklake docker.1ms.run/pgducklake/pgducklake:18-main
b9dfe45ca98849164babacbf644877c7f8692265a991c16c1d56bf484c3b462f
sudo docker exec -it pgducklake psql
psql (18.3 (Debian 18.3-1.pgdg12+1))
Type "help" for help. -
在postgresql中测试列存储表
postgres=# CREATE TABLE row_store_table AS
SELECT i AS id, 'hello pg_ducklake' AS msg
FROM generate_series(1, 10000) AS i;
SELECT 10000
postgres=# CREATE TABLE col_store_table USING ducklake AS
SELECT *
FROM row_store_table;
SELECT 10000
postgres=# SELECT max(id) FROM col_store_table;
max10000
(1 row)postgres=# CREATE TABLE titanic USING ducklake AS
SELECT * FROM read_csv('/par/lineitem.csv');
SELECT 59986052
postgres=# \timing on
Timing is on.
postgres=# select sum(L_QUANTITY) from titanic;
sum1529738036
(1 row)Time: 87.255 ms
postgres=#
查看物理存储文件
postgres=# select * FROM ducklake.list_files('public', 'titanic');
Time: 4.600 ms
data_file | data_file_size_bytes | data_file_footer_size | data_file_encryption_key | delete_file | delete_file_size_bytes | delete_file_footer_size | delete_file_encryption_key
----------------------------------------------------------------------------------------------------------------+----------------------+-----------------------+--------------------------+-------------+------------------------+-------------------------+----------------------------
/var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-27a5-7b6a-a086-ed394fc5a8e9.parquet | 527428314 | 182108 | | | | |
/var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-3fb0-77ac-9702-0f8a51b83c8e.parquet | 524592064 | 181908 | | | | |
/var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-56f4-7e9d-a78f-cd792a917766.parquet | 538558467 | 186965 | | | | |
/var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-6fa9-73fc-9fa3-83c14927643c.parquet | 529799189 | 183423 | | | | |
/var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-885d-7000-be71-6bf3ecb992b1.parquet | 139023964 | 52960 | | | | |
(5 rows)
postgres=# delete from col_store_table where id=2000;
DELETE 1
Time: 14.703 ms
postgres=# SELECT sum(id) FROM col_store_table;
sum
----------
50003000
(1 row)
Time: 6.045 ms
postgres=# select * FROM ducklake.list_files('public', 'col_store_table');
Time: 6.112 ms
data_file | data_file_size_bytes | data_file_footer_size | data_file_encryption_key | delete_file | delete_file_size_bytes | delete_file_footer_size | delete_file_encryption_key
------------------------------------------------------------------------------------------------------------------------+----------------------+-----------------------+--------------------------+-------------+------------------------+-------------------------+----------------------------
/var/lib/postgresql/18/docker/pg_ducklake/public/col_store_table/ducklake-019d3c25-7e58-7cfd-a1ea-d83222a57b7a.parquet | 40468 | 302 | | | | |
(1 row)
删除是通过给删除行在另一个文件中做标记完成的,但是我没查到delete_file的位置。
postgres=#
\q
sudo docker exec -it pgducklake bash
输入密码
postgres@kylin-pc:/ls /var/lib/postgresql/18/docker/pg_ducklake/public/col_store_table/
ducklake-019d3c25-7e58-7cfd-a1ea-d83222a57b7a.parquet
postgres@kylin-pc:/$ /par/duckdb
DuckDB v1.5.1 (Variegata)
Enter ".help" for usage hints.
memory D SELECT sum(id) FROM '/var/lib/postgresql/18/docker/pg_ducklake/public/col_store_table/ducklake-019d3c25-7e58-7cfd-a1ea-d83222a57b7a.parquet';
┌─────────────────┐
│ sum(id) │
│ int128 │
├─────────────────┤
│ 50005000 │
│ (50.01 million) │
└─────────────────┘
直接查parquet文件,还是未删除行的状态。