这个插件使postgresql能访问ducklake数据湖。

存储库地址:https://github.com/relytcloud/pg_ducklake

  1. 拉取docker镜像

    sudo docker pull docker.1ms.run/pgducklake/pgducklake:18-main
    输入密码
    18-main: Pulling from pgducklake/pgducklake
    d997cc310c98: Pull complete
    b5ed69009603: Pull complete
    cff374c7356c: Pull complete
    cf8420628f40: Pull complete
    159f3aaadd71: Pull complete
    ec50f454fdf0: Pull complete
    c5204b920d75: Pull complete
    b3de9652abb2: Pull complete
    f3779ed79afa: Pull complete
    eaad140a72db: Pull complete
    5be10d048583: Pull complete
    1a1f20eb8102: Pull complete
    c89a454fbafa: Pull complete
    d45fcd79ff10: Pull complete
    e80c313dbf75: Pull complete
    9a5e596e136b: Pull complete
    5bfc9ec5cce9: Pull complete
    d72962811c59: Download complete
    Digest: sha256:c7dcfa1bafa8e262fa4a6328f0e936f5b5eb3495d707df3defb9b8231d8b42fc
    Status: Downloaded newer image for docker.1ms.run/pgducklake/pgducklake:18-main
    docker.1ms.run/pgducklake/pgducklake:18-main

  2. 运行容器

    sudo docker run -d -e POSTGRES_PASSWORD=duckdb -v /home/aaa/par:/par --network host --name pgducklake docker.1ms.run/pgducklake/pgducklake:18-main
    b9dfe45ca98849164babacbf644877c7f8692265a991c16c1d56bf484c3b462f
    sudo docker exec -it pgducklake psql
    psql (18.3 (Debian 18.3-1.pgdg12+1))
    Type "help" for help.

  3. 在postgresql中测试列存储表

    postgres=# CREATE TABLE row_store_table AS
    SELECT i AS id, 'hello pg_ducklake' AS msg
    FROM generate_series(1, 10000) AS i;
    SELECT 10000
    postgres=# CREATE TABLE col_store_table USING ducklake AS
    SELECT *
    FROM row_store_table;
    SELECT 10000
    postgres=# SELECT max(id) FROM col_store_table;
    max

    10000
    (1 row)

    postgres=# CREATE TABLE titanic USING ducklake AS
    SELECT * FROM read_csv('/par/lineitem.csv');
    SELECT 59986052
    postgres=# \timing on
    Timing is on.
    postgres=# select sum(L_QUANTITY) from titanic;
    sum

    1529738036
    (1 row)

    Time: 87.255 ms
    postgres=#

查看物理存储文件

复制代码
postgres=# select * FROM ducklake.list_files('public', 'titanic');
Time: 4.600 ms

                                                   data_file                                                    | data_file_size_bytes | data_file_footer_size | data_file_encryption_key | delete_file | delete_file_size_bytes | delete_file_footer_size | delete_file_encryption_key 
----------------------------------------------------------------------------------------------------------------+----------------------+-----------------------+--------------------------+-------------+------------------------+-------------------------+----------------------------
 /var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-27a5-7b6a-a086-ed394fc5a8e9.parquet |            527428314 |                182108 |                          |             |                        |                         | 
 /var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-3fb0-77ac-9702-0f8a51b83c8e.parquet |            524592064 |                181908 |                          |             |                        |                         | 
 /var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-56f4-7e9d-a78f-cd792a917766.parquet |            538558467 |                186965 |                          |             |                        |                         | 
 /var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-6fa9-73fc-9fa3-83c14927643c.parquet |            529799189 |                183423 |                          |             |                        |                         | 
 /var/lib/postgresql/18/docker/pg_ducklake/public/titanic/ducklake-019d3c29-885d-7000-be71-6bf3ecb992b1.parquet |            139023964 |                 52960 |                          |             |                        |                         | 
(5 rows)



postgres=# delete from col_store_table where id=2000;
DELETE 1
Time: 14.703 ms

postgres=# SELECT sum(id) FROM col_store_table;
   sum    
----------
 50003000
(1 row)

Time: 6.045 ms

postgres=# select * FROM ducklake.list_files('public', 'col_store_table');
Time: 6.112 ms

                                                       data_file                                                        | data_file_size_bytes | data_file_footer_size | data_file_encryption_key | delete_file | delete_file_size_bytes | delete_file_footer_size | delete_file_encryption_key 
------------------------------------------------------------------------------------------------------------------------+----------------------+-----------------------+--------------------------+-------------+------------------------+-------------------------+----------------------------
 /var/lib/postgresql/18/docker/pg_ducklake/public/col_store_table/ducklake-019d3c25-7e58-7cfd-a1ea-d83222a57b7a.parquet |                40468 |                   302 |                          |             |                        |                         | 
(1 row)

删除是通过给删除行在另一个文件中做标记完成的,但是我没查到delete_file的位置。

复制代码
postgres=# 
\q
sudo docker exec -it pgducklake bash
输入密码         
postgres@kylin-pc:/ls /var/lib/postgresql/18/docker/pg_ducklake/public/col_store_table/
ducklake-019d3c25-7e58-7cfd-a1ea-d83222a57b7a.parquet
postgres@kylin-pc:/$ /par/duckdb
DuckDB v1.5.1 (Variegata)
Enter ".help" for usage hints.
memory D SELECT sum(id) FROM '/var/lib/postgresql/18/docker/pg_ducklake/public/col_store_table/ducklake-019d3c25-7e58-7cfd-a1ea-d83222a57b7a.parquet';
┌─────────────────┐
│     sum(id)     │
│     int128      │
├─────────────────┤
│    50005000     │
│ (50.01 million) │
└─────────────────┘

直接查parquet文件,还是未删除行的状态。

相关推荐
倔强的石头_2 天前
《Kingbase护城河》——数据库存储空间全景探测与精细化瘦身实战
数据库
冬奇Lab3 天前
每日一个开源项目(第134篇):Zvec - 阿里开源的嵌入式向量数据库,向量搜索界的 SQLite
数据库·人工智能·llm
ClouGence3 天前
Oracle CDC 架构优化:从主库直连到 DataGuard 备库同步
数据库·后端·oracle
无响应de神3 天前
三、用户与权限管理
数据库·mysql
麦聪聊数据4 天前
数据服务化时代:企业数据能力输出的核心路径
数据库
shushangyun_4 天前
2026年快消品B2B系统推荐:支持终端门店订货、促销政策自动化的工具?
java·运维·网络·数据库·人工智能·spring·自动化
DARLING Zero two♡4 天前
【MySQL数据库】数据类型与表约束
数据库·mysql
曹牧4 天前
Oracle EXPLAIN PLAN
数据库·oracle
BD_Marathon4 天前
SQL学习指南——视图
数据库·sql
活宝小娜4 天前
mysql详细安装教程
数据库·mysql·adb