用官方安装脚本安装duckdb 1.5并测试加载各种插件

上次没有成功是因为机器的时间不对。

复制代码
aaa@kylin-pc:~$ curl https://install.duckdb.org | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (60) SSL certificate problem: certificate is not yet valid
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
aaa@kylin-pc:~$ curl -k https://install.duckdb.org | sh 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4054  100  4054    0     0   4791      0 --:--:-- --:--:-- --:--:--  4786

*** DuckDB Linux/MacOS installation script, version  ***


         .;odxdl,            
       .xXXXXXXXXKc          
       0XXXXXXXXXXXd  cooo:  
      ,XXXXXXXXXXXXK  OXXXXd 
       0XXXXXXXXXXXo  cooo:  
       .xXXXXXXXXKc          
         .;odxdl,  


curl: (60) SSL certificate problem: certificate is not yet valid              
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

把系统时间改成准确的时间以后就可以执行安装了。

复制代码
aaa@kylin-pc:~$ export https_proxy=http://proxy.aaa:8080/
aaa@kylin-pc:~$ curl https://install.duckdb.org | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4054  100  4054    0     0   4907      0 --:--:-- --:--:-- --:--:--  4907

*** DuckDB Linux/MacOS installation script, version 1.5.0 ***


         .;odxdl,            
       .xXXXXXXXXKc          
       0XXXXXXXXXXXd  cooo:  
      ,XXXXXXXXXXXXK  OXXXXd 
       0XXXXXXXXXXXo  cooo:  
       .xXXXXXXXXKc          
         .;odxdl,  


######################################################################## 100.0%

Successfully installed DuckDB 1.5.0 to /home/aaa/.duckdb/cli/1.5.0/duckdb
Updated symlink from /home/aaa/.duckdb/cli/latest/duckdb to
                     /home/aaa/.duckdb/cli/1.5.0/duckdb

Hint: Append the following line to your shell profile:
export PATH='/home/aaa/.duckdb/cli/latest':$PATH

To launch DuckDB 1.5.0 now, type
/home/aaa/.duckdb/cli/latest/duckdb

安装脚本并没把duckdb位置加入搜索路径,而是给出命令行让人自己去执行。

复制代码
aaa@kylin-pc:~$ duckdb tpch10.db
Could not find command-not-found database. Run 'sudo apt update' to populate it.
duckdb:未找到命令
aaa@kylin-pc:~$ export PATH='/home/aaa/.duckdb/cli/latest':$PATH
aaa@kylin-pc:~$ duckdb tpch10.db
DuckDB v1.5.0 (Variegata)
Enter ".help" for usage hints.
tpch10 D .tables
 ────────────────────────────────────────────────────────────────────── tpch10 ─────────────────────────────────────────────────────────────────────── 
 ─────────────────────────────────────────────────────────────────────── main ──────────────────────────────────────────────────────────────────────── 
┌─────────────────────────┐┌─────────────────────┐┌───────────────────────┐┌───────────────────────┐┌─────────────────────────┐┌──────────────────────┐
│        lineitem         ││      supplier       ││       partsupp        ││         part          ││         orders          ││       customer       │
│                         ││                     ││                       ││                       ││                         ││                      │
│ l_orderkey      bigint  ││ s_suppkey   bigint  ││ ps_partkey    bigint  ││ p_partkey     bigint  ││ o_orderkey      bigint  ││ c_custkey    bigint  │
│ l_partkey       bigint  ││ s_name      varchar ││ ps_suppkey    bigint  ││ p_name        varchar ││ o_custkey       bigint  ││ c_name       varchar │
│ l_suppkey       bigint  ││ s_address   varchar ││ ps_availqty   bigint  ││ p_mfgr        varchar ││ o_orderstatus   varchar ││ c_address    varchar │
│ l_linenumber    bigint  ││ s_nationkey integer ││ ps_supplycost decimal ││ p_brand       varchar ││ o_totalprice    decimal ││ c_nationkey  integer │
│ l_quantity      decimal ││ s_phone     varchar ││ ps_comment    varchar ││ p_type        varchar ││ o_orderdate     date    ││ c_phone      varchar │
│ l_extendedprice decimal ││ s_acctbal   decimal ││                       ││ p_size        integer ││ o_orderpriority varchar ││ c_acctbal    decimal │
│ l_discount      decimal ││ s_comment   varchar ││   8.00 million rows   ││ p_container   varchar ││ o_clerk         varchar ││ c_mktsegment varchar │
│ l_tax           decimal ││                     │└───────────────────────┘│ p_retailprice decimal ││ o_shippriority  integer ││ c_comment    varchar │
│ l_returnflag    varchar ││     100000 rows     │┌───────────────────────┐│ p_comment     varchar ││ o_comment       varchar ││                      │
│ l_linestatus    varchar │└─────────────────────┘│        nation         ││                       ││                         ││  1.50 million rows   │
│ l_shipdate      date    │┌─────────────────────┐│                       ││   2.00 million rows   ││   15.00 million rows    │└──────────────────────┘
│ l_commitdate    date    ││       region        ││ n_nationkey   integer │└───────────────────────┘└─────────────────────────┘
│ l_receiptdate   date    ││                     ││ n_name        varchar │                                                    
│ l_shipinstruct  varchar ││ r_regionkey integer ││ n_regionkey   integer │                                                    
│ l_shipmode      varchar ││ r_name      varchar ││ n_comment     varchar │                                                    
│ l_comment       varchar ││ r_comment   varchar ││                       │                                                    
│                         ││                     ││        25 rows        │                                                    
│   59.99 million rows    ││       5 rows        │└───────────────────────┘                                                    
└─────────────────────────┘└─────────────────────┘                                                                             
tpch10 D 

tpch10 D copy lineitem to 'lineitem.csv'(header);
tpch10 D copy lineitem to 'lineitem.parquet';
tpch10 D .system ls -l lineitem*
ls: 无法访问'lineitem*': 没有那个文件或目录
System command returns 512
tpch10 D .system bash


aaa@kylin-pc:~$ ls -l lineitem.*
-rw-rw-r-- 1 aaa aaa 7907002656 3月  12 08:26 lineitem.csv
-rw-rw-r-- 1 aaa aaa 2223320375 3月  12 08:26 lineitem.parquet


aaa@kylin-pc:~$ exit
exit
System command returns 32512
tpch10 D .timer on
tpch10 D select sum(l_quantity) from 'lineitem.csv';
┌─────────────────┐
│ sum(l_quantity) │
│     double      │
├─────────────────┤
│  1529738036.0   │
│ (1.53 billion)  │
└─────────────────┘
Run Time (s): real 7.303 user 49.767565 sys 5.863118
tpch10 D select sum(l_quantity) from 'lineitem.parquet';
┌─────────────────┐
│ sum(l_quantity) │
│  decimal(38,2)  │
├─────────────────┤
│  1529738036.00  │
│ (1.53 billion)  │
└─────────────────┘
Run Time (s): real 0.121 user 0.339830 sys 0.069226
tpch10 D select sum(l_quantity) from lineitem;
┌─────────────────┐
│ sum(l_quantity) │
│  decimal(38,2)  │
├─────────────────┤
│  1529738036.00  │
│ (1.53 billion)  │
└─────────────────┘
Run Time (s): real 0.023 user 0.178912 sys 0.000488
tpch10 D 

加载保留了tpch-sf10数据的数据库文件,测试了csv文件、parquet文件、数据库文件(已调入内存)的速度。

为了加载其他插件,需要先手工下载httpfs插件,然后才能在duckdb内部使用代理服务器。

复制代码
aaa@kylin-pc:~/testcc$ export https_proxy=http://proxy.aaa:8080/
aaa@kylin-pc:~/testcc$ curl -LO http://extensions.duckdb.org/v1.5.0/linux_arm64/httpfs.duckdb_extension.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:08 --:--:--     0^C
aaa@kylin-pc:~/testcc$ export http_proxy=http://proxy.aaa:8080/
aaa@kylin-pc:~/testcc$ curl -LO http://extensions.duckdb.org/v1.5.0/linux_arm64/httpfs.duckdb_extension.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 6753k  100 6753k    0     0   305k      0  0:00:22  0:00:22 --:--:--  333k

第一个插件是张泽鹏先生编写的rusty_sheet。读本地和远程xlsx文件的效果如下。

复制代码
aaa@kylin-pc:~$ duckdb
DuckDB v1.5.0 (Variegata)
Enter ".help" for usage hints.
memory D install 'testcc/httpfs.duckdb_extension.gz';
memory D load httpfs;
memory D install rusty_sheet from community;


memory D load rusty_sheet;
memory D select * from read_sheet('./car.xlsx');
┌─────────────────────────────┬─────────┬──────────────────────┬──────────────────┬───┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┐
│ 2022年度新能源汽车推广应用... │    B    │          C           │        D         │ ... │          I           │          J           │          K           │          L           │
│           varchar           │ varchar │       varchar        │     varchar      │ ... │       varchar        │       varchar        │       varchar        │       varchar        │
├─────────────────────────────┼─────────┼──────────────────────┼──────────────────┼───┼──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┤
│ 地区                        │ 序号    │ 车辆生产企业         │ 车辆型号         │ ... │ 核定补助标准\n(万...  │ 应清算补助资金\n(...  │ 按整车企业取整后补...  │ 核减原因             │
│ NULL                        │ 1       │ 兰州广通新能源汽车...  │ 小计             │ ... │ NULL                 │ 0                    │ 0                    │ NULL                 │
│ NULL                        │ NULL    │ NULL                 │ LZG6105BEVBTI    │ ... │ 0                    │ 0                    │ NULL                 │ 核减4辆,原因为:不... │
└─────────────────────────────┴─────────┴──────────────────────┴──────────────────┴───┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┘
  907 rows (40 shown)                                                    use .last to show entire result                                                     12 columns (8 shown)
memory D select * from read_sheet('https://www.miit.gov.cn/cms_files/filemanager/1226211233/attach/20262/38cc5e2f54684007a4dedf52618399d3.xlsx');
┌────────────────────────────┬─────────┬──────────────────────┬───────────────────┬───┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┐
│ 2021年度新能源汽车推广应...  │    B    │          C           │         D         │ ... │          I           │          J           │          K           │          L           │
│          varchar           │ varchar │       varchar        │      varchar      │ ... │       varchar        │       varchar        │       varchar        │       varchar        │
├────────────────────────────┼─────────┼──────────────────────┼───────────────────┼───┼──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┤
│ 地区                       │ 序号    │ 车辆生产企业         │ 车辆型号          │ ... │ 核定补助标准\n(万...  │ 应清算补助资金\n(...  │ 按整车企业取整后补...  │ 核减原因             │
│ NULL                       │ 1       │ 兰州广通新能源汽车...  │ 小计              │ ... │ NULL                 │ 4.68                 │ 5                    │ NULL                 │
│ NULL                       │ NULL    │ NULL                 │ LZG6119BEVH1      │ ... │ 0                    │ 0                    │ NULL                 │ 核减1辆,原因为:现... │
│ NULL                       │ NULL    │ NULL                 │ LZG6121BEVBT1     │ ... │ 4.68                 │ 4.68                 │ NULL                 │ 核减3辆,原因为:不... │
└────────────────────────────┴─────────┴──────────────────────┴───────────────────┴───┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┘
  531 rows (40 shown)                                                    use .last to show entire result                                                     12 columns (8 shown)

第二个插件是read_dbf。读本地FoxBASE的DBF文件的效果如下。

复制代码
memory D install read_dbf from community;
memory D load read_dbf;
memory D SELECT * FROM read_dbf('sample.dbf');
┌─────────┬─────────────────────────────┬────────────────────────┬─────────────────┬──────────────────┬────────────┬─────────────────────┬───────────────┬────────────┬─────────┐
│ CUST_NO │            NAME             │         STREET         │      CITY       │    STATE_PROV    │ ZIP_PST_CD │       COUNTRY       │     PHONE     │ FRST_CNTCT │   ASD   │
│  int64  │           varchar           │        varchar         │     varchar     │     varchar      │  varchar   │       varchar       │    varchar    │    date    │ varchar │
├─────────┼─────────────────────────────┼────────────────────────┼─────────────────┼──────────────────┼────────────┼─────────────────────┼───────────────┼────────────┼─────────┤
│    1221 │ Kauai Dive Shoppe           │ 4-976 Sugarloaf Hwy    │ Kapaa Kauai     │ Hi               │ 94766      │ U.s.a.              │ 808-555-0269  │ 1990-04-03 │ NULL    │
│    1231 │ Unisco                      │ Po Box Z-547           │ Freeport        │ NULL             │ NULL       │ Bahamas             │ 809-555-3915  │ 1981-02-28 │ NULL    │
│    1351 │ Sight Diver                 │ 1 Neptune Lane         │ Kato Paphos     │ Dghdfghdfgh      │ NULL       │ Cyprus              │ 357-6-876708  │ 1990-04-12 │ NULL    │
│    1551 │ Marmot Divers Club          │ 872 Queen St.          │ Kitchener       │ Ontario          │ G3n 2e1    │ Canada              │ 519-555-5520  │ 1990-05-11 │ NULL    │
│    1560 │ The Depth Charge            │ 15243 Underwater Fwy.  │ Marathon        │ Fl               │ 35003      │ U.s.a.              │ 800-555-3798  │ 1990-05-18 │ NULL    │

第三个插件是postgres。不知何故在duckdb中屡次下载失败,只好手工下载安装。读本地PostgreSQL容器服务的效果如下。

复制代码
memory D install postgres;
IO Error:
Failed to download extension "postgres_scanner" at URL "https://extensions.duckdb.org/v1.5.0/linux_arm64/postgres_scanner.duckdb_extension.gz"
Extension "postgres_scanner" is an existing extension.

For more info, visit https://duckdb.org/docs/stable/extensions/troubleshooting?version=v1.5.0&platform=linux_arm64&extension=postgres_scanner (ERROR Timeout was reached)
memory D 

aaa@kylin-pc:~$ cd testcc
aaa@kylin-pc:~/testcc$ curl -LO https://extensions.duckdb.org/v1.5.0/linux_arm64/postgres_scanner.duckdb_extension.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12.4M  100 12.4M    0     0   656k      0  0:00:19  0:00:19 --:--:--  939k
aaa@kylin-pc:~/testcc$ cd ..
aaa@kylin-pc:~$ duckdb
DuckDB v1.5.0 (Variegata)
Enter ".help" for usage hints.
memory D install 'testcc/postgres_scanner.duckdb_extension.gz';
memory D install postgres;
memory D load postgres;
memory D ATTACH '' AS postgres_db (TYPE postgres);
IO Error:
Unable to connect to Postgres at "": connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory
	Is the server running locally and accepting connections on that socket?

memory D ATTACH 'dbname=postgres user=postgres host=127.0.0.1' AS db (TYPE postgres, READ_ONLY);


memory D SELECT * FROM postgres_query('db', 'SELECT * FROM version()');
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                         version                                                          │
│                                                         varchar                                                          │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ PostgreSQL 18.3 (Debian 18.3-1.pgdg13+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 14.2.0-19) 14.2.0, 64-bit │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
memory D 

不带参数的ATTACH命令失败,带了参数就对了。

相应的在容器中建表语句如下:

复制代码
aaa@kylin-pc:~$ sudo docker start pg18
输入密码         
pg18
aaa@kylin-pc:~$ sudo docker exec -it pg18 bash
root@kylin-pc:/# su - postgres
postgres@kylin-pc:~$ psql
psql (18.3 (Debian 18.3-1.pgdg13+1))
Type "help" for help.


postgres=# CREATE TABLE t1 (
            id INTEGER PRIMARY KEY,
            percentage INTEGER CHECK (0 <= percentage AND percentage <= 100)
        );
CREATE TABLE
postgres=# INSERT INTO t1 VALUES (1, 5);
INSERT 0 1

在duckdb中两种查询方式如下

复制代码
memory D SELECT * FROM postgres_query('db', 'SELECT * FROM t1');
┌───────┬────────────┐
│  id   │ percentage │
│ int32 │   int32    │
├───────┼────────────┤
│     1 │          5 │
└───────┴────────────┘

memory D SELECT * FROM db.t1;
┌───────┬────────────┐
│  id   │ percentage │
│ int32 │   int32    │
├───────┼────────────┤
│     1 │          5 │
└───────┴────────────┘
相关推荐
jinanmichael2 小时前
Mybatis控制台打印SQL执行信息(执行方法、执行SQL、执行时间)
数据库·sql·mybatis
江湖有缘2 小时前
从零搭建私密空间:使用 Docker一键部署DailyTxT加密日记系统
运维·docker·容器
returnthem2 小时前
Docker核心概念与环境安装
运维·docker·容器
J2虾虾2 小时前
给Redis增加密码
数据库·redis·缓存
Mr数据杨2 小时前
【Dv3Admin】Django通用自定义工作台卡片
数据库·django·sqlite
014-code2 小时前
手把手带你解读 Dockerfile - 最快上手方法
java·docker·容器·持续部署
山峰哥2 小时前
SQL优化全攻略:从索引策略到Explain实战解析
大数据·数据库·sql·oracle·性能优化·编辑器
江湖有缘2 小时前
基于华为openEuler系统部署MicroBin粘贴板工具
华为·docker·华为云·openeuler
JuneXcy2 小时前
第9章 关系模式的规范化设计理论
数据库·mysql