上次没有成功是因为机器的时间不对。
aaa@kylin-pc:~$ curl https://install.duckdb.org | sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (60) SSL certificate problem: certificate is not yet valid
More details here: https://curl.haxx.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
aaa@kylin-pc:~$ curl -k https://install.duckdb.org | sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 4054 100 4054 0 0 4791 0 --:--:-- --:--:-- --:--:-- 4786
*** DuckDB Linux/MacOS installation script, version ***
.;odxdl,
.xXXXXXXXXKc
0XXXXXXXXXXXd cooo:
,XXXXXXXXXXXXK OXXXXd
0XXXXXXXXXXXo cooo:
.xXXXXXXXXKc
.;odxdl,
curl: (60) SSL certificate problem: certificate is not yet valid
More details here: https://curl.haxx.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
把系统时间改成准确的时间以后就可以执行安装了。
aaa@kylin-pc:~$ export https_proxy=http://proxy.aaa:8080/
aaa@kylin-pc:~$ curl https://install.duckdb.org | sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 4054 100 4054 0 0 4907 0 --:--:-- --:--:-- --:--:-- 4907
*** DuckDB Linux/MacOS installation script, version 1.5.0 ***
.;odxdl,
.xXXXXXXXXKc
0XXXXXXXXXXXd cooo:
,XXXXXXXXXXXXK OXXXXd
0XXXXXXXXXXXo cooo:
.xXXXXXXXXKc
.;odxdl,
######################################################################## 100.0%
Successfully installed DuckDB 1.5.0 to /home/aaa/.duckdb/cli/1.5.0/duckdb
Updated symlink from /home/aaa/.duckdb/cli/latest/duckdb to
/home/aaa/.duckdb/cli/1.5.0/duckdb
Hint: Append the following line to your shell profile:
export PATH='/home/aaa/.duckdb/cli/latest':$PATH
To launch DuckDB 1.5.0 now, type
/home/aaa/.duckdb/cli/latest/duckdb
安装脚本并没把duckdb位置加入搜索路径,而是给出命令行让人自己去执行。
aaa@kylin-pc:~$ duckdb tpch10.db
Could not find command-not-found database. Run 'sudo apt update' to populate it.
duckdb:未找到命令
aaa@kylin-pc:~$ export PATH='/home/aaa/.duckdb/cli/latest':$PATH
aaa@kylin-pc:~$ duckdb tpch10.db
DuckDB v1.5.0 (Variegata)
Enter ".help" for usage hints.
tpch10 D .tables
────────────────────────────────────────────────────────────────────── tpch10 ───────────────────────────────────────────────────────────────────────
─────────────────────────────────────────────────────────────────────── main ────────────────────────────────────────────────────────────────────────
┌─────────────────────────┐┌─────────────────────┐┌───────────────────────┐┌───────────────────────┐┌─────────────────────────┐┌──────────────────────┐
│ lineitem ││ supplier ││ partsupp ││ part ││ orders ││ customer │
│ ││ ││ ││ ││ ││ │
│ l_orderkey bigint ││ s_suppkey bigint ││ ps_partkey bigint ││ p_partkey bigint ││ o_orderkey bigint ││ c_custkey bigint │
│ l_partkey bigint ││ s_name varchar ││ ps_suppkey bigint ││ p_name varchar ││ o_custkey bigint ││ c_name varchar │
│ l_suppkey bigint ││ s_address varchar ││ ps_availqty bigint ││ p_mfgr varchar ││ o_orderstatus varchar ││ c_address varchar │
│ l_linenumber bigint ││ s_nationkey integer ││ ps_supplycost decimal ││ p_brand varchar ││ o_totalprice decimal ││ c_nationkey integer │
│ l_quantity decimal ││ s_phone varchar ││ ps_comment varchar ││ p_type varchar ││ o_orderdate date ││ c_phone varchar │
│ l_extendedprice decimal ││ s_acctbal decimal ││ ││ p_size integer ││ o_orderpriority varchar ││ c_acctbal decimal │
│ l_discount decimal ││ s_comment varchar ││ 8.00 million rows ││ p_container varchar ││ o_clerk varchar ││ c_mktsegment varchar │
│ l_tax decimal ││ │└───────────────────────┘│ p_retailprice decimal ││ o_shippriority integer ││ c_comment varchar │
│ l_returnflag varchar ││ 100000 rows │┌───────────────────────┐│ p_comment varchar ││ o_comment varchar ││ │
│ l_linestatus varchar │└─────────────────────┘│ nation ││ ││ ││ 1.50 million rows │
│ l_shipdate date │┌─────────────────────┐│ ││ 2.00 million rows ││ 15.00 million rows │└──────────────────────┘
│ l_commitdate date ││ region ││ n_nationkey integer │└───────────────────────┘└─────────────────────────┘
│ l_receiptdate date ││ ││ n_name varchar │
│ l_shipinstruct varchar ││ r_regionkey integer ││ n_regionkey integer │
│ l_shipmode varchar ││ r_name varchar ││ n_comment varchar │
│ l_comment varchar ││ r_comment varchar ││ │
│ ││ ││ 25 rows │
│ 59.99 million rows ││ 5 rows │└───────────────────────┘
└─────────────────────────┘└─────────────────────┘
tpch10 D
tpch10 D copy lineitem to 'lineitem.csv'(header);
tpch10 D copy lineitem to 'lineitem.parquet';
tpch10 D .system ls -l lineitem*
ls: 无法访问'lineitem*': 没有那个文件或目录
System command returns 512
tpch10 D .system bash
aaa@kylin-pc:~$ ls -l lineitem.*
-rw-rw-r-- 1 aaa aaa 7907002656 3月 12 08:26 lineitem.csv
-rw-rw-r-- 1 aaa aaa 2223320375 3月 12 08:26 lineitem.parquet
aaa@kylin-pc:~$ exit
exit
System command returns 32512
tpch10 D .timer on
tpch10 D select sum(l_quantity) from 'lineitem.csv';
┌─────────────────┐
│ sum(l_quantity) │
│ double │
├─────────────────┤
│ 1529738036.0 │
│ (1.53 billion) │
└─────────────────┘
Run Time (s): real 7.303 user 49.767565 sys 5.863118
tpch10 D select sum(l_quantity) from 'lineitem.parquet';
┌─────────────────┐
│ sum(l_quantity) │
│ decimal(38,2) │
├─────────────────┤
│ 1529738036.00 │
│ (1.53 billion) │
└─────────────────┘
Run Time (s): real 0.121 user 0.339830 sys 0.069226
tpch10 D select sum(l_quantity) from lineitem;
┌─────────────────┐
│ sum(l_quantity) │
│ decimal(38,2) │
├─────────────────┤
│ 1529738036.00 │
│ (1.53 billion) │
└─────────────────┘
Run Time (s): real 0.023 user 0.178912 sys 0.000488
tpch10 D
加载保留了tpch-sf10数据的数据库文件,测试了csv文件、parquet文件、数据库文件(已调入内存)的速度。
为了加载其他插件,需要先手工下载httpfs插件,然后才能在duckdb内部使用代理服务器。
aaa@kylin-pc:~/testcc$ export https_proxy=http://proxy.aaa:8080/
aaa@kylin-pc:~/testcc$ curl -LO http://extensions.duckdb.org/v1.5.0/linux_arm64/httpfs.duckdb_extension.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:08 --:--:-- 0^C
aaa@kylin-pc:~/testcc$ export http_proxy=http://proxy.aaa:8080/
aaa@kylin-pc:~/testcc$ curl -LO http://extensions.duckdb.org/v1.5.0/linux_arm64/httpfs.duckdb_extension.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 6753k 100 6753k 0 0 305k 0 0:00:22 0:00:22 --:--:-- 333k
第一个插件是张泽鹏先生编写的rusty_sheet。读本地和远程xlsx文件的效果如下。
aaa@kylin-pc:~$ duckdb
DuckDB v1.5.0 (Variegata)
Enter ".help" for usage hints.
memory D install 'testcc/httpfs.duckdb_extension.gz';
memory D load httpfs;
memory D install rusty_sheet from community;
memory D load rusty_sheet;
memory D select * from read_sheet('./car.xlsx');
┌─────────────────────────────┬─────────┬──────────────────────┬──────────────────┬───┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┐
│ 2022年度新能源汽车推广应用... │ B │ C │ D │ ... │ I │ J │ K │ L │
│ varchar │ varchar │ varchar │ varchar │ ... │ varchar │ varchar │ varchar │ varchar │
├─────────────────────────────┼─────────┼──────────────────────┼──────────────────┼───┼──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┤
│ 地区 │ 序号 │ 车辆生产企业 │ 车辆型号 │ ... │ 核定补助标准\n(万... │ 应清算补助资金\n(... │ 按整车企业取整后补... │ 核减原因 │
│ NULL │ 1 │ 兰州广通新能源汽车... │ 小计 │ ... │ NULL │ 0 │ 0 │ NULL │
│ NULL │ NULL │ NULL │ LZG6105BEVBTI │ ... │ 0 │ 0 │ NULL │ 核减4辆,原因为:不... │
└─────────────────────────────┴─────────┴──────────────────────┴──────────────────┴───┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┘
907 rows (40 shown) use .last to show entire result 12 columns (8 shown)
memory D select * from read_sheet('https://www.miit.gov.cn/cms_files/filemanager/1226211233/attach/20262/38cc5e2f54684007a4dedf52618399d3.xlsx');
┌────────────────────────────┬─────────┬──────────────────────┬───────────────────┬───┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┐
│ 2021年度新能源汽车推广应... │ B │ C │ D │ ... │ I │ J │ K │ L │
│ varchar │ varchar │ varchar │ varchar │ ... │ varchar │ varchar │ varchar │ varchar │
├────────────────────────────┼─────────┼──────────────────────┼───────────────────┼───┼──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┤
│ 地区 │ 序号 │ 车辆生产企业 │ 车辆型号 │ ... │ 核定补助标准\n(万... │ 应清算补助资金\n(... │ 按整车企业取整后补... │ 核减原因 │
│ NULL │ 1 │ 兰州广通新能源汽车... │ 小计 │ ... │ NULL │ 4.68 │ 5 │ NULL │
│ NULL │ NULL │ NULL │ LZG6119BEVH1 │ ... │ 0 │ 0 │ NULL │ 核减1辆,原因为:现... │
│ NULL │ NULL │ NULL │ LZG6121BEVBT1 │ ... │ 4.68 │ 4.68 │ NULL │ 核减3辆,原因为:不... │
└────────────────────────────┴─────────┴──────────────────────┴───────────────────┴───┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┘
531 rows (40 shown) use .last to show entire result 12 columns (8 shown)
第二个插件是read_dbf。读本地FoxBASE的DBF文件的效果如下。
memory D install read_dbf from community;
memory D load read_dbf;
memory D SELECT * FROM read_dbf('sample.dbf');
┌─────────┬─────────────────────────────┬────────────────────────┬─────────────────┬──────────────────┬────────────┬─────────────────────┬───────────────┬────────────┬─────────┐
│ CUST_NO │ NAME │ STREET │ CITY │ STATE_PROV │ ZIP_PST_CD │ COUNTRY │ PHONE │ FRST_CNTCT │ ASD │
│ int64 │ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │ date │ varchar │
├─────────┼─────────────────────────────┼────────────────────────┼─────────────────┼──────────────────┼────────────┼─────────────────────┼───────────────┼────────────┼─────────┤
│ 1221 │ Kauai Dive Shoppe │ 4-976 Sugarloaf Hwy │ Kapaa Kauai │ Hi │ 94766 │ U.s.a. │ 808-555-0269 │ 1990-04-03 │ NULL │
│ 1231 │ Unisco │ Po Box Z-547 │ Freeport │ NULL │ NULL │ Bahamas │ 809-555-3915 │ 1981-02-28 │ NULL │
│ 1351 │ Sight Diver │ 1 Neptune Lane │ Kato Paphos │ Dghdfghdfgh │ NULL │ Cyprus │ 357-6-876708 │ 1990-04-12 │ NULL │
│ 1551 │ Marmot Divers Club │ 872 Queen St. │ Kitchener │ Ontario │ G3n 2e1 │ Canada │ 519-555-5520 │ 1990-05-11 │ NULL │
│ 1560 │ The Depth Charge │ 15243 Underwater Fwy. │ Marathon │ Fl │ 35003 │ U.s.a. │ 800-555-3798 │ 1990-05-18 │ NULL │
第三个插件是postgres。不知何故在duckdb中屡次下载失败,只好手工下载安装。读本地PostgreSQL容器服务的效果如下。
memory D install postgres;
IO Error:
Failed to download extension "postgres_scanner" at URL "https://extensions.duckdb.org/v1.5.0/linux_arm64/postgres_scanner.duckdb_extension.gz"
Extension "postgres_scanner" is an existing extension.
For more info, visit https://duckdb.org/docs/stable/extensions/troubleshooting?version=v1.5.0&platform=linux_arm64&extension=postgres_scanner (ERROR Timeout was reached)
memory D
aaa@kylin-pc:~$ cd testcc
aaa@kylin-pc:~/testcc$ curl -LO https://extensions.duckdb.org/v1.5.0/linux_arm64/postgres_scanner.duckdb_extension.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 12.4M 100 12.4M 0 0 656k 0 0:00:19 0:00:19 --:--:-- 939k
aaa@kylin-pc:~/testcc$ cd ..
aaa@kylin-pc:~$ duckdb
DuckDB v1.5.0 (Variegata)
Enter ".help" for usage hints.
memory D install 'testcc/postgres_scanner.duckdb_extension.gz';
memory D install postgres;
memory D load postgres;
memory D ATTACH '' AS postgres_db (TYPE postgres);
IO Error:
Unable to connect to Postgres at "": connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory
Is the server running locally and accepting connections on that socket?
memory D ATTACH 'dbname=postgres user=postgres host=127.0.0.1' AS db (TYPE postgres, READ_ONLY);
memory D SELECT * FROM postgres_query('db', 'SELECT * FROM version()');
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ version │
│ varchar │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ PostgreSQL 18.3 (Debian 18.3-1.pgdg13+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 14.2.0-19) 14.2.0, 64-bit │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
memory D
不带参数的ATTACH命令失败,带了参数就对了。
相应的在容器中建表语句如下:
aaa@kylin-pc:~$ sudo docker start pg18
输入密码
pg18
aaa@kylin-pc:~$ sudo docker exec -it pg18 bash
root@kylin-pc:/# su - postgres
postgres@kylin-pc:~$ psql
psql (18.3 (Debian 18.3-1.pgdg13+1))
Type "help" for help.
postgres=# CREATE TABLE t1 (
id INTEGER PRIMARY KEY,
percentage INTEGER CHECK (0 <= percentage AND percentage <= 100)
);
CREATE TABLE
postgres=# INSERT INTO t1 VALUES (1, 5);
INSERT 0 1
在duckdb中两种查询方式如下
memory D SELECT * FROM postgres_query('db', 'SELECT * FROM t1');
┌───────┬────────────┐
│ id │ percentage │
│ int32 │ int32 │
├───────┼────────────┤
│ 1 │ 5 │
└───────┴────────────┘
memory D SELECT * FROM db.t1;
┌───────┬────────────┐
│ id │ percentage │
│ int32 │ int32 │
├───────┼────────────┤
│ 1 │ 5 │
└───────┴────────────┘