duckdb 1.6dev版本最近输出执行计划默认支持翻页,翻页程序在linux中是less, 在windows中是more。
但有的linux环境,比如docker debian 13镜像中不包括less程序。
显示较长的执行计划就会报错
root@kylin:/par# ./duckdb0701
DuckDB v1.6.0-dev10007 (Development Version, 2daa4fc9a4)
explain SELECT d.bucket, count(*) AS cnt, avg(b.measure) AS avg_measure
FROM fact_a a
JOIN dim d USING (keyid)
JOIN fact_b b USING (keyid)
GROUP BY d.bucket;
sh: 1: less: not found
而直接在linux命令行执行, 因为不调用翻页程序,是能够输出的
sh: 1: less: not found
memory D
root@kylin:/par# ./duckdb0701 <a.sql
╭─ Order By ────────────────────────────────╮
│ Order By: d.bucket ASC │
│ ~2,000 rows │
╰─────────────────────┬─────────────────────╯
╭─ Projection ────────┴─────────────────────╮
│ Projections: #0, #1, │
│ "/"(#2, CAST(#3 AS DOUBLE)) │
│ ~2,000 rows │
╰─────────────────────┬─────────────────────╯
尝试从其他系统复制less程序和所需动态库,并将其目录加入搜索路径,结果显示乱码
kylin@kylin:/data/i$ whereis less
less: /usr/bin/less /bin/less /usr/share/man/man1/less.1.gz
kylin@kylin:/data/i$ cp /usr/bin/less .
kylin@kylin:/data/i$ ldd less
linux-vdso.so.1 => (0x0000007f8eb29000)
/usr/lib/libzfh.so (0x0000007f8e89e000)
libtinfo.so.5 => /lib/aarch64-linux-gnu/libtinfo.so.5 (0x0000007f8e848000)
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000007f8e701000)
/lib/ld-linux-aarch64.so.1 (0x0000007f8eafe000)
libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000007f8e6d5000)
libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000007f8e6c2000)
kylin@kylin:/data/i$ cp /lib/aarch64-linux-gnu/libtinfo.so.5 .
kylin@kylin:/data/i$
root@kylin:/par# export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH
root@kylin:/par# ./less
Missing filename ("less --help" for help)
root@kylin:/par# export PATH=./:$PATH
root@kylin:/par# ./duckdb0701
DuckDB v1.6.0-dev10007 (Development Version, 2daa4fc9a4)
Enter ".help" for usage hints.
memory D .read a.sql
<E2><95><AD><E2><94><80> Order By <E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94>
<E2><94><82> Order By: d.bucket ASC <E2><94><82>
<E2><94><82> ~2,000 rows <E2><94><82>
<E2><95><B0><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94>
<E2><95><AD><E2><94><80> Projection <E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><80><E2><94><B4><E2><94><80><E2><94><80><E2><94><80><E2><94>
<E2><94><82> Projections: #0, #1, <E2><94><82>
<E2><94><82> "/"(#2, CAST(#3 AS DOUBLE)) <E2><94><82>
<E2><94><82> ~2,000 rows <E2><94><82>
将默认翻页程序改为cat(实际上不再支持翻页), 就能正常输出了。
root@kylin:/par# export PAGER=cat
root@kylin:/par# ./duckdb0701
DuckDB v1.6.0-dev10007 (Development Version, 2daa4fc9a4)
Enter ".help" for usage hints.
memory D .read a.sql
╭─ Order By ────────────────────────────────╮
│ Order By: d.bucket ASC │
│ ~2,000 rows │
╰─────────────────────┬─────────────────────╯
windows中,默认没有less程序,duckdb调用more来翻页,导致显示串行问题。
memory D .read a2.sql
╭─ Summary ───────────╮
│ Total Time: 0.0385s │
╭─ Order By ───────────────────────╮
│ Order By: final_agg.bucket ASC │
│ 100 rows 10.0ms │
╭─ Projection ────┴────────────────╮
│ 100 rows 0µs │
╭─ Hash Group By ─┴────────────────╮
│ Groups: #0 │
│ Aggregates: sum(#1), sum(#2) │
│ 100 rows 40.0ms │
╭─ Projection ────┴────────────────╮
│ 2,000 rows 0µs │
╭─ Hash Join ─────┴────────────────╮
│ 2,000 rows 0µs │
╭─ Hash Join ─────┴────────────────╮ ╭─────────────────┴────────────────╮
│ 2,000 rows 0µs │ │ Hash Group By 2,000 rows · 0µs │
╰─────────────────┬────────────────╯ │ Projection 150,000 rows · 0µs │
-- More --
想到windows的type命令可以打印纯文本,改为set PAGER=type,还是不行,可能因为type是内部命令,不是具体的文件名,无法找到。
set PAGER=type
C:\d>duckdb0701
DuckDB v1.6.0-dev10027 (Development Version, 3cb65aa794)
Enter ".help" for usage hints.
memory D .read a2.sql
命令语法不正确。
怎么才能在windows中使用linux命令?方法之一是使用MSYS2工具集。到清华源镜像站(https://mirrors.tuna.tsinghua.edu.cn/msys2/distrib/x86_64/)下载安装包安装,然后将安装目录加入路径。
C:\d>set path=C:\d\msys64\usr\bin;%path%
C:\d>set PAGER=cat
C:\d>duckdb0701
DuckDB v1.6.0-dev10027 (Development Version, 3cb65aa794)
Enter ".help" for usage hints.
memory D .read a2.sql
╭─ Summary ───────────╮
│ Total Time: 0.0154s │
╰─────────────────────╯
╭──────────────────────────────────╮
│ Order By 100 rows · 0µs │
│ Projection 100 rows · 0µs │
╰─────────────────┬────────────────╯
╭─ Hash Group By ─┴────────────────╮
│ Groups: #0 │
│ Aggregates: sum(#1), sum(#2) │
│ 100 rows 20.0ms │
╰─────────────────┬────────────────╯
╭─ Projection ────┴────────────────╮
│ 2,000 rows 0µs │
╰─────────────────┬────────────────╯
需要注意,msys2中虽然也有less程序,同样会导致乱码,别用。
用来测试的脚本是。
sql
explain analyze WITH
fact_a AS(
SELECT LEAST(1999, FLOOR(2000 * POW(i::DOUBLE / 150000, 3.74)))::INTEGER AS keyid
FROM range(150000) t(i)),
fact_b AS(
SELECT
LEAST(1999, FLOOR(2000 * POW(i::DOUBLE / 150000, 3.74)))::INTEGER AS keyid,
(((i * 13) % 1000)::DOUBLE / 7.0) AS measure
FROM range(150000) t(i)),
dim AS(
SELECT i::INTEGER AS keyid, (i % 100)::INTEGER AS bucket
FROM range(2000) t(i)),
-- fact_a 按 keyid 分组,计算每个 keyid 的计数
agg_a AS (
SELECT
keyid,
COUNT(*) AS cnt_per_keyid
FROM fact_a
GROUP BY keyid
),
-- fact_b 按 keyid 分组,计算每个 keyid 的 measure 总和
agg_b AS (
SELECT
keyid,COUNT(*) AS cnt_per_keyid2,
SUM(measure) AS sum_measure_per_keyid
FROM fact_b
GROUP BY keyid
),
-- 关联 agg_a、agg_b 和 dim,按 bucket 分组汇总
final_agg AS (
SELECT
d.bucket,
SUM(a.cnt_per_keyid*cnt_per_keyid2) AS cnt,
SUM(b.sum_measure_per_keyid *cnt_per_keyid2) / cnt AS avg_measure
FROM agg_a a
JOIN agg_b b ON a.keyid = b.keyid
JOIN dim d ON a.keyid = d.keyid
GROUP BY d.bucket
)
SELECT
bucket,
cnt,
avg_measure
FROM final_agg
ORDER BY bucket;