编译Duckdb机器学习插件QuackML

存储库下载源代码,并解压到/par

首先用duckdb 1.3的源代码编译,报错,头文件不存在

bash 复制代码
export LD_LIBRARY_PATH=/par/duck/build/src

g++ -fPIC -shared -o libtest2.so *.cpp -I /par/duck/src/include -lssl -lcrypto -I include -lduckdb -L /par/duck/build/src
In file included from quackml_extension.cpp:12:
include/functions/sum_count.hpp:11:10: fatal error: duckdb/core_functions/aggregate/nested_functions.hpp: No such file or directory
   11 | #include "duckdb/core_functions/aggregate/nested_functions.hpp"

查看这个软件发布的日期,2024年4月,找到相应时间的DuckDB版本,下载源代码 和libduckdb库。解压缩头文件到/par/duckdb-0.10.3/include, 库文件到/par/duckdb-0.10.3/lib,

bash 复制代码
 g++ -fPIC -shared -o libtest2.so *.cpp -I /par/duckdb-0.10.3/include -lssl -lcrypto -I include -lduckdb -L /par/duckdb-0.10.3/lib
In file included from /par/duckdb-0.10.3/include/duckdb/common/multi_file_reader_options.hpp:13,
                 from /par/duckdb-0.10.3/include/duckdb/execution/operator/csv_scanner/csv_reader_options.hpp:19,
                 from /par/duckdb-0.10.3/include/duckdb/common/serializer/deserializer.hpp:18,
                 from /par/duckdb-0.10.3/include/duckdb/main/secret/secret.hpp:13,
                 from /par/duckdb-0.10.3/include/duckdb/main/extension_util.hpp:14,
                 from quackml_extension.cpp:8:
/par/duckdb-0.10.3/include/duckdb/common/hive_partitioning.hpp:28:82: error: 'duckdb_re2' has not been declared
   28 |         DUCKDB_API static std::map<string, string> Parse(const string &filename, duckdb_re2::RE2 &regex);
      |                                                                                  ^~~~~~~~~~

搜索了一下,duckdb_re2是在第3方目录下的re2中定义的,解压缩到/par,单独编译,报错,删除不识别的命令disable_target_warnings,生成了makefile, 可以生成了。

bash 复制代码
cd /par/re2
root@6ae32a5ffcde:/par/re2# mkdir build
root@6ae32a5ffcde:/par/re2# cd build
root@6ae32a5ffcde:/par/re2/build# cmake ..
-- The CXX compiler identification is GNU 14.2.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/local/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at CMakeLists.txt:104 (disable_target_warnings):
  Unknown CMake command "disable_target_warnings".


-- Configuring incomplete, errors occurred!
See also "/par/re2/build/CMakeFiles/CMakeOutput.log".
root@6ae32a5ffcde:/par/re2/build# cmake ..
-- Configuring done
-- Generating done
-- Build files have been written to: /par/re2/build
root@6ae32a5ffcde:/par/re2/build# make
[  4%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/bitmap256.cc.o
[  8%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/compile.cc.o
[ 12%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/bitstate.cc.o
[ 16%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/dfa.cc.o
[ 20%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/filtered_re2.cc.o
[ 25%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/mimics_pcre.cc.o
[ 29%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/nfa.cc.o
[ 33%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/onepass.cc.o
[ 37%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/parse.cc.o
[ 41%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/perl_groups.cc.o
[ 45%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/prefilter.cc.o
[ 50%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/prefilter_tree.cc.o
[ 54%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/prog.cc.o
[ 58%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/re2.cc.o
[ 62%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/regexp.cc.o
[ 66%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/set.cc.o
[ 70%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/simplify.cc.o
[ 75%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/stringpiece.cc.o
[ 79%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/tostring.cc.o
[ 83%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/unicode_casefold.cc.o
[ 87%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/unicode_groups.cc.o
[ 91%] Building CXX object CMakeFiles/duckdb_re2.dir/util/rune.cc.o
[ 95%] Building CXX object CMakeFiles/duckdb_re2.dir/util/strutil.cc.o
[100%] Linking CXX static library libduckdb_re2.a
[100%] Built target duckdb_re2

把上述路径加入-I , 然后在quackml_extension.cpp中添加#include "re2.h", 仍然报错

bash 复制代码
 g++ -fPIC -shared -o libtest2.so *.cpp */*.cpp -I /par/duckdb-0.10.3/include -lssl -lcrypto -I include -lduckdb -L /par/duckdb-0.10.3/lib -I /par/re2/re2
In file included from quackml_extension.cpp:5:
/par/re2/re2/re2.h:279:13: error: 'StringPiece' does not name a type
...
/usr/include/re2/stringpiece.h:34:7: note: 're2::StringPiece' declared here
   34 | class StringPiece {
      |       ^~~~~~~~~~~

这里怎么出现了一个/usr/include/re2/目录下的头文件?可能因为系统预装的re2,而系统预装的头文件不能被自动包含进去。

再看/par/re2/re2/re2.h中确实引用了re2/stringpiece.h, 那就是我们的-I 目录写错了,把#include "re2.h"改为#include "re2/re2.h",-I 改为/par/re2,re2相关的错误没有了。还剩余一个函数参数个数不对错误。

bash 复制代码
g++ -fPIC -shared -o libtest2.so *.cpp */*.cpp -I /par/duckdb-0.10.3/include -lssl -lcrypto -I include -lduckdb -L /par/duckdb-0.10.3/lib -I /par/re2/
functions/linear_reg.cpp: In function 'void quackml::SlowLinearRegressionFinalize(duckdb::Vector&, duckdb::AggregateInputData&, duckdb::Vector&, idx_t, idx_t)':
functions/linear_reg.cpp:273:42: error: too many arguments to function 'std::vector<std::vector<double> > quackml::getGradientND(std::vector<std::vector<double> >&, std::vector<std::vector<double> >&, std::vector<std::vector<double> >&, double)'
  273 |             auto gradient = getGradientND(*state.sigma, *state.c, *state.theta, state.lambda, state.count);
      |                             ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from functions/linear_reg.cpp:5:
include/functions/linear_reg_utils.hpp:21:38: note: declared here
   21 |     std::vector<std::vector<double>> getGradientND(std::vector<std::vector<double>> &sigma, std::vector<std::vector<double>> &c, std::vector<std::vector<double>> &theta, double lambda);

把linear_reg.cpp中的最后一个参数删除,编译通过。

虽然编译成功,在将它用python3 ./appendmetadata.py -l libtest2.so -n quackml -dv v1.3.0 --duckdb-platform linux_amd64 --extension-version 0.1 --abi-type ""转成插件后,总是报找不到符号错误,

bash 复制代码
/par/duckdb130 -unsigned
DuckDB v1.3.0 (Ossivalis) 71c5c07cdd
Enter ".help" for usage hints.
D load '/par/QuackML-main/src/quackml.duckdb_extension';
IO Error:
Extension "/par/QuackML-main/src/quackml.duckdb_extension" could not be loaded: /par/QuackML-main/src/quackml.duckdb_extension: undefined symbol: _ZN6duckdb18BaseScalarFunctionC2ENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_6vectorINS_11LogicalTypeELb1EEES8_NS_17FunctionStabilityES8_NS_20FunctionNullHandlingE

 export LD_LIBRARY_PATH=/par/duckdb-0.10.3/lib
root@6ae32a5ffcde:/par/QuackML-main/src# g++ -fPIC -shared -o libtest2.so *.cpp */*.cpp /par/re2/build/libduckdb_re2.a -
I /par/duckdb-0.10.3/include -lssl -lcrypto -I include -lduckdb -L /par/duckdb-0.10.3/lib -I /par/re2/
D load '/par/QuackML-main/src/quackml.duckdb_extension';
IO Error:
Extension "/par/QuackML-main/src/quackml.duckdb_extension" could not be loaded: /par/QuackML-main/src/quackml.duckdb_extension: undefined symbol: _ZNK6duckdb18BaseScalarFunction8ToStringB5cxx11Ev
D .exit

用了各种版本库文件和CLI,包括用0.10.3的duckdb CLI来调用,都未解决,还需要进一步研究

相关推荐
potato_may4 分钟前
C++ 发展简史与核心语法入门
开发语言·c++·算法
曹牧6 分钟前
Oracle:“列不能外部关联到子查询”
数据库·sql
问知AI13 分钟前
InsightMatrix:问知AI的核心基座大模型
人工智能·科技·ai写作·内容运营
档案宝档案管理21 分钟前
核心功能揭秘——档案管理系统如何破解档案管理难题?
大数据·数据库·安全·档案·档案管理
深圳佛手22 分钟前
实例说明大模型参数到底是什么
人工智能
青云交22 分钟前
Java 大视界 -- Java 大数据机器学习模型在自然语言处理中的跨语言信息检索与知识融合
机器学习·自然语言处理·java 大数据·知识融合·跨语言信息检索·多语言知识图谱·低资源语言处理
Databend24 分钟前
如何打造AI时代的数据基石 | Databend Meetup 上海站回顾
数据库
OpenCSG25 分钟前
智源Emu3.5发布:34B参数的世界模型基座,以“下一状态预测”重塑多模态Scaling范式
人工智能·开源
leo_23225 分钟前
SMP(软件制作平台)到底是什么?--小视频番外篇之一
人工智能·科技创新·smp(软件制作平台)·中国语言
youcans_26 分钟前
【DeepSeek 论文精读】15. DeepSeek-V3.2:开拓开源大型语言模型新前沿
论文阅读·人工智能·语言模型·智能体·deepseek