对在aarch64 Linux环境编译安装的CinderX补充测试

前文最后说,CinderX报错不能用,这不对,我在其github存储库上提了这个issue,alexmalyshev回复

I think that's actually just a warning that you're getting but things should be working after that?Right, this is just a logged warning that CinderX tried to enable huge pages but wasn't able to. It falls back to using normal pages.

(我认为那实际上只是一个警告,在那之后一切应该仍能正常运行?是的,这只是一个记录在案的警告,表示 CinderX 尝试启用大页但未能成功,因此它会回退到使用普通页面。)

下面是测试结果

复制代码
nbs@kylin-pc:~/par$ sudo docker start gcc142
输入密码         
gcc142
nbs@kylin-pc:~/par$ sudo docker exec -it gcc142 bash
root@kylin-pc:/# cd /par/uv314
root@kylin-pc:/par/uv314# source .venv/bin/activate

cinderx.jit.compile_after_n_calls(1)
(uv314) root@kylin-pc:/par/uv314# time python ../pe932e.py
JIT: /par/cinderx-2026.3.30.0/cinderx/Jit/code_allocator.cpp:62 -- Failed to madvise [0x7fa1ffc000, 0x7fa21fc000) with MADV_HUGEPAGE, errno=22
T(16) = 72673459417881349.time=18.62825632095337
T(16) = 72673459417881349.time=9.188195705413818

real	0m27.987s
user	0m27.694s
sys	0m0.047s
-- cinderx.jit.compile_after_n_calls(0)
(uv314) root@kylin-pc:/par/uv314# time python ../pe932e.py
JIT: /par/cinderx-2026.3.30.0/cinderx/Jit/code_allocator.cpp:62 -- Failed to madvise [0x7fa0cdc000, 0x7fa0edc000) with MADV_HUGEPAGE, errno=22
T(16) = 72673459417881349.time=11.874915599822998
T(16) = 72673459417881349.time=12.66346549987793

real	0m24.587s
user	0m24.465s
sys	0m0.012s

可见不管是cinderx.jit.compile_after_n_calls(1)还是cinderx.jit.compile_after_n_calls(0),JIT后都比原始版本提速了。关于这个问题,我也发issue提问了。alexmalyshev回复

There could be a number of reasons for this, but most likely it's because letting functions run through the interpreter once allows the CPython adaptive interpreter to emit specialized opcodes (e.g. BINARY_OP_ADD_INT instead of BINARY_OP) and CinderX can make use of that to generate faster code. Feel free to run with PYTHONJITDUMPFINALHIR=1 in your environment to compare CinderX's internal representation for each function between the two runs.

In general we don't recommend compile_after_n_calls(0) for general usage as it's not meant to be performant. It's primarily helpful to test CinderX compatibility for your code.

(出现这种情况可能有多种原因,但最有可能的是:让函数通过解释器运行一次,能使 CPython 的自适应解释器发出特化的操作码(例如 BINARY_OP_ADD_INT 而非 BINARY_OP),而 CinderX 可以利用这一点来生成更快的代码。你可以在环境中设置 PYTHONJITDUMPFINALHIR=1 来比较两次运行之间 CinderX 对每个函数的内部表示差异。

一般来说,我们不建议在日常使用中设置 compile_after_n_calls(0),因为这样做的目的并不是为了追求性能。它主要用于帮助你测试代码与 CinderX 的兼容性。)

我也用他建议的PYTHONJITDUMPFINALHIR=1参数测试了,确实两者输出的内部表示差别很大。compile_after_n_calls(0)的输出非常冗长,不像认真优化过的。

还有两个方法:cinderx.jit.force_compile(fun)cinderx.jit.lazy_compile(fun), 分别是立即编译和延迟编译,它们俩的效果没太大区别,都和compile_after_n_calls(1)的加速比一致。

复制代码
(uv314) root@kylin-pc:/par/uv314# time python ../pe932d.py

N=2499500025000000, a=24995000, b=25000000, k=8, s=49995000
T(16) = 72673459417881349,time=14.928523063659668

real	0m14.948s
user	0m14.871s
sys	0m0.004s
(uv314) root@kylin-pc:/par/uv314# 


(uv314) root@kylin-pc:/par/uv314# time python ../pe932e.py
JIT: /par/cinderx-2026.3.30.0/cinderx/Jit/code_allocator.cpp:62 -- Failed to madvise [0x7fbbb7c000, 0x7fbbd7c000) with MADV_HUGEPAGE, errno=22
T(16) = 72673459417881349.time=14.70127558708191
cinderx.jit.force_compile(solve_T16)
T(16) = 72673459417881349.time=8.867673873901367

real	0m23.639s
user	0m23.529s
sys	0m0.012s



(uv314) root@kylin-pc:/par/uv314# time python ../pe932e.py
JIT: /par/cinderx-2026.3.30.0/cinderx/Jit/code_allocator.cpp:62 -- Failed to madvise [0x7fa909c000, 0x7fa929c000) with MADV_HUGEPAGE, errno=22
T(16) = 72673459417881349.time=14.878132820129395
cinderx.jit.lazy_compile(solve_T16)
T(16) = 72673459417881349.time=9.286654710769653

real	0m24.211s
user	0m24.080s
sys	0m0.024s
相关推荐
xxyy8882 小时前
关于labelimg安装后在标注过程中闪退和死机的问题处理
开发语言·python
Joseph Cooper2 小时前
Linux regmap 子系统实战:在驱动中 dump PMIC 寄存器定位供电问题
linux·运维·服务器
计算机安禾3 小时前
【Linux从入门到精通】第35篇:容器化技术预备——Docker安装与基本概念
linux·运维·docker
子木HAPPY阳VIP3 小时前
信创UOS,Docker 完整操作部署(Dockerfile部署方式)&排错整合
linux·运维·redis·nginx·docker·容器·tomcat
瞎折腾啥啊3 小时前
vcpkg与CMake
linux·c++·cmake·cmakelists
AOwhisky3 小时前
Kubernetes调度与服务暴露:从“定时任务”到“服务发现”的完全指南
linux·运维·云原生·容器·kubernetes·服务发现
卷Java3 小时前
上下文压缩
开发语言·windows·python
AI技术增长3 小时前
Pytorch图像去噪实战(十二):DDPM图像去噪完整训练流程,构建可复现扩散模型工程
pytorch·python·深度学习
勤劳的进取家3 小时前
应用层基础
运维·网络·学习
本地化文档3 小时前
setuptools-docs-l10n
python·github·gitcode