WIN10安装ollama服务
安装忽略,一条命令,主要需要添加 OLLAMA_HOST:0.0.0.0:11434 变量
ubuntu安装
sudo apt install python3-venv python3-full python3-pip #安装
cd /app/ #进入目录
python3 -m venv ollamaenv #设置自己环境ollamaenv,会当前位置创建目录
source ollamaenv/bin/activate #激活环境
pip3 install ollama-profiler -i https://mirrors.aliyun.com/pypi/simple/ #安装测试工具
测试命令
ollama-profiler gpt-oss:20b gemma3:1b --url http://192.168.89.1:11434 -n 3 --warmup --num-predict 204800 --prompt-file /app/ollamaenv/prompt/lx-yc.txt
- ollama-profiler #命令
- gpt-oss:20b gemma3:1b #模型名称 多个空格分开即可
- --url http://192.168.89.1:11434 #测试的ollama服务地址
- -n 3 测试3次
- --warmup #每次启动新模型又一次预热
- num-predict 204800 #上下文设置
- --prompt-file /app/ollamaenv/prompt/lx-yc.txt #提示词文件路径
测试结果
bash
(ollamaenv) ubuntu@ubuntu-VMware-Virtual-Platform:/app$ ollama-profiler gpt-oss:20b --url http://192.168.89.1:11434 -n 3 --warmup --num-predict 204800 --prompt-file /app/ollamaenv/prompt/lx-yc.txt
────────────────────────────────────────────────────────────────────────────────
Ollama Model Benchmark
────────────────────────────────────────────────────────────────────────────────
Models: gpt-oss:20b
Server: http://192.168.89.1:11434
Mode: sequential
Runs: 3/model/round x 1 round(s) = 3 total/model
Warmup: enabled
Max tokens: 204800 tokens
Seed: 42
Prompt: # 你是优秀的小说品鉴人,请通读如下小说,用1000字评价一�...
gpt-oss:20b
Run TTFT Gen TPS Prompt TPS Total Load Eval Time Tokens
────────────────────────────────────────────────────────────────────────────────────────
1 851.0ms 100.9 1767133.7 38.44s 564.6ms 34.37s 3467
2 823.0ms 101.3 1313976.2 15.13s 532.6ms 13.09s 1326
3 842.0ms 100.5 1667156.4 15.25s 562.9ms 13.20s 1326
────────────────────────────────────────────────────────────────────────────────────────
avg 838.7ms 100.9 1582755.4 22.94s 553.4ms 20.22s 2039
Comparison Summary
Metric gpt-oss:20b
──────────────────────────────────────
TTFT 838.7ms ±14.3ms
Generation TPS 100.9 ±0.4
Prompt Eval TPS 1582755.4 ±238076.8
Total Duration 22.94s ±13.43s
Load Duration 553.4ms ±18.0ms
Prompt Eval Time 16.0ms ±2.6ms
Eval Time 20.22s ±12.25s
Prompt Tokens 24962
Generated Tokens 2039 ±1236
(ollamaenv) ubuntu@ubuntu-VMware-Virtual-Platform:/app$ ollama-profiler gpt-oss:20b gemma3:1b --url http://192.168.89.1:11434 -n 3 --warmup --num-predict 204800 --prompt-file /app/ollamaenv/prompt/lx-yc.txt
────────────────────────────────────────────────────────────────────────────────
Ollama Model Benchmark
────────────────────────────────────────────────────────────────────────────────
Models: gpt-oss:20b, gemma3:1b
Server: http://192.168.89.1:11434
Mode: sequential
Runs: 3/model/round x 1 round(s) = 3 total/model
Warmup: enabled
Max tokens: 204800 tokens
Seed: 42
Prompt: # 你是优秀的小说品鉴人,请通读如下小说,用1000字评价一�...
gpt-oss:20b
Run TTFT Gen TPS Prompt TPS Total Load Eval Time Tokens
────────────────────────────────────────────────────────────────────────────────────────
1 813.0ms 101.5 1728646.4 15.05s 534.3ms 13.06s 1326
2 860.0ms 101.2 1667000.6 15.17s 575.4ms 13.11s 1326
3 849.0ms 100.8 1724824.1 15.20s 530.3ms 13.15s 1326
────────────────────────────────────────────────────────────────────────────────────────
avg 840.7ms 101.2 1706823.7 15.14s 546.7ms 13.11s 1326
gemma3:1b
Run TTFT Gen TPS Prompt TPS Total Load Eval Time Tokens
────────────────────────────────────────────────────────────────────────────────────────
1 5362.0ms 138.3 6126.3 7182.2ms 650.7ms 1504.0ms 208
2 5254.0ms 140.3 6195.5 7046.5ms 604.8ms 1482.0ms 208
3 5227.0ms 138.1 6177.3 7086.8ms 572.1ms 1506.6ms 208
────────────────────────────────────────────────────────────────────────────────────────
avg 5281.0ms 138.9 6166.4 7105.2ms 609.2ms 1497.6ms 208
Comparison Summary
Metric gpt-oss:20b gemma3:1b
──────────────────────────────────────────────────────────
TTFT 840.7ms ±24.6ms 5281.0ms ±71.4ms
Generation TPS 101.2 ±0.3 138.9 ±1.3
Prompt Eval TPS 1706823.7 ±34540.8 6166.4 ±35.8
Total Duration 15.14s ±79.0ms 7105.2ms ±69.7ms
Load Duration 546.7ms ±25.0ms 609.2ms ±39.5ms
Prompt Eval Time 14.6ms ±0.3ms 4607.8ms ±26.9ms
Eval Time 13.11s ±43.9ms 1497.6ms ±13.5ms
Prompt Tokens 24962 28413
Generated Tokens 1326 208
Relative Performance (vs best)
Metric gpt-oss:20b gemma3:1b
──────────────────────────────────────────────────────────
TTFT ★ best +528.2%
Generation TPS +27.2% ★ best
Prompt Eval TPS ★ best +99.6%
Total Duration +113.1% ★ best