文章目录
-
- [一.官网下载 0.3.13版本](#一.官网下载 0.3.13版本)
- 二.将文件包上传至ubuntu服务器
- 三.下载安装脚本
- 四.剔除GPU相关下载ROCM等,纯CPU运行脚本
- 五.ollama常用命令
- [六. 远程测试](#六. 远程测试)
- [七.对接spring AI](#七.对接spring AI)
一.官网下载 0.3.13版本
二.将文件包上传至ubuntu服务器
三.下载安装脚本
bash
curl -fsSL https://ollama.com/install.sh
修改远程拉取ollama代码为本地解压
源需要修改的脚本代码如下
bash
if curl -I --silent --fail --location "https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}" >/dev/null ; then
status "Downloading Linux ${ARCH} bundle"
curl --fail --show-error --location --progress-bar \
"https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}" | \
$SUDO tar -xzf - -C "$OLLAMA_INSTALL_DIR"
BUNDLE=1
if [ "$OLLAMA_INSTALL_DIR/bin/ollama" != "$BINDIR/ollama" ] ; then
status "Making ollama accessible in the PATH in $BINDIR"
$SUDO ln -sf "$OLLAMA_INSTALL_DIR/ollama" "$BINDIR/ollama"
fi
else
status "Downloading Linux ${ARCH} CLI"
curl --fail --show-error --location --progress-bar -o "$TEMP_DIR/ollama"\
"https://ollama.com/download/ollama-linux-${ARCH}${VER_PARAM}"
$SUDO install -o0 -g0 -m755 $TEMP_DIR/ollama $OLLAMA_INSTALL_DIR/ollama
BUNDLE=0
if [ "$OLLAMA_INSTALL_DIR/ollama" != "$BINDIR/ollama" ] ; then
status "Making ollama accessible in the PATH in $BINDIR"
$SUDO ln -sf "$OLLAMA_INSTALL_DIR/ollama" "$BINDIR/ollama"
fi
fi
新改后代码
bash
status "Downloading Linux ${ARCH} bundle"
# curl --fail --show-error --location --progress-bar \
# "https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}" | \
$SUDO tar -xzf ./ollama-linux-amd64.tgz -C "$OLLAMA_INSTALL_DIR"
BUNDLE=1
if [ "$OLLAMA_INSTALL_DIR/bin/ollama" != "$BINDIR/ollama" ] ; then
status "Making ollama accessible in the PATH in $BINDIR"
$SUDO ln -sf "$OLLAMA_INSTALL_DIR/ollama" "$BINDIR/ollama"
fi
四.剔除GPU相关下载ROCM等,纯CPU运行脚本
在题目3的基础上,又剔除了GPU部分,即从wls2注释将下面全部删除
完整版 离线基于CPU的运行脚本
bash
#!/bin/sh
# This script installs Ollama on Linux.
# It detects the current operating system architecture and installs the appropriate version of Ollama.
set -eu
status() { echo ">>> $*" >&2; }
error() { echo "ERROR $*"; exit 1; }
warning() { echo "WARNING: $*"; }
TEMP_DIR=$(mktemp -d)
cleanup() { rm -rf $TEMP_DIR; }
trap cleanup EXIT
available() { command -v $1 >/dev/null; }
require() {
local MISSING=''
for TOOL in $*; do
if ! available $TOOL; then
MISSING="$MISSING $TOOL"
fi
done
echo $MISSING
}
[ "$(uname -s)" = "Linux" ] || error 'This script is intended to run on Linux only.'
ARCH=$(uname -m)
case "$ARCH" in
x86_64) ARCH="amd64" ;;
aarch64|arm64) ARCH="arm64" ;;
*) error "Unsupported architecture: $ARCH" ;;
esac
IS_WSL2=false
KERN=$(uname -r)
case "$KERN" in
*icrosoft*WSL2 | *icrosoft*wsl2) IS_WSL2=true;;
*icrosoft) error "Microsoft WSL1 is not currently supported. Please use WSL2 with 'wsl --set-version <distro> 2'" ;;
*) ;;
esac
VER_PARAM="${OLLAMA_VERSION:+?version=$OLLAMA_VERSION}"
SUDO=
if [ "$(id -u)" -ne 0 ]; then
# Running as root, no need for sudo
if ! available sudo; then
error "This script requires superuser permissions. Please re-run as root."
fi
SUDO="sudo"
fi
NEEDS=$(require curl awk grep sed tee xargs)
if [ -n "$NEEDS" ]; then
status "ERROR: The following tools are required but missing:"
for NEED in $NEEDS; do
echo " - $NEED"
done
exit 1
fi
for BINDIR in /usr/local/bin /usr/bin /bin; do
echo $PATH | grep -q $BINDIR && break || continue
done
OLLAMA_INSTALL_DIR=$(dirname ${BINDIR})
status "Installing ollama to $OLLAMA_INSTALL_DIR"
$SUDO install -o0 -g0 -m755 -d $BINDIR
$SUDO install -o0 -g0 -m755 -d "$OLLAMA_INSTALL_DIR"
status "Downloading Linux ${ARCH} bundle"
# curl --fail --show-error --location --progress-bar \
# "https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}" | \
$SUDO tar -xzf ./ollama-linux-amd64.tgz -C "$OLLAMA_INSTALL_DIR"
BUNDLE=1
if [ "$OLLAMA_INSTALL_DIR/bin/ollama" != "$BINDIR/ollama" ] ; then
status "Making ollama accessible in the PATH in $BINDIR"
$SUDO ln -sf "$OLLAMA_INSTALL_DIR/ollama" "$BINDIR/ollama"
fi
install_success() {
status 'The Ollama API is now available at 127.0.0.1:11434.'
status 'Install complete. Run "ollama" from the command line.'
}
trap install_success EXIT
# Everything from this point onwards is optional.
configure_systemd() {
if ! id ollama >/dev/null 2>&1; then
status "Creating ollama user..."
$SUDO useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama
fi
if getent group render >/dev/null 2>&1; then
status "Adding ollama user to render group..."
$SUDO usermod -a -G render ollama
fi
if getent group video >/dev/null 2>&1; then
status "Adding ollama user to video group..."
$SUDO usermod -a -G video ollama
fi
status "Adding current user to ollama group..."
$SUDO usermod -a -G ollama $(whoami)
status "Creating ollama systemd service..."
cat <<EOF | $SUDO tee /etc/systemd/system/ollama.service >/dev/null
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=$BINDIR/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"
[Install]
WantedBy=default.target
EOF
SYSTEMCTL_RUNNING="$(systemctl is-system-running || true)"
case $SYSTEMCTL_RUNNING in
running|degraded)
status "Enabling and starting ollama service..."
$SUDO systemctl daemon-reload
$SUDO systemctl enable ollama
start_service() { $SUDO systemctl restart ollama; }
trap start_service EXIT
;;
esac
}
if available systemctl; then
configure_systemd
fi
install_success
五.ollama常用命令
bash
# 关闭ollama服务
service ollama stop
ollama serve # 启动ollama
ollama create # 从模型文件创建模型
ollama show # 显示模型信息
ollama run qwen2.5:3b-instruct-q4_K_M # 运行模型,会先自动下载模型
ollama pull # 从注册仓库中拉取模型
ollama push # 将模型推送到注册仓库
ollama list # 列出已下载模型
ollama ps # 列出正在运行的模型
ollama cp # 复制模型
ollama rm # 删除模型
六. 远程测试
建议生产不开启,因为没有token等限制,必须注意接口调用安全
1.首先停止ollama服务:
bash
systemctl stop ollama
2.修改ollama的service文件:
bash
vim /etc/systemd/system/ollama.service
3.新增Environment="OLLAMA_HOST=0.0.0.0:11434"
bash
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
Environment="OLLAMA_HOST=0.0.0.0:11434"
[Install]
WantedBy=default.target
- 启动ollama
bash
systemctl daemon-reload
systemctl start ollama
# 若启动失败可以使用 ollama serve测试
七.对接spring AI
xml
<dependencyManagement>
<dependencies>
<!--spring boot依赖-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-dependencies</artifactId>
<version>${spring.boot.version}</version>
<type>pom</type>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>1.0.0-SNAPSHOT</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- https://mvnrepository.com/artifact/org.springframework.ai/spring-ai-ollama-spring-boot-starter -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>
</dependencies>
<repositories>
<repository>
<id>spring-milestones</id>
<name>Spring Milestones</name>
<url>https://repo.spring.io/milestone</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>spring-snapshots</id>
<name>Spring Snapshots</name>
<url>https://repo.spring.io/snapshot</url>
<releases>
<enabled>false</enabled>
</releases>
</repository>
</repositories>
若以上代码无法拉取,可能被setting.xml全局拦截到镜像站。 以上spring ai还未发布到maven中央仓库
请参考maven多仓库私库模板配置
yaml
spring:
application:
name: spring-ai-ollama
ai:
ollama:
base-url: http://192.168.200.94:11434
chat:
# 为了使模型输入内容拥有更多的多样性或随机性,应当增加temperature。
#在 temperature 非零的情况下,从 0.95 左右的 top-p(或 250 左右的 top-k )开始,根据需要降低 temperature。
# 如果有太多无意义的内容、垃圾内容或产生幻觉,应当降低 temperature 和 降低top-p/top-k。
# 如果 temperature 很高而模型输出内容的多样性却很低,应当增加top-p/top-k。
# 为了获得更多样化的主题,应当增加存在惩罚值。
# 为了获得更多样化且更少重复内容的模型输出,应当增加频率惩罚。
options:
# 配置文件指定时,现在程序中指定的模型,程序没有指定模型在对应查找配置中的模型
# model: qwen:0.5b-chat
model: qwen2.5:3b-instruct-q4_K_M
# 支持的最大字符数
max_tokens: 2048
# 温度值越高,准确率下降,温度值越低,准确率上升
# 对于每个提示语只需要单个答案:零。
#对于每个提示语需要多个答案:非零。
temperature: 0.4
# 随机采样 值越大,随机性越高
# 在 temperature 为零的情况下:输出不受影响。
# 在 temperature 不为零的情况下:非零。
top_p: 0.2
# 贪心解码 值越大,随机性越高
top-k: 40
# 频率惩罚 让token每次在文本中出现都受到惩罚。这可以阻止重复使用相同的token/单词/短语,同时也会使模型讨论的主题更加多样化,更频繁地更换主题
# 当问题仅存在一个正确答案时:零。
# 当问题存在多个正确答案时:可自由选择。
frequency-penalty: 0
# 存在惩罚 如果一个token已经在文本中出现过,就会受到惩罚 使其讨论的主题更加多样化,话题变化更加频繁,而不会明显抑制常用词的重复
presence-penalty: 0
java
@RestController
public class QianWenController {
@Resource
private OllamaChatModel ollamaChatModel;
@RequestMapping(value = "/ai/ollama")
public Object ollama(@RequestParam(value = "msg") String msg) {
String called = ollamaChatModel.call(msg);
System.out.println(called);
return called;
}
@RequestMapping(value = "/ai/ollama2")
public Map<String, Object> ollama2(@RequestParam(value = "msg") String msg) {
Map<String, Object> map = new HashMap<String, Object>();
long start = System.currentTimeMillis();
ChatResponse chatResponse = ollamaChatModel.call(new Prompt(msg, OllamaOptions.create().
withModel("qwen2.5:3b-instruct-q4_K_M")//使用哪个大模型
.withTemperature(0.4D)));//温度,温度值越高,准确率下降,温度值越低,准确率上升
String content = chatResponse.getResult().getOutput().getContent();
long end = System.currentTimeMillis();
map.put("content", content);
map.put("time", (end - start) / 1000);
return map;
}
@RequestMapping(value = "/ai/stream",produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> stream(@RequestParam(value = "msg") String msg) {
return ollamaChatModel
.stream(new Prompt(msg))
.flatMapSequential(chunk -> Flux.just(chunk.getResult().getOutput().getContent()));
}
}