【xinference】(15):在compshare上,使用docker-compose运行xinference和chatgpt-web项目,配置成功!!!

视频演示

【xinference】(15):在compshare上,使用docker-compose运行xinference和chatgpt-web项目,配置成功!!!

1,安装docker方法:

bash 复制代码
#!/bin/sh

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit nvidia-docker2

echo "install docker finish ."

sudo curl -L "https://github.com/docker/compose/releases/download/v2.28.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod 755 /usr/local/bin/docker-compose 

echo "install docker-compose finish ."

# 把当前用户加入到 docker 组;
sudo gpasswd -a $USER docker
# 更新docker组
newgrp docker
# 增加自动启动
sudo systemctl enable docker
sudo systemctl restart docker

echo "add docker user finish ."

2,然后就可以启动docker-compose了

yaml 复制代码
version: '3.5'

services:

##################### 使用xinference部署大模型 #####################

# docker 文档
# https://inference.readthedocs.io/zh-cn/latest/getting_started/using_docker_image.html#docker-image
# 使用qwen2大模型
# https://inference.readthedocs.io/zh-cn/latest/models/builtin/llm/qwen2-instruct.html
# 启动 7b模型
# xinference launch --model-engine vllm --model-name qwen2-instruct --size-in-billions 7 --model-format awq --quantization Int4

  xinf:
    restart: always
    container_name: xinf
    image: xprobe/xinference:latest
    # 使用 GPU 资源。
    deploy:
        resources:
            reservations:
                devices:
                  - driver: "nvidia"
                    count: "all"
                    capabilities: ["gpu"]
    ports:
      - 9997:9997
    environment:
      - XINFERENCE_MODEL_SRC=modelscope
    volumes:
      - ./xinf-cache/:/root/.cache
    # 命令启动 xinference 
    entrypoint: xinference-local -H 0.0.0.0 --log-level debug

## https://github.com/Chanzhaoyu/chatgpt-web
  app:
    image: chenzhaoyu94/chatgpt-web 
    container_name: app
    ports:
      - 8188:3002
    environment:
      # choose one
      OPENAI_API_KEY: sk-xxx
      # choose one
      OPENAI_ACCESS_TOKEN: xxx
      # API interface address, optional, available when OPENAI_API_KEY is set
      OPENAI_API_BASE_URL: http://xinf:9997/v1
      # API model, optional, available when OPENAI_API_KEY is set, https://platform.openai.com/docs/models
      # gpt-4, gpt-4-turbo-preview, gpt-4-0125-preview, gpt-4-1106-preview, gpt-4-0314, gpt-4-0613, gpt-4-32k, gpt-4-32k-0314, gpt-4-32k-0613, gpt-3.5-turbo-16k, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo, gpt-3.5-turbo-0301, gpt-3.5-turbo-0613, text-davinci-003, text-davinci-002, code-davinci-002
      OPENAI_API_MODEL: qwen2-instruct
      # reverse proxy, optional
      AUTH_SECRET_KEY:
      # maximum number of requests per hour, optional, unlimited by default
      MAX_REQUEST_PER_HOUR: 0
      # timeout, unit milliseconds, optional
      TIMEOUT_MS: 60000

networks:
  default:
    name: xinf-network

然后需要登陆到xinf 启动模型:

bash 复制代码
xinference launch --model-engine vllm --model-name qwen2-instruct --size-in-billions 7 --model-format awq --quantization Int4

3,启动成功之后就访问了

效果还不错!

相关推荐
小满zs27 分钟前
React-router v7 第三章(路由)
前端·react.js
Kx…………4 小时前
Day2:前端项目uniapp壁纸实战
前端·学习·uni-app·实战·项目
gqkmiss4 小时前
Git Cherry-pick:核心命令、实践详解
前端·git·前端框架·commit·cherry-pick
不想上班只想要钱5 小时前
vue面试题
前端·javascript·vue.js
拉不动的猪5 小时前
简单回顾下es6增数组方法
前端·javascript·面试
Alkaid:5 小时前
解决Long类型前端精度丢失和正常传回后端问题
java·前端·javascript·vue.js
冯浩(grow up)5 小时前
macOS可视化桌面配置docker加速器
macos·docker·容器
Code额5 小时前
Vue 框架组件间通信方式
前端·vue.js·前端框架
玖玖passion5 小时前
即时响应之道:深入浅出Server-Sent Events
前端
Micheal_Dad6 小时前
在macOS的docker中如何安装及运行ROS2
macos·docker