deploy local llm ragflow

CPU >= 4 cores

RAM >= 16 GB

Disk >= 50 GB

Docker >= 24.0.0 & Docker Compose >= v2.26.1

下载docker:

官方下载方式:https://docs.docker.com/desktop/install/ubuntu/

其中 DEB package需要手动下载并传输到服务器

国内下载方式:

https://blog.csdn.net/u011278722/article/details/137673353

Ensure vm.max_map_count >= 262144:

check:

$ sysctl vm.max_map_count

Reset vm.max_map_count to a value at least 262144 if it is not:

$ sudo sysctl -w vm.max_map_count=262144

This change will be reset after a system reboot. To ensure your change remains permanent, add or 
        update the vm.max_map_count value in /etc/sysctl.conf accordingly:
$ vm.max_map_count=262144

Clone the repo:

$ git clone https://github.com/infiniflow/ragflow.git

该步骤需要手动下载并传输,国内无法下载

Build the pre-built Docker images and start up the server:

$ cd ragflow/docker

$ chmod +x ./entrypoint.sh

$ docker compose up -d

这一步也需要手动传输或直接用用源代码build(见最后)

Check the server status after having the server up and running:

$ docker logs -f ragflow-server

The following output confirms a successful launch of the system:


/ __ \ ____ _ ____ _ / // / _ __

/ // // __ // __ // / / // __ | | /| / /

/ , // // // / / // / / // // /| |/ |/ /
/
/ || _ ,/ _ , /// // _
/ | /| _/

/____/

In your web browser, enter the IP address of your server and log in to RAGFlow.

With the default settings, you only need to enter http://IP_OF_YOUR_MACHINE (sans port number) as the default HTTP serving port 80 can be omitted when using the default configurations.

In service_conf.yaml, select the desired LLM factory in user_default_llm and update the API_KEY field with the corresponding API key.

See llm_api_key_setup for more information.

Rebuild:

To build the Docker images from source:

$ git clone https://github.com/infiniflow/ragflow.git

$ cd ragflow/

$ docker build -t infiniflow/ragflow:dev .

$ cd ragflow/docker

$ chmod +x ./entrypoint.sh

$ docker compose up -d

卸载原有cuda和驱动

https://blog.alumik.cn/posts/90/#:\~:text=Use the following command to uninstall a Toolkit,remove --purge '^nvidia-.*' sudo apt-get remove --purge '^libnvidia-.*'

CUDA 和 Nvdia driver安装:

https://blog.hellowood.dev/posts/ubuntu-22-安装-nvdia-显卡驱动和-cuda/

下载Vllm

https://qwen.readthedocs.io/zh-cn/latest/deployment/vllm.html

国内下载model: /Qwen2-7B-Instruct方法:

pip install modelscope

from modelscope import snapshot_download

model_dir = snapshot_download('qwen/Qwen2-7B-Instruct', cache_dir='/home/llmlocal/qwen/qwen/')

运行llm服务器

python -m vllm.entrypoints.openai.api_server --model /home/llmlocal/qwen/qwen/Qwen2-7B-Instruct --host 0.0.0.0 --port 8000

测试:

curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{

"model": "/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct",

"messages": [

{"role": "system", "content": "You are a helpful assistant."},

{"role": "user", "content": "Tell me something about large language models."}

],

"temperature": 0.7,

"top_p": 0.8,

"repetition_penalty": 1.05,

"max_tokens": 512

}'

更改ragflow的MODEL_NAME = "/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct" 路径在rag里的chat_model

相关推荐
Hylan_J30 分钟前
【VSCode】MicroPython环境配置
ide·vscode·python·编辑器
莫忘初心丶35 分钟前
在 Ubuntu 22 上使用 Gunicorn 启动 Flask 应用程序
python·ubuntu·flask·gunicorn
失败尽常态5233 小时前
用Python实现Excel数据同步到飞书文档
python·excel·飞书
2501_904447743 小时前
OPPO发布新型折叠屏手机 起售价8999
python·智能手机·django·virtualenv·pygame
青龙小码农3 小时前
yum报错:bash: /usr/bin/yum: /usr/bin/python: 坏的解释器:没有那个文件或目录
开发语言·python·bash·liunx
大数据追光猿4 小时前
Python应用算法之贪心算法理解和实践
大数据·开发语言·人工智能·python·深度学习·算法·贪心算法
Leuanghing4 小时前
【Leetcode】11. 盛最多水的容器
python·算法·leetcode
xinxiyinhe5 小时前
如何设置Cursor中.cursorrules文件
人工智能·python
诸神缄默不语6 小时前
如何用Python 3自动打开exe程序
python·os·subprocess·python 3
橘子师兄6 小时前
分页功能组件开发
数据库·python·django