deploy local llm ragflow

CPU >= 4 cores

RAM >= 16 GB

Disk >= 50 GB

Docker >= 24.0.0 & Docker Compose >= v2.26.1

下载docker:

官方下载方式:https://docs.docker.com/desktop/install/ubuntu/

其中 DEB package需要手动下载并传输到服务器

国内下载方式:

https://blog.csdn.net/u011278722/article/details/137673353

Ensure vm.max_map_count >= 262144:

check:

$ sysctl vm.max_map_count

Reset vm.max_map_count to a value at least 262144 if it is not:

$ sudo sysctl -w vm.max_map_count=262144

复制代码
This change will be reset after a system reboot. To ensure your change remains permanent, add or 
        update the vm.max_map_count value in /etc/sysctl.conf accordingly:
$ vm.max_map_count=262144

Clone the repo:

$ git clone https://github.com/infiniflow/ragflow.git

该步骤需要手动下载并传输,国内无法下载

Build the pre-built Docker images and start up the server:

$ cd ragflow/docker

$ chmod +x ./entrypoint.sh

$ docker compose up -d

这一步也需要手动传输或直接用用源代码build(见最后)

Check the server status after having the server up and running:

$ docker logs -f ragflow-server

The following output confirms a successful launch of the system:


/ __ \ ____ _ ____ _ / // / _ __

/ // // __ // __ // / / // __ | | /| / /

/ , // // // / / // / / // // /| |/ |/ /
/
/ || _ ,/ _ , /// // _
/ | /| _/

/____/

In your web browser, enter the IP address of your server and log in to RAGFlow.

With the default settings, you only need to enter http://IP_OF_YOUR_MACHINE (sans port number) as the default HTTP serving port 80 can be omitted when using the default configurations.

In service_conf.yaml, select the desired LLM factory in user_default_llm and update the API_KEY field with the corresponding API key.

See llm_api_key_setup for more information.

Rebuild:

To build the Docker images from source:

$ git clone https://github.com/infiniflow/ragflow.git

$ cd ragflow/

$ docker build -t infiniflow/ragflow:dev .

$ cd ragflow/docker

$ chmod +x ./entrypoint.sh

$ docker compose up -d

卸载原有cuda和驱动

https://blog.alumik.cn/posts/90/#:\~:text=Use the following command to uninstall a Toolkit,remove --purge '^nvidia-.*' sudo apt-get remove --purge '^libnvidia-.*'

CUDA 和 Nvdia driver安装:

https://blog.hellowood.dev/posts/ubuntu-22-安装-nvdia-显卡驱动和-cuda/

下载Vllm

https://qwen.readthedocs.io/zh-cn/latest/deployment/vllm.html

国内下载model: /Qwen2-7B-Instruct方法:

pip install modelscope

from modelscope import snapshot_download

model_dir = snapshot_download('qwen/Qwen2-7B-Instruct', cache_dir='/home/llmlocal/qwen/qwen/')

运行llm服务器

python -m vllm.entrypoints.openai.api_server --model /home/llmlocal/qwen/qwen/Qwen2-7B-Instruct --host 0.0.0.0 --port 8000

测试:

curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{

"model": "/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct",

"messages": [

{"role": "system", "content": "You are a helpful assistant."},

{"role": "user", "content": "Tell me something about large language models."}

],

"temperature": 0.7,

"top_p": 0.8,

"repetition_penalty": 1.05,

"max_tokens": 512

}'

更改ragflow的MODEL_NAME = "/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct" 路径在rag里的chat_model

相关推荐
海鸥-w7 分钟前
Python(FastAPI)中ORM框架Sqlalchemy的安装及建表
python
Wonderful U1 小时前
Python+Django实战|个人博客内容管理系统:搭建轻量化、高自由度的个人动态博客CMS系统
人工智能·python·django
高洁011 小时前
智能体:你的私人数字助理
人工智能·python·数据挖掘·virtualenv·知识图谱
海鸥-w1 小时前
python(fastapi) 实现更新,新增,删除接口
android·python·fastapi
淘矿人1 小时前
DeepSeek V4对决Claude 4.8:AI模型终极横评
java·开发语言·人工智能·python·sql·php·pygame
showgea1 小时前
Python httpx封装和使用
python·httpx
Asize2 小时前
重生之我在 Vibe Coding 时代当程序员:第十二课,Prompt 不是咒语,是可以沉淀的业务接口
前端·人工智能·python
abigale032 小时前
字典 与 Python 对象 的总结
python·dict·object
星河漫步Lu2 小时前
Pycharm中部署Anaconda环境
ide·python·pycharm