deploy local llm ragflow

CPU >= 4 cores

RAM >= 16 GB

Disk >= 50 GB

Docker >= 24.0.0 & Docker Compose >= v2.26.1

下载docker:

官方下载方式:https://docs.docker.com/desktop/install/ubuntu/

其中 DEB package需要手动下载并传输到服务器

国内下载方式:

https://blog.csdn.net/u011278722/article/details/137673353

Ensure vm.max_map_count >= 262144:

check:

$ sysctl vm.max_map_count

Reset vm.max_map_count to a value at least 262144 if it is not:

$ sudo sysctl -w vm.max_map_count=262144

复制代码
This change will be reset after a system reboot. To ensure your change remains permanent, add or 
        update the vm.max_map_count value in /etc/sysctl.conf accordingly:
$ vm.max_map_count=262144

Clone the repo:

$ git clone https://github.com/infiniflow/ragflow.git

该步骤需要手动下载并传输,国内无法下载

Build the pre-built Docker images and start up the server:

$ cd ragflow/docker

$ chmod +x ./entrypoint.sh

$ docker compose up -d

这一步也需要手动传输或直接用用源代码build(见最后)

Check the server status after having the server up and running:

$ docker logs -f ragflow-server

The following output confirms a successful launch of the system:


/ __ \ ____ _ ____ _ / // / _ __

/ // // __ // __ // / / // __ | | /| / /

/ , // // // / / // / / // // /| |/ |/ /
/
/ || _ ,/ _ , /// // _
/ | /| _/

/____/

In your web browser, enter the IP address of your server and log in to RAGFlow.

With the default settings, you only need to enter http://IP_OF_YOUR_MACHINE (sans port number) as the default HTTP serving port 80 can be omitted when using the default configurations.

In service_conf.yaml, select the desired LLM factory in user_default_llm and update the API_KEY field with the corresponding API key.

See llm_api_key_setup for more information.

Rebuild:

To build the Docker images from source:

$ git clone https://github.com/infiniflow/ragflow.git

$ cd ragflow/

$ docker build -t infiniflow/ragflow:dev .

$ cd ragflow/docker

$ chmod +x ./entrypoint.sh

$ docker compose up -d

卸载原有cuda和驱动

https://blog.alumik.cn/posts/90/#:\~:text=Use the following command to uninstall a Toolkit,remove --purge '^nvidia-.*' sudo apt-get remove --purge '^libnvidia-.*'

CUDA 和 Nvdia driver安装:

https://blog.hellowood.dev/posts/ubuntu-22-安装-nvdia-显卡驱动和-cuda/

下载Vllm

https://qwen.readthedocs.io/zh-cn/latest/deployment/vllm.html

国内下载model: /Qwen2-7B-Instruct方法:

pip install modelscope

from modelscope import snapshot_download

model_dir = snapshot_download('qwen/Qwen2-7B-Instruct', cache_dir='/home/llmlocal/qwen/qwen/')

运行llm服务器

python -m vllm.entrypoints.openai.api_server --model /home/llmlocal/qwen/qwen/Qwen2-7B-Instruct --host 0.0.0.0 --port 8000

测试:

curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{

"model": "/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct",

"messages": [

{"role": "system", "content": "You are a helpful assistant."},

{"role": "user", "content": "Tell me something about large language models."}

],

"temperature": 0.7,

"top_p": 0.8,

"repetition_penalty": 1.05,

"max_tokens": 512

}'

更改ragflow的MODEL_NAME = "/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct" 路径在rag里的chat_model

相关推荐
F_D_Z5 分钟前
哈希表解Two Sum问题
python·算法·leetcode·哈希表
智算菩萨9 分钟前
【实战】使用讯飞星火API和Python构建一套文本摘要UI程序
开发语言·python·ui
Groundwork Explorer13 分钟前
异步框架+POLL混合方案应对ESP32 MPY多任务+TCP多连接
python·单片机
梦帮科技22 分钟前
Scikit-learn特征工程实战:从数据清洗到提升模型20%准确率
人工智能·python·机器学习·数据挖掘·开源·极限编程
xqqxqxxq31 分钟前
Java 集合框架之线性表(List)实现技术笔记
java·笔记·python
verbannung36 分钟前
Python进阶: 元类与属性查找理解
python
想用offer打牌1 小时前
LLM参数: Temperature 与 Top-p解析
人工智能·python·llm
小智RE0-走在路上1 小时前
Python学习笔记(6)--列表,元组,字符串,序列切片
笔记·python·学习
feeday1 小时前
Python 删除重复图片 优化版
开发语言·python
ss2731 小时前
Java线程池全解:工作原理、参数调优
java·linux·python