本地搭建【文档助手】大模型版(LangChain+llama+Streamlit)

概述

本文的文档助手就是:我们上传一个文档,然后在对话框中输入问题,大模型会把问题的答案返回。

安装步骤

  1. 先下载代码到本地

LangChain调用llama模型的示例代码:https://github.com/afaqueumer/DocQA(代码不是本人写的,尊重原创)

java 复制代码
git clone https://github.com/afaqueumer/DocQA.git
  1. 环境安装
java 复制代码
双击 setup_env.bat
  • 如果没反应可能是缺少环境,打开控制台手动执行一下,缺python或者pip的自己根据报错下载一下

如果llama-cpp-python安装报错

(1)需要下载Visual Studio

(2)打开Visual Studio,工具,获取工具和功能

(3)等待下载完,重新运行setup_env.bat

如果还有报错【error C2061: 语法错误: 】,那么可能是Visual Studio的版本太低了,我一开始用的是2019版本,后来换成了2022

更新为2022之后重复上面操作

  1. 下载一个靠谱的模型
    https://huggingface.co/TheBloke/Llama-2-7B-GGUF
    本文用的是:TheBloke/Llama-2-7B-GGUF中的最精简版
  2. 进入DocQA,修改app.py

原始

java 复制代码
llm = LlamaCpp(model_path="./models/llama-7b.ggmlv3.q4_0.bin")
embeddings = LlamaCppEmbeddings(model_path="models/llama-7b.ggmlv3.q4_0.bin")

这里改成自己下载的模型地址,比如:llama-2-7b.Q2_K.gguf

java 复制代码
llm = LlamaCpp(model_path="../llama.cpp/models/7B/llama-2-7b.Q2_K.gguf")
embeddings = LlamaCppEmbeddings(model_path="../llama.cpp/models/7B/llama-2-7b.Q2_K.gguf")
  1. 运行
java 复制代码
双击run_app.bat
  1. 测试

准备好一个txt文档

java 复制代码
As of October this year, there were nearly 2,500 geographical indication products in China
The reporter learned from the State Intellectual Property Office that in recent years, the quantity and quality of China's geographical indication products have risen rapidly. As of October this year, China has approved a total of 2,495 geographical indication products, and approved 7,013 geographical indications to be registered as collective trademarks and certification trademarks. In 2021, the direct output value of GI products exceeded 700 billion yuan.
In recent years, the State Intellectual Property Office has conscientiously implemented the decisions and arrangements of the CPC Central Committee and the State Council, actively and steadily promoted the unified acceptance channels, unified special signs, unified announcements, unified protection and supervision, unified foreign cooperation and other work, and further improved the system of protection, management and application of geographical indications.
In terms of institutional construction, the State Intellectual Property Office issued the "14th Five-Year Plan for the Protection and Use of Geographical Indications", formulated and issued a unified special indication for geographical indications, revised and issued the "Measures for the Protection of Foreign Geographical Indication Products", and launched the legislative work on geographical indications; In the year, a total of 1,416 cases of infringement of geographical indications were investigated and dealt with across the country, involving an amount of 9.28 million yuan and a fine of 13.023 million yuan

上传到页面中

没有GPU的痛苦,运行太慢了

注意:别用中文问,这个模型好像不支持中文,换一个支持中文的模型就行了

相关推荐
董厂长14 分钟前
langchain :记忆组件混淆概念澄清 & 创建Conversational ReAct后显示指定 记忆组件
人工智能·深度学习·langchain·llm
亿牛云爬虫专家42 分钟前
Kubernetes下的分布式采集系统设计与实战:趋势监测失效引发的架构进化
分布式·python·架构·kubernetes·爬虫代理·监测·采集
九年义务漏网鲨鱼4 小时前
【大模型学习 | MINIGPT-4原理】
人工智能·深度学习·学习·语言模型·多模态
蹦蹦跳跳真可爱5895 小时前
Python----OpenCV(图像増强——高通滤波(索贝尔算子、沙尔算子、拉普拉斯算子),图像浮雕与特效处理)
人工智能·python·opencv·计算机视觉
nananaij5 小时前
【Python进阶篇 面向对象程序设计(3) 继承】
开发语言·python·神经网络·pycharm
雷羿 LexChien5 小时前
从 Prompt 管理到人格稳定:探索 Cursor AI 编辑器如何赋能 Prompt 工程与人格风格设计(上)
人工智能·python·llm·编辑器·prompt
敲键盘的小夜猫6 小时前
LLM复杂记忆存储-多会话隔离案例实战
人工智能·python·langchain
高压锅_12206 小时前
Django Channels WebSocket实时通信实战:从聊天功能到消息推送
python·websocket·django
胖达不服输7 小时前
「日拱一码」020 机器学习——数据处理
人工智能·python·机器学习·数据处理
吴佳浩8 小时前
Python入门指南-番外-LLM-Fingerprint(大语言模型指纹):从技术视角看AI开源生态的边界与挑战
python·llm·mcp