win10 Langchain-chatchat 知识库本地搭建记录

一、clone源码

bash 复制代码
git clone https://github.com/chatchat-space/Langchain-Chatchat.git

二、环境准备

ini 复制代码
conda create -n Chatchat python==3.10
conda activate Chatchat
​

三、模型配置

model_config.py 中

makefile 复制代码
​
# 选用的 Embedding 名称
EMBEDDING_MODEL = "m3e-base"
​
LLM_MODELS = ["zhipu-api"] 
​
ONLINE_LLM_MODEL = {
    # 具体注册及api key获取请前往 http://open.bigmodel.cn
    "zhipu-api": {
        "api_key": "你自己的智普API key",
        "version": "chatglm_turbo",  # 可选包括 "chatglm_turbo"
        "provider": "ChatGLMWorker",
    },
 }
    
    MODEL_PATH = {
    "embed_model": {
        "zhipu-api": "lucidrains/GLM-130B",
        "m3e-base": "G:\AIGC\Langchain\m3e-base-main",
    },
    
    "llm_model": {
        "zhipu-api": "lucidrains/GLM-130B",
     }
  }

四、报错问题

css 复制代码
python init_database.py --recreate-vs 初始数据库失败:

初始数据库失败:

github.com/chatchat-sp...

github.com/chatchat-sp...

github.com/chatchat-sp...

arduino 复制代码
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>python init_database.py --recreate-vs
recreating all vector stores
2023-12-19 17:02:47,732 - faiss_cache.py[line:80] - INFO: loading vector store in 'samples/vector_store/bge-large-zh' from disk.
2023-12-19 17:02:51,277 - SentenceTransformer.py[line:66] - INFO: Load pretrained SentenceTransformer: BAAI/bge-large-zh
2023-12-19 17:03:33,432 - embeddings_api.py[line:39] - ERROR: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/BAAI/bge-large-zh (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001F06C868BE0>, 'Connection to huggingface.co timed out. (connect timeout=None)'))"), '(Request ID: 149213c1-2ec8-4340-90cd-f6d60fdde1da)')
AttributeError: 'NoneType' object has no attribute 'conjugate'
​
The above exception was the direct cause of the following exception:
​
Traceback (most recent call last):
  File "G:\AIGC\Langchain\Langchain-Chatchat\init_database.py", line 108, in <module>
    folder2db(kb_names=args.kb_name, mode="recreate_vs", embed_model=args.embed_model)
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\migrate.py", line 121, in folder2db
    kb.create_kb()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\base.py", line 81, in create_kb
    self.do_create_kb()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\faiss_kb_service.py", line 47, in do_create_kb
    self.load_vector_store()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\faiss_kb_service.py", line 28, in load_vector_store
    return kb_faiss_pool.load_vector_store(kb_name=self.kb_name,
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_cache\faiss_cache.py", line 90, in load_vector_store
    vector_store = self.new_vector_store(embed_model=embed_model, embed_device=embed_device)
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_cache\faiss_cache.py", line 48, in new_vector_store
    vector_store = FAISS.from_documents([doc], embeddings, normalize_L2=True)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\vectorstores.py", line 510, in from_documents
    return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\vectorstores\faiss.py", line 911, in from_texts
    embeddings = embedding.embed_documents(texts)
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\base.py", line 399, in embed_documents
    return normalize(embeddings).tolist()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\base.py", line 38, in normalize
    norm = np.linalg.norm(embeddings, axis=1)
  File "<__array_function__ internals>", line 200, in norm
  File "F:\Anaconda3\envs\langchain\lib\site-packages\numpy\linalg\linalg.py", line 2541, in norm
    s = (x.conj() * x).real
TypeError: loop of ufunc does not support argument 0 of type NoneType which has no callable conjugate method

解决方法:

EMBEDDING_MODEL 改成bge-large-zh

然后清空knowledge_base 重新初始化向量库即可。

启动startup.py

python startup.py -a

github.com/chatchat-sp...

pip install zhipuai pypi.tuna.tsinghua.edu.cn/simple

python 复制代码
2023-12-19 15:44:46,117 - utils.py[line:24] - ERROR: object of type 'NoneType' has no len()
Traceback (most recent call last):
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\utils.py", line 22, in wrap_done
    await fn
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\base.py", line 381, in acall
    raise e
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\base.py", line 375, in acall
    await self._acall(inputs, run_manager=run_manager)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\llm.py", line 275, in _acall
    response = await self.agenerate([inputs], run_manager=run_manager)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\llm.py", line 142, in agenerate
    return await self.llm.agenerate_prompt(
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 501, in agenerate_prompt
    return await self.agenerate(
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 461, in agenerate
    raise exceptions[0]
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 564, in _agenerate_with_cache
    return await self._agenerate(
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_community\chat_models\openai.py", line 518, in _agenerate
    return await agenerate_from_stream(stream_iter)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 81, in agenerate_from_stream
    async for chunk in stream:
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_community\chat_models\openai.py", line 489, in _astream
    if len(chunk["choices"]) == 0:
TypeError: object of type 'NoneType' has no len()
2023-12-19 15:44:46,122 - utils.py[line:27] - ERROR: TypeError: Caught exception: object of type 'NoneType' has no len()

启动 webui:

streamlit run webui.py

less 复制代码
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>streamlit run webui.py
​
  You can now view your Streamlit app in your browser.
​
  Local URL: http://localhost:8501
  Network URL: http://192.168.43.195:8501
​
2023-12-19 14:21:27,722 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:29,726 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:31,032 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:31,729 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:33,035 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:33,838 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:35,041 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:35,503 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:35,843 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:36,099 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:37,519 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:37,857 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:37.859 Uncaught app exception
Traceback (most recent call last):
  File "F:\Anaconda3\envs\langchain\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui.py", line 64, in <module>
    pages[selected_page]["func"](api=api, is_lite=is_lite)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui_pages\dialogue\dialogue.py", line 165, in dialogue_page
    running_models = list(api.list_running_models())
TypeError: 'NoneType' object is not iterable
2023-12-19 14:21:38,116 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:39,526 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:40,131 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:41,635 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:42,240 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:43,641 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:44,248 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:45,647 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:45.647 Uncaught app exception
Traceback (most recent call last):
  File "F:\Anaconda3\envs\langchain\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui.py", line 64, in <module>
    pages[selected_page]["func"](api=api, is_lite=is_lite)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui_pages\dialogue\dialogue.py", line 165, in dialogue_page
    running_models = list(api.list_running_models())
TypeError: 'NoneType' object is not iterable
2023-12-19 14:21:46,262 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:46.262 Uncaught app exception
Traceback (most recent call last):
  File "F:\Anaconda3\envs\langchain\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui.py", line 64, in <module>
    pages[selected_page]["func"](api=api, is_lite=is_lite)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui_pages\dialogue\dialogue.py", line 165, in dialogue_page
    running_models = list(api.list_running_models())
TypeError: 'NoneType' object is not iterable

创建知识库失败

yaml 复制代码
2023-12-20 10:43:16,728 - SentenceTransformer.py[line:66] - INFO: Load pretrained SentenceTransformer: G:\AIGC\Langchain\m3e-base-main
2023-12-20 10:43:21,466 - embeddings_api.py[line:39] - ERROR: Error while deserializing header: HeaderTooLarge
2023-12-20 10:43:21,483 - kb_api.py[line:34] - ERROR: TypeError: 创建知识库出错: loop of ufunc does not support argument 0 of type NoneType which has no callable conjugate method

解决方法:

EMBEDDING_MODEL 改成bge-large-zh

shell 复制代码
$ git lfs install
$ git clone https://huggingface.co/BAAI/bge-large-zh

然后清空knowledge_base 执行命令 python init_database.py --recreate-vs 重新初始化向量库即可,以上问题均得到解决。

五、启动信息

css 复制代码
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>python startup.py -a
​
​
==============================Langchain-Chatchat Configuration==============================
操作系统:Windows-10-10.0.18363-SP0.
python版本:3.10.12 | packaged by Anaconda, Inc. | (main, Jul  5 2023, 19:01:18) [MSC v.1916 64 bit (AMD64)]
项目版本:v0.2.8
langchain版本:0.0.344. fastchat版本:0.2.34
​
​
当前使用的分词器:ChineseRecursiveTextSplitter
当前启动的LLM模型:['zhipu-api'] @ cpu
{'api_key': '你自己的apikey',
 'device': 'cpu',
 'host': '127.0.0.1',
 'infer_turbo': False,
 'model_path': 'lucidrains/GLM-130B',
 'online_api': True,
 'port': 21001,
 'provider': 'ChatGLMWorker',
 'version': 'chatglm_turbo',
 'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>}
当前Embbedings模型: m3e-base @ cpu
==============================Langchain-Chatchat Configuration==============================
​
​
2023-12-20 10:09:39,873 - startup.py[line:650] - INFO: 正在启动服务:
2023-12-20 10:09:39,873 - startup.py[line:651] - INFO: 如需查看 llm_api 日志,请前往 G:\AIGC\Langchain\Langchain-Chatchat\logs
2023-12-20 10:09:52 | INFO | model_worker | Register to controller
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Started server process [27468]
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Waiting for application startup.
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Application startup complete.
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Uvicorn running on http://127.0.0.1:20000 (Press CTRL+C to quit)
INFO:     Started server process [25024]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:7861 (Press CTRL+C to quit)
​
​
==============================Langchain-Chatchat Configuration==============================
操作系统:Windows-10-10.0.18363-SP0.
python版本:3.10.12 | packaged by Anaconda, Inc. | (main, Jul  5 2023, 19:01:18) [MSC v.1916 64 bit (AMD64)]
项目版本:v0.2.8
langchain版本:0.0.344. fastchat版本:0.2.34
​
​
当前使用的分词器:ChineseRecursiveTextSplitter
当前启动的LLM模型:['zhipu-api'] @ cpu
{'api_key': '你自己的apikey',
 'device': 'cpu',
 'host': '127.0.0.1',
 'infer_turbo': False,
 'model_path': 'lucidrains/GLM-130B',
 'online_api': True,
 'port': 21001,
 'provider': 'ChatGLMWorker',
 'version': 'chatglm_turbo',
 'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>}
当前Embbedings模型: m3e-base @ cpu
​
​
服务端运行信息:
    OpenAI API Server: http://127.0.0.1:20000/v1
    Chatchat  API  Server: http://127.0.0.1:7861
    Chatchat WEBUI Server: http://127.0.0.1:8501
==============================Langchain-Chatchat Configuration==============================
​
​
​
  You can now view your Streamlit app in your browser.
​
  URL: http://127.0.0.1:8501

启动页面如下:

六、注意事项

新建知识库名字不支持中文名称,且导入PDF解析速度较慢:

参考资料:

blog.csdn.net/hero2722856...

zhuanlan.zhihu.com/p/670696982

相关推荐
AI极客菌35 分钟前
Controlnet作者新作IC-light V2:基于FLUX训练,支持处理风格化图像,细节远高于SD1.5。
人工智能·计算机视觉·ai作画·stable diffusion·aigc·flux·人工智能作画
阿_旭37 分钟前
一文读懂| 自注意力与交叉注意力机制在计算机视觉中作用与基本原理
人工智能·深度学习·计算机视觉·cross-attention·self-attention
王哈哈^_^43 分钟前
【数据集】【YOLO】【目标检测】交通事故识别数据集 8939 张,YOLO道路事故目标检测实战训练教程!
前端·人工智能·深度学习·yolo·目标检测·计算机视觉·pyqt
Power20246662 小时前
NLP论文速读|LongReward:基于AI反馈来提升长上下文大语言模型
人工智能·深度学习·机器学习·自然语言处理·nlp
数据猎手小k2 小时前
AIDOVECL数据集:包含超过15000张AI生成的车辆图像数据集,目的解决旨在解决眼水平分类和定位问题。
人工智能·分类·数据挖掘
好奇龙猫2 小时前
【学习AI-相关路程-mnist手写数字分类-win-硬件:windows-自我学习AI-实验步骤-全连接神经网络(BPnetwork)-操作流程(3) 】
人工智能·算法
沉下心来学鲁班2 小时前
复现LLM:带你从零认识语言模型
人工智能·语言模型
数据猎手小k2 小时前
AndroidLab:一个系统化的Android代理框架,包含操作环境和可复现的基准测试,支持大型语言模型和多模态模型。
android·人工智能·机器学习·语言模型
YRr YRr2 小时前
深度学习:循环神经网络(RNN)详解
人工智能·rnn·深度学习
sp_fyf_20242 小时前
计算机前沿技术-人工智能算法-大语言模型-最新研究进展-2024-11-01
人工智能·深度学习·神经网络·算法·机器学习·语言模型·数据挖掘