win10 Langchain-chatchat 知识库本地搭建记录

一、clone源码

bash 复制代码
git clone https://github.com/chatchat-space/Langchain-Chatchat.git

二、环境准备

ini 复制代码
conda create -n Chatchat python==3.10
conda activate Chatchat
​

三、模型配置

model_config.py 中

makefile 复制代码
​
# 选用的 Embedding 名称
EMBEDDING_MODEL = "m3e-base"
​
LLM_MODELS = ["zhipu-api"] 
​
ONLINE_LLM_MODEL = {
    # 具体注册及api key获取请前往 http://open.bigmodel.cn
    "zhipu-api": {
        "api_key": "你自己的智普API key",
        "version": "chatglm_turbo",  # 可选包括 "chatglm_turbo"
        "provider": "ChatGLMWorker",
    },
 }
    
    MODEL_PATH = {
    "embed_model": {
        "zhipu-api": "lucidrains/GLM-130B",
        "m3e-base": "G:\AIGC\Langchain\m3e-base-main",
    },
    
    "llm_model": {
        "zhipu-api": "lucidrains/GLM-130B",
     }
  }

四、报错问题

css 复制代码
python init_database.py --recreate-vs 初始数据库失败:

初始数据库失败:

github.com/chatchat-sp...

github.com/chatchat-sp...

github.com/chatchat-sp...

arduino 复制代码
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>python init_database.py --recreate-vs
recreating all vector stores
2023-12-19 17:02:47,732 - faiss_cache.py[line:80] - INFO: loading vector store in 'samples/vector_store/bge-large-zh' from disk.
2023-12-19 17:02:51,277 - SentenceTransformer.py[line:66] - INFO: Load pretrained SentenceTransformer: BAAI/bge-large-zh
2023-12-19 17:03:33,432 - embeddings_api.py[line:39] - ERROR: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/BAAI/bge-large-zh (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001F06C868BE0>, 'Connection to huggingface.co timed out. (connect timeout=None)'))"), '(Request ID: 149213c1-2ec8-4340-90cd-f6d60fdde1da)')
AttributeError: 'NoneType' object has no attribute 'conjugate'
​
The above exception was the direct cause of the following exception:
​
Traceback (most recent call last):
  File "G:\AIGC\Langchain\Langchain-Chatchat\init_database.py", line 108, in <module>
    folder2db(kb_names=args.kb_name, mode="recreate_vs", embed_model=args.embed_model)
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\migrate.py", line 121, in folder2db
    kb.create_kb()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\base.py", line 81, in create_kb
    self.do_create_kb()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\faiss_kb_service.py", line 47, in do_create_kb
    self.load_vector_store()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\faiss_kb_service.py", line 28, in load_vector_store
    return kb_faiss_pool.load_vector_store(kb_name=self.kb_name,
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_cache\faiss_cache.py", line 90, in load_vector_store
    vector_store = self.new_vector_store(embed_model=embed_model, embed_device=embed_device)
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_cache\faiss_cache.py", line 48, in new_vector_store
    vector_store = FAISS.from_documents([doc], embeddings, normalize_L2=True)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\vectorstores.py", line 510, in from_documents
    return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\vectorstores\faiss.py", line 911, in from_texts
    embeddings = embedding.embed_documents(texts)
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\base.py", line 399, in embed_documents
    return normalize(embeddings).tolist()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\base.py", line 38, in normalize
    norm = np.linalg.norm(embeddings, axis=1)
  File "<__array_function__ internals>", line 200, in norm
  File "F:\Anaconda3\envs\langchain\lib\site-packages\numpy\linalg\linalg.py", line 2541, in norm
    s = (x.conj() * x).real
TypeError: loop of ufunc does not support argument 0 of type NoneType which has no callable conjugate method

解决方法:

EMBEDDING_MODEL 改成bge-large-zh

然后清空knowledge_base 重新初始化向量库即可。

启动startup.py

python startup.py -a

github.com/chatchat-sp...

pip install zhipuai pypi.tuna.tsinghua.edu.cn/simple

python 复制代码
2023-12-19 15:44:46,117 - utils.py[line:24] - ERROR: object of type 'NoneType' has no len()
Traceback (most recent call last):
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\utils.py", line 22, in wrap_done
    await fn
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\base.py", line 381, in acall
    raise e
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\base.py", line 375, in acall
    await self._acall(inputs, run_manager=run_manager)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\llm.py", line 275, in _acall
    response = await self.agenerate([inputs], run_manager=run_manager)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\llm.py", line 142, in agenerate
    return await self.llm.agenerate_prompt(
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 501, in agenerate_prompt
    return await self.agenerate(
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 461, in agenerate
    raise exceptions[0]
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 564, in _agenerate_with_cache
    return await self._agenerate(
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_community\chat_models\openai.py", line 518, in _agenerate
    return await agenerate_from_stream(stream_iter)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 81, in agenerate_from_stream
    async for chunk in stream:
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_community\chat_models\openai.py", line 489, in _astream
    if len(chunk["choices"]) == 0:
TypeError: object of type 'NoneType' has no len()
2023-12-19 15:44:46,122 - utils.py[line:27] - ERROR: TypeError: Caught exception: object of type 'NoneType' has no len()

启动 webui:

streamlit run webui.py

less 复制代码
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>streamlit run webui.py
​
  You can now view your Streamlit app in your browser.
​
  Local URL: http://localhost:8501
  Network URL: http://192.168.43.195:8501
​
2023-12-19 14:21:27,722 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:29,726 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:31,032 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:31,729 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:33,035 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:33,838 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:35,041 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:35,503 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:35,843 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:36,099 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:37,519 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:37,857 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:37.859 Uncaught app exception
Traceback (most recent call last):
  File "F:\Anaconda3\envs\langchain\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui.py", line 64, in <module>
    pages[selected_page]["func"](api=api, is_lite=is_lite)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui_pages\dialogue\dialogue.py", line 165, in dialogue_page
    running_models = list(api.list_running_models())
TypeError: 'NoneType' object is not iterable
2023-12-19 14:21:38,116 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:39,526 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:40,131 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:41,635 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:42,240 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:43,641 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:44,248 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:45,647 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:45.647 Uncaught app exception
Traceback (most recent call last):
  File "F:\Anaconda3\envs\langchain\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui.py", line 64, in <module>
    pages[selected_page]["func"](api=api, is_lite=is_lite)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui_pages\dialogue\dialogue.py", line 165, in dialogue_page
    running_models = list(api.list_running_models())
TypeError: 'NoneType' object is not iterable
2023-12-19 14:21:46,262 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:46.262 Uncaught app exception
Traceback (most recent call last):
  File "F:\Anaconda3\envs\langchain\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui.py", line 64, in <module>
    pages[selected_page]["func"](api=api, is_lite=is_lite)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui_pages\dialogue\dialogue.py", line 165, in dialogue_page
    running_models = list(api.list_running_models())
TypeError: 'NoneType' object is not iterable

创建知识库失败

yaml 复制代码
2023-12-20 10:43:16,728 - SentenceTransformer.py[line:66] - INFO: Load pretrained SentenceTransformer: G:\AIGC\Langchain\m3e-base-main
2023-12-20 10:43:21,466 - embeddings_api.py[line:39] - ERROR: Error while deserializing header: HeaderTooLarge
2023-12-20 10:43:21,483 - kb_api.py[line:34] - ERROR: TypeError: 创建知识库出错: loop of ufunc does not support argument 0 of type NoneType which has no callable conjugate method

解决方法:

EMBEDDING_MODEL 改成bge-large-zh

shell 复制代码
$ git lfs install
$ git clone https://huggingface.co/BAAI/bge-large-zh

然后清空knowledge_base 执行命令 python init_database.py --recreate-vs 重新初始化向量库即可,以上问题均得到解决。

五、启动信息

css 复制代码
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>python startup.py -a
​
​
==============================Langchain-Chatchat Configuration==============================
操作系统:Windows-10-10.0.18363-SP0.
python版本:3.10.12 | packaged by Anaconda, Inc. | (main, Jul  5 2023, 19:01:18) [MSC v.1916 64 bit (AMD64)]
项目版本:v0.2.8
langchain版本:0.0.344. fastchat版本:0.2.34
​
​
当前使用的分词器:ChineseRecursiveTextSplitter
当前启动的LLM模型:['zhipu-api'] @ cpu
{'api_key': '你自己的apikey',
 'device': 'cpu',
 'host': '127.0.0.1',
 'infer_turbo': False,
 'model_path': 'lucidrains/GLM-130B',
 'online_api': True,
 'port': 21001,
 'provider': 'ChatGLMWorker',
 'version': 'chatglm_turbo',
 'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>}
当前Embbedings模型: m3e-base @ cpu
==============================Langchain-Chatchat Configuration==============================
​
​
2023-12-20 10:09:39,873 - startup.py[line:650] - INFO: 正在启动服务:
2023-12-20 10:09:39,873 - startup.py[line:651] - INFO: 如需查看 llm_api 日志,请前往 G:\AIGC\Langchain\Langchain-Chatchat\logs
2023-12-20 10:09:52 | INFO | model_worker | Register to controller
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Started server process [27468]
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Waiting for application startup.
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Application startup complete.
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Uvicorn running on http://127.0.0.1:20000 (Press CTRL+C to quit)
INFO:     Started server process [25024]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:7861 (Press CTRL+C to quit)
​
​
==============================Langchain-Chatchat Configuration==============================
操作系统:Windows-10-10.0.18363-SP0.
python版本:3.10.12 | packaged by Anaconda, Inc. | (main, Jul  5 2023, 19:01:18) [MSC v.1916 64 bit (AMD64)]
项目版本:v0.2.8
langchain版本:0.0.344. fastchat版本:0.2.34
​
​
当前使用的分词器:ChineseRecursiveTextSplitter
当前启动的LLM模型:['zhipu-api'] @ cpu
{'api_key': '你自己的apikey',
 'device': 'cpu',
 'host': '127.0.0.1',
 'infer_turbo': False,
 'model_path': 'lucidrains/GLM-130B',
 'online_api': True,
 'port': 21001,
 'provider': 'ChatGLMWorker',
 'version': 'chatglm_turbo',
 'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>}
当前Embbedings模型: m3e-base @ cpu
​
​
服务端运行信息:
    OpenAI API Server: http://127.0.0.1:20000/v1
    Chatchat  API  Server: http://127.0.0.1:7861
    Chatchat WEBUI Server: http://127.0.0.1:8501
==============================Langchain-Chatchat Configuration==============================
​
​
​
  You can now view your Streamlit app in your browser.
​
  URL: http://127.0.0.1:8501

启动页面如下:

六、注意事项

新建知识库名字不支持中文名称,且导入PDF解析速度较慢:

参考资料:

blog.csdn.net/hero2722856...

zhuanlan.zhihu.com/p/670696982

相关推荐
风象南42 分钟前
Claude Code这个隐藏技能,让我告别PPT焦虑
人工智能·后端
Mintopia1 小时前
OpenClaw 对软件行业产生的影响
人工智能
陈广亮2 小时前
构建具有长期记忆的 AI Agent:从设计模式到生产实践
人工智能
会写代码的柯基犬2 小时前
DeepSeek vs Kimi vs Qwen —— AI 生成俄罗斯方块代码效果横评
人工智能·llm
Mintopia3 小时前
OpenClaw 是什么?为什么节后热度如此之高?
人工智能
爱可生开源社区3 小时前
DBA 的未来?八位行业先锋的年度圆桌讨论
人工智能·dba
叁两5 小时前
用opencode打造全自动公众号写作流水线,AI 代笔太香了!
前端·人工智能·agent
前端付豪6 小时前
LangChain记忆:通过Memory记住上次的对话细节
人工智能·python·langchain
strayCat232556 小时前
Clawdbot 源码解读 7: 扩展机制
人工智能·开源