背景

工作中我们会碰到的痛点：

拜访记录内容太多如何提取概要
工作总结时候发现需要查阅的素材来源太松散比如：图片资源，会议语音，文档。。。，如何总结提取想要的信息？

上述问题，一般需要员工投入大量时间精力进行信息整合获取出有效信息，那么这块能否通过 AI 进行初步筛选和提取概要？提升员工的工作效率？

带着上述痛点开始探索，预想协助办公的AI需要满足这些条件：

AI模型需要本地化，不需联网确保数据安全
AI模型能读公司内部数据文档，训练出趋向于自身公司业务的模型
能在消费级电脑上快速运行，降低搭建，运行成本

本地私有模型：GPT4All

开源大语言模型GPT4All，用户能选择不同的ai模型在消费级的电脑中正常运行，且不需要链接网络能很好私有化，保护商业数据，该项目拥有59.2k star，拥有较好的开发生态，GPT4All支持客户端安装和代码环境构建

客户端安装

1.下载客户端： gpt4all.io/index.html

2.下载&运行的AI模型

这里我选择下载了两个规模为7B，13B模型：Mistral OperOrca( Trained by Mistral AI, 7 billion 参数 ,size: 3.83GB) , Orca 2(Trained by Microsoft, 13 billion 参数, size: 6.86GB)

AI规模概念：7B，13B是指模型中可训练参数的数量。这里的 "B"表示10亿（Billion）

对两个模型进行提问：如果你有自己的意识和思维，你会选择成为人类吗？

7B模型回答：

13B模型回答：

两种模型回答的角度不同，个人更倾向于7B 模型回复结果，能直接回复问题并提出问题，

硬件运行环境情况

回复的速度有点慢估计和电脑性能有关，

未提问时cpu运行情况：

提问ai 模型 cpu运行情况：

本机为mac 2.9GHz 六核 i9 内存32GB 在阿里云上同比配置一个月大概 1024 rmb

当然也可以买GPU运行最近火热的 Mate开源模型：Llama2

python运行GPT4all

上手简单demo

文档地址：docs.gpt4all.io/

demo 使用 7B参数训练的模型：mistral-7b-openorca.Q4_0.gguf

该模型文件大小3.83G, python 代码调用：

ini 复制代码

from gpt4all import GPT4All


model_path = './models'


model = GPT4All(model_name='mistral-7b-openorca.Q4_0.gguf',model_path=model_path,allow_download=False)


response =model.generate("The capital of China is ", max_tokens=2000)
print(respons

问该模型 "The capital of China is "

该模型的回答：

耗时在10s左右，这个后期可以优化，运行的本机情况

2. langchain训练AI对指定文档提取内容

langchain文档地址：python.langchain.com/docs/get_st...

使用langChain训练和扩充ai模型能力，使得ai模型通过"读"取给定pdf文件后回围绕pdf文件内容回答问题。能训练出趋于公司数据的私有模型

创建python虚拟环境，控制python版本

复制代码

python3 -m venv .venv

激活环境

bash 复制代码

source .venv/bin/activate

安装依赖库

复制代码

pip install pygpt4all
pip install langchain
pip install unstructured
pip install pdf2image
pip install pytesseract
pip install pypdf
pip install faiss-cpu

Hugging Face 中下载模型

存放入项目中

创建docs一个测试的pdf，内容：买买提去年有$10000,今年亏了2000，那么现在

通过langchain创建该pdf矢量索引库，生成时间在20s左右，库的大小34kb 具体情况：

写脚本让AI对读取pdf矢量索引库

ini 复制代码

from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.manager import AsyncCallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
# function for loading only TXT files
from langchain.document_loaders import TextLoader
# text splitter for create chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter
# to be able to load the pdf files
from langchain.document_loaders import UnstructuredPDFLoader
from langchain.document_loaders import PyPDFLoader
from langchain.document_loaders import DirectoryLoader
# Vector Store Index to create our database about our knowledge
from langchain.indexes import VectorstoreIndexCreator
# LLamaCpp embeddings from the Alpaca model
from langchain.embeddings import LlamaCppEmbeddings
# FAISS  library for similaarity search
from langchain.vectorstores.faiss import FAISS
import os  #for interaaction with the files
import datetime


# TEST FOR SIMILARITY SEARCH


# assign the path for the 2 models GPT4All and Alpaca for the embeddings 
gpt4all_path = './models/mistral-7b-openorca.Q4_0.gguf' 
llama_path = './models/ggml-model-q4_0.bin' 
# Calback manager for handling the calls with  the model
callback_manager = AsyncCallbackManager([StreamingStdOutCallbackHandler()])


# create the embedding object
embeddings = LlamaCppEmbeddings(model_path=llama_path)
# create the GPT4All llm object
llm = GPT4All(model=gpt4all_path, callback_manager=callback_manager, verbose=True)




def similarity_search(query, index):
    # k is the number of similarity searched that matches the query
    # default is 4
    matched_docs = index.similarity_search(query, k=3) 
    sources = []
    for doc in matched_docs:
        sources.append(
            {
                "page_content": doc.page_content,
                "metadata": doc.metadata,
            }
        )


    return matched_docs, sources


# Load our local index vector db
# index = FAISS.load_local("my_faiss_index", embeddings)
# # Hardcoded question
# query = "where is Lili destination?"
# docs = index.similarity_search(query)
# # Get the matches best 3 results - defined in the function k=3
# print(f"The question is: {query}")
# print("Here the result of the semantic search on the index, without GPT4All..")
# print(docs[0])




# Load our local index vector db
index = FAISS.load_local("my_faiss_index", embeddings)


# create the prompt template
template = """
Please use the following context to answer questions.
Context: {context}
---
Question: {question}
Answer: Let's think step by step."""


# Hardcoded question
question = input("Your question: ")
matched_docs, sources = similarity_search(question, index)
# Creating the context
context = "\n".join([doc.page_content for doc in matched_docs])
# instantiating the prompt template and the GPT4All chain
prompt = PromptTemplate(template=template, input_variables=["context", "question"]).partial(context=context)
llm_chain = LLMChain(prompt=prompt, llm=llm)
# Print the result
print(llm_chain.run(

运行脚本先AI提问：买买提现在剩下多少钱？

AI回答剩$8000符合预期，回答的内容有点长，可以在后期训练AI回答的效果和风格

GPT4All构建自己的 private LLM模型

背景

本地私有模型：GPT4All

python运行GPT4all