LangChain框架 + Ollama本地 + llama3:8b 调用本地大模型（LLM）实现带网页界面的本地聊天机器人

本地安装Ollama并运行模型

**官网下载安装：**https://ollama.com/

拉取/启动模型： ollama pull llama3:8b，ollama run llama3:8b

**启动模型：**ollama run llama3:8b "当前时间是？"

**查看运行的模型：**ollama ps

**停止模型：**ollama stop llama3:8b

Llama3是Meta于2024年4月推出的新一代开源大语言模型（LLM），发布了 8B（80亿参数）和 70B（700亿参数）两个版本，均提供"基础预训练版（Base）"和"指令微调版（Instruct）"。

LangChain的版本

首选最新稳定：langchain==1.2.17，老项目兼容：langchain==0.3.18。

先升级pip（避免解析问题）：python -m pip install --upgrade pip

查看已安装的命令：pip list | findstr "langchain"

卸载已安装的包：

复制代码

pip uninstall -y langchain langchain-ollama langchain-core langchain-community langchain-protocol
pip cache purge

先安装langchain 1.2.17（会自动安装匹配的langchain-core）： pip install langchain==1.2.17 --force-reinstall

后安装兼容的版本langchain-ollama会强制锁定langchain-core版本： pip install langchain-ollama==0.3.1 --no-deps

安装langchain-text-splitters： pip install langchain-text-splitters==0.2.4 --no-deps

安装langchain-community： pip install langchain-community==0.2.12 --no-deps

安装faiss-cpu稳定版本：pip install faiss-cpu==1.8.0

安装官方ollama包：pip install ollama==0.5.1

安装完成后查看：pip list | findstr "langchain"

复制代码

langchain                1.2.17
langchain-community      0.2.12
langchain-core           1.3.3
langchain-ollama         0.3.1
langchain-protocol       0.0.15
langchain-text-splitters 0.2.4

**LangChain的核心功能包括：**1、链（Chains）：将不同的LLM组件组合成工作流，实现复杂任务。2、检索增强生成（RAG）：结合外部知识库，让 AI 能基于特定文档回答问题。3、代理（Agents）：让 AI 可以调用工具、执行任务，比如查询数据或运行代码。

直接访问本地大语言模型

复制代码

from langchain_ollama.llms import OllamaLLM

# 初始化本地模型
llm = OllamaLLM(
    model="llama3:8b",
    temperature=0.7,  # 创造力：0=严谨，1=发散
    base_url="http://127.0.0.1:11434"  # 显式指定 Ollama 地址，避免网络问题
)

# 调用
response = llm.invoke("什么是大语言模型？")
print(response)

提示词模板（按规则回答问题）

复制代码

from langchain_ollama.llms import OllamaLLM
from langchain_core.prompts import PromptTemplate

# 模型
llm = OllamaLLM(
    model="llama3:8b",
    temperature=0.7,  # 创造力：0=严谨，1=发散
    base_url="http://127.0.0.1:11434"  # 显式指定 Ollama 地址，避免网络问题
)

# 自定义提示模板
prompt = PromptTemplate(
    input_variables=["question"],
    template="请用专业、简洁、不超过30字的中文回答：{question}"
)

# 链式组合（LangChain 1.x 标准写法）
chain = prompt | llm

# 运行
answer = chain.invoke({"question": "什么是大语言模型？"})
print(answer)

带记忆的聊天机器人（记住上下文）

复制代码

from langchain_ollama.llms import OllamaLLM
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory

# 模型
llm = OllamaLLM(
    model="llama3:8b",
    temperature=0.7,  # 创造力：0=严谨，1=发散
    base_url="http://127.0.0.1:11434"  # 显式指定 Ollama 地址，避免网络问题
)

# 2. 定义对话模板
prompt = ChatPromptTemplate.from_messages([
    ("system", "你是一个友好的AI助手，能记住对话历史并回答问题。"),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}")
])

# 3. 构建基础链
chain = prompt | llm

# 4. 内存存储对话历史
store = {}

def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

# 5. 带记忆的对话链
chat_chain = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history"
)

# 6. 测试对话
config = {"session_id": "test_session"}
print(chat_chain.invoke({"input": "你好，我叫小明！"}, config=config))
print(chat_chain.invoke({"input": "我刚才叫什么名字？"}, config=config))

本地知识库RAG

复制代码

from langchain_ollama.llms import OllamaLLM
from langchain_ollama.embeddings import OllamaEmbeddings
from langchain_community.document_loaders import TextLoader 
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.vectorstores import InMemoryVectorStore

# 1. 加载文件
loader = TextLoader("test.txt", encoding="utf-8")
documents = loader.load()

# 2. 文本拆分
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
texts = splitter.split_documents(documents)

# 3. 向量库
embeddings = OllamaEmbeddings(model="llama3:8b")
vector_store = InMemoryVectorStore(embeddings)
vector_store.add_documents(texts)

# 4. 问答
llm = OllamaLLM(model="llama3:8b")


def ask(question):
    docs = vector_store.similarity_search(question, k=2)
    context = "\n".join([doc.page_content for doc in docs])

    prompt = f"""基于以下上下文回答问题：
上下文：{context}
问题：{question}
回答："""
    return llm.invoke(prompt)


# 测试
print(ask("LangChain 的核心功能有哪些？"))

复制代码

test.txt

复制代码

LangChain 是一个用于构建大语言模型（LLM）应用的开发框架。
它的核心功能包括：
1.  链（Chains）：将不同的LLM组件组合成工作流，实现复杂任务。
2.  检索增强生成（RAG）：结合外部知识库，让 AI 能基于特定文档回答问题。
3.  代理（Agents）：让 AI 可以调用工具、执行任务，比如查询数据或运行代码。

Ollama 是一个本地运行 LLM 的工具，支持一键部署 Llama 3、Mistral 等开源模型。
使用 LangChain + Ollama，可以快速搭建一个完全本地、不依赖云端的 AI 应用，保护数据隐私。

结构化输出JSON

复制代码

from langchain_ollama.llms import OllamaLLM
from langchain_core.prompts import PromptTemplate
import json
import re

# 大模型
llm = OllamaLLM(model="llama3:8b")

# 提示词（只要求输出 JSON，不需要文件）
prompt = PromptTemplate(
    template="""
请从下面文本里提取信息，只输出 JSON，不要输出任何多余文字。
输出格式：
{{"姓名":"", "年龄":"", "城市":""}}

文本：{text}
""",
    input_variables=["text"]
)

# 自动提取 JSON
def extract_json(output):
    try:
        match = re.search(r"\{.*\}", output, re.DOTALL)
        if match:
            return json.loads(match.group(0))
    except:
        return {"错误": "解析失败"}
    return output

# 组合成链
chain = prompt | llm | extract_json

# 运行
text = "我叫小明，今年25岁，住在北京。"
result = chain.invoke({"text": text})
print(result)

带网页界面的本地聊天机器人

安装超轻量网页库：pip install streamlit==1.57.0

复制代码

from langchain_ollama.llms import OllamaLLM
import streamlit as st

# 标题
st.title("💬 本地 Llama3 聊天机器人")
st.caption("LangChain + Ollama + 本地运行")

# 初始化模型
llm = OllamaLLM(model="llama3:8b")

# 记忆聊天历史
if "messages" not in st.session_state:
    st.session_state.messages = []

# 显示历史消息
for msg in st.session_state.messages:
    with st.chat_message(msg["role"]):
        st.markdown(msg["content"])

# 输入框
user_input = st.chat_input("输入你的问题...")

if user_input:
    # 显示用户消息
    with st.chat_message("user"):
        st.markdown(user_input)
    st.session_state.messages.append({"role": "user", "content": user_input})

    # AI 回答
    with st.chat_message("assistant"):
        response = llm.invoke(user_input)
        st.markdown(response)
        st.session_state.messages.append({"role": "assistant", "content": response})

运行： streamlit run chatbot.py

**聊天网页地址：**http://localhost:8501/