Phi-3-mini-4k-instruct 的功能测试

Model card 介绍

Phi-3-Mini-4K-Instruct 是一个 3.8B 参数、轻量级、最先进的开放模型，使用 Phi-3 数据集进行训练，其中包括合成数据和经过过滤的公开可用网站数据，重点是高品质和推理密集的属性。该型号属于 Phi-3 系列，Mini 版本有 4K 和 128K 两种变体，这是它可以支持的上下文长度（以令牌为单位）。

该模型经历了训练后过程，其中结合了监督微调和针对指令遵循和安全措施的直接偏好优化。当根据测试常识、语言理解、数学、代码、长上下文和逻辑推理的基准进行评估时，Phi-3 Mini-4K-Instruct 在参数少于 130 亿的模型中展示了强大且最先进的性能。

资源和技术文档：

预期用途

主要用例

该模型旨在用于英语的商业和研究用途。该模型适用于需要以下功能的应用：

内存/计算受限环境
延迟限制场景
强大的推理能力（尤其是代码、数学和逻辑）

我们的模型旨在加速语言和多模式模型的研究，用作生成人工智能驱动功能的构建块。

用例注意事项

我们的模型并非针对所有下游目的而专门设计或评估。开发人员在选择用例时应考虑语言模型的常见限制，并在特定下游用例中使用之前评估和减轻准确性、安全性和复杂性，特别是对于高风险场景。开发人员应了解并遵守与其用例相关的适用法律或法规（包括隐私、贸易合规法等）。

本模型卡中包含的任何内容均不应解释为或视为对该模型发布所依据的许可的限制或修改。

如何使用

Phi-3 Mini-4K-Instruct 已集成到 Transformer 的开发版本 (4.40.0) 中。在通过 pip 发布正式版本之前，请确保您正在执行以下操作之一：

加载模型时，确保 trust_remote_code=True 作为 from_pretrained() 函数的参数传递。
将本地 Transformer 更新到开发版本：pip uninstall -y transformers && pip install git+https://github.com/huggingface/transformers. 前面的命令是从源克隆和安装的替代方法。

电流互感器版本可以通过以下方式验证：pip list | grep transformers.

Phi-3 Mini-4K-Instruct 也可用于 HuggingChat.

分词器

Phi-3 Mini-4K-Instruct 支持高达 32064 个令牌的词汇量。分词器文件已经提供了可用于下游微调的占位符标记，但它们也可以扩展到模型的词汇量大小。

聊天格式

鉴于训练数据的性质，Phi-3 Mini-4K-Instruct 模型最适合使用如下聊天格式进行提示。您可以使用通用模板将提示作为问题提供，如下所示：

python 复制代码

<|user|>\nQuestion <|end|>\n<|assistant|>

例如：

python 复制代码

<|system|>
You are a helpful AI assistant.<|end|>
<|user|>
How to explain Internet for a medieval knight?<|end|>
<|assistant|>

其中模型在 <|assistant|> 之后生成文本。如果出现少量提示，则提示格式可以如下：

python 复制代码

<|system|>
You are a helpful AI assistant.<|end|>
<|user|>
I am going to Paris, what should I see?<|end|>
<|assistant|>
Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:\n\n1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.\n2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.\n3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.\n\nThese are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world."<|end|>
<|user|>
What is so great about #1?<|end|>
<|assistant|>

示例推理代码

此代码片段展示了如何快速开始在 GPU 上运行模型：

python 复制代码

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

torch.random.manual_seed(0)

model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct", 
    device_map="cuda", 
    torch_dtype="auto", 
    trust_remote_code=True, 
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")

messages = [
    {"role": "system", "content": "You are a helpful digital assistant. Please provide safe, ethical and accurate information to the user."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "temperature": 0.0,
    "do_sample": False,
}

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

某些应用程序/框架可能在对话开始时不包含 BOS 令牌 (<s>)。请确保包含它，因为它提供更可靠的结果。

负责任的人工智能考虑因素

与其他语言模型一样，Phi 系列模型的行为方式可能不公平、不可靠或具有攻击性。需要注意的一些限制行为包括：

服务质量：Phi 模型主要根据英文文本进行训练。英语以外的语言的性能会更差。训练数据中代表性较少的英语语言品种可能会比标准美式英语表现更差。
伤害的表征和刻板印象的延续：这些模型可能会过度或低估某些群体的代表性，消除某些群体的代表性，或强化贬低或负面的刻板印象。尽管训练后具有安全性，但由于不同群体的代表性水平不同或反映现实世界模式和社会偏见的训练数据中负面刻板印象示例的普遍存在，这些限制可能仍然存在。
不当或攻击性内容：这些模型可能会产生其他类型的不当或攻击性内容，这可能导致在没有针对用例的额外缓解措施的情况下，不适合在敏感环境中部署。
信息可靠性：语言模型可以生成无意义的内容或捏造听起来合理但不准确或过时的内容。
代码范围有限：大部分 Phi-3 训练数据基于 Python，并使用常用包，例如"打字、数学、随机、集合、日期时间、itertools"。如果模型生成的 Python 脚本使用其他包或其他语言的脚本，我们强烈建议用户手动验证所有 API 使用情况。

开发人员应采用负责任的人工智能最佳实践，并负责确保特定用例符合相关法律法规（例如隐私、贸易等）。需要考虑的重要领域包括：

分配：如果没有进一步的评估和额外的去偏差技术，模型可能不适合可能对法律地位或资源或生活机会的分配（例如：住房、就业、信贷等）产生重大影响的场景。
高风险场景：开发人员应评估在高风险场景中使用模型的适用性，在这些场景中，不公平、不可靠或令人反感的输出可能会造成极高的成本或导致伤害。这包括在准确性和可靠性至关重要的敏感或专家领域提供建议（例如：法律或健康建议）。应根据部署上下文在应用程序级别实施其他保护措施。
错误信息：模型可能会产生不准确的信息。开发人员应遵循透明度最佳实践，并告知最终用户他们正在与人工智能系统进行交互。在应用程序级别，开发人员可以构建反馈机制和管道，以在特定于用例的上下文信息中进行地面响应，这种技术称为检索增强生成（RAG）。
有害内容的生成：开发人员应评估其上下文的输出，并使用适合其用例的可用安全分类器或自定义解决方案。
滥用：可能存在其他形式的滥用，例如欺诈、垃圾邮件或恶意软件制作，开发人员应确保其应用程序不违反适用的法律和法规。

训练模型

架构：Phi-3 Mini-4K-Instruct 有 3.8B 参数，是一个密集的仅解码器 Transformer 模型。该模型通过监督微调 (SFT) 和直接偏好优化 (DPO) 进行微调，以确保符合人类偏好和安全准则。
输入：文本。它最适合使用聊天格式的提示。
上下文长度：4K 令牌
GPU：512 H100-80G
培训时间：7天
训练数据：3.3T 代币
输出：响应输入生成的文本
日期：我们的模型在 2024 年 2 月至 4 月期间接受训练
状态：这是在离线数据集上训练的静态模型，截止日期为 2023 年 10 月。随着我们改进模型，可能会发布调整模型的未来版本。

数据集

我们的训练数据包括多种来源，总计 3.3 万亿个代币，并且是以下内容的组合

公开的文档经过严格的质量筛选，精选出高质量的教育数据和代码；
新创建的合成"教科书式"数据，用于教授数学、编码、常识推理、世界常识（科学、日常活动、心理理论等）；
高质量的聊天格式监督数据涵盖各种主题，以反映人类在不同方面的偏好，例如遵循指令、诚实、诚实和乐于助人。

微调

此处提供了带有 TRL 和 Accelerate 模块的多 GPU 监督微调 (SFT) 的基本示例。

基准测试

我们报告了 Phi-3-Mini-4K-Instruct 在标准开源基准上的结果，衡量模型的推理能力（常识推理和逻辑推理）。我们与 Phi-2、Mistral-7b-v0.1、Mixtral-8x7b、Gemma 7B、Llama-3-8B-Instruct 和 GPT-3.5 进行比较。

所有报告的数字都是通过完全相同的管道生成的，以确保这些数字具有可比性。由于评估中的选择略有不同，这些数字可能与其他公布的数字有所不同。

按照现在的标准，我们在温度为 0 时使用少样本提示来评估模型。提示和样本次数是 Microsoft 评估语言模型的内部工具的一部分，特别是我们没有对 Phi 的管道进行优化 -3。更具体地说，我们不会更改提示、选择不同的少数样本、更改提示格式或对模型进行任何其他形式的优化。

每个基准列出了 k 镜头示例的数量。

| Phi-3-Mini-4K-In 3.8b |-----------------------|----- | MMLU 5-Shot | 68.8 | HellaSwag 5-Shot | 76.7 | ANLI 7-Shot | 52.8 | GSM-8K 0-Shot; CoT | 82.5 | MedQA 2-Shot | 53.8 | AGIEval 0-Shot | 37.5 | TriviaQA 5-Shot | 64.0 | Arc-C 10-Shot | 84.9 | Arc-E 10-Shot | 94.6 | PIQA 5-Shot | 84.2 | SociQA 5-Shot | 76.6 | BigBench-Hard 0-Shot | 71.7 | WinoGrande 5-Shot | 70.8 | OpenBookQA 10-Shot | 83.2 | BoolQ 0-Shot | 77.6 | CommonSenseQA 10-Shot | 80.2 | TruthfulQA 10-Shot | 65.0 | HumanEval 0-Shot | 59.1 | MBPP 3-Shot | 53.8 | Phi-3-Small 7b (preview) | Phi-3-Medium 14b (preview) | Phi-2 2.7b | Mistral 7b | Gemma 7b | Llama-3-In 8b | Mixtral 8x7b | GPT-3.5 version 1106 |
---------------------|----------------------------|------------|------------|----------|---------------|--------------|----------------------|-------|
| 75.3 | 78.2 | 56.3 | 61.7 | 63.6 | 66.5 | 68.4 | 71.4 |
| 78.7 | 83.2 | 53.6 | 58.5 | 49.8 | 71.1 | 70.4 | 78.8 |
| 55.0 | 58.7 | 42.5 | 47.1 | 48.7 | 57.3 | 55.2 | 58.1 |
| 86.4 | 90.8 | 61.1 | 46.4 | 59.8 | 77.4 | 64.7 | 78.1 |
| 58.2 | 69.8 | 40.9 | 49.6 | 50.0 | 60.5 | 62.2 | 63.4 |
| 45.0 | 49.7 | 29.8 | 35.1 | 42.1 | 42.0 | 45.2 | 48.4 |
| 59.1 | 73.3 | 45.2 | 72.3 | 75.2 | 67.7 | 82.2 | 85.8 |
| 90.7 | 91.9 | 75.9 | 78.6 | 78.3 | 82.8 | 87.3 | 87.4 |
| 97.1 | 98.0 | 88.5 | 90.6 | 91.4 | 93.4 | 95.6 | 96.3 |
| 87.8 | 88.2 | 60.2 | 77.7 | 78.1 | 75.7 | 86.0 | 86.6 |
| 79.0 | 79.4 | 68.3 | 74.6 | 65.5 | 73.9 | 75.9 | 68.3 |
| 75.0 | 82.5 | 59.4 | 57.3 | 59.6 | 51.5 | 69.7 | 68.32 |
| 82.5 | 81.2 | 54.7 | 54.2 | 55.6 | 65 | 62.0 | 68.8 |
| 88.4 | 86.6 | 73.6 | 79.8 | 78.6 | 82.6 | 85.8 | 86.0 |
| 82.9 | 86.5 | -- | 72.2 | 66.0 | 80.9 | 77.6 | 79.1 |
| 80.3 | 82.6 | 69.3 | 72.6 | 76.2 | 79 | 78.1 | 79.6 |
| 68.1 | 74.8 | -- | 52.1 | 53.0 | 63.2 | 60.1 | 85.8 |
| 59.1 | 54.7 | 47.0 | 28.0 | 34.1 | 60.4 | 37.8 | 62.2 |
| 71.4 | 73.7 | 60.6 | 50.8 | 51.5 | 67.7 | 60.2 | 77.8 |

软件

硬件

请注意，默认情况下，Phi-3-mini 模型使用闪存注意力，这需要某些类型的 GPU 硬件才能运行。我们在以下 GPU 类型上进行了测试：

英伟达 A100

英伟达A6000

英伟达 H100

如果您想在以下平台上运行模型：

NVIDIA V100 或更早一代 GPU：使用 attn_implementation="eager" 调用 AutoModelForCausalLM.from_pretrained()

CPU：使用GGUF量化模型4K

GPU、CPU 和移动设备上的优化推理：使用 ONNX 模型 4K

跨平台支持

ONNX 运行时生态系统现在支持跨平台和硬件的 Phi-3 Mini 模型。您可以在此处找到优化的 Phi-3 Mini-4K-Instruct ONNX 模型。

优化的 Phi-3 模型也以 ONNX 格式发布，可跨设备（包括服务器平台、Windows、Linux 和 Mac 桌面以及移动 CPU）在 CPU 和 GPU 上与 ONNX Runtime 一起运行，并具有最适合每个目标的精度。 DirectML 支持让开发人员能够跨 AMD、Intel 和 NVIDIA GPU 为 Windows 设备大规模提供硬件加速。

ONNX Runtime 与 DirectML 一起为一系列 CPU、GPU 和移动设备上的 Phi-3 提供跨平台支持。

以下是我们添加的一些优化配置：

int4 DML 的 ONNX 模型：通过 AWQ 量化为 int4
适用于 fp16 CUDA 的 ONNX 模型
int4 CUDA 的 ONNX 模型：通过 RTN 量化为 int4
适用于 int4 CPU 和移动设备的 ONNX 模型：通过 RTN 量化为 int4

这是技术报告。

python 复制代码

!pip install llama-index llama-index-llms-huggingface llama-index-embeddings-huggingface transformers accelerate bitsandbytes llama-index-readers-web matplotlib flash-attn

python 复制代码

hf_token = "hf_"

设置数据

python 复制代码

from llama_index.readers.web import BeautifulSoupWebReader

url = "https://www.theverge.com/2023/9/29/23895675/ai-bot-social-network-openai-meta-chatbots"

documents = BeautifulSoupWebReader().load_data([url])

LLM

python 复制代码

from llama_index.llms.huggingface import HuggingFaceLLM


def messages_to_prompt(messages):
    prompt = ""
    system_found = False
    for message in messages:
        if message.role == "system":
            prompt += f"<|system|>\n{message.content}<|end|>\n"
            system_found = True
        elif message.role == "user":
            prompt += f"<|user|>\n{message.content}<|end|>\n"
        elif message.role == "assistant":
            prompt += f"<|assistant|>\n{message.content}<|end|>\n"
        else:
            prompt += f"<|user|>\n{message.content}<|end|>\n"

    # trailing prompt
    prompt += "<|assistant|>\n"

    if not system_found:
        prompt = (
            "<|system|>\nYou are a helpful AI assistant.<|end|>\n" + prompt
        )

    return prompt


llm = HuggingFaceLLM(
    model_name="microsoft/Phi-3-mini-4k-instruct",
    model_kwargs={
        "trust_remote_code": True,
    },
    generate_kwargs={"do_sample": True, "temperature": 0.1},
    tokenizer_name="microsoft/Phi-3-mini-4k-instruct",
    query_wrapper_prompt=(
        "<|system|>\n"
        "You are a helpful AI assistant.<|end|>\n"
        "<|user|>\n"
        "{query_str}<|end|>\n"
        "<|assistant|>\n"
    ),
    messages_to_prompt=messages_to_prompt,
    is_chat_model=True,
)

复制代码

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

复制代码

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

python 复制代码

from llama_index.core import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

Settings.llm = llm
Settings.embed_model = HuggingFaceEmbedding(
    model_name="BAAI/bge-small-en-v1.5"
)

Index Setup 指数设置

python 复制代码

from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex.from_documents(documents)

python 复制代码

from llama_index.core import SummaryIndex

summary_index = SummaryIndex.from_documents(documents)

有用的导入/日志记录

python 复制代码

from llama_index.core.response.notebook_utils import display_response

python 复制代码

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

基本查询引擎

紧凑（默认）Compact (default)

python 复制代码

query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("How do OpenAI and Meta differ on AI tools?")

display_response(response)

复制代码

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

复制代码

WARNING:transformers_modules.microsoft.Phi-3-mini-4k-instruct.240d36176caf025230489b7a56e895d9e5b845f7.modeling_phi3:You are not running the flash-attention implementation, expect numerical differences.
You are not running the flash-attention implementation, expect numerical differences.

Final Response:

OpenAI 和 Meta 在使用人工智能工具的方法上有所不同。 OpenAI 倾向于将其产品展示为生产力工具，专注于完成工作的简单实用程序。另一方面，从事娱乐业务的 Meta 开发了自己的生成人工智能和语音用途，为其消息应用程序创建了 28 个个性驱动的聊天机器人。这些聊天机器人的声音来自 Charli D'Amelio、Dwyane Wade、Kendall Jenner、MrBeast、Snoop Dogg、Tom Brady 和 Paris Hilton 等名人。虽然 OpenAI 的 ChatGPT 主要是用于生成文本的语言模型，但 Meta 的 AI 工具更专注于通过个性驱动的聊天机器人创建引人入胜且有趣的交互。

细化 refine

python 复制代码

query_engine = vector_index.as_query_engine(response_mode="refine")

response = query_engine.query("How do OpenAI and Meta differ on AI tools?")

display_response(response)

Batches: 0%| | 0/1 $00:00\$

Final Response:

OpenAI 和 Meta 对 AI 工具有不同的看法。 OpenAI 主要专注于创建作为生产力辅助工具的人工智能工具，帮助用户更有效地完成任务。例如，他们对 ChatGPT 的最新更新引入了语音功能，使该工具更易于使用和通用。另一方面，以娱乐业务而闻名的Meta，则以独特的方式开发了AI工具。他们推出了 28 个个性驱动的聊天机器人，其中包括 Charli D'Amelio 和 Tom Brady 等名人的声音，用于他们的消息应用程序中。这些聊天机器人旨在提供独特且引人入胜的用户体验。

树总结

python 复制代码

query_engine = vector_index.as_query_engine(response_mode="tree_summarize")

response = query_engine.query("How do OpenAI and Meta differ on AI tools?")

display_response(response)

Batches: 0%| | 0/1 $00:00\$

Final Response:

OpenAI 和 Meta 在使用人工智能工具的方法上有所不同。 OpenAI 倾向于将其产品展示为生产力工具，专注于完成工作的简单实用程序。另一方面，从事娱乐业务的 Meta 开发了自己的生成人工智能和语音用途，为其消息应用程序创建了 28 个个性驱动的聊天机器人。这些聊天机器人的声音来自 Charli D'Amelio、Dwyane Wade、Kendall Jenner、MrBeast、Snoop Dogg、Tom Brady 和 Paris Hilton 等名人。虽然 OpenAI 的 ChatGPT 主要是用于各种任务的语言模型，但 Meta 的 AI 工具更专注于通过其消息应用程序提供娱乐和个性化体验。

路由器查询引擎

python 复制代码

from llama_index.core.tools import QueryEngineTool, ToolMetadata

vector_tool = QueryEngineTool(
    vector_index.as_query_engine(),
    metadata=ToolMetadata(
        name="vector_search",
        description="Useful for searching for specific facts.",
    ),
)

summary_tool = QueryEngineTool(
    summary_index.as_query_engine(response_mode="tree_summarize"),
    metadata=ToolMetadata(
        name="summary",
        description="Useful for summarizing an entire document.",
    ),
)

单选择器

python 复制代码

from llama_index.core.query_engine import RouterQueryEngine

query_engine = RouterQueryEngine.from_defaults(
    [vector_tool, summary_tool], select_multi=False
)

response = query_engine.query("What was mentioned about Meta?")

display_response(response)

复制代码

INFO:llama_index.core.query_engine.router_query_engine:Selecting query engine 0: Useful for searching for specific facts..
Selecting query engine 0: Useful for searching for specific facts..

复制代码

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

**Final Response:
**Meta 是一家主要从事娱乐业务的公司，它也在构建 LLM（大型语言模型），并发现了自己在生成人工智能和语音方面的用途。他们推出了 28 个用于消息应用程序的个性化聊天机器人，Charli D'Amelio、Dwyane Wade、Kendall Jenner、MrBeast、Snoop Dogg、Tom Brady 和 Paris Hilton 等名人为这些聊天机器人配音。

多重选择器

python 复制代码

from llama_index.core.query_engine import RouterQueryEngine

query_engine = RouterQueryEngine.from_defaults(
    [vector_tool, summary_tool],
    select_multi=True,
)

response = query_engine.query(
    "What was mentioned about Meta? Summarize with any other companies mentioned in the entire document."
)

display_response(response)

复制代码

INFO:llama_index.core.query_engine.router_query_engine:Selecting query engine 1: Useful for summarizing an entire document, which is needed to provide a summary about Meta and any other companies mentioned..
Selecting query engine 1: Useful for summarizing an entire document, which is needed to provide a summary about Meta and any other companies mentioned..

Final Response:

据周三透露，娱乐行业公司 Meta 正在开发自己的生成人工智能和语音用途。他们推出了 28 个用于 Meta 消息应用程序的个性驱动聊天机器人，Charli D'Amelio、Dwyane Wade、Kendall Jenner、MrBeast、Snoop Dogg、Tom Brady 和 Paris Hilton 等名人都为这项工作献出了自己的力量。这些聊天机器人带有简短且常常令人尴尬的描述，Meta 计划将其人工智能角色放置在其产品的每个主要表面上，有可能将社交信息转变为部分合成的社交网络。

文件中提到的另一家公司OpenAI倾向于将其产品定位为生产力工具，而Meta则涉足娱乐业务。 OpenAI 的 ChatGPT 已经发展成为一种更有用的工具，其语音功能可能会导致更具同理心和参与度的社交网络。该文件还提到了人工智能生成图像的潜力，Meta 的消息应用程序引入了新的贴纸。

该文件还简要介绍了

子问题查询引擎

python 复制代码

from llama_index.core.tools import QueryEngineTool, ToolMetadata

vector_tool = QueryEngineTool(
    vector_index.as_query_engine(),
    metadata=ToolMetadata(
        name="vector_search",
        description="Useful for searching for specific facts.",
    ),
)

summary_tool = QueryEngineTool(
    summary_index.as_query_engine(response_mode="tree_summarize"),
    metadata=ToolMetadata(
        name="summary",
        description="Useful for summarizing an entire document.",
    ),
)

python 复制代码

import nest_asyncio

nest_asyncio.apply()

python 复制代码

from llama_index.core.query_engine import SubQuestionQueryEngine

query_engine = SubQuestionQueryEngine.from_defaults(
    [vector_tool, summary_tool],
    verbose=True,
)

response = query_engine.query(
    "What was mentioned about Meta? How Does it differ from how OpenAI is talked about?"
)

display_response(response)

复制代码

Generated 3 sub questions.
[vector_search] Q: What are the key points mentioned about Meta in documents?

复制代码

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

复制代码

[vector_search] A: 1. Meta is building large language models (LLMs) and generative AI, similar to OpenAI.

2. Meta has developed 28 personality-driven chatbots for its messaging apps, featuring voices of celebrities like Charli D'Amelio, Dwyane Wade, Kendall Jenner, MrBeast, Snoop Dogg, Tom Brady, and Paris Hilton.

3. Meta's chatbots are designed to have brief and often cringe-worthy character descriptions, with MrBeast's Zach being described as "MrBeast, the guy who will roast you because he cares."

4. Meta's chatbots are intended to provide users with a taste of interacting with AI, allowing them to get a feel for AI Snoop Dogg before any potential issues are ironed out.

5. Meta's chatbots are seen as a step towards a synthetic social network, where AI characters will be present on every major surface of the company's products, including Facebook pages, Instagram accounts, and messaging inboxes.

6.
[vector_search] Q: What are the key points mentioned about OpenAI in documents?

复制代码

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

复制代码

[vector_search] A: 1. OpenAI announced the latest updates for ChatGPT, including a feature that allows users to interact with its large language model via voice.

2. The addition of a voice to ChatGPT gives it a hint of personality, making it feel more powerful as a mobile app and potentially more empathetic and helpful.

3. OpenAI's products are typically presented as productivity tools, but the company is also exploring uses for generative AI and voices in the entertainment industry.

4. OpenAI has developed 28 personality-driven chatbots for use in Meta's messaging apps, with celebrity voices lending their personalities to the bots.

5. The voice feature for ChatGPT is currently rolling out to ChatGPT Plus subscribers, with free users expected to gain access in the future.
[summary] Q: How does Meta differ from OpenAI in terms of mentioned facts?
[summary] A: Meta and OpenAI differ in their approach and applications of artificial intelligence (AI) based on the mentioned facts. OpenAI primarily presents its products as productivity tools, focusing on simple utilities for getting things done, such as the ChatGPT AI that can now provide voice responses and offer pep talks. On the other hand, Meta, which is in the entertainment business, is building its own uses for generative AI and voices, creating personality-driven chatbots for its messaging apps. These chatbots are designed to mimic celebrities and offer unique interactions, such as AI Taylor Swift or MrBeast. While OpenAI's ChatGPT is more focused on productivity and utility, Meta's AI characters aim to provide entertainment and novelty, potentially transforming social networking into a partially synthetic experience.

Final Response:

Meta 参与了大型语言模型和生成式人工智能的创建，与 OpenAI 类似，但它采取了独特的方法，为其消息应用程序开发个性驱动的聊天机器人。这些聊天机器人具有名人的声音，旨在提供独特的互动，例如流行人物的人工智能版本。相比之下，OpenAI 的重点是生产力工具，其 ChatGPT AI 提供语音响应和基于实用程序的交互。 Meta 的努力更倾向于娱乐和合成社交网络的潜力，而 OpenAI 则强调生产力和效率的实际应用。

SQL查询引擎

python 复制代码

import locale

locale.getpreferredencoding = lambda: "UTF-8"

python 复制代码

!curl "https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip" -O "./chinook.zip"
!unzip "./chinook.zip"

python 复制代码

from sqlalchemy import (
    create_engine,
    MetaData,
    Table,
    Column,
    String,
    Integer,
    select,
    column,
)

engine = create_engine("sqlite:///chinook.db")

python 复制代码

from llama_index.core import SQLDatabase

sql_database = SQLDatabase(engine)

python 复制代码

from llama_index.core.indices.struct_store import NLSQLTableQueryEngine

query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["albums", "tracks", "artists"],
)

python 复制代码

response = query_engine.query("What are some albums? Limit to 5.")

display_response(response)

复制代码

INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'albums' has columns: AlbumId (INTEGER), Title (NVARCHAR(160)), ArtistId (INTEGER), and foreign keys: ['ArtistId'] -> artists.['ArtistId'].

Table 'tracks' has columns: TrackId (INTEGER), Name (NVARCHAR(200)), AlbumId (INTEGER), MediaTypeId (INTEGER), GenreId (INTEGER), Composer (NVARCHAR(220)), Milliseconds (INTEGER), Bytes (INTEGER), UnitPrice (NUMERIC(10, 2)), and foreign keys: ['MediaTypeId'] -> media_types.['MediaTypeId'], ['GenreId'] -> genres.['GenreId'], ['AlbumId'] -> albums.['AlbumId'].

Table 'artists' has columns: ArtistId (INTEGER), Name (NVARCHAR(120)), and foreign keys: .
> Table desc str: Table 'albums' has columns: AlbumId (INTEGER), Title (NVARCHAR(160)), ArtistId (INTEGER), and foreign keys: ['ArtistId'] -> artists.['ArtistId'].

Table 'tracks' has columns: TrackId (INTEGER), Name (NVARCHAR(200)), AlbumId (INTEGER), MediaTypeId (INTEGER), GenreId (INTEGER), Composer (NVARCHAR(220)), Milliseconds (INTEGER), Bytes (INTEGER), UnitPrice (NUMERIC(10, 2)), and foreign keys: ['MediaTypeId'] -> media_types.['MediaTypeId'], ['GenreId'] -> genres.['GenreId'], ['AlbumId'] -> albums.['AlbumId'].

Table 'artists' has columns: ArtistId (INTEGER), Name (NVARCHAR(120)), and foreign keys: .

Final Response:

以下是五张热门专辑：

"对于那些即将摇滚的人，我们向你们致敬"
"竭尽全力"
"不安和狂野"
"让这里有摇滚"
《大人物》

这些专辑在音乐界产生了重大影响，并受到歌迷和评论家的高度评价。

python 复制代码

response = query_engine.query("What are some artists? Limit it to 5.")

display_response(response)

Phi-3-mini-4k-instruct 的功能测试

Model card 介绍

资源和技术文档：

预期用途

主要用例

用例注意事项

如何使用

示例推理代码

负责任的人工智能考虑因素

训练模型

数据集

微调

基准测试

这是技术报告。

设置 数据

LLM

Index Setup 指数设置

有用的导入/日志记录

基本查询引擎

紧凑（默认）Compact (default)

细化 refine

树总结

路由器查询引擎

单选择器

多重选择器

子问题查询引擎

SQL查询引擎

设置数据