Generative AI: RAG, AI Agents & Deployment

目录

[Useful links](#Useful links)

[Types and Application of Gen AI](#Types and Application of Gen AI)

Marketing

[AI Hierarchy](#AI Hierarchy)

Tokenization

[Prompt Engineering](#Prompt Engineering)

LLM (Large Language Model)

Transformer

RAG (Retrieval Augmented Generation)

MCP (Model Context Protocol)

[AI Agent](#AI Agent)

[LangChain&LangGraph, LlamaIndex, CrewAI, AutoGen, PydanticAI](#LangChain&LangGraph, LlamaIndex, CrewAI, AutoGen, PydanticAI)

Chatbot

Deployment


|---------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Book | 《AI Agents in Action》- Oreilly |
| Lastest News & Resources | * MIT AI News: Artificial intelligence | MIT News | Massachusetts Institute of Technology * AIGC Weekly:AIGC Weekly |
| API (GPT, Gemini, etc.) | GPT (Generative Pre-trained Transformer): https://developers.openai.com/api/docs Gemini: https://ai.google.dev/api |

Types and Application of Gen AI

Types: Text, Image, Video, Audio, 3D, Code, Task, Music, etc.

Application:

  • Text-to-Text
  • Text-to-Image
  • Text-to-Video
  • Text-to-Audio
  • Text-to-3D
  • Text-to-Code
  • Text-to-Task
  • Image-to-Text
  • Image-to-Image
  • Video-to-Text
  • Audio-to-Text
  • Image-to-Video
  • Text-to-Music
  • Music-to-Text
  • Audio-to-Audio

Image Generation:

  • MidJourney
  • Stable Diffusion
  • DALL-E 3

Marketing

AI Hierarchy

AIGC: Artificial Intelligence Generative Content

The emergence of AIGC is due to the breakthrough in parameter magnitude of large language models(LLM).

Tokenization

Breaks text into smaller units (words, subwords, characters) and maps them to numeric IDs.

Prompt Engineering

Methods of Prompt Engineering:

  • Domain-specific Knowledge
  • Effective Keywords
  • Role Prompting, Shot Prompting, Chain of Thought Prompting
  • Chain-of-Thoughts (CoT)
  • Chain-of-Thought-Self-Consistency
  • Tree-of-Thoughts (ToT)
  • Graph-of-Thoughts (GoT)
  • Algorithm-of-Thoughts
  • Skeleton-of-Thought
  • Program-of-Thoughts

​​​​

LLM (Large Language Model)

Popular Proprietary LLMs

  • OpenAI - GPT
  • Google - Gemini
  • Anthropic - Claude

Open-Source LLMs

  • Meta - Llama
  • DeepSeek
  • OpenAI - GPT-oss
  • Google - Gemma

Applications

  • Chatbots
  • Doc QA
  • Coding
  • Agents

Transformer

My blog written in 2020 for AI music composition is built by LSTM:

https://blog.csdn.net/Beth_Chan/article/details/111351195

but after new GenAi era, the AI generation is not LSTM, is replaced by Transformation .

The new Generative LLMs like GPT, Gemini are built using transformers.

分析 LSTM 和 Transformer 在 AI 音乐生成领域的代际变化:

一、LSTM在AI音乐生成中的特点

LSTM作为循环神经网络(RNN)的变体,曾经是AI音乐生成的主流模型,它的优势在于:

  1. 序列建模能力:擅长捕捉音乐中的时序依赖关系,比如旋律的走向、和弦的衔接逻辑

  2. 低资源适配性:在2020年算力资源相对有限的时期,LSTM的训练成本更低,更容易在中小规模数据集上实现可用的音乐生成效果

  3. 专注局部特征:对短片段的音乐风格模仿能力较强,适合生成段落级别的旋律

二、Transformer架构带来的代际提升

以GPT、Gemini为代表的大语言模型采用的Transformer架构,为AI音乐生成带来了质的飞跃:

  1. 全局注意力机制:通过自注意力机制可以同时捕捉音乐中的长距离依赖,比如整首曲子的主题呼应、结构对称性

  2. 多模态融合能力:不仅能处理音符序列,还能结合歌词、情感标签、乐器音色等多维度信息生成更丰富的音乐

  3. 通用模型适配性:基于大语言模型的Transformer可以直接处理文本描述,实现"文字转音乐"的跨模态生成

  4. 风格迁移能力:更擅长学习不同音乐流派的全局风格特征,生成的音乐完整性和艺术性更强

三、技术迭代背后的核心逻辑

  1. 算力驱动:Transformer的训练需要海量算力,这是2020年尚不具备的基础条件

  2. 数据爆发:音乐版权数据的开放和数字化音乐库的扩张,为大模型训练提供了充足的素材

  3. 需求升级:从单纯的"生成旋律"升级为"创作符合特定场景、情感、风格的完整音乐作品"

Encoder & Decoder

Why do transformers use positional encoding?

To provide word order information, since transformers look at all tokens simultaneously.

RAG (Retrieval Augmented Generation)

MCP (Model Context Protocol)

AI Agent

Build AI Agents with LLMs

What are AI agents?

What exactly are AI agents, and why should you want to learn about them in the first place? AI agents are tools designed to allow users to interact with LLMs to achieve a more productive or creative workflow as seamlessly as possible. Before AI agents, users would be forced to build their own statistical language models---a time-consuming, technical, and expensive endeavor! Now, with AI agents, users who want to interact with AI simply get to log in to an interface and conduct business ranging from asking questions of their documents to getting help with their homework.
什么是人工智能代理,你为什么要首先了解它们?AI 代理是一种工具,旨在允许用户与 LLM 进行交互,以尽可能无缝地实现更高效或更具创造性的工作流程。在 AI 代理出现之前,用户将被迫构建自己的统计语言模型------这是一项耗时、技术且昂贵的工作!现在,有了人工智能代理,想要与人工智能互动的用户只需登录一个界面,就可以开展业务,从询问他们的文件问题到获得家庭作业的帮助。

At a more granular level, you might think of AI agents as UI "wrappers" around the models that power them. That is to say, AI agents are often user-friendly "frontends" that make using the models that fuel them easier, often by focusing and limiting just how users interact with the model. Take ChatGPT, for instance. The models fueling ChatGPT (GPT-3.5 Turbo or GPT-4) are massively complex, powerful, and difficult to use and operate on their own. As an AI agent, ChatGPT abstracts away these models' technical features and allows users to interact with them simply via text.
在更精细的级别上,你可以将 AI 代理视为支持它们的模型的 UI"包装器"。也就是说,人工智能代理通常是用户友好的"前端",通常通过关注和限制用户与模型的交互方式,使使用驱动它们的模型变得更加容易。以ChatGPT为例。为 ChatGPT(GPT-3.5 Turbo 或 GPT-4)提供动力的模型非常复杂、功能强大且难以单独使用和操作。作为人工智能代理,ChatGPT 抽象出这些模型的技术特征,允许用户简单地通过文本与它们进行交互。

Use cases

Structure

Design Patterns

Copilot Pattern

Research Pattern

LangChain&LangGraph, LlamaIndex, CrewAI, AutoGen, PydanticAI

Build LLM-based applications

Chatbot

Deployment

FastAPI

相关推荐
刘大猫.1 小时前
华为昇腾芯片将为DeepSeek-V4推理,通往国产算力自由
华为·ai·大模型·算力·deepseek·deepseek-v4·昇腾芯片
零安道长1 小时前
Twitter 用户信息 API 集成指南
ai
阿里云大数据AI技术1 小时前
深度回顾 | 阿里云携手 Elastic 定义 Agent 时代搜索新范式,解锁 Search AI 核心生产力
elasticsearch·agent
阿杰学AI2 小时前
AI核心知识129—大语言模型之 向量数据库(简洁且通俗易懂版)
数据库·人工智能·ai·语言模型·自然语言处理·向量数据库·vector database
QC·Rex3 小时前
Spring AI MCP Apps 实战:打造聊天与富 UI 融合的智能化应用
人工智能·spring·ui·spring ai·mcp
嵌入式小企鹅4 小时前
国产大模型与芯片加速融合,RISC-V生态多点开花,AI编程工具迈入自动化新纪元
人工智能·学习·ai·嵌入式·算力·risc-v·半导体
醇氧4 小时前
Hermes Agent 学习(安装部署详细教程)
人工智能·python·学习·阿里云·ai·云计算
爱吃的小肥羊4 小时前
我整理了 14 种 GPT-Image-2 的神仙玩法,大家看看效果怎么样!
aigc·openai
张忠琳4 小时前
【openclaw】OpenClaw Flows 模块超深度架构分析
ai·架构·vllm
图图玩ai4 小时前
SSH 命令管理工具怎么选?从命令收藏到批量执行一次讲清
linux·nginx·docker·ai·程序员·ssh·可视化·gmssh·批量命令执行