关于Transformer的理解

关于Transformer, QKV的意义表示其更像是一个可学习的查询系统,或许以前搜索引擎的算法就与此有关或者某个分支的搜索算法与此类似。


Can anyone help me to understand this image? - #2 by J_Johnson - nlp - PyTorch Forums

Embeddings - these are learnable weights where each token(token could be a word, sentence piece, subword, character, etc) are converted into a vector, say, with 500 values between 0 and 1 that are trainable.

Positional Encoding - for each token, we want to inform the model where it's located, orderwise. This is because linear layers are not ideal for handling sequential information. So we manually pass this in by adding a vector of sine and cosine values on the first 2 elements in the embedding vector.

This sequence of vectors goes through an attention layer, which basically is like a learnable digitized database search function with keys, queries and values. In this case, we are "searching" for the most likely next token.

The Feed Forward is just a basic linear layer, but is applied across each embedding in the sequence separately(i.e. 3 dim tensor instead of 2 dim).

Then the final Linear layer is where we want to get out our predicted next token in the form of a vector of probabilities, which we apply a softmax to put the values in the range of 0 to 1.

There are two sides because when that diagram was developed, it was being used in language translations. But generative language models for next token prediction just use the Transformer decoder and not the encoder.

Here is a PyTorch tutorial that might help you go through how it works.

Language Modeling with nn.Transformer and torchtext --- PyTorch Tutorials 2.0.1+cu117 documentation


相关推荐
科士威传动7 小时前
丝杆支撑座同轴度如何安装?
人工智能·科技·机器学习·自动化
2401_841495648 小时前
【自然语言处理】中文 n-gram 词模型
人工智能·python·算法·自然语言处理·n-gram·中文文本生成模型·kneser-ney平滑
百***24378 小时前
GPT5.1 vs Claude-Opus-4.5 全维度对比及快速接入实战
大数据·人工智能·gpt
腾讯云开发者8 小时前
与 AI 共生,腾讯云携手行业专家共话数智驱动新质生长
人工智能
WLJT1231231238 小时前
AI懂你,家更暖:重塑生活温度的智能家电新范式
人工智能·生活
roman_日积跬步-终至千里8 小时前
【计算机视觉(16)】语义理解-训练神经网络1_激活_预处理_初始化_BN
人工智能·神经网络·计算机视觉
AI营销实验室8 小时前
原圈科技AI CRM系统引领2025文旅行业智能升级新趋势
人工智能·科技
AI营销前沿8 小时前
私域AI首倡者韩剑,原圈科技领航AI营销
大数据·人工智能
咚咚王者8 小时前
人工智能之数学基础 概率论与统计:第一章 基础概念
人工智能·概率论
_Li.8 小时前
机器学习-集成学习
人工智能·机器学习·集成学习