Transformer和LLM前沿内容（4）：Long-Context LLM

自动驾驶小学生2026-04-27 10:50

文章目录

- - [1. Context Extension](#1. Context Extension)
  - - [1.1 Rotary Position Embedding (RoPE)](#1.1 Rotary Position Embedding (RoPE))
    - [1.2 LongLoRA](#1.2 LongLoRA)
  - [2. Evaluation of Long-Context LLMs](#2. Evaluation of Long-Context LLMs)
  - - [2.1 The Lost in the Middle Phenomenon](#2.1 The Lost in the Middle Phenomenon)
    - [2.2 Long-Context Benchmarks: NIAH, LongBench](#2.2 Long-Context Benchmarks: NIAH, LongBench)
  - [3. Efficient Attention Mechanisms](#3. Efficient Attention Mechanisms)
  - - [3.1 KV Cache](#3.1 KV Cache)
    - [3.2 StreamingLLM and Attention Sinks（重点）](#3.2 StreamingLLM and Attention Sinks（重点）)
    - [3.3 DuoAttention: Retrieval Heads and Streaming Heads （重点）](#3.3 DuoAttention: Retrieval Heads and Streaming Heads （重点）)
    - [3.4 Quest: Query-Aware Sparsity（重点）](#3.4 Quest: Query-Aware Sparsity（重点）)
  - [4. Beyond Transformers](#4. Beyond Transformers)
  - - [4.1 State-Space Models (SSMs): Mamba](#4.1 State-Space Models (SSMs): Mamba)
    - [4.2 Hybrid Models: Jamba](#4.2 Hybrid Models: Jamba)

1. Context Extension

1.1 Rotary Position Embedding (RoPE)

1.2 LongLoRA

2. Evaluation of Long-Context LLMs

2.1 The Lost in the Middle Phenomenon

2.2 Long-Context Benchmarks: NIAH, LongBench

3. Efficient Attention Mechanisms

3.1 KV Cache

3.2 StreamingLLM and Attention Sinks（重点）

3.3 DuoAttention: Retrieval Heads and Streaming Heads （重点）

3.4 Quest: Query-Aware Sparsity（重点）

4. Beyond Transformers

4.1 State-Space Models (SSMs): Mamba

4.2 Hybrid Models: Jamba

上一篇：C++ 手撕 STL 底层：红黑树封装 mymap/myset

下一篇：[推理]vLLM-2026年第二季度路线图

热门推荐

01GitHub 镜像站点 02Codex 接入 DeepSeek API 完整配置文档 03CC-Switch & Claude 基于 Linux 服务器安装使用指南 04【踩坑记录 | 第一篇】微软商店无法使用时，如何手动安装 OpenAI Codex？附`.msix`文件系统错误解决方法 05几个好用的ip纯净度检测网站 06裂开！ChatGPT 居然开始要手机号验证，附详细解决方法 07装上就回不去了：CodeGraph 让 AI 编程效率飙升 92%，它到底做了什么？08用了半年 OpenRouter，我换到了 Ofox.ai — 两个 AI API 聚合平台的真实对比 09codex app每次打开重连5次Reconnecting问题解决 10【AI】2026 年具身智能模型和世界模型总结