CoT进阶：Self Consistency, Least-To-Most

CoT进阶

- [一：Self Consistency](#一：Self Consistency)
- - [1.1 方法简介](#1.1 方法简介)
  - [1.2 实验](#1.2 实验)
  - [1.3 结果](#1.3 结果)
- 二：Least-to-most
- - [2.1 方法简介](#2.1 方法简介)
  - [2.2 示例](#2.2 示例)
  - [2.3 结果](#2.3 结果)

一：Self Consistency

题目: SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS

机构：Google Brain, ICLR 2023

论文: https://arxiv.org/pdf/2203.11171.pdf

任务: 对于复杂问题而言，往往可以从多条推理路径得到最终的答案，因此将原来的CoT贪心解码进行优化，提出一种Self Consistency的解码算法

特点: sample-and-marginalize，投票，能够避免CoT的解码的局部最优以及输出重复，可以视作一种"self-ensemble"，无需训练/标注/微调，很容易与现存的采样算法，比如 temperature sampling， top-k sampling，nucleus sampling即插即用。

前置相关工作：CoT

1.1 方法简介

利用CoT prompting大模型
将CoT中的贪心解码替换为采样生成一组推理路径
答案一致性投票

关于NLG的各种采样算法：Greedy Search (Maximization)，Beam Search，Temperature Sampling，Top-K Sampling，Top-P Sampling (Nucleus sampling)，可以参见：

1.2 实验

Arithmetic Reasoning
Commonsense and Symbolic Reasoning
SELF-CONSISTENCY HELPS WHEN CHAIN-OF-THOUGHT HURTS PERFORMANCE
Comparison to Sample-and-Rank
Comparison to Beam Search
Comparison to Ensemble-based Approaches
Self-Consistency is Robust to Sampling Strategies and Scaling
Self-Consistency Improves Robustness to Imperfect Prompts
Self-Consistency Works for Non-Natural-Language Reasoning Paths and Zero-shot CoT