【论文阅读】一些多轮对话文章的体会 ACL 2023

前言

这几篇文章都不是做general任务的，倾向于通过一些额外信息，来做specific任务

【1】提出应该在instance-level上而不是task-level上生成attribute prompt(i.e. user's persona/dialogue intent)
- train a lightweight prompt module that takes as input a control attribute(shallow and deep version)
- 而不是 training static soft tokens for the dialogue task
【2】在inference阶段，基于对话历史预测persona信息来定制dialogue agent，而不依赖显式的persona描述
- 提出两种方式的persona detection model：
  - 给定模型对话历史，训练其输出的向量与persona向量（通过输入persona description来编码得到）近似
  - 给定模型对话历史，训练其直接生成persona description
- 多任务训练：将persona detection model 与 dialogue context encoder联合训练
  - 分享第一层参数，可以看作是一个通用的对话信息编码器
  - 训练persona detection model与dialogue model一起最大化ground truth response的概率
【3】生成包含特定语法items的回复（比如现在完成时，虚拟语气，定语从句），尝试了在DialoGPT上用强化学习的方式与基于GPT-3的in-context learning方式，发现都可以

【1】Dailydialog for label control 与 FoCus for document control
- Dailydialog ：对每句话都标注了dialogue act（图中是标记的是emotiong，act在另一个文件中），一共四种（陈述，问题，指示，承诺）
- FoCus: 包含user's persona，希望构建 dialogue agent
- 评估response
  - controllability for customizing responses
  - n-gram based: BLEU, NIST, ROUGE-L, METEOR for fluency and adequacy
  - distinct n-gram: Dist and Entropy for diversity
  - humane evaluation for consistency between dialogue context and response and attribute controllability
【2】PersonaChat and Dailydialog
- PersonaChat（arxiv 2018）
- 为了验证泛化性，在Dailydialog上测试
- 评估：
  - ppl for fluency
  - Dist for diversity
  - P-Cover for covering persona information
  - human evaluation(20 annotators)
  - etc.
【3】Dailydilog(SCoRE 来训练分类器)
- 评估：
  - Dist for diversity
  - G-Ration for containing the item
  - GOAL for fluency