Skill Discovery | 无监督技能发现的经典工作总结

[🐱 Unsupervised](#🐱 Unsupervised)
- [Diversity is All You Need: Learning Skills without a Reward Function (diayn)](#Diversity is All You Need: Learning Skills without a Reward Function (diayn))
- [Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills (EDL)](#Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills (EDL))
- [CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery](#CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery)
- [Lipschitz-constrained Unsupervised Skill Discovery (LSD)](#Lipschitz-constrained Unsupervised Skill Discovery (LSD))
- [Controllability-Aware Unsupervised Skill Discovery (CSD)](#Controllability-Aware Unsupervised Skill Discovery (CSD))
- [METRA: Scalable Unsupervised RL with Metric-Aware Abstraction](#METRA: Scalable Unsupervised RL with Metric-Aware Abstraction)
- [Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning (csf)](#Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning (csf))
- [Foundation policies with hilbert representations (HILP, offline metra)](#Foundation policies with hilbert representations (HILP, offline metra))
- [Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning (DUDSi)](#Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning (DUDSi))
- [SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions](#SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions)
- [Efficient Skill Discovery via Regret-Aware Optimization](#Efficient Skill Discovery via Regret-Aware Optimization)
[🦜 Guided](#🦜 Guided)
- [Safety-Aware Unsupervised Skill Discovery](#Safety-Aware Unsupervised Skill Discovery)
- [Do's and Don'ts: Learning Desirable Skills with Instruction Videos (dodont)](#Do's and Don'ts: Learning Desirable Skills with Instruction Videos (dodont))
- [Language Guided Skill Discovery (LGSD)](#Language Guided Skill Discovery (LGSD))
- [Reference Guided Skill Discovery (RGSD)](#Reference Guided Skill Discovery (RGSD))
- [Controlled Diversity with Preference: Towards Learning a Diverse Set of Desired Skills (CDP)](#Controlled Diversity with Preference: Towards Learning a Diverse Set of Desired Skills (CDP))
- [Human-Aligned Skill Discovery Balancing Behaviour Exploration and Alignment (HaSD)](#Human-Aligned Skill Discovery Balancing Behaviour Exploration and Alignment (HaSD))
- [Guiding Skill Discovery with Foundation Models (fog)](#Guiding Skill Discovery with Foundation Models (fog))

🐱 Unsupervised

Diversity is All You Need: Learning Skills without a Reward Function (diayn)

ICLR 2019。
arxiv：https://arxiv.org/abs/1802.06070
pdf：https://arxiv.org/pdf/1802.06070
html：https://ar5iv.labs.arxiv.org/html/1802.06070
website：https://sites.google.com/view/diayn
博客：论文速读纪录 | 2025.01

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills (EDL)

ICML 2020。
arxiv：https://arxiv.org/abs/2002.03647
GitHub：https://github.com/victorcampos7/edl
博客：论文速读记录 | 2026.02

Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning (csf)

ICLR 2025 oral。
arxiv：https://arxiv.org/abs/2412.08021
open review：https://openreview.net/forum?id=xoIeVdFO7U
GitHub：https://github.com/Princeton-RL/contrastive-successor-features
博客：论文速读记录 | 2025.06

Foundation policies with hilbert representations (HILP, offline metra)

ICML 2024。
arxiv：https://arxiv.org/abs/2402.15567
website：https://seohong.me/projects/hilp/
博客：论文速读记录 | 2025.12（2）

Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning (DUDSi)

neurips 2024，7 5 5 4 poster。
arxiv：https://arxiv.org/abs/2410.11251
open review：https://openreview.net/forum?id=ePOBcWfNFC
website：https://jiahenghu.github.io/DUSDi-site/
博客：论文速读记录 | 2025.09

SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions

neurips 2024，8 6 5 5 poster，5 是 borderline ac。
arxiv：https://arxiv.org/abs/2410.18416
open review：https://openreview.net/forum?id=i816TeqgVh
website：https://wangzizhao.github.io/SkiLD/
博客：论文速读记录 | 2025.09

Efficient Skill Discovery via Regret-Aware Optimization

ICML 2025，3 3 2 1 poster。
arxiv：https://arxiv.org/abs/2506.21044
open review：https://openreview.net/forum?id=4qMJ8Ignmp
GitHub：https://github.com/ZhHe11/RSD
博客：论文速读记录 | 2025.10

🦜 Guided

Safety-Aware Unsupervised Skill Discovery

ICRA 2023。
paper：https://safe-skill.github.io/static/pdfs/safe-skill.pdf
website：https://safe-skill.github.io/
博客：CSDN |【ICRA 2023】SASD 论文阅读笔记：一种安全感知的无监督技能发现方法

Do's and Don'ts: Learning Desirable Skills with Instruction Videos (dodont)

NeurIPS 2024 poster。
arxiv：https://arxiv.org/abs/2406.00324
pdf：https://arxiv.org/pdf/2406.00324
html：https://arxiv.org/html/2406.00324
website：https://mynsng.github.io/dodont/
open review：https://openreview.net/forum?id=7X5zu6GIuW
博客：Skill Discovery | DoDont：使用 do + don't 示例视频，引导 agent 学习人类期望的 skill

Language Guided Skill Discovery (LGSD)

ICLR 2025，8 8 6 6 poster。
arxiv：https://arxiv.org/abs/2406.06615
pdf：https://arxiv.org/pdf/2406.06615
html：https://arxiv.org/html/2406.06615v2
open review：https://openreview.net/forum?id=i3e92uSZCp
博客：Skill Discovery | LGSD：用描述 state 的语言 embedding 的距离，作为 metra 的 d(x,y) 距离约束

Reference Guided Skill Discovery (RGSD)

ICLR 2026。
arxiv：https://arxiv.org/abs/2510.06203
pdf：https://arxiv.org/pdf/2510.06203
html：https://arxiv.org/html/2510.06203
open review：https://openreview.net/forum?id=IaGf8Eh5Uo
博客：Skill Discovery | Skill Discovery | RGSD：基于高质量参考轨迹，预训练 skill space

Controlled Diversity with Preference: Towards Learning a Diverse Set of Desired Skills (CDP)

AAMAS 2023。
arxiv：https://arxiv.org/abs/2303.04592
GitHub：https://github.com/HussonnoisMaxence/CDP
期刊版本：Human-informed skill discovery: Controlled diversity with preference in reinforcement learning，science direct。
博客：论文速读记录 | 2025.11

Human-Aligned Skill Discovery Balancing Behaviour Exploration and Alignment (HaSD)

AAMAS 2025。
arxiv：https://arxiv.org/abs/2501.17431
GitHub：https://github.com/HussonnoisMaxence/HaSD-AAMAS
博客：论文速读记录 | 2025.11

Guiding Skill Discovery with Foundation Models (fog)

最新论文链接：https://liacs.leidenuniv.nl/~plaata1/papers/4848.pdf
ICLR 2025 版 open review 论文链接：https://openreview.net/pdf?id=nZBUtzJhf8
最新 website：https://sites.google.com/view/submission-fog （可惜有一些可视化好像挂掉了）
博客：Skill Discovery | FoG：使用 LLM / CLIP 给出 dodont 权重，以引导 agent 安全探索

Skill Discovery | 无监督技能发现的经典工作总结

🐱 Unsupervised

Diversity is All You Need: Learning Skills without a Reward Function (diayn)

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills (EDL)

CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

Lipschitz-constrained Unsupervised Skill Discovery (LSD)

Controllability-Aware Unsupervised Skill Discovery (CSD)

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction

Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning (csf)

Foundation policies with hilbert representations (HILP, offline metra)

Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning (DUDSi)

SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions

Efficient Skill Discovery via Regret-Aware Optimization

🦜 Guided

Safety-Aware Unsupervised Skill Discovery

Do's and Don'ts: Learning Desirable Skills with Instruction Videos (dodont)

Language Guided Skill Discovery (LGSD)

Reference Guided Skill Discovery (RGSD)

Controlled Diversity with Preference: Towards Learning a Diverse Set of Desired Skills (CDP)

Human-Aligned Skill Discovery Balancing Behaviour Exploration and Alignment (HaSD)

Guiding Skill Discovery with Foundation Models (fog)