技术栈

奖励函数设计

datamonday
1 年前
人工智能·机器人·llm·强化学习·gpt-4·具身智能·奖励函数设计
【EAI 019】Eureka: Human-Level Reward Design via Coding LLM论文标题:Eureka: Human-Level Reward Design via Coding Large Language Models 论文作者:Yecheng Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi Fan, Anima Anandkumar 作者单位:NVIDIA; UPenn; Caltech; UT Austin 论文原文:ht