RT-1: ROBOTICS TRANSFORMERFOR REAL-WORLD CONTROL AT SCALE

摘要

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. (通过利用大量的多样化的任务通用的数据集,这些数据转化为知识,现在大模型都能够高水平的解决特定的下有任务,无论是零样本方式还是少量的指定任务数据集,也就是语言大模型现在已经得到验证,利用强化学习能够使用一个大模型,解决各种任务)。While this capabilityhas been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data。(尽管这个方法已经在计算机视觉,自然语言处理,语音识别任务中,但是在机器恩这个领域任然需要验证,由于机器人收集真实数据比较困难,所以模型的通用泛化能力非常重要)。We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data(我们相信存在一个高质量的架构能够很好地利用多样化的机器人数据,并且训练的关键在于模型的训练是开放式的,并且无某一个任务无关性,或者换句话说是不针对特定任务,而是对所有任务的通用训练,类似于现在的语言大模型)。In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties(这篇文章,我们提出了机器人的transformer,显示出很有前瞻行的模型)

introduction

End-to-end robotic learning, with either imitation or reinforcement, typically involves collecting task-specific data in either single-task (Kalashnikov et al., 2018; Zhang et al., 2018) or multitask (Kalashnikov et al., 2021b; Jang et al., 2021) settings that are narrowly tailored to the tasks that the robot should perform.(现在端到端的机器人学习,主要是强化学习及模仿学习,主要收集指定任务的训练数据,或者是专门为机器人多任务设置的数据)。This workflow mirrors the classic approach to supervised learning in other domains, such as computer vision and NLP, where task-specific datasets would be collected, labeled, and deployed to solve individual tasks, with little interplay between the tasks themselves.()这些工作流与类似于计算机视觉或nlp领域类似,需要再指定任务收集数据,标注,部署来解决独立的任务,多个任务之间互动性很小。Recent years have seen a transformation in vision, NLP, and other domains, away from siloed, small scale datasets and models and towards large, general models pre-trained on broad, large datasets.(近些年来,在视觉和nlp的领域逐渐从孤立,小数据,单任务训练逐渐转变为大的,一般通用模型和大量数据的模型训练)。The keys to the success of such models lie with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the knowledge present in large-scale datasets.(这些模型关键因素在于能够使用开放的,任务独立的训练,主要在于好的架构能够学习大量的知识从非常多的数据中学习)If a model can "sponge up" experience to learn general patterns in language or perception, then it can bring them to bear on individual tasks more efficiently(如果一个模型能够吸收大量知识,并且能够学到通用的模式,那么这个模型能够更加有效的应用于单个任务中)。While removing the need for large taskspecific datasets is appealing generally in supervised learning, it is even more critical in robotics, where datasets might require engineering-heavy autonomous operation or expensive human demonstrations.(能够去除针对特定任务的大量数据标注工作在监督学习中是非常有吸引力的,这在robotics中也是非常重要的,由于需要复杂的昂贵的标注工作,甚至需要人类的演示)。We therefore ask: can we train a single, capable, large multi-task backbone model on data consisting of a wide variety of robotic tasks? (我们是否能训练一个能力比较强支撑多任务的基础模型,并且能够对于新任务利用零样本进行适配,环境或者objects)

相关推荐
一碗白开水一5 小时前
【论文解读】VMamba: Visual State Space Model
人工智能·计算机视觉
网安情报局5 小时前
如何选择合适的AI大模型:快快云安全AI大模型聚合平台全解析
人工智能·网络安全·ai大模型
yongyoudayee5 小时前
业务语义模型:AI CRM从“能用”到“好用”的技术分水岭
大数据·人工智能
我的世界洛天依5 小时前
官宣|VFrame 企划正式成立:九州合唱团登场,九州网络用 RVC 重构跨 IP 虚拟歌手音乐生态
人工智能·电脑
木雷坞5 小时前
视觉算法环境 Docker 镜像拉取失败排查
运维·人工智能·docker·容器
ACCELERATOR_LLC5 小时前
【DataWhale组队学习】DIY-LLM Task6 评估与基准测试
人工智能·深度学习·大模型·模型评估
我就是妖怪5 小时前
Kimi K2.6 新手快速上手与实战指南
大数据·人工智能
Elcker5 小时前
企业级RAG应用构建手册
人工智能·rag
蝎子莱莱爱打怪5 小时前
小孩儿才做选择!Hermes 和OpenClaw 我都要!
人工智能·后端·github
imbackneverdie5 小时前
sci期刊示意图、流程图、机制图怎么画?
人工智能·ai·aigc·科研绘图·ai工具·科研工具·ai生图