机器人、机械臂相关的任务的开源数据集

Open X-Embodiment(最大规模之一)

  • 来自 21+ 机构、22 种不同机器人(单臂、双臂、四足等)的 1M+ 真实机器人轨迹,统一格式(RLDS)。包含图像、动作、语言指令等,多样性极高,被称为"机器人版的 ImageNet"。
  • 用途 :训练跨 embodiment(不同机器人身体)的通用策略(X-robot learning)、行为克隆、世界模型等。促进零样本/少样本迁移。

Open X-Embodiment 中的数据组织方式是:

  • Dataset → 由很多 Episodes(一条完整的演示轨迹)组成
  • Episode → 由很多 Steps(时间步,每一步对应一帧)组成
  • 每个 Step 包含下面这些核心字段

单个step展示:

python 复制代码
('action', {'open_gripper': <tf.Tensor: shape=(), dtype=bool, numpy=True>, 'rotation_delta': <tf.Tensor: shape=(3,), dtype=float32, numpy=array([ 6.077167e-07, -1.193009e-07,  1.308389e-07], dtype=float32)>, 'terminate_episode': <tf.Tensor: shape=(), dtype=float32, numpy=0.0>, 'world_vector': <tf.Tensor: shape=(3,), dtype=float32, numpy=array([1.9514002e-10, 8.0674190e-11, 2.9859176e-10], dtype=float32)>})
('is_first', <tf.Tensor: shape=(), dtype=bool, numpy=True>)
('is_last', <tf.Tensor: shape=(), dtype=bool, numpy=False>)
('is_terminal', <tf.Tensor: shape=(), dtype=bool, numpy=False>)
('observation', {'image': <tf.Tensor: shape=(480, 640, 3), dtype=uint8, numpy=
array([[[ 83,  88,  91],
        [ 92,  97, 100],
        [ 95, 100, 103],
        ...,
        [ 12,  16,  19],
        [ 11,  15,  18],
        [ 11,  15,  18]],

       [[ 99, 104, 107],
        [112, 117, 120],
        [ 77,  82,  85],
        ...,
        [ 12,  16,  19],
        [ 12,  16,  19],
        [ 11,  15,  18]],

       [[109, 114, 117],
        [102, 107, 110],
        [ 73,  78,  81],
        ...,
        [ 13,  17,  20],
        [ 12,  15,  20],
        [ 12,  15,  20]],

       ...,

       [[209, 199, 189],
        [212, 202, 192],
        [210, 200, 190],
        ...,
        [159, 120,  89],
        [159, 120,  89],
        [160, 121,  88]],

       [[214, 197, 187],
        [211, 197, 186],
        [212, 199, 190],
        ...,
        [156, 120,  88],
        [156, 120,  88],
        [157, 121,  89]],

       [[214, 196, 186],
        [217, 200, 190],
        [215, 201, 192],
        ...,
        [155, 118,  89],
        [155, 119,  87],
        [156, 120,  88]]], dtype=uint8)>, 'natural_language_embedding': <tf.Tensor: shape=(512,), dtype=float32, numpy=
array([ 2.30294447e-02,  6.05173744e-02, -5.20481169e-02, -2.47768629e-02,
        1.08427750e-02,  9.11148265e-03, -5.82255470e-03, -5.61384659e-04,
       -6.19610846e-02,  2.86639649e-02, -5.38054272e-04,  3.64868939e-02,
        5.83116375e-02, -6.58905273e-03, -8.61448571e-02, -4.80751926e-03,
       -1.01660937e-02,  1.18107673e-04, -2.88839266e-02,  3.15110236e-02,
        2.88712289e-02, -3.36049236e-02, -3.32037471e-02,  8.23714286e-02,
       -4.78581823e-02,  1.89173147e-02, -1.17043108e-02,  2.34112926e-02,
        7.34530063e-03, -1.72176119e-02,  2.04275567e-02,  3.81773487e-02,
        4.12922464e-02,  4.79167514e-02, -3.39735448e-02,  3.30767594e-02,
       -4.11522724e-02, -1.53829195e-02, -5.84172420e-02,  3.31984945e-02,
        5.99250644e-02,  6.65826425e-02,  8.60797986e-03, -5.16836308e-02,
       -1.56743526e-02,  1.52221220e-02, -1.06724147e-02,  4.62612659e-02,
       -6.90690577e-02, -1.53337680e-02, -4.61382791e-03, -5.53948805e-02,
       -5.04228892e-03, -8.98943655e-03, -6.67139981e-03,  3.24231014e-02,
        2.31685992e-02,  6.53641894e-02, -8.42219740e-02, -1.27285672e-03,
       -8.77480879e-02,  1.38156675e-02, -1.46904064e-03,  1.15498211e-02,
        5.21797240e-02, -5.19705354e-04, -2.24918984e-02,  6.42215684e-02,
       -4.64617684e-02,  2.62698326e-02,  3.75501104e-02, -3.29499803e-02,
       -7.24997371e-02, -8.21422189e-02, -1.41956005e-02,  3.98824103e-02,
        1.40000479e-02,  8.12648833e-02, -2.12381762e-02, -2.07397826e-02,
       -5.43150529e-02,  3.39643732e-02, -1.70414969e-02, -2.75766440e-02,
        2.39402782e-02,  5.05681075e-02,  2.82037798e-02,  4.60241176e-02,
        2.80120317e-03, -5.19922487e-02,  6.25411570e-02,  4.05185223e-02,
       -1.27093755e-02, -1.11003146e-02,  6.20534718e-02,  1.30008266e-03,
        5.72689734e-02,  5.93838990e-02,  1.60258804e-02, -5.33626489e-02,
       -2.78736539e-02,  5.59482542e-05,  6.45361319e-02,  3.67150083e-02,
        4.29617465e-02,  8.05545747e-02,  1.70597266e-02,  1.83145087e-02,
       -2.10689139e-02,  1.55236656e-02,  6.38596108e-03,  9.61762015e-03,
       -3.68559025e-02, -5.65365329e-02, -8.20576102e-02,  4.36546106e-04,
        1.85061377e-02,  1.79954898e-02, -4.18148264e-02, -1.34398844e-02,
       -5.50495237e-02, -5.81721216e-03, -3.58555908e-03,  8.61355010e-03,
       -1.84907299e-02,  4.18909490e-02, -7.99810067e-02,  4.61478308e-02,
       -1.94716249e-02,  1.08241506e-01,  4.07698080e-02,  8.40406120e-02,
        1.12400670e-02, -1.70313604e-02,  6.71370700e-02,  3.07028089e-03,
       -7.81752318e-02, -3.61328013e-02,  9.47262254e-03, -9.52577963e-03,
        5.29839359e-02,  1.38539104e-02, -2.02670484e-03, -4.78461385e-02,
       -2.62118988e-02, -4.16636690e-02,  5.06601445e-02, -9.39371157e-03,
       -4.18891832e-02,  1.33636931e-03, -3.27013247e-02, -4.38975617e-02,
        3.90706435e-02, -3.39448899e-02, -1.37101451e-03,  1.01938769e-01,
        5.05746491e-02,  2.22387109e-02, -9.20815207e-03, -5.59622720e-02,
        2.36338340e-02,  3.57812233e-02,  3.67419459e-02,  4.05951738e-02,
       -3.91846336e-02, -3.26995626e-02,  6.80791363e-02,  7.62440078e-03,
        3.71609740e-02,  1.11309215e-02,  2.43490692e-02, -2.58231871e-02,
       -4.75365296e-03, -1.32513493e-02,  4.95327450e-02, -2.94215400e-02,
        1.48535492e-02,  4.26678807e-02,  5.19396178e-02,  2.48162784e-02,
       -3.35754268e-02,  1.53736938e-02, -5.17860018e-02, -1.55840889e-02,
       -1.37480170e-01,  1.45085938e-02,  4.23489325e-02, -3.09551191e-02,
       -6.17873259e-02,  1.18616760e-01,  1.93447457e-03, -3.51321213e-02,
        3.47951464e-02,  1.00223430e-01, -5.11648208e-02, -3.18400338e-02,
       -3.42703797e-02,  1.03139855e-01,  3.10028344e-02,  6.91802651e-02,
        6.74038455e-02, -2.69975960e-02,  9.42844898e-03,  2.41422597e-02,
       -1.10973073e-02, -8.41486901e-02,  3.67963091e-02,  3.02007329e-02,
        5.21205217e-02, -1.60790365e-02,  4.82646599e-02,  5.85531294e-02,
        5.83571829e-02,  2.33832877e-02,  1.15526440e-02,  3.20059396e-02,
        1.97310932e-02,  2.08420889e-03, -3.63293551e-02, -3.13506350e-02,
        8.75868872e-02,  2.91291904e-02,  9.95325111e-03,  8.91545340e-02,
        4.88411672e-02,  2.94628982e-02,  5.47678173e-02, -2.45558042e-02,
       -4.52916101e-02,  7.55258650e-02,  1.61027964e-02, -1.17763421e-02,
       -4.52294350e-02,  4.45150882e-02,  3.60127762e-02, -8.18681996e-03,
        7.46363848e-02, -1.79726221e-02,  4.47655469e-02,  8.11863169e-02,
        3.53159979e-02, -1.39076794e-02,  3.07287946e-02, -2.10502259e-02,
       -8.53858422e-03,  2.70146951e-02, -4.04026592e-03,  6.43077912e-03,
        1.19745508e-02,  2.06797030e-02, -8.06324333e-02, -8.93529970e-04,
        3.63447075e-03,  2.07282994e-02, -5.89092001e-02,  6.57288581e-02,
       -6.81186020e-02,  9.78687033e-03,  2.74124667e-02,  7.90238082e-02,
       -1.72475800e-02, -7.87314959e-03, -7.10541308e-02, -1.38236294e-02,
        7.51328692e-02,  4.54298928e-02,  1.40015250e-02,  2.35893158e-03,
       -5.71069727e-03, -7.58356750e-02,  1.15318755e-02, -9.97513458e-02,
       -7.44394362e-02,  1.94973976e-03, -5.39304875e-02, -3.12894508e-02,
        5.76846413e-02, -1.61580462e-02,  1.20309796e-02,  2.00632177e-02,
       -4.42018919e-02, -9.56523046e-03, -1.34884613e-02,  4.89868447e-02,
       -5.09263668e-03, -4.25534621e-02,  2.55978890e-02, -5.81441559e-02,
        9.62353572e-02,  2.02117823e-02, -1.71392057e-02, -8.66092928e-03,
       -7.12014129e-03,  2.09330823e-02,  5.37754819e-02, -2.28657983e-02,
       -3.00242249e-02, -2.37361230e-02,  5.92173003e-02,  9.49474890e-03,
       -4.23376933e-02,  8.67886376e-03, -2.29124147e-02,  4.76709800e-03,
       -6.53485348e-03,  3.07827536e-02, -1.12981182e-02, -6.45409599e-02,
       -4.22426425e-02, -4.35791314e-02, -3.22807357e-02,  6.40643388e-02,
        2.14385260e-02,  6.49540052e-02, -2.70295776e-02,  7.27179274e-02,
        7.42838776e-04, -1.51439067e-02, -7.70827755e-02,  5.93844475e-03,
       -5.38883060e-02, -8.49346817e-02,  6.47232234e-02,  2.66256160e-03,
       -6.04295097e-02,  3.50991786e-02,  6.68003969e-03,  5.09417169e-02,
        5.67321070e-02, -6.97394609e-02, -2.94399057e-02, -2.46931631e-02,
       -1.35134598e-02, -1.35464333e-02, -8.00555572e-02, -4.59090434e-02,
        2.26265122e-03, -2.62019224e-02, -5.06492890e-02,  1.75190102e-02,
       -6.88601807e-02, -1.22997751e-02,  7.56917149e-02,  6.47063181e-02,
       -5.43286018e-02,  2.50871237e-02, -3.73206660e-02, -2.14841217e-02,
        6.80395355e-03,  2.31319037e-03, -5.60809672e-02,  1.24006355e-02,
       -5.31012006e-02, -3.00859232e-02, -4.90449194e-04,  7.92119950e-02,
       -5.58438292e-03, -9.47208703e-02,  3.93651873e-02, -9.36861802e-03,
       -2.10053977e-02,  4.35312726e-02, -4.43299953e-03,  8.10990185e-02,
       -2.13196576e-02,  2.98558734e-02, -8.99196975e-03,  9.83189717e-02,
        3.06732170e-02, -7.63913393e-02, -4.46917377e-02,  3.47564444e-02,
        5.54822870e-02, -3.05903926e-02,  2.38518044e-03, -5.33119589e-02,
        5.86052053e-02,  1.50059548e-03, -2.03027464e-02,  3.82378735e-02,
        3.28032067e-03, -2.67034769e-02,  3.17935310e-02, -5.46296015e-02,
        1.97696351e-02, -2.22448520e-02,  4.38505001e-02,  2.44970415e-02,
       -1.09113343e-02,  2.73622875e-03,  8.32284316e-02, -1.77767519e-02,
        4.07689884e-02, -1.45015372e-02,  7.53508434e-02,  4.51910906e-02,
       -8.25277641e-02, -3.41877192e-02,  1.77418068e-02, -2.11642925e-02,
        7.21889213e-02,  1.94768235e-02, -2.11453009e-02,  8.44602436e-02,
        2.48519853e-02,  6.11021928e-02, -1.98512268e-03, -1.13665080e-02,
        2.05941293e-02,  3.30619141e-02,  2.97438656e-03, -1.60303004e-02,
       -4.40949984e-02, -9.18218791e-02, -5.54856025e-02,  1.34288361e-02,
        5.06847687e-02, -1.10165784e-04, -5.90740182e-02,  2.76382081e-02,
       -7.46401250e-02, -2.08996880e-05,  9.32212397e-02, -1.14457374e-02,
        2.73100566e-02,  5.75445183e-02, -5.61816094e-04, -4.73934077e-02,
       -4.35944721e-02, -3.52028124e-02,  1.54333077e-02, -9.50482860e-02,
       -2.23925672e-02, -5.45034707e-02,  3.05453204e-02, -1.12274047e-02,
        3.73750143e-02, -2.16984842e-02, -1.08364470e-01,  7.26937056e-02,
        1.29395060e-03, -7.43590072e-02, -3.03321686e-02, -7.87093714e-02,
       -1.97824650e-02, -3.12689431e-02,  1.91068836e-02, -2.56705619e-02,
       -5.90012893e-02,  4.39764038e-02, -3.96308303e-02,  5.48811145e-02,
        2.21167058e-02, -7.17390552e-02,  1.08197704e-01, -3.33064198e-02,
        1.45169953e-02,  6.05951026e-02, -2.87664887e-02, -1.29939001e-02,
       -9.37548559e-03, -8.96087009e-03, -1.75709426e-02, -4.91941124e-02,
       -7.48750120e-02, -3.38227525e-02,  1.34311384e-02,  4.94770631e-02,
       -2.06464231e-02, -2.02238802e-02,  6.20684102e-02,  2.73675490e-02,
        3.78895476e-02, -2.29312032e-02,  2.29282468e-03, -2.35780124e-02,
       -3.83791104e-02,  4.52002697e-03, -1.50561752e-02, -3.13255116e-02,
       -1.01998383e-02, -7.87638277e-02, -3.32716666e-02, -3.95182334e-02,
       -9.74203497e-02,  2.18809545e-02, -7.25470260e-02,  1.37321148e-02,
        2.50995047e-02,  5.10979109e-02,  4.24298123e-02,  2.75686290e-02,
       -4.90514887e-03, -5.27277738e-02,  7.37626553e-02, -5.29145412e-02,
        3.27520235e-03, -4.49925736e-02, -2.53231432e-02, -2.29947641e-02,
       -1.70712546e-02, -2.77240891e-02, -3.56300585e-02, -5.66384979e-02,
       -6.26312867e-02, -4.54002097e-02, -1.07850032e-02, -5.15301190e-02,
        5.28749228e-02,  1.83964465e-02, -2.11781319e-02, -8.78388137e-02],
      dtype=float32)>, 'natural_language_instruction': <tf.Tensor: shape=(), dtype=string, numpy=b'Place the can to the left of the pot.'>, 'state': <tf.Tensor: shape=(7,), dtype=float32, numpy=
array([ 0.29806843, -0.11465699,  0.10782038,  0.04275148, -0.14888743,
       -0.31455365,  1.0001532 ], dtype=float32)>})
('reward', <tf.Tensor: shape=(), dtype=float32, numpy=0.0>)

相关解释:

  • 'action' 这一步机器人实际执行的动作(命令)。

    • open_gripper: 是否打开夹爪(bool)。
    • rotation_delta: 旋转增量(3 维,roll/pitch/yaw)。
    • world_vector: 世界坐标系下的移动向量(3 维,x/y/z)。
    • terminate_episode: 是否结束当前 episode。 作用:这是模型要学习的"输出",即"看到当前画面后,应该执行什么动作"。
  • 'is_first', 'is_last', 'is_terminal'

    • is_first=True:这是当前 episode 的第一步(初始状态)。
    • is_last / is_terminal:标记 episode 是否结束或达到终止状态。 作用:帮助代码知道轨迹的开始和结束边界。
  • 'observation'(最重要的一部分) 机器人"看到"的当前环境信息(输入):

    • image:主相机拍摄的 RGB 图像(480×640×3)。这是原始像素数据,和 LeWorldModel 里从 pixels 学习的输入一样。
    • natural_language_embedding:自然语言指令的向量嵌入(512 维)。对应指令是 "Place the can to the left of the pot."(把罐子放到锅的左边)。
    • natural_language_instruction:原始文本指令。
    • state:机器人本体状态(7 维),通常包括关节角度、末端执行器位置、夹爪状态等 proprioceptive(本体感觉)信息。
  • 'reward' 当前步的奖励值。这里是 0.0(大多数演示数据里 reward 都是 0 或稀疏,只有成功时才为 1)。 Open X-Embodiment 大部分是 reward-free 或稀疏奖励的演示数据(Imitation Learning 为主)。

python 复制代码
import tensorflow as tf
import tensorflow_datasets as tfds

# load raw dataset --> replace this with tfds.load() on your
# local machine!
dataset = 'kuka'
b = tfds.builder_from_directory(builder_dir=dataset2path(dataset))
ds = b.as_dataset(split='train[:10]')

def episode2steps(episode):
  return episode['steps']

def step_map_fn(step):
  return {
      'observation': {
          'image': tf.image.resize(step['observation']['image'], (128, 128)),
      },
      'action': tf.concat([
          step['action']['world_vector'],
          step['action']['rotation_delta'],
          step['action']['gripper_closedness_action'],
      ], axis=-1)
  }

# convert RLDS episode dataset to individual steps & reformat
ds = ds.map(
    episode2steps, num_parallel_calls=tf.data.AUTOTUNE).flat_map(lambda x: x)
ds = ds.map(step_map_fn, num_parallel_calls=tf.data.AUTOTUNE)

# shuffle, repeat, pre-fetch, batch
ds = ds.cache()         # optionally keep full dataset in memory
ds = ds.shuffle(100)    # set shuffle buffer size
ds = ds.repeat()        # ensure that data never runs out

DROID(Distributed Robot Interaction Dataset)

python 复制代码
# Full DROID dataset in RLDS (1.7TB)
gsutil -m cp -r gs://gresearch/robotics/droid <path_to_your_target_dir>

# Example 100 episodes from the DROID dataset in RLDS for debugging (2GB)
gsutil -m cp -r gs://gresearch/robotics/droid_100 <path_to_your_target_dir>

# Raw DROID dataset in stereo HD, stored as MP4 videos (8.7TB)
gsutil -m cp -r gs://gresearch/robotics/droid_raw <path_to_your_target_dir>

# Raw DROID dataset, non-stereo HD video only (5.6TB, excluding stereo video & raw SVO cam files)
gsutil -m rsync -r -x ".*SVO.*|.*stereo.*\.mp4$" "gs://gresearch/robotics/droid_raw" <path_to_your_target_dir>

数据集的 Schema:(单条episode)

python 复制代码
DROID = {
        "episode_metadata": {
                "recording_folderpath": tf.Text, # path to the folder of recordings
                "file_path": tf.Text, # path to the original data file
                },
	"steps": {
		"is_first": tf.Scalar(dtype=bool), # true on first step of the episode
                "is_last": tf.Scalar(dtype=bool), # true on last step of the episode
        	"is_terminal": tf.Scalar(dtype=bool), # true on last step of the episode if it is a terminal step, True for demos
                                
                "language_instruction": tf.Text, # language instruction
                "language_instruction_2": tf.Text, # alternative language instruction
                "language_instruction_3": tf.Text, # alternative language instruction
                "observation": {
                                "gripper_position": tf.Tensor(1, dtype=float64), # gripper position state
                                "cartesian_position": tf.Tensor(6, dtype=float64), # robot Cartesian state
                                "joint_position": tf.Tensor(7, dtype=float64), # joint position state
                                "wrist_image_left": tf.Image(180, 320, 3, dtype=uint8), # wrist camera RGB left viewpoint        
                                "exterior_image_1_left": tf.Image(180, 320, 3, dtype=uint8), # exterior camera 1 left viewpoint
                                "exterior_image_2_left": tf.Image(180, 320, 3, dtype=uint8), # exterior camera 2 left viewpoint
                		},                            
                "action_dict": {
                                "gripper_position": tf.Tensor(1, dtype=float64), # commanded gripper position
                                "gripper_velocity": tf.Tensor(1, dtype=float64), # commanded gripper velocity
                                "cartesian_position": tf.Tensor(6, dtype=float64), # commanded Cartesian position
                                "cartesian_velocity": tf.Tensor(6, dtype=float64), # commanded Cartesian velocity
                                "joint_position": tf.Tensor(7, dtype=float64),  # commanded joint position
                        	"joint_velocity": tf.Tensor(7, dtype=float64), # commanded joint velocity
                		},
		"discount": tf.Scalar(dtype=float32), # discount if provided, default to 1
                "reward": tf.Scalar(dtype=float32), # reward if provided, 1 on final step for demos
                "action": tf.Tensor(7, dtype=float64), # robot action, consists of [6x joint velocities, 1x gripper position]
	},
}
episode_metadata(元数据)
  • recording_folderpath:原始录制文件夹路径
  • file_path:原始数据文件路径
steps(每一步 / 每一帧)

每个 episode 由很多 steps 组成,每一个 step 包含:

  • is_first:是否是这条轨迹的第一帧
  • is_last:是否是最后一条
  • is_terminal:是否是任务结束
  • 语言指令 (非常重要):
    • language_instruction:主要指令(英文)
    • language_instruction_2 / 3:备选指令(同一个任务可能有不同说法)
  • observation(观测 / 输入)
    • gripper_position:夹爪位置
    • cartesian_position:机器人末端执行器在空间中的位置(6维)
    • joint_position:7个关节的角度
    • 图像 (3个相机视角,左视图):
      • wrist_image_left:手腕相机图像(180×320)
      • exterior_image_1_left:外部相机1
      • exterior_image_2_left:外部相机2
  • action_dict(动作指令)
    • 包含 gripper、cartesian、joint 的位置和速度命令
  • action (最终使用的动作):
    • 7维向量 = 6个关节速度 + 1个夹爪位置
  • reward:奖励值(演示数据中通常最后一步是 1,其余是 0)
  • discount:折扣因子(一般是 1)

BridgeData V2

  • 是什么:大规模机器人操纵演示数据集(大量 pick-and-place 等任务),有 teleoperated(人工遥操)和 scripted 数据。
  • 用途:大规模机器人学习、策略训练、数据增强。
  • 下载BridgeData V2 (JPEGS + TFDS 格式)
  • 著名项目:Berkeley 的 Octo 模型等,许多 scalable robot learning 工作。

ALOHA(特别是 Mobile ALOHA)

  • 是什么:低成本双臂遥操硬件 + 数据集,专注于双臂灵巧操纵(bimanual manipulation),包括移动底座版本。
  • 用途:双臂任务、灵巧操作、变形物体处理。
  • 下载 :项目页面 Mobile ALOHA 或 Hugging Face(搜索 ALOHA 数据集),GitHub https://github.com/tonyzhaozh/aloha
  • 著名项目:Mobile ALOHA 项目,许多扩散策略(Diffusion Policy)论文使用。

LIBERO

  • 是什么:Lifelong Robot Learning 基准,包含多个任务套件(Spatial、Object、Goal 等),高质量演示。
  • 用途:终身学习、知识迁移、基准测试。
  • 下载LIBERO Datasets -- LIBERO (Box 链接)或 Hugging Face(HuggingFaceVLA/libero 或 physical-intelligence/libero)
  • 著名项目:终身学习、Pi0 等模型基准。

Code Structures

python 复制代码
libero/
    assets/ # Where all the object mesh files are stored
    bddl_files/   # PDDL file definitions for tasks
        libero_goal/*.bddl      # 10 tasks of LIBERO-Goal suite
        libero_spatial/*.bddl   # 10 tasks of LIBERO-Spatial suite
        libero_object/*.bddl    # 10 tasks of LIBERO-Object suite
        libero_10/*.bddl        # 10 tasks of LIBERO-100 for evaluation (aka LIBERO-LONG)
        libero_90/*.bddl        # 90 tasks of LIBERO-100 for pretraining

    benchmark/    # Task orders for evaluation of all benchmarks
    envs/         # Environment definitions for LIBERO tasks
    init_files/   # Fixed initializations for benchmark evaluation
    utils/        # Miscellaneous utility functions

关于task的部分:

python 复制代码
from libero.libero import benchmark
from libero.libero.envs import OffScreenRenderEnv


benchmark_dict = benchmark.get_benchmark_dict()
task_suite_name = "libero_10" # can also choose libero_spatial, libero_object, etc.
task_suite = benchmark_dict[task_suite_name]()

# retrieve a specific task
task_id = 0
task = task_suite.get_task(task_id)
task_name = task.name
task_description = task.language
task_bddl_file = os.path.join(get_libero_path("bddl_files"), task.problem_folder, task.bddl_file)
print(f"[info] retrieving task {task_id} from suite {task_suite_name}, the " + \
      f"language instruction is {task_description}, and the bddl file is {task_bddl_file}")

# step over the environment
env_args = {
    "bddl_file_name": task_bddl_file,
    "camera_heights": 128,
    "camera_widths": 128
}
env = OffScreenRenderEnv(**env_args)
env.seed(0)
env.reset()
init_states = task_suite.get_task_init_states(task_id) # for benchmarking purpose, we fix the a set of initial states
init_state_id = 0
env.set_init_state(init_states[init_state_id])

dummy_action = [0.] * 7
for step in range(10):
    obs, reward, done, info = env.step(dummy_action)
env.close()

内容物:

复制代码
找到文件,真实路径在: /content/LIBERO/libero/libero/bddl_files/libero_10/LIVING_ROOM_SCENE2_put_both_the_alphabet_soup_and_the_tomato_sauce_in_the_basket.bddl

--- 以下是 BDDL 文件内容 ---
(define (problem LIBERO_Living_Room_Tabletop_Manipulation)
  (:domain robosuite)
  (:language put both the alphabet soup and the tomato sauce in the basket)
    (:regions
      (basket_init_region
          (:target living_room_table)
          (:ranges (
              (-0.01 0.25 0.01 0.27)
            )
          )
          (:yaw_rotation (
              (0.0 0.0)
            )
          )
      )
      (milk_init_region
          (:target living_room_table)
          (:ranges (
              (0.025 -0.125 0.07500000000000001 -0.07500000000000001)
            )
          )
          (:yaw_rotation (
              (0.0 0.0)
            )
          )
      )
      (cream_cheese_init_region
          (:target living_room_table)
          (:ranges (
              (0.07500000000000001 -0.225 0.125 -0.17500000000000002)
            )
          )
          (:yaw_rotation (
              (0.0 0.0)
            )
          )
      )
      (orange_juice_init_region
          (:target living_room_table)
          (:ranges (
              (-0.025 -0.275 0.025 -0.225)
            )
          )
          (:yaw_rotation (
              (0.0 0.0)
            )
          )
      )
      (tomato_sauce_init_region
          (:target living_room_table)
          (:ranges (
              (-0.125 0.025 -0.07500000000000001 0.07500000000000001)
            )
          )
          (:yaw_rotation (
              (0.0 0.0)
            )
          )
      )
      (alphabet_soup_init_region
          (:target living_room_table)
          (:ranges (
              (-0.125 -0.175 -0.07500000000000001 -0.125)
            )
          )
          (:yaw_rotation (
              (0.0 0.0)
            )
          )
      )
      (butter_init_region
          (:target living_room_table)
          (:ranges (
              (0.025 0.025 0.07500000000000001 0.07500000000000001)
            )
          )
          (:yaw_rotation (
              (0.0 0.0)
            )
          )
      )
      (ketchup_init_region
          (:target living_room_table)
          (:ranges (
              (-0.275 -0.175 -0.225 -0.125)
            )
          )
          (:yaw_rotation (
              (0.0 0.0)
            )
          )
      )
      (contain_region
          (:target basket_1)
      )
    )

  (:fixtures
    living_room_table - living_room_table
  )

  (:objects
    alphabet_soup_1 - alphabet_soup
    cream_cheese_1 - cream_cheese
    tomato_sauce_1 - tomato_sauce
    ketchup_1 - ketchup
    orange_juice_1 - orange_juice
    milk_1 - milk
    butter_1 - butter
    basket_1 - basket
  )

  (:obj_of_interest
    alphabet_soup_1
    tomato_sauce_1
    basket_1
  )

  (:init
    (On alphabet_soup_1 living_room_table_alphabet_soup_init_region)
    (On cream_cheese_1 living_room_table_cream_cheese_init_region)
    (On tomato_sauce_1 living_room_table_tomato_sauce_init_region)
    (On ketchup_1 living_room_table_ketchup_init_region)
    (On milk_1 living_room_table_milk_init_region)
    (On orange_juice_1 living_room_table_orange_juice_init_region)
    (On butter_1 living_room_table_butter_init_region)
    (On basket_1 living_room_table_basket_init_region)
  )

  (:goal
    (And (In alphabet_soup_1 basket_1_contain_region) (In tomato_sauce_1 basket_1_contain_region))
  )

)

task解析:

把 alphabet soup(字母汤罐)和 tomato sauce(番茄酱)都放到篮子里

文件逐段详细解释

lisp

复制代码
(define (problem LIBERO_Living_Room_Tabletop_Manipulation)
  (:domain robosuite)
  (:language put both the alphabet soup and the tomato sauce in the basket)
  • problem:这是任务的名字。
  • domain:基于 robosuite(MuJoCo 的机器人仿真框架)。
  • :language自然语言指令 ------ 这就是模型或人类看到的任务描述。
(:regions ...) ------ 区域定义

这一段定义了桌子上不同物品的初始放置区域(范围):

  • basket_init_region:篮子应该放在桌子的哪个位置。
  • alphabet_soup_init_region、tomato_sauce_init_region 等:每种物品的初始放置范围(x, y 坐标区间)。
  • contain_region:篮子里面的"可容纳区域"。

这些区域保证了任务的可重复性(每次初始化位置都在一定范围内)。

(:fixtures ...)

lisp

复制代码
(:fixtures
  living_room_table - living_room_table
)

定义了场景中的固定物体(这里只有一张 living_room_table)。

(:objects ...)

lisp

复制代码
(:objects
  alphabet_soup_1 - alphabet_soup
  tomato_sauce_1 - tomato_sauce
  ...
  basket_1 - basket
)

列出这个场景中所有出现的物体,并指定它们的类型。

(:obj_of_interest ...)

lisp

复制代码
(:obj_of_interest
  alphabet_soup_1
  tomato_sauce_1
  basket_1
)

关键物体 :只有这三个物体是完成任务真正需要的,其他物体(如牛奶、橙汁等)是干扰物,增加任务难度。

(:init ...) ------ 初始状态

lisp

复制代码
(:init
  (On alphabet_soup_1 living_room_table_alphabet_soup_init_region)
  (On tomato_sauce_1 living_room_table_tomato_sauce_init_region)
  ...
)

定义任务开始时每个物体放在哪里

(:goal ...) ------ 目标条件(最重要!)

lisp

复制代码
(:goal
  (And (In alphabet_soup_1 basket_1_contain_region) 
       (In tomato_sauce_1 basket_1_contain_region))
)

成功条件 : 字母汤罐 番茄酱 都必须放在篮子里面

只有当这个条件满足时,任务才算完成,reward 才会给 +1。

LeRobotDataset

  • 是什么 :Hugging Face 推出的标准化机器人数据集格式(v3.0),支持 streaming、多模态、易可视化。不是单一数据集,而是许多数据集的统一入口。
  • 用途:方便加载/处理各种机器人数据,与 LeRobot 库无缝集成。
  • 下载/访问https://huggingface.co/lerobot (搜索数据集)或通过 LeRobot 库加载。
  • 著名项目:Hugging Face LeRobot 框架,许多开源机器人项目采用此格式。