Python知识分享网 - 专业的Python学习网站 学Python,上Python222
【机器人控制】基于视觉-语言-动作流模型的通用策略:跨形态数据预训练与高精度灵巧操作实现 PDF 下载
匿名网友发布于:2026-01-03 14:01:12
(侵权举报)
(假如点击没反应,多刷新两次就OK!)

【机器人控制】基于视觉-语言-动作流模型的通用策略:跨形态数据预训练与高精度灵巧操作实现 PDF 下载 图1

 

 

资料内容:

 

In our final set of experiments, we tackle a range of
challenging multi-stage tasks via a combination of fine-tuning
and language. For some of these tasks, data is present in pre
training, but fine-tuning is required to attain mastery. For some,
no data is present in pre-training. The tasks in this evaluation,
shown in Figure 12, are:
 
Laundry folding: This task requires a static (non-mobile) bi
manual system to fold articles of clothing. The clothing items
start in a randomized crumpled state in a bin, and the goal is
to take out the item, fold it, and place it on top of a stack of
previously folded items. The randomized initial configuration
of the crumpled laundry presents a major challenge, since the
policy needs to generalize to any configuration. This task is
present in pre-training.
 
Mobile laundry: Here, the Fibocom mobile robot in Figure 5
has to fold laundry, facing many of the same challenges while
controlling orientation and translation. This task is present in
pre-training.
 
Dryer unloading: Here, the Fibocom mobile robot has to take
laundry out of a dryer and place it into a hamper. This task is
present in pre-training.
 
Table bussing: This task requires bussing a table with a
diverse array of novel objects in a clutter scene, presenting
a much greater challenge than the benchmark in our out-of
box evaluation: the policy must generalize to unseen objects
of varying shapes and sizes, and perform complex dexterous
motions, such as twisting the gripper to pick up large plates
and carefully grasping thin, delicate items such as glasses.
The robot must handle dense clutter and intelligently sequence
various behaviors — for example, to clean off a plate with
trash, it must first pick up the plate, then shake its contents
into the garbage, and then place the plate in the bin. This task
is not present in pre-training.
 
Box building: The robot has to assemble a cardboard box
that starts in a flattened state. This task presents a number of
major challenges: the box needs to bent in the right way, and
the robot needs to hold down parts of the box while folding
others, utilizing both arms and even the surface of the table to
brace during folding motions. The robot might need to retry
some folds, requiring a reactive and intelligent strategy. This
task is not present in pre-training.
 
To-go box: This task requires moving several food items from
a plate into a to-go box, requiring packing the items into the
box so that they do not stick out, and then closing the box
with both arms. This task is not present in pre-training.
 
Packing eggs: The robot needs to take six eggs out of a
bowl and pack them into an egg carton, and then close the
carton. The eggs need to be grasped in a manner appropriate
to their pose inside the bowl, and then placed into open slots
in the carton. This presents challenges due to the egg shape,
slipperiness, and the need for careful placement. Closing the
box requires the use of both arms. This task is not present in
pre-training.