DAIR: Disentangled Attention as Intrinsic Regularization for Collaborative Bimanual Manipulation
Summary
We address the problem of solving complex bimanual robot manipulation tasks with Reinforcement Learning. Such challenging tasks can be decomposed into sub-tasks that are accomplishable by different robots concurrently or sequentially for better efficiency. While previous Reinforcement Learning approaches have focused on modeling the compositionality of sub-tasks, there are still two major unsolved challenges when learning cooperative strategies for two robots: (i) domination, i.e., one robot may try to solve a task by itself and leaves the other idle; (ii) conflict, i.e., one robot can interrupt another's workspace when executing different sub-tasks simultaneously, which leads to unsafe collisions. To tackle these two challenges, we propose a novel technique called disentangled attention, which provides an intrinsic regularization for two robots to focus on different aspects of the task. We evaluate our method on five bimanual manipulation tasks. Experimental results show that our proposed intrinsic regularization successfully avoids domination and reduces conflicts for the policies, which leads to significantly more efficient and effective cooperative strategies than all baselines.
Short Description Video
Method
Our goal is to design a model and introduce a novel intrinsic regularization to better train the policy for
bimanual manipulation tasks with many objects. We hope the agents can automatically learn to allocate the
workload, and should also avoid the problems of domination and conflict. We use self-attention architecture
to combine all embedded representations from agents and objects. Based on this architecture, the intrinsic
loss is computed from the attention probability and encourages the agents to attend to different sub-tasks.
Results
Three Blocks Rearrangement
Attention Baseline
Disentangled Attention (Ours)
Eight Blocks Rearrangement
Attention Baseline
Disentangled Attention (Ours)
Two Blocks Stacking
Attention Baseline
Disentangled Attention (Ours)
Three Blocks Stacking
Attention Baseline
Disentangled Attention (Ours)
Two Tower Stacking
Attention Baseline
Disentangled Attention (Ours)
Open Box and Place
Attention Baseline
Disentangled Attention (Ours)
Push with Door
Attention Baseline
Disentangled Attention (Ours)
Lift Bar
This task shows the synergistic skill of our method. Though we leverage disentangled attention mechanism,
agents can still discover synergistic behaviors.