Imagine a robot trying to clean up a messy dinner table. Two main manipulation skills are required: grasping that enables the robot to pick up objects, and planar pushing that allows the robot to isolate objects in the dense clutter to find a good grasp pose. It is necessary to identify grasps within the full 6D space because top-down grasping is insufficient for objects with diverse shapes, e.g. a plate or a filled cup. Pushing operations are also essential because in real-world scenarios, the robot’s workspace can contain many objects and a collision-free direct grasp may not exist. Pushing operations can singulate objects in clutter, enabling future grasping of these isolated objects. We explore learning joint planar pushing and 6-degree-of-freedom (6-DoF) grasping policies in a cluttered environment.
In a Q-learning framework, we jointly train two separate neural networks with reinforcement learning to maximize a reward function. The reward function is defined as only encouraging successful grasps; we do not directly reward pushing actions, because such intermediate rewards often lead to undesired behavior. We tackle the problem of limited top-down grasping
action space by integrating a 6-DoF grasping pose sampler rather than using dense pixel-wise sampling from visual inputs and only considering hard-coded top-down grasping candidates.
We evaluate our approach by task completion rate, action efficiency, and grasp accuracy in simulation and demonstrate performance on a real robot implementation. Our system shows 10% higher action efficiency and 20% higher grasp success rate than VPG, the current state-of-the-art, indicating significantly better performance in terms of both higher
prediction accuracy and quality of grasp pose selection.
This work was published at ICRA 2021. The code is here, and you can see a video here!