On Friday we visited Rethink Robotics to install our software stack on their robots. They have agreed to help out with our scanning project. We calibrated three of the four arms; the fourth wrist camera had some kind of problem (perhaps hardware?) that led to an image that was too noisy to be useful. We have never gotten to use two robots at once before, and it was exciting to see them both moving and scanning objects. Our scanning stack is rapidly maturing, and we are on our way to scanning one million objects! More information about the research project is available in our Blue Sky Ideas paper.
Since we have quite a few Standard ICRA duckies, we thought it was important to get all of our ducks in a row:
We were fortunate enough to receive several Standard ICRA Duckies at RSS. We used Baxter to pick them up and squeak them!
We like bringing kids and robots together, and we recently shot this video of Stefanie’s son Jay interacting with Baxter:
It’s Jay’s new favorite robot video!
We received a set of the YCB objects last week and decided to complete the Protocol and Benchmark for Table Setting. We attained a score of 10/24! See the video:
Our approach was to use Baxter to autonomously collect visual models for the objects, annotate grasps, and then program the robot to move the objects to predefined positions on the table. Placement was challenging because the table setting doesn’t fit entirely within the robot’s kinematic space, so it drops some objects from a height. We could probably improve placement with more careful destination annotations, or by using vision to recognize the colors in the target region. The plate was challenging for Baxter because it barely fits within the robot’s kinematic space. It was very difficult for us to plan grasps on the plate so we left it out, but if we were doing fancier motion planning, the robot could probably pick it up. We were pleased to be able to recognize and manipulate five out of six of the objects in very little time using our software stack!
To run this benchmark, we had to create a new target template, because the one provided was too small to contain the YCB objects. You can get our template here:
Picking a snap circuit part. This model took about 15 minutes to acquire rgb and IR data using our Baxter. (The slow part is the IR scan.) We also had to annotate the default grasp point. It’s automatically picking using the cameras to localize the object and then goes in to grasp.
We also successfully teleoperated Baxter to pick up one of the parts and snap it into place. It was possible, but took two separate manoeuvres to get both ends to engage. We had to move the arm very slowly and practice a few times first. Video is here:
We worked with Bianca Homberg and Mehmet Dogar from Daniela Rus’s group to install our pick and place stack on their Baxter! We were all surprised at the differences between our robots, even though they are the “same” robot: the camera location, calibration parameters, gripper masks all needed to change. But once we had recalibrated everything, the robot was able to pick up the brush! Next we will try to get it to work with their soft hand.
If an object is not visible in IR, sensors such as the Kinect or an IR range finder cannot see it. To address this problem we have developed a technique for applying a temporary contrast agent to image the object in IR. We scan the object with the contrast agent to obtain a high-quality depth map. After the contrast agent is removed, we localize the object with vision and incorporate the high-quality depth information based on the visual pose estimate. This video shows our preferred method for applying the contrast agent.
We have integrated Baxter with the Kinect 2 using iai_kinect2. So far it seems to have higher latency than the Kinect 1, but higher resolution (so a larger workspace) and less calibration error. Overall picking is more accurate! The video shows four successful picks in a row.
We made a video of Baxter interpreting multimodal referring expressions using our multimodal Bayes filter. Our system interprets referring expressions in real time, outputting a distribution over objects at 14Hz.