Robots engage in high level planning when executing real world tasks. We queried Tell Me Dave for a high level plan describing how to distribute three mugs among three tables. Since Baxter is not mobile, we simulated three tables by constructing three pedestals which rest inside of Baxter’s work space.
Here is a video showing Baxter distributing three mugs on three pedestals of different heights in response to the natural language input “Distribute the mugs on the tables” :
We like bringing kids and robots together, and we recently shot this video of Stefanie’s son Jay interacting with Baxter:
It’s Jay’s new favorite robot video!
We received a set of the YCB objects last week and decided to complete the Protocol and Benchmark for Table Setting. We attained a score of 10/24! See the video:
Our approach was to use Baxter to autonomously collect visual models for the objects, annotate grasps, and then program the robot to move the objects to predefined positions on the table. Placement was challenging because the table setting doesn’t fit entirely within the robot’s kinematic space, so it drops some objects from a height. We could probably improve placement with more careful destination annotations, or by using vision to recognize the colors in the target region. The plate was challenging for Baxter because it barely fits within the robot’s kinematic space. It was very difficult for us to plan grasps on the plate so we left it out, but if we were doing fancier motion planning, the robot could probably pick it up. We were pleased to be able to recognize and manipulate five out of six of the objects in very little time using our software stack!
To run this benchmark, we had to create a new target template, because the one provided was too small to contain the YCB objects. You can get our template here:
Picking a snap circuit part. This model took about 15 minutes to acquire rgb and IR data using our Baxter. (The slow part is the IR scan.) We also had to annotate the default grasp point. It’s automatically picking using the cameras to localize the object and then goes in to grasp.
We also successfully teleoperated Baxter to pick up one of the parts and snap it into place. It was possible, but took two separate manoeuvres to get both ends to engage. We had to move the arm very slowly and practice a few times first. Video is here:
We worked with Bianca Homberg and Mehmet Dogar from Daniela Rus’s group to install our pick and place stack on their Baxter! We were all surprised at the differences between our robots, even though they are the “same” robot: the camera location, calibration parameters, gripper masks all needed to change. But once we had recalibrated everything, the robot was able to pick up the brush! Next we will try to get it to work with their soft hand.
If an object is not visible in IR, sensors such as the Kinect or an IR range finder cannot see it. To address this problem we have developed a technique for applying a temporary contrast agent to image the object in IR. We scan the object with the contrast agent to obtain a high-quality depth map. After the contrast agent is removed, we localize the object with vision and incorporate the high-quality depth information based on the visual pose estimate. This video shows our preferred method for applying the contrast agent.
We have integrated Baxter with the Kinect 2 using iai_kinect2. So far it seems to have higher latency than the Kinect 1, but higher resolution (so a larger workspace) and less calibration error. Overall picking is more accurate! The video shows four successful picks in a row.
We made a video of Baxter interpreting multimodal referring expressions using our multimodal Bayes filter. Our system interprets referring expressions in real time, outputting a distribution over objects at 14Hz.
Our paper on inverse semantics has won Best Paper at RSS 2015!
We created an automous helicopter that accepts natural language commands. The helicopter infers appropriate actions and navigates to a requested location.