Smarter robot with better grip

Smarter robot with better grip

Problem being addressed

One of the great promises of robot learning systems is that they will be able to learn from their mistakes and continuously adapt to ever-changing environments. Despite this potential​,​ most of the robot learning systems today are deployed as a fixed policy and they are not being adapted after their deployment.

Solution

Adapting an image-based grasping policy to changes in background, object shape and appearance, lighting conditions, and robot morphology and kinematics. The researchers purposefully modify the robot and its environment, characteristic of the persistent change of the real world, and investigate its ability to adapt. Likewise, rather than proposing a new adaptation algorithm, with new complexity and caveats, they show how to successfully adapt robotic policies to substantial changes, using only the most basic components of existing off-policy reinforcement learning algorithms.

Advantages of this solution

This work is the first to demonstrate that simple fine-tuning of off-policy reinforcement learning can successfully adapt to substantial task, robot, and environment variations which were not present in the original training distribution. The suggested adaptation uses less than 0.2% of the data necessary to learn the task from scratch. This leads to substantial performance gains over the course of fine-tuning. These positive results hold in a limited continual learning setting, in which one can repeatedly fine-tune a single lineage of policies using data from a succession of new tasks.

Solution originally applied in these industries

engineering

Engineering

Possible New Application of the Work



Technology Companies

The base model achieves a success rate of 86% on the baseline grasping task and is able to pick up labeled objects with 98% success rate; this could be a great automation solution for warehouses with constantly changing environment to minimize operational errors.

Author of original research described in this blitzcard:

question-mark-icon

Name of the author who conducted the original research that this blitzcard is based on.

Source URL: #############check-icon


search-iconBrowse all blitzcards