Robotic help making mistakes? Just push it in the right direction | News put

by Brenden Burgess

When you buy through links on our site, we may earn a commission at no extra cost to you. However, this does not influence our evaluations.

Imagine that a robot helps you clean the dishes. You ask him to take a soapy bowl out of the sink, but its pliers slightly lacks the brand.

Using a new framework developed by MIT and NVIDIA researchers, you can correct the behavior of this robot with simple interactions. The method would allow you to point to the bowl or trace a trajectory on a screen, or simply give a boost to the robot arm in the right direction.

Unlike other methods to correct the behavior of robots, this technique does not ask users to collect new data and recycle the automatic learning model that feeds the robot brain. It allows a robot to use intuitive and real -time human comments to choose an achievable action sequence that is as close as possible to satisfy the intention of the user.

When the researchers tested their framework, its success rate was 21% higher than an alternative method which did not take advantage of human interventions.

In the long term, this framework could allow a user to more easily guide a factory formed robot to perform a wide variety of household chores even if the robot has never seen his house or the objects.

“We cannot expect the layers to collect data and refine a neural network model. paper on this method.

His co-authors include Lirui Wang Phd '24 and Yilun from PHD '24; The main author Julie Shah, professor of aeronautics and astronautics and director of the interactive robotics group at the computer intelligence laboratory and artificial intelligence (CSAIL); As well as Balakumar Sundaygam, Xuning Yang, Yu-Wei Chao, Claudia Perez-D'Arpino Phd '19 and Dieter Fox from Nvidia. Research will be presented at the international conference on robots and automation.

Attenuating disalping

Recently, researchers began to use pre-formed generative AI models to learn a “policy” or a set of rules, which a robot follows to carry out an action. Generative models can solve several complex tasks.

During training, the model only sees realizable robot movements, he therefore learns to generate valid trajectories so that the robot can follow.

Although these trajectories are valid, this does not mean that they always align themselves with the intention of a user in the real world. The robot may have been formed to grasp boxes on a shelf without overthrowing them, but it may not reach the box above someone's library if the shelf is differently oriented from those it has seen in training.

To overcome these failures, engineers generally collect data demonstrating the new task and retracts the generative model, an expensive and long process that requires automatic learning expertise.

Instead, MIT researchers wanted to allow users to direct the behavior of the robot during deployment when it makes a mistake.

But if a human interacts with the robot to correct their behavior, this could inadvertently do the generative model to choose an unlikely action. It could reach the box that the user wants, but eliminate books on the shelf in the process.

“We want to allow the user to interact with the robot without introducing this type of error, so we get much more aligned behavior with the intention of the user during deployment, but which is also valid and achievable,” explains Wang.

Their framework accomplishes this by providing the user with three intuitive ways to correct the behavior of the robot, each which offers certain advantages.

First of all, the user can point to the object he wants the robot to manipulate in an interface that shows his view of the camera. Second, they can retrace a trajectory in this interface, allowing them to specify how they want the robot to reach the object. Third, they can physically move the robot arm in the direction they want it to follow.

“When you map a 2D image of the environment with actions in a 3D space, some information is lost. Pushing the robot physically is the most direct way to specify the intention of the user without losing any information, ”explains Wang.

Sample for success

To ensure that these interactions do not cause the robot to choose an invalid action, such as colliding with other objects, researchers use a specific sampling procedure. This technique allows the model to choose an action in all the valid actions that line up most closely with the user's objective.

“Rather than imposing the user's desire, we give the robot an idea of ​​what the user understands, but leave the sampling procedure oscillating around their own set of learned behaviors,” explains Wang.

This sampling method allowed the researchers to surpass the other methods they compared during simulations and experiences with a real robot arm in a toy kitchen.

Although their method may not always finish the task immediately, it offers users the advantage of being able to immediately correct the robot if they see it doing something bad, rather than waiting for it to end, then give it new instructions.

In addition, after a user aggravated the robot several times until he picks up the right bowl, he could record this corrective action and integrate it into his behavior through future training. Then, the next day, the robot could pick up the right bowl without needing a boost.

“But the key to this continuous improvement is to have a way for the user to interact with the robot, which we have shown here,” said Wang.

In the future, researchers want to increase the speed of the sampling procedure while retaining or improving its performance. They also want to experience the generation of robot policies in new environments.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.