
Researchers from MIT and Technion, Israel Institute of Technology, have developed a innovative algorithm This could revolutionize the way the machines are formed to fight against uncertain situations of the real world. Inspired by the human learning process, the algorithm dynamically determines when a machine must imitate a “teacher” (called imitation learning) and when he has to explore and learn by trials and errors (called learning to strengthen).
The key idea behind the algorithm is to find a balance between the two learning methods. Instead of relying on brutal test and error combinations or imitation and strengthening forms, researchers have formed two student machines simultaneously. One student used a weighted combination of the two learning methods, while the other student relied solely on learning strengthening.
The algorithm has continually compared the performance of the two students. If the student using the teacher's advice has obtained better results, the algorithm has increased the weight of learning imitation for training. Conversely, if the student relying on trials and errors showed promising progress, the algorithm focused more on learning to strengthen. By dynamically adjusting the performance -based learning approach, the algorithm has proven to be adaptive and more effective for teaching complex tasks.
In simulated experiences, the researchers tested their approach by forming machines to navigate the labyrinths and manipulate objects. The algorithm has demonstrated almost perfect success rates and has outperformed the methods which only used imitation or learning of strengthening. The results were promising and presented the potential of the algorithm to form machines to question the real world scenarios, such as robot navigation in unknown environments.
Pulkit Agrawal, director of IMI AI LAB and assistant teacher in the computer intelligence and artificial intelligence laboratory, stressed the ability of algorithm to resolve difficult tasks with which previous methods have fought. Researchers think that this approach could lead to the development of upper robots capable of manipulation and locomotion of complex objects.
In addition, algorithm applications extend beyond robotics. It has the potential to improve performance in various fields that use imitation or learning to strengthen. For example, it could be used to train smaller language models by taking advantage of the knowledge of larger models for specific tasks. Researchers are also interested in exploring similarities and differences between automatic learning and human learning of teachers, in order to improve the overall learning experience.
Experts not involved in research have expressed their enthusiasm for the robustness of algorithm and its promising results in different fields. They highlighted the potential of its application in fields involving memory, reasoning and tactile detection. The ability of the algorithm to take advantage of previous calculation works and to simplify the balancing of learning objectives makes it an exciting progression in the field of strengthening learning.
As research continues, this algorithm could open the way to more efficient and adaptable automatic learning systems, bringing us closer to the development of advanced AI technologies.
Learn more about research in the paper.
