For all their impressive abilities, large languages models (LLM) are often below when they have given new tasks to undo which require complex reasoning skills.
Although the LLM of an accounting firm can excel in summarizing financial reports, this same model could fail unexpectedly if it is responsible for predicting market trends or identifying fraudulent transactions.
To make LLM more adaptable, MIT researchers have examined how a certain training technique can be strategically deployed to increase the performance of a model on unknown and difficult problems.
They show that training in testing time, a method that implies a temporary update of some of the internal work of a model during deployment, can lead to an improvement in the accuracy of six times. The researchers have developed a framework for the implementation of a training strategy at the time of the test which uses examples of the new task to maximize these gains.
Their work could improve the flexibility of a model, allowing a standard LLM to adapt to complex tasks that require planning or abstraction. This could lead to LLM which would be more precise in many applications that require logical deduction, from medical diagnostics to the management of the supply chain.
“A real learning – what we have done here with training in testing time – is something that these models cannot do alone after their shipping. They cannot acquire new skills or improve in a task. But we have shown that if you push the model a little to make a real learning, you see that huge improvements can occur. ”
Akyürek is joined on the paper by graduate students Mehul Damani, Linlu Qiu, Han Guo and Jyothish Pari; The first cycle Adam Zweiger; and the main authors Yoon Kim, assistant professor of electrical engineering and computer science (CEE) and member of the computer and artificial intelligence laboratory (CSAIL); And Jacob Andreas, Associate Professor of EECS and member of CSAIL. Research will be presented at the international conference on automatic learning.
Tackle
LLM users often try to improve the performance of their model on a new task using a technique called learning in the context. They fuel the model a few examples of the new task as a text invites that guide the outputs of the model.
But learning in the context does not always work for problems that require logic and reasoning.
MIT researchers have examined how training in testing time can be used in conjunction with learning in the context to increase the performance of these difficult tasks. Testing training is to update certain model parameters – the internal variables he uses to make predictions – using a small amount of new data specific to the task to be performed.
Researchers explored how testing training interacts with learning in context. They studied design choices which maximize the improvements in performance that can be hidden from an LLM for general use.
“We note that training in testing time is a much stronger form of learning. Although simply providing examples can modestly increase precision, updating the model with these examples can lead to much better performance, especially in difficult fields, ”explains Damani.
Learning in context requires a small set of examples of tasks, including problems and their solutions. Researchers use these examples to create a specific set of data necessary for training at the time of the test.
To extend the size of this data set, they create new inputs by slightly modifying problems and solutions in examples, for example by horizontally overthrowing certain input data. They note that the formation of the model on the outputs of this new set of data leads to the best performance.
In addition, researchers update a small number of model parameters using a technique called low -ranking adaption, which improves the efficiency of the training process in testing time.
“This is important because our method must be effective if it will be deployed in the real world. We find that you can get enormous improvements in precision with a very small amount of parameter training, ”explains Akyürek.
Develop new skills
The rationalization of the process is essential, because the training in testing time is used at the installation, which means that a user should do so for each individual task. The model updates are only temporary, and the model returns to its original form after having made a prediction.
A model that usually takes less than a minute to respond to a request can take five or 10 minutes to provide a response to the training at the time of the test, adds Akyürek.
“We would not want to do this for all user requests, but it is useful that you have a very difficult task you want to resolve well. There could also be too difficult tasks for an LLM to solve without this method, ”he said.
The researchers tested their approach on two sets of reference data for extremely complex problems, such as IQ puzzles. It has increased precision as much as six times on techniques that only use learning in context.
Tasks that involved structured models or those used completely unknown data has shown the largest performance improvements.
“For simpler tasks, learning in context could be OK. But updating the parameters themselves could develop a new competence in the model, ”explains Damani.
In the future, researchers want to use this information towards the development of models that learn continuously.
The long -term objective is an LLM which, given a request, can automatically determine whether it must use the training in testing time to update the parameters or if it can solve the task using learning in the context, then implement the best training strategy in testing time without the need for human intervention.
This work is supported, in part, by the Mit-ibM Watson AI Lab and the National Science Foundation.
