Travel agents help provide end -to -end logistics – such as transport, accommodation, meals and accommodation – for businessmen, vacationers and both. For those who seek to take their own arrangements, the models of large languages (LLM) seem to be a solid tool to use for this task because of their ability to interact iteratively using natural language, provide common sense reasoning, collect information and call other tools to help the task to be accomplished. However, recent work has revealed that the cutting -edge LLM are fighting with complex logistics and mathematical reasoning, as well as problems with multiple constraints, such as travel planning, where they were found to provide 4% or less viable solutions, even with additional tools and application programming interfaces (API).
Subsequently, a MIT and Mit-IBM Watson AI Lab research team recommended the problem to see if they could increase the success rate of LLM solutions for complex problems. “We believe that many of these planning problems are naturally a problem of combinatorial optimization”, where you must satisfy several constraints in a certifiable manner, explains Chuchu Fan, associate professor at the MIT Department of Aeronautics and Astronautics (Aeroastro) and in the laboratory for information and decision -making systems (Liver). She is also a researcher at the Mit-ibm Watson Ai Lab. Its team applies automatic learning, control theory and formal methods to develop safe and verifiable control systems for robotics, autonomous systems, controllers and human-machine interactions.
Noting the transferable nature of their work for travel planning, the group sought to create a friendly framework for users who can act as AI travel broker to help develop realistic, logical and complete travel plans. To achieve this, the researchers combined current LLM with algorithms and a complete satisfaction solver. Solvents are mathematical tools that rigorously check if the criteria can be met and how, but they require complex computer programming for use. This makes them natural LLMS companions for problems like these, where users want support for planning in a timely manner, without the need for programming or research on travel options. In addition, if a user's constraint cannot be respected, the new technique can identify and articulate when the problem is and propose alternative measures to the user, who can then choose to accept it, reject them or modify them until a valid plan is formulated, if it exists.
“Different travel planning complexities are something that everyone will have to face at a given moment. There are different needs, requirements, constraints and real information that you can collect, ”explains Fan. “Our idea is not to ask the LLM to offer a travel plan. Instead, an LLM here acts as a translator to translate this description of the natural language of the problem into a problem that a solver can manage (then provide this to the user), “explains Fan.
Co-authorizing a paper On work with fan are Yang Zhang from the Mit-Ibm Watson Ai Lab, the student graduated from Aeroastro Yilun Hao and the graduate student Yongchao Chen de Mit Lids and Harvard University. This work was recently presented at the Nations Conference of the Americas of the For Computational Linguistic Association.
Break the solver
Mathematics tend to be specific to the field. For example, in the treatment of natural language, the LLM perform regressions to predict the following token, alias “Word”, in a series to analyze or create a document. It works well to generalize the various human inputs. However, the LLMS would not work for formal verification applications, such as in aerospace or cybersecurity, where circuit connections and stress tasks must be completed and proven, otherwise gaps and vulnerabilities can sneak and cause critical security problems. Here, the resolvers excellent, but they need fixed formatages of the entrances and have trouble with unsatisfactory requests. A hybrid technique, however, offers the opportunity to develop solutions for complex problems, such as travel planning, intuitively for everyday people.
“The solver is really the key here, because when we develop these algorithms, we know exactly how the problem is solved as an optimization problem,” explains Fan. More specifically, the research group used a solver called the Strucability of Modulo theories (SMT), which determines whether a formula can be satisfied. “With this particular solver, it is not only a question of optimization. He is reasoning on many different algorithms to understand if the planning problem is possible or not.
Translation into action
The “travel agent” operates in four steps that can be repeated, as needed. The researchers used GPT-4, Claude-3 or Mistral-Large as LLM of the method. Firstly, the LLM analyzes the travel plan requested by the user in the planning stages, noting preferences for the budget, hotels, transport, destinations, attractions, restaurants and the duration of the journey in days, as well as any other user prescription. These steps are then converted into an executable python code (with an annotation in natural language for each of the constraints), which calls APIs like Calysearch, Flightsearch, etc. To collect data, and the SMT solver to start performing the stages arranged in the constraint satisfaction problem. If a sound and complete solution can be found, the solver obtains the result at the LLM, which then provides a coherent route to the user.
If one or more constraints cannot be respected, the frame begins to look for an alternative. The solver produces code identifying the contradictory constraints (with its corresponding annotation) that the LLM then provides to the user with a potential remedy. The user can then decide how to proceed, until a solution (or the maximum number of iterations) is reached.
Generalizable and robust planning
The researchers have tested their method using the LLM aforementioned against other basic lines: GPT-4 in itself, OpenAi O1-PREVIEW in itself, GPT-4 with a tool to collect information and a research algorithm which optimizes the total cost. Using the Travelplanner data set, which includes data for viable plans, the team has examined several performance measures: how often a method could provide a solution, if the solution satisfied the common sense criteria like not to visit two cities in one day, the capacity of the method to comply with one or more constraints and a final success rate indicating that it could comply with all constraints. The new technique has generally reached a success rate of 90%, compared to 10% or less for basic lines. The team also explored the addition of a JSON representation in the request stage, which also facilitated the method of providing solutions with success rates of 84.4-98.9%.
The Mit-IBM team posed additional challenges for their method. They examined the importance of each component of their solution – such as the deletion of human comments or the solver – and how it affected the plan adjustments to unsatisfactory requests in the 10 or 20 iterations using a new set of data they created called UNSATCHRISTMAS, which includes invisible constraints and a modified version of Travelplanner. On average, the framework of the Mit-IBM group achieved a success of 78.6 and 85%, which increases to 81.6 and 91.7% with additional plan modification laps. The researchers analyzed how well he had managed the new invisible constraints and paraphrased paraphrases and the code of Pas. In both cases, it worked very well, in particular with a success rate of 86.7% for the paraphrase test.
Finally, the Mit-IBM researchers applied their framework to other areas with tasks such as the selection of blocks, the allocation of tasks, the problem of itinerant sellers and the warehouse. Here, the method must select numbered colored blocks and maximize its score; Optimize the assignment of robot tasks for different scenarios; Plan travel minimizing the distance traveled; and completion and optimization of robot tasks.
“I think it is a very strong and innovative setting that can save a lot of time for humans, and also, it is a very new combination of the LLM and the solver,” explains Hao.
This work was funded, in part, by the Office of Naval Research and the Mit-ibm Watson Ai Lab.
