The management of questions that involve both natural language and structured tables has become an essential task to build smarter and useful AI systems. These systems should often process content which includes various types of data, such as text mixed with digital tables, which are commonly found in commercial documents, research documents and public reports. Understanding these documents requires that AI makes reasoning which covers both textual explanations and details based on the table – an intrinsically more complicated process than the traditional response based on the text.
One of the major problems in this field is that current language models often fail to interpret the documents with precision when the tables are involved. The models tend to lose relationships between lines and columns when the tables are flattened in raw text. This deforms the underlying structure of the data and reduces the accuracy of the responses, in particular when the task involves calculations, aggregations or reasoning which connects several facts on the document. These limitations make it difficult to use standard systems for practical tasks of multi-hop questions and answers which require information both text and tables.
To solve these problems, the previous methods tried to apply a generation to recovery (CLOTH) Techniques. These imply the recovery of text segments and feed them in a language model for the generation of responses. However, these techniques are insufficient for tasks that require composition or global reasoning on large sets of tabular data. Tools like Naiverag and TableGPT2 try to simulate this process by converting the tables into a process format or by generating an execution based on the code in Python. However, these methods still fight with the tasks where the maintenance of the original structure of the table is necessary for a correct interpretation.
Researchers from Huawei Cloud BU proposed a method called Tablerag which directly addresses these limitations. Research has introduced Tablerag as a hybrid system that alternates between the search for text data and structured execution based on SQL. This approach preserves the tabular arrangement and deals with queries based on the table as a unified reasoning unit. This new system preserves not only the structure of the table but also performs requests in a way that respects the relational nature of the data, organized in lines and columns. The researchers also created a data set called Heteqa to compare the performance of their method in different areas and reasoning tasks in several stages.
Tablerag functions in two main steps. The offline step is to analyze heterogeneous documents in structured databases in extraction of tables and textual content separately. These are stored in parallel corpus – a relational database for tables and a knowledge base for the text. The online phase manages user questions through an iterative process in four stages: DEPOSITION OF REQUIRS, Recovery of text, SQL programming and execution and generation of intermediate responses. When a question is received, the system identifies if it requires tabular or textual reasoning, dynamically chooses the appropriate strategy and combines outings. SQL is used for precise symbolic execution, allowing better performance in digital and logical calculations.
During experiences, Tablerag was tested on several benchmarks, notably Hybridqa, Wikitables and the newly built heteqa. Heteqa consists of 304 complex questions in nine various fields and includes 136 unique tables, as well as more than 5,300 entities derived from Wikipedia. The data set questions the models with tasks such as filtering, aggregation, grouping, calculation and sorting. Tablerag has surpassed all basic methods, including Naiverag, React and Tablegpt2. It has reached a constantly higher precision, with reasoning at the document powered up to 5 iterative steps, and used models such as Claude-3.5-Sonnet and Qwen-2.5-72B to check the results.
The work presented a strong and well structured solution to the challenge of reasoning in relation to mixed format documents. By maintaining structural integrity and by adopting SQL for structured data operations, researchers have demonstrated an effective alternative to existing recovery systems. Tablerag represents a significant front step in the answers to questions that manage documents containing both tables and text, offering a viable method for a more precise, evolving and interpretable document.
Discover the Paper And GitHub page. All the merit of this research goes to researchers in this project. Ready to connect with 1 million developers / engineers / researchers? Find out how NVIDIA, LG AI Research and the best IA companies operate Marktechpost to reach their target audience (Learn more)
Nikhil is an intern consultant at Marktechpost. It pursues a double degree integrated into materials at the Indian Kharagpur Institute of Technology. Nikhil is an IA / ML enthusiast who is still looking for applications in fields like biomaterials and biomedical sciences. With a strong experience in material science, he explores new progress and creates opportunities to contribute.
