Hidden bias in large -language models

by Brenden Burgess

When you buy through links on our site, we may earn a commission at no extra cost to you. However, this does not influence our evaluations.

bias large language models

The models of large languages ​​(LLM) like GPT-4 and Claude have completely transformed AI with their ability to deal with and generate human-type text. But under their powerful abilities is a subtle and often neglected problem: position bias. This refers to the trend of these models to overestimate the information located at the beginning and the end of a document while neglecting the content in the middle. This bias can have significant real consequences, potentially leading to inaccurate or incomplete responses from AI systems.

A team of MIT researchers has now identified the underlying cause of this defect. Their study reveals that the position bias comes not only from training data Used to teach LLM, but from fundamental design choices in the architecture of the model itself – in particular how the models based on transformers manage the attention and positioning of words.

Transformers, the architecture of the neural network behind most LLMs, works by coding for the phrases in tokens and learning how these tokens are linked to each other. To give meaning to the long text sequences, the models use attention mechanisms. These systems allow tokens to “concentrate” selectively on the related tokens elsewhere in the sequence, helping the model to understand the context.

However, due to the enormous calculation cost to allow each token to take care of all other tokens, developers often use causal masks. These constraints limit each token to consider only the previous tokens in the sequence. In addition, position encodings are added to help the models follow the order of words.

The MIT team has developed a theoretical framework based on graphics to study how these architectural choices affect the flow of attention within models. Their analysis shows that the Biaise Causal masking intrinsically the models towards the start of the entry, whatever the importance of the content. In addition, as more attention layers are added – a common strategy to stimulate model performance – this bias is strengthening.

This discovery aligns with the real world challenges faced by developers working on applied AI systems. Learn more about the construction of experience of Qudata A more intelligent recovery generation system using graphic databases. Our case study addresses some of the same architectural limitations and shows how to preserve structured relationships and contextual relevance in practice.

According to Xinyi Wu, MIT's doctoral student and the main author of the study, their framework helped show that even if the data is neutral, the architecture itself can distort the objective of the model.

To test their theory, the team launched experiences where correct answers in a text were placed in different positions. They found a clear U model: the models worked better when the answer was at the beginning, a little worse at the end and worse in the middle – a phenomenon which they nicknamed “lost in the environment”.

However, their work has also revealed potential ways to mitigate this bias. The strategic use of positional encodings, which can be designed to connect tokens more strongly to nearby words, can considerably reduce position bias. The simplification of models by reducing the number of attention layers or by exploring alternative masking strategies could also help. Although model architecture plays a major role, it is crucial to remember that biased training data can always strengthen the problem.

This research provides valuable information on the internal functioning of AI systems which is increasingly used in high challenges, from legal research to medical diagnostics, including code generation.

As Ali Jadbabaie, professor and head of the MIT civil and environmental engineering department pointed out, these models are black boxes. Most users do not realize that the input order can affect the precision of the output. If they want to trust AI in critical applications, users must understand when and why it fails.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.