The Allen Institute for Ai-Ai2 reveals the Autods: a Bayesian engine focused on the surprise for an open scientific discovery

by Brenden Burgess

When you buy through links on our site, we may earn a commission at no extra cost to you. However, this does not influence our evaluations.

1752853962 ai2 autods blog graphic development v7

The Allen Institute for Artificial Intelligence (AI2) introduced Autods (autonomous discovery via surprise), a revolutionary prototype engine for an open autonomous scientific discovery. Distinct from conventional AI research assistants who depend on objectives or queries defined by man, the automes generate, test and itered independently, and itered hypotheses by quantifying and seeking the “Bayesian surprise” – a measure of the principle of authentic discovery, even beyond what humans are specifically looking for.

1752853962 ai2 autods blog graphic development v7 21752853962 ai2 autods blog graphic development v7 2

Of the survey focused on open exploration objectives

The traditional approaches of autonomous scientific discovery (TSA) generally revolve around the answer to pre-special research questions: generate relevant hypotheses for a given problem, then validate them experimentally. Autods fundamentally starts from this paradigm. Inspired by exploration focused on the curiosity of human scientists, the autods operate in an open way – it decides What questions to ask, which hypotheses to continue, and how To rely on the previous results, all without predefined objectives.

square unsloth ad

The open discovery is intrinsically difficult, requiring mechanisms for the crossing of vast hypotheses spaces and the priority of hypotheses on merit. To meet these challenges, the autods formalize the concept of “surprise” – a measurable change in belief on a hypothesis before and after having acquired empirical evidence.

Quantify the Bayesian surprise via large language models

At the heart of the Autods is a new framework to estimate the Bayesian surprise. For each hypothesis generated, advanced language models (LLM) – like GPT -4O – acted as probabilistic observers, aroused their “belief” on the hypothesis (in the form of probabilities) before and after empirical tests. These distributions of beliefs, built by sampling several LLM judgments, are modeled with beta distributions.

To detect a significant discovery, the autods calculate the divergence of Kullback -Lebler (KL) between the posterior beta distributions (after the proof) and anterior (before the proof) – a formal measure of the Bayesian surprise. Above all, only the changes of belief that cross a threshold for changing evidence (for example, probably true to probably false) are treated as truly surprising, focusing the system on substantial discoveries rather than trivial uncertainty updates.

Finding effective hypotheses with MCTS

Exploration of the vast hypothesis landscape effectively requires more than naive sampling. Autods uses research on Monte Carlo trees (MCTS) with a progressive enlargement to guide its search for surprising discoveries. Each node of the research tree represents a hypothesis, and the branches correspond to new hypotheses packaged with previous results. This structure allows autods to maintain a balance between the exploration of new avenues and the monitoring of fruitful tracks.

Unlike gourmet research or beam methods that may overcome or prematurely pruning, MCTS maintains high discovery efficiency under fixed calculation. Empirically, in 21 sets of domain data such as biology, economics and behavioral sciences, autods surpass repeated sampling, the bases of beam search and the search for beams – with 5 to 29% of hypotheses more judged surprising by the LLM.

Multi-agent multi-agent architecture

Autods orchestrates a series of specialized LLM agents, each responsible for a part distinct from the independent scientific work flow:

  • Generation of hypotheses
  • Experimental design
  • Programming and execution
  • Analysis and review of results

The deduplication of semantically similar hypotheses uses a hierarchical clustering pipeline: the text -based text with LLM combined with semantic equivalence in pairs guarantee that the final output set only includes really distinct discoveries.

Alignment and human interpretability

Alignment with human scientific intuition is a key reference. In a structured human assessment (with examiners with STEM history at the MS / PHD level), 67% of the assumptions that the autods deemed surprising were also considered surprising by experts in the field. In addition, the Bayesian surprise metrics of Autods drew more closely with human judgment than proxy measures such as “interest” or “utility” provided.

Interestingly, the nature and the direction of surprising belief changes varied according to the scientific field – lightning, for example, that confirming claims often require stronger evidence to be convincingly than new falsifications.

Practical considerations and future perspectives

Autods have high implementation and experimental validity, with more than 98% of the discoveries evaluated correctly implemented by human examiners. While the current pipelines depend on LLMS led by API and therefore faced latency constraints, the team has also explored an implementation of “programmatic research” which provides much faster, although less rich results.

Although Autods is currently a research prototype (with the opening of the planned open source), its architecture and its empirical success trace a convincing path for the scientific sciences and focused on AI.

Conclusion

Autods represent a significant increase in autonomous scientific reasoning. By going from research focused on objectives to an autonomous exploration based on curiosity – and to the implementation of its research in the Bayesian surprise – this indicates the way towards future AI systems capable of completing, accelerating or even directing scientific discovery independently.


Discover the Paper,, GitHub page And Blog. All the merit of this research goes to researchers in this project.

Sponsorship opportunity: Reach the most influential AI developers in the United States and Europe. 1M + monthly players, 500K + community manufacturers, endless possibilities. (Explore sponsorship)


Screen Shot 2021 09 14 at 9.02.24 AM

Asif Razzaq is the CEO of Marktechpost Media Inc .. as a visionary entrepreneur and engineer, AIF undertakes to exploit the potential of artificial intelligence for social good. His most recent company is the launch of an artificial intelligence media platform, Marktechpost, which stands out from its in-depth coverage of automatic learning and in-depth learning news which are both technically solid and easily understandable by a large audience. The platform has more than 2 million monthly views, illustrating its popularity with the public.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.