In a breathtaking new study, researchers showed that AI, especially large language models (LLMs), can predict the outcomes of scientific studies more accurately than human experts. This landmark research, led by a team from University College London (UCL), emphasizes the transformative potential of AI in scientific research, showing not only that it could synthesize knowledge from massive datasets but also predict future research results with superhuman precision.
In Nature Human Behaviour on November 27, 2024, a study was conducted that tested LLMs against 171 human neuroscience experts to see which could better predict the results of neuroscience experiments. The team developed a tool called BrainBench that would present the AI and the human experts with pairs of abstracts from neuroscience studies: one real and the other altered with plausible but incorrect results. The LLMs outperformed the human experts by a significant margin, with an accuracy of 81% compared to the experts’ 63%. Even when the human experts were selected for their high domain expertise, they still lagged behind the AI, with a maximum accuracy of 66%.
What makes these findings even more impressive is that the AI models did not just retrieve past knowledge. Instead, they identified patterns across existing research to make accurate predictions about the outcomes of new studies. In fact, a version of Mistral, an open-source LLM that had been specifically trained on neuroscience literature, showed an even higher accuracy rate of 86%.
According to lead author Dr. Ken Luo, this breakthrough reveals the true potential of AI in accelerating scientific discovery. “While previous research focused on LLMs’ ability to summarize and answer questions, we explored their capacity to predict scientific outcomes, which is a much more powerful application,” Luo explained. “These models can identify critical patterns in existing studies, offering predictions that could streamline the process of experimental design.”
The research opens the door for future collaborations between human experts and AI tools, with the possibility of AI systems providing real-time predictions on the likely outcomes of various experimental approaches. This could enable faster, more efficient iterations of scientific research, saving both time and resources.
As AI evolves, its role in science is likely to expand, paving the way for AI-assisted research across various domains. The study is a powerful reminder of the increasing capabilities of generative AI and its potential to reshape the landscape of scientific inquiry.