欧洲杯投注地址_明升体育-竞彩足球比分推荐

图片
Dr Kevin Jablonka, junior research group leader at the Institute of Organic Chemistry and Macromolecular Chemistry

The limits of AI in materials science

Researchers at Friedrich Schiller University Jena highlight strengths and weaknesses of language–image models on scientific tasks
Dr Kevin Jablonka, junior research group leader at the Institute of Organic Chemistry and Macromolecular Chemistry
Image: Nicole Nerger (Universit?t Jena)
  • Research

Published: | By: Kevin Jablonka, Marco K?rner

Current AI-based vision-language models can perceive content very well, but they reach their limits in more complex scientific processes. This is shown by a new study by researchers at Friedrich Schiller University Jena, conducted with international partners and published in the journal ‘Nature Computational Science’. In this work, the researchers systematically examined for the first time how well modern AI models can process visual and textual information in chemistry and materials science.

An innovative evaluation method for AI

“Our study addresses a key problem in AI research: How can we fairly evaluate multimodal systems when it is unclear which data the models have already seen during training?” explains Dr Kevin Maik Jablonka, head of a Carl Zeiss Foundation junior research group at Friedrich Schiller University Jena and at the Helmholtz Institute for Polymers in Energy Applications (HIPOLE) in Jena, describing the methodological innovation. The evaluation method developed enables, for the first time, a systematic analysis of the strengths and weaknesses of current AI systems in scientific applications.

“Multimodal AI systems that can understand both text and images are seen as the future of scientific assistant systems,” Jablonka adds. “We wanted to find out whether these models truly have the potential to support researchers in their daily work — from literature analysis to data evaluation.”

More than one thousand tasks from everyday scientific practice

To test the capabilities of multimodal AI, the international team developed the evaluation framework ‘MaCBench’ (https://macbench.lamalab.orgExternal link), which comprises more than 1,100 realistic tasks from three central areas of scientific work: extracting data from the literature, understanding laboratory and simulation experiments, and interpreting measurement results. The tests ranged from analysing spectroscopy data and assessing laboratory safety to interpreting crystal structures.

The team examined leading AI models for their ability to understand and link scientific information. “In contrast to pure text models, these systems must be able to process visual and textual information simultaneously — a core capability for scientific work,” Jablonka explains.

Success on simple tasks, weaknesses in complex reasoning

The results present a nuanced picture: while the AI models reliably recognised laboratory equipment and extracted standardised data almost flawlessly, they showed fundamental weaknesses in spatial analyses and in integrating information from different sources. “It was particularly striking that the same information was processed markedly better when presented as text rather than as an image,” reports Jablonka. “This suggests that the integration of different data types is not yet functioning optimally.”

It was also notable that model performance correlated strongly with the frequency of the test materials on the internet. “This suggests that the models sometimes rely on pattern recognition from training data rather than developing genuine scientific understanding,” says the researcher.

Foundations for better AI assistant systems

These insights can benefit the development of future scientific AI assistants: “Before such systems can be used reliably in research, their spatial perception and the integration of different types of information must be fundamentally improved,” Jablonka concludes. “Our work indicates concrete ways to tackle these challenges and to improve AI tools for the natural sciences.”

Information

Original publication:
Alampara et al.: ?Probing the limitations of multimodal language models for chemistry and materials research“, Nature Computational Science (2025), DOI: 10.1038/s43588-025-00836-3External link

Contact:

Kevin Maik Jablonka, Dr
Institute of Organic Chemistry and Macromolecular Chemistry
Humboldtstra?e 10
07743 Jena Google Maps site planExternal link