Autonomous Microscopy Experiments through Large Language Model Agents Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2501.10385
The emergence of large language models (LLMs) has accelerated the development of self-driving laboratories (SDLs) for materials research. Despite their transformative potential, current SDL implementations rely on rigid, predefined protocols that limit their adaptability to dynamic experimental scenarios across different labs. A significant challenge persists in measuring how effectively AI agents can replicate the adaptive decision-making and experimental intuition of expert scientists. Here, we introduce AILA (Artificially Intelligent Lab Assistant), a framework that automates atomic force microscopy (AFM) through LLM-driven agents. Using AFM as an experimental testbed, we develop AFMBench-a comprehensive evaluation suite that challenges AI agents based on language models like GPT-4o and GPT-3.5 to perform tasks spanning the scientific workflow: from experimental design to results analysis. Our systematic assessment shows that state-of-the-art language models struggle even with basic tasks such as documentation retrieval, leading to a significant decline in performance in multi-agent coordination scenarios. Further, we observe that LLMs exhibit a tendency to not adhere to instructions or even divagate to additional tasks beyond the original request, raising serious concerns regarding safety alignment aspects of AI agents for SDLs. Finally, we demonstrate the application of AILA on increasingly complex experiments open-ended experiments: automated AFM calibration, high-resolution feature detection, and mechanical property measurement. Our findings emphasize the necessity for stringent benchmarking protocols before deploying AI agents as laboratory assistants across scientific disciplines.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2501.10385
- https://arxiv.org/pdf/2501.10385
- OA Status
- green
- Cited By
- 2
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4406692223
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4406692223Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2501.10385Digital Object Identifier
- Title
-
Autonomous Microscopy Experiments through Large Language Model AgentsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-18Full publication date if available
- Authors
-
Indrajeet Mandal, Jitendra Soni, Mohd Zaki, Morten M. Smedskjær, Katrin Wondraczek, Lothar Wondraczek, Nitya Nand Gosvami, N. M. Anoop KrishnanList of authors in order
- Landing page
-
https://arxiv.org/abs/2501.10385Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2501.10385Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2501.10385Direct OA link when available
- Concepts
-
Microscopy, Computer science, Nanotechnology, Materials science, Physics, OpticsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
2Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 2Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4406692223 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2501.10385 |
| ids.doi | https://doi.org/10.48550/arxiv.2501.10385 |
| ids.openalex | https://openalex.org/W4406692223 |
| fwci | |
| type | preprint |
| title | Autonomous Microscopy Experiments through Large Language Model Agents |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11948 |
| topics[0].field.id | https://openalex.org/fields/25 |
| topics[0].field.display_name | Materials Science |
| topics[0].score | 0.9728999733924866 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2505 |
| topics[0].subfield.display_name | Materials Chemistry |
| topics[0].display_name | Machine Learning in Materials Science |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C147080431 |
| concepts[0].level | 2 |
| concepts[0].score | 0.5026261806488037 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1074953 |
| concepts[0].display_name | Microscopy |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.4246293902397156 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C171250308 |
| concepts[2].level | 1 |
| concepts[2].score | 0.3422449231147766 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11468 |
| concepts[2].display_name | Nanotechnology |
| concepts[3].id | https://openalex.org/C192562407 |
| concepts[3].level | 0 |
| concepts[3].score | 0.2553485631942749 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q228736 |
| concepts[3].display_name | Materials science |
| concepts[4].id | https://openalex.org/C121332964 |
| concepts[4].level | 0 |
| concepts[4].score | 0.18320712447166443 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[4].display_name | Physics |
| concepts[5].id | https://openalex.org/C120665830 |
| concepts[5].level | 1 |
| concepts[5].score | 0.15204590559005737 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q14620 |
| concepts[5].display_name | Optics |
| keywords[0].id | https://openalex.org/keywords/microscopy |
| keywords[0].score | 0.5026261806488037 |
| keywords[0].display_name | Microscopy |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.4246293902397156 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/nanotechnology |
| keywords[2].score | 0.3422449231147766 |
| keywords[2].display_name | Nanotechnology |
| keywords[3].id | https://openalex.org/keywords/materials-science |
| keywords[3].score | 0.2553485631942749 |
| keywords[3].display_name | Materials science |
| keywords[4].id | https://openalex.org/keywords/physics |
| keywords[4].score | 0.18320712447166443 |
| keywords[4].display_name | Physics |
| keywords[5].id | https://openalex.org/keywords/optics |
| keywords[5].score | 0.15204590559005737 |
| keywords[5].display_name | Optics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2501.10385 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2501.10385 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2501.10385 |
| locations[1].id | doi:10.48550/arxiv.2501.10385 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2501.10385 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5006744528 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-8808-4602 |
| authorships[0].author.display_name | Indrajeet Mandal |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Mandal, Indrajeet |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5108382162 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-8427-3860 |
| authorships[1].author.display_name | Jitendra Soni |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Soni, Jitendra |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5046327692 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-4551-3470 |
| authorships[2].author.display_name | Mohd Zaki |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Zaki, Mohd |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5022182707 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-0476-2021 |
| authorships[3].author.display_name | Morten M. Smedskjær |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Smedskjaer, Morten M. |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5052110026 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-6268-9136 |
| authorships[4].author.display_name | Katrin Wondraczek |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Wondraczek, Katrin |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5021241484 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-0747-3076 |
| authorships[5].author.display_name | Lothar Wondraczek |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Wondraczek, Lothar |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5028136506 |
| authorships[6].author.orcid | https://orcid.org/0000-0003-4082-9887 |
| authorships[6].author.display_name | Nitya Nand Gosvami |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Gosvami, Nitya Nand |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5065129881 |
| authorships[7].author.orcid | https://orcid.org/0000-0003-1500-4947 |
| authorships[7].author.display_name | N. M. Anoop Krishnan |
| authorships[7].author_position | last |
| authorships[7].raw_author_name | Krishnan, N. M. Anoop |
| authorships[7].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2501.10385 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Autonomous Microscopy Experiments through Large Language Model Agents |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11948 |
| primary_topic.field.id | https://openalex.org/fields/25 |
| primary_topic.field.display_name | Materials Science |
| primary_topic.score | 0.9728999733924866 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2505 |
| primary_topic.subfield.display_name | Materials Chemistry |
| primary_topic.display_name | Machine Learning in Materials Science |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052 |
| cited_by_count | 2 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 2 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2501.10385 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2501.10385 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2501.10385 |
| primary_location.id | pmh:oai:arXiv.org:2501.10385 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2501.10385 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2501.10385 |
| publication_date | 2024-12-18 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.A | 41 |
| abstract_inverted_index.a | 70, 137, 152 |
| abstract_inverted_index.AI | 49, 95, 177, 215 |
| abstract_inverted_index.an | 84 |
| abstract_inverted_index.as | 83, 132, 217 |
| abstract_inverted_index.in | 45, 140, 142 |
| abstract_inverted_index.of | 2, 11, 59, 176, 186 |
| abstract_inverted_index.on | 26, 98, 188 |
| abstract_inverted_index.or | 159 |
| abstract_inverted_index.to | 34, 105, 115, 136, 154, 157, 162 |
| abstract_inverted_index.we | 63, 87, 147, 182 |
| abstract_inverted_index.AFM | 82, 195 |
| abstract_inverted_index.Lab | 68 |
| abstract_inverted_index.Our | 118, 204 |
| abstract_inverted_index.SDL | 23 |
| abstract_inverted_index.The | 0 |
| abstract_inverted_index.and | 56, 103, 200 |
| abstract_inverted_index.can | 51 |
| abstract_inverted_index.for | 15, 179, 209 |
| abstract_inverted_index.has | 7 |
| abstract_inverted_index.how | 47 |
| abstract_inverted_index.not | 155 |
| abstract_inverted_index.the | 9, 53, 109, 166, 184, 207 |
| abstract_inverted_index.AILA | 65, 187 |
| abstract_inverted_index.LLMs | 150 |
| abstract_inverted_index.even | 127, 160 |
| abstract_inverted_index.from | 112 |
| abstract_inverted_index.like | 101 |
| abstract_inverted_index.rely | 25 |
| abstract_inverted_index.such | 131 |
| abstract_inverted_index.that | 30, 72, 93, 122, 149 |
| abstract_inverted_index.with | 128 |
| abstract_inverted_index.(AFM) | 77 |
| abstract_inverted_index.Here, | 62 |
| abstract_inverted_index.SDLs. | 180 |
| abstract_inverted_index.Using | 81 |
| abstract_inverted_index.based | 97 |
| abstract_inverted_index.basic | 129 |
| abstract_inverted_index.force | 75 |
| abstract_inverted_index.labs. | 40 |
| abstract_inverted_index.large | 3 |
| abstract_inverted_index.limit | 31 |
| abstract_inverted_index.shows | 121 |
| abstract_inverted_index.suite | 92 |
| abstract_inverted_index.tasks | 107, 130, 164 |
| abstract_inverted_index.their | 19, 32 |
| abstract_inverted_index.(LLMs) | 6 |
| abstract_inverted_index.(SDLs) | 14 |
| abstract_inverted_index.GPT-4o | 102 |
| abstract_inverted_index.across | 38, 220 |
| abstract_inverted_index.adhere | 156 |
| abstract_inverted_index.agents | 50, 96, 178, 216 |
| abstract_inverted_index.atomic | 74 |
| abstract_inverted_index.before | 213 |
| abstract_inverted_index.beyond | 165 |
| abstract_inverted_index.design | 114 |
| abstract_inverted_index.expert | 60 |
| abstract_inverted_index.models | 5, 100, 125 |
| abstract_inverted_index.rigid, | 27 |
| abstract_inverted_index.safety | 173 |
| abstract_inverted_index.Despite | 18 |
| abstract_inverted_index.GPT-3.5 | 104 |
| abstract_inverted_index.agents. | 80 |
| abstract_inverted_index.aspects | 175 |
| abstract_inverted_index.complex | 190 |
| abstract_inverted_index.current | 22 |
| abstract_inverted_index.decline | 139 |
| abstract_inverted_index.develop | 88 |
| abstract_inverted_index.dynamic | 35 |
| abstract_inverted_index.exhibit | 151 |
| abstract_inverted_index.feature | 198 |
| abstract_inverted_index.leading | 135 |
| abstract_inverted_index.observe | 148 |
| abstract_inverted_index.perform | 106 |
| abstract_inverted_index.raising | 169 |
| abstract_inverted_index.results | 116 |
| abstract_inverted_index.serious | 170 |
| abstract_inverted_index.through | 78 |
| abstract_inverted_index.Finally, | 181 |
| abstract_inverted_index.Further, | 146 |
| abstract_inverted_index.adaptive | 54 |
| abstract_inverted_index.concerns | 171 |
| abstract_inverted_index.divagate | 161 |
| abstract_inverted_index.findings | 205 |
| abstract_inverted_index.language | 4, 99, 124 |
| abstract_inverted_index.original | 167 |
| abstract_inverted_index.persists | 44 |
| abstract_inverted_index.property | 202 |
| abstract_inverted_index.request, | 168 |
| abstract_inverted_index.spanning | 108 |
| abstract_inverted_index.struggle | 126 |
| abstract_inverted_index.tendency | 153 |
| abstract_inverted_index.testbed, | 86 |
| abstract_inverted_index.alignment | 174 |
| abstract_inverted_index.analysis. | 117 |
| abstract_inverted_index.automated | 194 |
| abstract_inverted_index.automates | 73 |
| abstract_inverted_index.challenge | 43 |
| abstract_inverted_index.deploying | 214 |
| abstract_inverted_index.different | 39 |
| abstract_inverted_index.emergence | 1 |
| abstract_inverted_index.emphasize | 206 |
| abstract_inverted_index.framework | 71 |
| abstract_inverted_index.introduce | 64 |
| abstract_inverted_index.intuition | 58 |
| abstract_inverted_index.materials | 16 |
| abstract_inverted_index.measuring | 46 |
| abstract_inverted_index.necessity | 208 |
| abstract_inverted_index.protocols | 29, 212 |
| abstract_inverted_index.regarding | 172 |
| abstract_inverted_index.replicate | 52 |
| abstract_inverted_index.research. | 17 |
| abstract_inverted_index.scenarios | 37 |
| abstract_inverted_index.stringent | 210 |
| abstract_inverted_index.workflow: | 111 |
| abstract_inverted_index.AFMBench-a | 89 |
| abstract_inverted_index.LLM-driven | 79 |
| abstract_inverted_index.additional | 163 |
| abstract_inverted_index.assessment | 120 |
| abstract_inverted_index.assistants | 219 |
| abstract_inverted_index.challenges | 94 |
| abstract_inverted_index.detection, | 199 |
| abstract_inverted_index.evaluation | 91 |
| abstract_inverted_index.laboratory | 218 |
| abstract_inverted_index.mechanical | 201 |
| abstract_inverted_index.microscopy | 76 |
| abstract_inverted_index.open-ended | 192 |
| abstract_inverted_index.potential, | 21 |
| abstract_inverted_index.predefined | 28 |
| abstract_inverted_index.retrieval, | 134 |
| abstract_inverted_index.scenarios. | 145 |
| abstract_inverted_index.scientific | 110, 221 |
| abstract_inverted_index.systematic | 119 |
| abstract_inverted_index.Assistant), | 69 |
| abstract_inverted_index.Intelligent | 67 |
| abstract_inverted_index.accelerated | 8 |
| abstract_inverted_index.application | 185 |
| abstract_inverted_index.demonstrate | 183 |
| abstract_inverted_index.development | 10 |
| abstract_inverted_index.effectively | 48 |
| abstract_inverted_index.experiments | 191 |
| abstract_inverted_index.multi-agent | 143 |
| abstract_inverted_index.performance | 141 |
| abstract_inverted_index.scientists. | 61 |
| abstract_inverted_index.significant | 42, 138 |
| abstract_inverted_index.adaptability | 33 |
| abstract_inverted_index.benchmarking | 211 |
| abstract_inverted_index.calibration, | 196 |
| abstract_inverted_index.coordination | 144 |
| abstract_inverted_index.disciplines. | 222 |
| abstract_inverted_index.experimental | 36, 57, 85, 113 |
| abstract_inverted_index.experiments: | 193 |
| abstract_inverted_index.increasingly | 189 |
| abstract_inverted_index.instructions | 158 |
| abstract_inverted_index.laboratories | 13 |
| abstract_inverted_index.measurement. | 203 |
| abstract_inverted_index.self-driving | 12 |
| abstract_inverted_index.(Artificially | 66 |
| abstract_inverted_index.comprehensive | 90 |
| abstract_inverted_index.documentation | 133 |
| abstract_inverted_index.transformative | 20 |
| abstract_inverted_index.decision-making | 55 |
| abstract_inverted_index.high-resolution | 197 |
| abstract_inverted_index.implementations | 24 |
| abstract_inverted_index.state-of-the-art | 123 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 8 |
| citation_normalized_percentile |