A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2508.21148
Scientific Large Language Models (Sci-LLMs) are transforming how knowledge is represented, integrated, and applied in scientific research, yet their progress is shaped by the complex nature of scientific data. This survey presents a comprehensive, data-centric synthesis that reframes the development of Sci-LLMs as a co-evolution between models and their underlying data substrate. We formulate a unified taxonomy of scientific data and a hierarchical model of scientific knowledge, emphasizing the multimodal, cross-scale, and domain-specific challenges that differentiate scientific corpora from general natural language processing datasets. We systematically review recent Sci-LLMs, from general-purpose foundations to specialized models across diverse scientific disciplines, alongside an extensive analysis of over 270 pre-/post-training datasets, showing why Sci-LLMs pose distinct demands -- heterogeneous, multi-scale, uncertainty-laden corpora that require representations preserving domain invariance and enabling cross-modal reasoning. On evaluation, we examine over 190 benchmark datasets and trace a shift from static exams toward process- and discovery-oriented assessments with advanced evaluation protocols. These data-centric analyses highlight persistent issues in scientific data development and discuss emerging solutions involving semi-automated annotation pipelines and expert validation. Finally, we outline a paradigm shift toward closed-loop systems where autonomous agents based on Sci-LLMs actively experiment, validate, and contribute to a living, evolving knowledge base. Collectively, this work provides a roadmap for building trustworthy, continually evolving artificial intelligence (AI) systems that function as a true partner in accelerating scientific discovery.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2508.21148
- https://arxiv.org/pdf/2508.21148
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415987149
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415987149Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2508.21148Digital Object Identifier
- Title
-
A Survey of Scientific Large Language Models: From Data Foundations to Agent FrontiersWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-08-28Full publication date if available
- Authors
-
Mingjun Hu, Chenglong Ma, Wei Li, Wanghan Xu, Jiamin Wu, Jucheng Hu, Tianbin Li, Guohang Zhuang, Jiaqi Liu, Yingzhou Lu, Ying Chen, Chaoyang Zhang, Cheng Tan, Ying Jie, Guo–Cheng Wu, Shuo Gao, Pengcheng Chen, Jiashi Lin, Haitao Wu, Lulu Chen, Fengxiang Wang, Yuanyuan Zhang, Xiangyu Zhao, Feilong Tang, Encheng Su, Jing Ning, Xinyao Liu, Yu Du, Changkai Ji, Pengfei Jiang, Cheng Tang, Ziyan Huang, Jiyao Liu, Jiaqi Wei, Yue-Jin Yang, Xiang Zhang, Guangshuai Wang, Yang Yue, Huihui Xu, Ziyang Chen, Yizhou Wang, Chen Tang, Jianyu Wu, Yuchen Ren, Siyuan Yan, Zhonghua Wang, Zhongxing Xu, Shiyan Su, Shangquan Sun, Runkai Zhao, Zhisheng Zhang, Dingkang Yang, Jinjie Wei, Jiaqi Wang, Jin-Sen Xu, Jiangtao Yan, Wenhao Tang, Hongze Zhu, Yu Liu, Fudi Wang, Yiqing Shen, Yuanfeng Ji, Yanzhou Su, Tong Xie, Hongming Shan, Chun-Mei Feng, Zhi Bin Hou, Diping Song, Lihao Liu, Yanyan Huang, Lequan Yu, Bin Fu, Shujun Wang, X.L. Li, Xiaowei Hu, Yun Gu, Ben Fei, Benyou Wang, Y. Cao, Minjie Shen, Jie Xu, Haodong Duan, Yan Fang, Hongxia Hao, Jielan Li, Jiajun Du, Yanbo Wang, Imran Razzak, Zhongying Deng, Chi Zhang, Lijun Wu, Conghui He, Lu Zhaohui, Jinhai Huang, Wenqi Shao, Yihao Liu, Siqi Luo, Yi Xin, Xiaohong Liu, Fenghua LingList of authors in order
- Landing page
-
https://arxiv.org/abs/2508.21148Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2508.21148Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2508.21148Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415987149 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2508.21148 |
| ids.doi | https://doi.org/10.48550/arxiv.2508.21148 |
| ids.openalex | https://openalex.org/W4415987149 |
| fwci | |
| type | preprint |
| title | A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2508.21148 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2508.21148 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2508.21148 |
| locations[1].id | doi:10.48550/arxiv.2508.21148 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2508.21148 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5007932694 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-5474-6022 |
| authorships[0].author.display_name | Mingjun Hu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Hu, Ming |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5013612900 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-6460-7819 |
| authorships[1].author.display_name | Chenglong Ma |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Ma, Chenglong |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5100318323 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-9438-9226 |
| authorships[2].author.display_name | Wei Li |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Li, Wei |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5102613788 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Wanghan Xu |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Xu, Wanghan |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5025651569 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-3479-1026 |
| authorships[4].author.display_name | Jiamin Wu |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Wu, Jiamin |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5103246163 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Jucheng Hu |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Hu, Jucheng |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5084901754 |
| authorships[6].author.orcid | https://orcid.org/0000-0001-8450-5767 |
| authorships[6].author.display_name | Tianbin Li |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Li, Tianbin |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5081656482 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-4744-7841 |
| authorships[7].author.display_name | Guohang Zhuang |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Zhuang, Guohang |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5100390372 |
| authorships[8].author.orcid | https://orcid.org/0000-0002-9775-2342 |
| authorships[8].author.display_name | Jiaqi Liu |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Liu, Jiaqi |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5046890926 |
| authorships[9].author.orcid | https://orcid.org/0009-0008-7774-6018 |
| authorships[9].author.display_name | Yingzhou Lu |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Lu, Yingzhou |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5100383082 |
| authorships[10].author.orcid | https://orcid.org/0000-0002-7322-2224 |
| authorships[10].author.display_name | Ying Chen |
| authorships[10].author_position | middle |
| authorships[10].raw_author_name | Chen, Ying |
| authorships[10].is_corresponding | False |
| authorships[11].author.id | https://openalex.org/A5100603442 |
| authorships[11].author.orcid | https://orcid.org/0000-0001-6816-3840 |
| authorships[11].author.display_name | Chaoyang Zhang |
| authorships[11].author_position | middle |
| authorships[11].raw_author_name | Zhang, Chaoyang |
| authorships[11].is_corresponding | False |
| authorships[12].author.id | https://openalex.org/A5100763145 |
| authorships[12].author.orcid | https://orcid.org/0000-0002-8764-2078 |
| authorships[12].author.display_name | Cheng Tan |
| authorships[12].author_position | middle |
| authorships[12].raw_author_name | Tan, Cheng |
| authorships[12].is_corresponding | False |
| authorships[13].author.id | https://openalex.org/A5100776906 |
| authorships[13].author.orcid | https://orcid.org/0000-0003-2649-8241 |
| authorships[13].author.display_name | Ying Jie |
| authorships[13].author_position | middle |
| authorships[13].raw_author_name | Ying, Jie |
| authorships[13].is_corresponding | False |
| authorships[14].author.id | https://openalex.org/A5083319940 |
| authorships[14].author.orcid | https://orcid.org/0000-0002-1946-6770 |
| authorships[14].author.display_name | Guo–Cheng Wu |
| authorships[14].author_position | middle |
| authorships[14].raw_author_name | Wu, Guocheng |
| authorships[14].is_corresponding | False |
| authorships[15].author.id | https://openalex.org/A5089002209 |
| authorships[15].author.orcid | https://orcid.org/0000-0003-3096-4700 |
| authorships[15].author.display_name | Shuo Gao |
| authorships[15].author_position | middle |
| authorships[15].raw_author_name | Gao, Shujian |
| authorships[15].is_corresponding | False |
| authorships[16].author.id | https://openalex.org/A5101979964 |
| authorships[16].author.orcid | https://orcid.org/0000-0002-7989-0194 |
| authorships[16].author.display_name | Pengcheng Chen |
| authorships[16].author_position | middle |
| authorships[16].raw_author_name | Chen, Pengcheng |
| authorships[16].is_corresponding | False |
| authorships[17].author.id | https://openalex.org/A5100568833 |
| authorships[17].author.orcid | https://orcid.org/0000-0002-2696-6204 |
| authorships[17].author.display_name | Jiashi Lin |
| authorships[17].author_position | middle |
| authorships[17].raw_author_name | Lin, Jiashi |
| authorships[17].is_corresponding | False |
| authorships[18].author.id | https://openalex.org/A5011649536 |
| authorships[18].author.orcid | https://orcid.org/0000-0003-3458-1459 |
| authorships[18].author.display_name | Haitao Wu |
| authorships[18].author_position | middle |
| authorships[18].raw_author_name | Wu, Haitao |
| authorships[18].is_corresponding | False |
| authorships[19].author.id | https://openalex.org/A5100420828 |
| authorships[19].author.orcid | https://orcid.org/0000-0002-4452-045X |
| authorships[19].author.display_name | Lulu Chen |
| authorships[19].author_position | middle |
| authorships[19].raw_author_name | Chen, Lulu |
| authorships[19].is_corresponding | False |
| authorships[20].author.id | https://openalex.org/A5075996633 |
| authorships[20].author.orcid | https://orcid.org/0009-0002-5267-232X |
| authorships[20].author.display_name | Fengxiang Wang |
| authorships[20].author_position | middle |
| authorships[20].raw_author_name | Wang, Fengxiang |
| authorships[20].is_corresponding | False |
| authorships[21].author.id | https://openalex.org/A5077518883 |
| authorships[21].author.orcid | https://orcid.org/0000-0001-7442-6846 |
| authorships[21].author.display_name | Yuanyuan Zhang |
| authorships[21].author_position | middle |
| authorships[21].raw_author_name | Zhang, Yuanyuan |
| authorships[21].is_corresponding | False |
| authorships[22].author.id | https://openalex.org/A5101577120 |
| authorships[22].author.orcid | https://orcid.org/0000-0003-2538-8657 |
| authorships[22].author.display_name | Xiangyu Zhao |
| authorships[22].author_position | middle |
| authorships[22].raw_author_name | Zhao, Xiangyu |
| authorships[22].is_corresponding | False |
| authorships[23].author.id | https://openalex.org/A5111230252 |
| authorships[23].author.orcid | |
| authorships[23].author.display_name | Feilong Tang |
| authorships[23].author_position | middle |
| authorships[23].raw_author_name | Tang, Feilong |
| authorships[23].is_corresponding | False |
| authorships[24].author.id | https://openalex.org/A5116867627 |
| authorships[24].author.orcid | |
| authorships[24].author.display_name | Encheng Su |
| authorships[24].author_position | middle |
| authorships[24].raw_author_name | Su, Encheng |
| authorships[24].is_corresponding | False |
| authorships[25].author.id | https://openalex.org/A5006858509 |
| authorships[25].author.orcid | https://orcid.org/0000-0002-5289-331X |
| authorships[25].author.display_name | Jing Ning |
| authorships[25].author_position | middle |
| authorships[25].raw_author_name | Ning, Junzhi |
| authorships[25].is_corresponding | False |
| authorships[26].author.id | https://openalex.org/A5101437216 |
| authorships[26].author.orcid | https://orcid.org/0000-0002-5948-1318 |
| authorships[26].author.display_name | Xinyao Liu |
| authorships[26].author_position | middle |
| authorships[26].raw_author_name | Liu, Xinyao |
| authorships[26].is_corresponding | False |
| authorships[27].author.id | https://openalex.org/A5023684585 |
| authorships[27].author.orcid | https://orcid.org/0000-0002-0868-4887 |
| authorships[27].author.display_name | Yu Du |
| authorships[27].author_position | middle |
| authorships[27].raw_author_name | Du, Ye |
| authorships[27].is_corresponding | False |
| authorships[28].author.id | https://openalex.org/A5111097611 |
| authorships[28].author.orcid | |
| authorships[28].author.display_name | Changkai Ji |
| authorships[28].author_position | middle |
| authorships[28].raw_author_name | Ji, Changkai |
| authorships[28].is_corresponding | False |
| authorships[29].author.id | https://openalex.org/A5070132561 |
| authorships[29].author.orcid | https://orcid.org/0000-0002-2975-2527 |
| authorships[29].author.display_name | Pengfei Jiang |
| authorships[29].author_position | middle |
| authorships[29].raw_author_name | Jiang, Pengfei |
| authorships[29].is_corresponding | False |
| authorships[30].author.id | https://openalex.org/A5048286830 |
| authorships[30].author.orcid | |
| authorships[30].author.display_name | Cheng Tang |
| authorships[30].author_position | middle |
| authorships[30].raw_author_name | Tang, Cheng |
| authorships[30].is_corresponding | False |
| authorships[31].author.id | https://openalex.org/A5115596571 |
| authorships[31].author.orcid | |
| authorships[31].author.display_name | Ziyan Huang |
| authorships[31].author_position | middle |
| authorships[31].raw_author_name | Huang, Ziyan |
| authorships[31].is_corresponding | False |
| authorships[32].author.id | https://openalex.org/A5110941234 |
| authorships[32].author.orcid | |
| authorships[32].author.display_name | Jiyao Liu |
| authorships[32].author_position | middle |
| authorships[32].raw_author_name | Liu, Jiyao |
| authorships[32].is_corresponding | False |
| authorships[33].author.id | https://openalex.org/A5006193910 |
| authorships[33].author.orcid | https://orcid.org/0000-0002-9927-6167 |
| authorships[33].author.display_name | Jiaqi Wei |
| authorships[33].author_position | middle |
| authorships[33].raw_author_name | Wei, Jiaqi |
| authorships[33].is_corresponding | False |
| authorships[34].author.id | https://openalex.org/A5103698774 |
| authorships[34].author.orcid | |
| authorships[34].author.display_name | Yue-Jin Yang |
| authorships[34].author_position | middle |
| authorships[34].raw_author_name | Yang, Yuejin |
| authorships[34].is_corresponding | False |
| authorships[35].author.id | https://openalex.org/A5102372950 |
| authorships[35].author.orcid | https://orcid.org/0000-0001-6706-2044 |
| authorships[35].author.display_name | Xiang Zhang |
| authorships[35].author_position | middle |
| authorships[35].raw_author_name | Zhang, Xiang |
| authorships[35].is_corresponding | False |
| authorships[36].author.id | https://openalex.org/A5115591477 |
| authorships[36].author.orcid | |
| authorships[36].author.display_name | Guangshuai Wang |
| authorships[36].author_position | middle |
| authorships[36].raw_author_name | Wang, Guangshuai |
| authorships[36].is_corresponding | False |
| authorships[37].author.id | https://openalex.org/A5100636019 |
| authorships[37].author.orcid | https://orcid.org/0000-0001-6934-2357 |
| authorships[37].author.display_name | Yang Yue |
| authorships[37].author_position | middle |
| authorships[37].raw_author_name | Yang, Yue |
| authorships[37].is_corresponding | False |
| authorships[38].author.id | https://openalex.org/A5010287267 |
| authorships[38].author.orcid | https://orcid.org/0000-0002-7389-4813 |
| authorships[38].author.display_name | Huihui Xu |
| authorships[38].author_position | middle |
| authorships[38].raw_author_name | Xu, Huihui |
| authorships[38].is_corresponding | False |
| authorships[39].author.id | https://openalex.org/A5100324487 |
| authorships[39].author.orcid | https://orcid.org/0000-0002-8564-9735 |
| authorships[39].author.display_name | Ziyang Chen |
| authorships[39].author_position | middle |
| authorships[39].raw_author_name | Chen, Ziyang |
| authorships[39].is_corresponding | False |
| authorships[40].author.id | https://openalex.org/A5100602395 |
| authorships[40].author.orcid | https://orcid.org/0000-0001-9888-6409 |
| authorships[40].author.display_name | Yizhou Wang |
| authorships[40].author_position | middle |
| authorships[40].raw_author_name | Wang, Yizhou |
| authorships[40].is_corresponding | False |
| authorships[41].author.id | https://openalex.org/A5101867082 |
| authorships[41].author.orcid | https://orcid.org/0000-0001-6830-4113 |
| authorships[41].author.display_name | Chen Tang |
| authorships[41].author_position | middle |
| authorships[41].raw_author_name | Tang, Chen |
| authorships[41].is_corresponding | False |
| authorships[42].author.id | https://openalex.org/A5070943245 |
| authorships[42].author.orcid | https://orcid.org/0000-0002-4590-5714 |
| authorships[42].author.display_name | Jianyu Wu |
| authorships[42].author_position | middle |
| authorships[42].raw_author_name | Wu, Jianyu |
| authorships[42].is_corresponding | False |
| authorships[43].author.id | https://openalex.org/A5078151985 |
| authorships[43].author.orcid | https://orcid.org/0000-0002-6009-6118 |
| authorships[43].author.display_name | Yuchen Ren |
| authorships[43].author_position | middle |
| authorships[43].raw_author_name | Ren, Yuchen |
| authorships[43].is_corresponding | False |
| authorships[44].author.id | https://openalex.org/A5108133368 |
| authorships[44].author.orcid | |
| authorships[44].author.display_name | Siyuan Yan |
| authorships[44].author_position | middle |
| authorships[44].raw_author_name | Yan, Siyuan |
| authorships[44].is_corresponding | False |
| authorships[45].author.id | https://openalex.org/A5103171298 |
| authorships[45].author.orcid | https://orcid.org/0000-0002-5550-8691 |
| authorships[45].author.display_name | Zhonghua Wang |
| authorships[45].author_position | middle |
| authorships[45].raw_author_name | Wang, Zhonghua |
| authorships[45].is_corresponding | False |
| authorships[46].author.id | https://openalex.org/A5015118469 |
| authorships[46].author.orcid | https://orcid.org/0000-0003-3815-7647 |
| authorships[46].author.display_name | Zhongxing Xu |
| authorships[46].author_position | middle |
| authorships[46].raw_author_name | Xu, Zhongxing |
| authorships[46].is_corresponding | False |
| authorships[47].author.id | https://openalex.org/A5037350701 |
| authorships[47].author.orcid | https://orcid.org/0009-0000-5486-8591 |
| authorships[47].author.display_name | Shiyan Su |
| authorships[47].author_position | middle |
| authorships[47].raw_author_name | Su, Shiyan |
| authorships[47].is_corresponding | False |
| authorships[48].author.id | https://openalex.org/A5103056828 |
| authorships[48].author.orcid | https://orcid.org/0000-0002-6292-2495 |
| authorships[48].author.display_name | Shangquan Sun |
| authorships[48].author_position | middle |
| authorships[48].raw_author_name | Sun, Shangquan |
| authorships[48].is_corresponding | False |
| authorships[49].author.id | https://openalex.org/A5040777365 |
| authorships[49].author.orcid | https://orcid.org/0000-0002-3649-9902 |
| authorships[49].author.display_name | Runkai Zhao |
| authorships[49].author_position | middle |
| authorships[49].raw_author_name | Zhao, Runkai |
| authorships[49].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2508.21148 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-08T23:21:52.890332 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2508.21148 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2508.21148 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2508.21148 |
| primary_location.id | pmh:oai:arXiv.org:2508.21148 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2508.21148 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2508.21148 |
| publication_date | 2025-08-28 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 32, 43, 54, 61, 139, 177, 195, 204, 218 |
| abstract_inverted_index.-- | 114 |
| abstract_inverted_index.On | 129 |
| abstract_inverted_index.We | 52, 84 |
| abstract_inverted_index.an | 100 |
| abstract_inverted_index.as | 42, 217 |
| abstract_inverted_index.by | 22 |
| abstract_inverted_index.in | 14, 159, 221 |
| abstract_inverted_index.is | 9, 20 |
| abstract_inverted_index.of | 26, 40, 57, 64, 103 |
| abstract_inverted_index.on | 187 |
| abstract_inverted_index.to | 92, 194 |
| abstract_inverted_index.we | 131, 175 |
| abstract_inverted_index.190 | 134 |
| abstract_inverted_index.270 | 105 |
| abstract_inverted_index.and | 12, 47, 60, 71, 125, 137, 146, 163, 171, 192 |
| abstract_inverted_index.are | 5 |
| abstract_inverted_index.for | 206 |
| abstract_inverted_index.how | 7 |
| abstract_inverted_index.the | 23, 38, 68 |
| abstract_inverted_index.why | 109 |
| abstract_inverted_index.yet | 17 |
| abstract_inverted_index.(AI) | 213 |
| abstract_inverted_index.This | 29 |
| abstract_inverted_index.data | 50, 59, 161 |
| abstract_inverted_index.from | 78, 89, 141 |
| abstract_inverted_index.over | 104, 133 |
| abstract_inverted_index.pose | 111 |
| abstract_inverted_index.that | 36, 74, 119, 215 |
| abstract_inverted_index.this | 201 |
| abstract_inverted_index.true | 219 |
| abstract_inverted_index.with | 149 |
| abstract_inverted_index.work | 202 |
| abstract_inverted_index.Large | 1 |
| abstract_inverted_index.These | 153 |
| abstract_inverted_index.base. | 199 |
| abstract_inverted_index.based | 186 |
| abstract_inverted_index.data. | 28 |
| abstract_inverted_index.exams | 143 |
| abstract_inverted_index.model | 63 |
| abstract_inverted_index.shift | 140, 179 |
| abstract_inverted_index.their | 18, 48 |
| abstract_inverted_index.trace | 138 |
| abstract_inverted_index.where | 183 |
| abstract_inverted_index.Models | 3 |
| abstract_inverted_index.across | 95 |
| abstract_inverted_index.agents | 185 |
| abstract_inverted_index.domain | 123 |
| abstract_inverted_index.expert | 172 |
| abstract_inverted_index.issues | 158 |
| abstract_inverted_index.models | 46, 94 |
| abstract_inverted_index.nature | 25 |
| abstract_inverted_index.recent | 87 |
| abstract_inverted_index.review | 86 |
| abstract_inverted_index.shaped | 21 |
| abstract_inverted_index.static | 142 |
| abstract_inverted_index.survey | 30 |
| abstract_inverted_index.toward | 144, 180 |
| abstract_inverted_index.applied | 13 |
| abstract_inverted_index.between | 45 |
| abstract_inverted_index.complex | 24 |
| abstract_inverted_index.corpora | 77, 118 |
| abstract_inverted_index.demands | 113 |
| abstract_inverted_index.discuss | 164 |
| abstract_inverted_index.diverse | 96 |
| abstract_inverted_index.examine | 132 |
| abstract_inverted_index.general | 79 |
| abstract_inverted_index.living, | 196 |
| abstract_inverted_index.natural | 80 |
| abstract_inverted_index.outline | 176 |
| abstract_inverted_index.partner | 220 |
| abstract_inverted_index.require | 120 |
| abstract_inverted_index.roadmap | 205 |
| abstract_inverted_index.showing | 108 |
| abstract_inverted_index.systems | 182, 214 |
| abstract_inverted_index.unified | 55 |
| abstract_inverted_index.Finally, | 174 |
| abstract_inverted_index.Language | 2 |
| abstract_inverted_index.Sci-LLMs | 41, 110, 188 |
| abstract_inverted_index.actively | 189 |
| abstract_inverted_index.advanced | 150 |
| abstract_inverted_index.analyses | 155 |
| abstract_inverted_index.analysis | 102 |
| abstract_inverted_index.building | 207 |
| abstract_inverted_index.datasets | 136 |
| abstract_inverted_index.distinct | 112 |
| abstract_inverted_index.emerging | 165 |
| abstract_inverted_index.enabling | 126 |
| abstract_inverted_index.evolving | 197, 210 |
| abstract_inverted_index.function | 216 |
| abstract_inverted_index.language | 81 |
| abstract_inverted_index.paradigm | 178 |
| abstract_inverted_index.presents | 31 |
| abstract_inverted_index.process- | 145 |
| abstract_inverted_index.progress | 19 |
| abstract_inverted_index.provides | 203 |
| abstract_inverted_index.reframes | 37 |
| abstract_inverted_index.taxonomy | 56 |
| abstract_inverted_index.Sci-LLMs, | 88 |
| abstract_inverted_index.alongside | 99 |
| abstract_inverted_index.benchmark | 135 |
| abstract_inverted_index.datasets, | 107 |
| abstract_inverted_index.datasets. | 83 |
| abstract_inverted_index.extensive | 101 |
| abstract_inverted_index.formulate | 53 |
| abstract_inverted_index.highlight | 156 |
| abstract_inverted_index.involving | 167 |
| abstract_inverted_index.knowledge | 8, 198 |
| abstract_inverted_index.pipelines | 170 |
| abstract_inverted_index.research, | 16 |
| abstract_inverted_index.solutions | 166 |
| abstract_inverted_index.synthesis | 35 |
| abstract_inverted_index.validate, | 191 |
| abstract_inverted_index.(Sci-LLMs) | 4 |
| abstract_inverted_index.Scientific | 0 |
| abstract_inverted_index.annotation | 169 |
| abstract_inverted_index.artificial | 211 |
| abstract_inverted_index.autonomous | 184 |
| abstract_inverted_index.challenges | 73 |
| abstract_inverted_index.contribute | 193 |
| abstract_inverted_index.discovery. | 224 |
| abstract_inverted_index.evaluation | 151 |
| abstract_inverted_index.invariance | 124 |
| abstract_inverted_index.knowledge, | 66 |
| abstract_inverted_index.persistent | 157 |
| abstract_inverted_index.preserving | 122 |
| abstract_inverted_index.processing | 82 |
| abstract_inverted_index.protocols. | 152 |
| abstract_inverted_index.reasoning. | 128 |
| abstract_inverted_index.scientific | 15, 27, 58, 65, 76, 97, 160, 223 |
| abstract_inverted_index.substrate. | 51 |
| abstract_inverted_index.underlying | 49 |
| abstract_inverted_index.assessments | 148 |
| abstract_inverted_index.closed-loop | 181 |
| abstract_inverted_index.continually | 209 |
| abstract_inverted_index.cross-modal | 127 |
| abstract_inverted_index.development | 39, 162 |
| abstract_inverted_index.emphasizing | 67 |
| abstract_inverted_index.evaluation, | 130 |
| abstract_inverted_index.experiment, | 190 |
| abstract_inverted_index.foundations | 91 |
| abstract_inverted_index.integrated, | 11 |
| abstract_inverted_index.multimodal, | 69 |
| abstract_inverted_index.specialized | 93 |
| abstract_inverted_index.validation. | 173 |
| abstract_inverted_index.accelerating | 222 |
| abstract_inverted_index.co-evolution | 44 |
| abstract_inverted_index.cross-scale, | 70 |
| abstract_inverted_index.data-centric | 34, 154 |
| abstract_inverted_index.disciplines, | 98 |
| abstract_inverted_index.hierarchical | 62 |
| abstract_inverted_index.intelligence | 212 |
| abstract_inverted_index.multi-scale, | 116 |
| abstract_inverted_index.represented, | 10 |
| abstract_inverted_index.transforming | 6 |
| abstract_inverted_index.trustworthy, | 208 |
| abstract_inverted_index.Collectively, | 200 |
| abstract_inverted_index.differentiate | 75 |
| abstract_inverted_index.comprehensive, | 33 |
| abstract_inverted_index.heterogeneous, | 115 |
| abstract_inverted_index.semi-automated | 168 |
| abstract_inverted_index.systematically | 85 |
| abstract_inverted_index.domain-specific | 72 |
| abstract_inverted_index.general-purpose | 90 |
| abstract_inverted_index.representations | 121 |
| abstract_inverted_index.uncertainty-laden | 117 |
| abstract_inverted_index.discovery-oriented | 147 |
| abstract_inverted_index.pre-/post-training | 106 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 120 |
| citation_normalized_percentile |