Exploiting redundancy in large materials datasets for efficient machine learning with less data Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.1038/s41467-023-42992-y
Extensive efforts to gather materials data have largely overlooked potential data redundancy. In this study, we present evidence of a significant degree of redundancy across multiple large datasets for various material properties, by revealing that up to 95% of data can be safely removed from machine learning training with little impact on in-distribution prediction performance. The redundant data is related to over-represented material types and does not mitigate the severe performance degradation on out-of-distribution samples. In addition, we show that uncertainty-based active learning algorithms can construct much smaller but equally informative datasets. We discuss the effectiveness of informative data in improving prediction performance and robustness and provide insights into efficient data acquisition and machine learning training. This work challenges the “bigger is better” mentality and calls for attention to the information richness of materials data rather than a narrow emphasis on data volume.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1038/s41467-023-42992-y
- https://www.nature.com/articles/s41467-023-42992-y.pdf
- OA Status
- gold
- Cited By
- 75
- References
- 54
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4388567027
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4388567027Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1038/s41467-023-42992-yDigital Object Identifier
- Title
-
Exploiting redundancy in large materials datasets for efficient machine learning with less dataWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-11-10Full publication date if available
- Authors
-
Kangming Li, Daniel Persaud, Kamal Choudhary, Brian DeCost, Michael T. Greenwood, Jason Hattrick‐SimpersList of authors in order
- Landing page
-
https://doi.org/10.1038/s41467-023-42992-yPublisher landing page
- PDF URL
-
https://www.nature.com/articles/s41467-023-42992-y.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
goldOpen access status per OpenAlex
- OA URL
-
https://www.nature.com/articles/s41467-023-42992-y.pdfDirect OA link when available
- Concepts
-
Computer science, Redundancy (engineering), Robustness (evolution), Machine learning, Training set, Artificial intelligence, Data mining, Biochemistry, Chemistry, Operating system, GeneTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
75Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 39, 2024: 33, 2023: 3Per-year citation counts (last 5 years)
- References (count)
-
54Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4388567027 |
|---|---|
| doi | https://doi.org/10.1038/s41467-023-42992-y |
| ids.doi | https://doi.org/10.1038/s41467-023-42992-y |
| ids.pmid | https://pubmed.ncbi.nlm.nih.gov/37949845 |
| ids.openalex | https://openalex.org/W4388567027 |
| fwci | 10.05113212 |
| type | article |
| title | Exploiting redundancy in large materials datasets for efficient machine learning with less data |
| biblio.issue | 1 |
| biblio.volume | 14 |
| biblio.last_page | 7283 |
| biblio.first_page | 7283 |
| topics[0].id | https://openalex.org/T11948 |
| topics[0].field.id | https://openalex.org/fields/25 |
| topics[0].field.display_name | Materials Science |
| topics[0].score | 0.9998999834060669 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2505 |
| topics[0].subfield.display_name | Materials Chemistry |
| topics[0].display_name | Machine Learning in Materials Science |
| topics[1].id | https://openalex.org/T10211 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9918000102043152 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1703 |
| topics[1].subfield.display_name | Computational Theory and Mathematics |
| topics[1].display_name | Computational Drug Discovery Methods |
| topics[2].id | https://openalex.org/T12072 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9854999780654907 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Machine Learning and Algorithms |
| is_xpac | False |
| apc_list.value | 3920 |
| apc_list.currency | GBP |
| apc_list.value_usd | 4808 |
| apc_paid.value | 3920 |
| apc_paid.currency | GBP |
| apc_paid.value_usd | 4808 |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.8156693577766418 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C152124472 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7927512526512146 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q1204361 |
| concepts[1].display_name | Redundancy (engineering) |
| concepts[2].id | https://openalex.org/C63479239 |
| concepts[2].level | 3 |
| concepts[2].score | 0.6734637022018433 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q7353546 |
| concepts[2].display_name | Robustness (evolution) |
| concepts[3].id | https://openalex.org/C119857082 |
| concepts[3].level | 1 |
| concepts[3].score | 0.6439183950424194 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[3].display_name | Machine learning |
| concepts[4].id | https://openalex.org/C51632099 |
| concepts[4].level | 2 |
| concepts[4].score | 0.6206527948379517 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q3985153 |
| concepts[4].display_name | Training set |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.49134865403175354 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C124101348 |
| concepts[6].level | 1 |
| concepts[6].score | 0.4617235064506531 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q172491 |
| concepts[6].display_name | Data mining |
| concepts[7].id | https://openalex.org/C55493867 |
| concepts[7].level | 1 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q7094 |
| concepts[7].display_name | Biochemistry |
| concepts[8].id | https://openalex.org/C185592680 |
| concepts[8].level | 0 |
| concepts[8].score | 0.0 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[8].display_name | Chemistry |
| concepts[9].id | https://openalex.org/C111919701 |
| concepts[9].level | 1 |
| concepts[9].score | 0.0 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[9].display_name | Operating system |
| concepts[10].id | https://openalex.org/C104317684 |
| concepts[10].level | 2 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q7187 |
| concepts[10].display_name | Gene |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.8156693577766418 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/redundancy |
| keywords[1].score | 0.7927512526512146 |
| keywords[1].display_name | Redundancy (engineering) |
| keywords[2].id | https://openalex.org/keywords/robustness |
| keywords[2].score | 0.6734637022018433 |
| keywords[2].display_name | Robustness (evolution) |
| keywords[3].id | https://openalex.org/keywords/machine-learning |
| keywords[3].score | 0.6439183950424194 |
| keywords[3].display_name | Machine learning |
| keywords[4].id | https://openalex.org/keywords/training-set |
| keywords[4].score | 0.6206527948379517 |
| keywords[4].display_name | Training set |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.49134865403175354 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/data-mining |
| keywords[6].score | 0.4617235064506531 |
| keywords[6].display_name | Data mining |
| language | en |
| locations[0].id | doi:10.1038/s41467-023-42992-y |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S64187185 |
| locations[0].source.issn | 2041-1723 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | 2041-1723 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | True |
| locations[0].source.display_name | Nature Communications |
| locations[0].source.host_organization | https://openalex.org/P4310319908 |
| locations[0].source.host_organization_name | Nature Portfolio |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310319908, https://openalex.org/P4310319965 |
| locations[0].source.host_organization_lineage_names | Nature Portfolio, Springer Nature |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://www.nature.com/articles/s41467-023-42992-y.pdf |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | Nature Communications |
| locations[0].landing_page_url | https://doi.org/10.1038/s41467-023-42992-y |
| locations[1].id | pmid:37949845 |
| locations[1].is_oa | False |
| locations[1].source.id | https://openalex.org/S4306525036 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | PubMed |
| locations[1].source.host_organization | https://openalex.org/I1299303238 |
| locations[1].source.host_organization_name | National Institutes of Health |
| locations[1].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | publishedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | True |
| locations[1].is_published | True |
| locations[1].raw_source_name | Nature communications |
| locations[1].landing_page_url | https://pubmed.ncbi.nlm.nih.gov/37949845 |
| locations[2].id | pmh:oai:pubmedcentral.nih.gov:10638383 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S2764455111 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | False |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | PubMed Central |
| locations[2].source.host_organization | https://openalex.org/I1299303238 |
| locations[2].source.host_organization_name | National Institutes of Health |
| locations[2].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[2].license | cc-by |
| locations[2].pdf_url | https://pmc.ncbi.nlm.nih.gov/articles/PMC10638383/pdf/41467_2023_Article_42992.pdf |
| locations[2].version | submittedVersion |
| locations[2].raw_type | Text |
| locations[2].license_id | https://openalex.org/licenses/cc-by |
| locations[2].is_accepted | False |
| locations[2].is_published | False |
| locations[2].raw_source_name | Nat Commun |
| locations[2].landing_page_url | https://www.ncbi.nlm.nih.gov/pmc/articles/10638383 |
| locations[3].id | pmh:oai:doaj.org/article:5aa92fdd64c04f02bcaafa9fbaf1283d |
| locations[3].is_oa | False |
| locations[3].source.id | https://openalex.org/S4306401280 |
| locations[3].source.issn | |
| locations[3].source.type | repository |
| locations[3].source.is_oa | False |
| locations[3].source.issn_l | |
| locations[3].source.is_core | False |
| locations[3].source.is_in_doaj | False |
| locations[3].source.display_name | DOAJ (DOAJ: Directory of Open Access Journals) |
| locations[3].source.host_organization | |
| locations[3].source.host_organization_name | |
| locations[3].license | |
| locations[3].pdf_url | |
| locations[3].version | submittedVersion |
| locations[3].raw_type | article |
| locations[3].license_id | |
| locations[3].is_accepted | False |
| locations[3].is_published | False |
| locations[3].raw_source_name | Nature Communications, Vol 14, Iss 1, Pp 1-10 (2023) |
| locations[3].landing_page_url | https://doaj.org/article/5aa92fdd64c04f02bcaafa9fbaf1283d |
| indexed_in | crossref, doaj, pubmed |
| authorships[0].author.id | https://openalex.org/A5055766485 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-4471-8527 |
| authorships[0].author.display_name | Kangming Li |
| authorships[0].countries | CA |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I106938459, https://openalex.org/I185261750 |
| authorships[0].affiliations[0].raw_affiliation_string | Department of Materials Science and Engineering, University of Toronto, 27 King's College Cir, Toronto, ON, Canada |
| authorships[0].institutions[0].id | https://openalex.org/I106938459 |
| authorships[0].institutions[0].ror | https://ror.org/05nkf0n29 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I106938459 |
| authorships[0].institutions[0].country_code | CA |
| authorships[0].institutions[0].display_name | University of New Brunswick |
| authorships[0].institutions[1].id | https://openalex.org/I185261750 |
| authorships[0].institutions[1].ror | https://ror.org/03dbr7087 |
| authorships[0].institutions[1].type | education |
| authorships[0].institutions[1].lineage | https://openalex.org/I185261750 |
| authorships[0].institutions[1].country_code | CA |
| authorships[0].institutions[1].display_name | University of Toronto |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Kangming Li |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Department of Materials Science and Engineering, University of Toronto, 27 King's College Cir, Toronto, ON, Canada |
| authorships[1].author.id | https://openalex.org/A5025711363 |
| authorships[1].author.orcid | https://orcid.org/0009-0004-9980-2704 |
| authorships[1].author.display_name | Daniel Persaud |
| authorships[1].countries | CA |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I106938459, https://openalex.org/I185261750 |
| authorships[1].affiliations[0].raw_affiliation_string | Department of Materials Science and Engineering, University of Toronto, 27 King's College Cir, Toronto, ON, Canada |
| authorships[1].institutions[0].id | https://openalex.org/I106938459 |
| authorships[1].institutions[0].ror | https://ror.org/05nkf0n29 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I106938459 |
| authorships[1].institutions[0].country_code | CA |
| authorships[1].institutions[0].display_name | University of New Brunswick |
| authorships[1].institutions[1].id | https://openalex.org/I185261750 |
| authorships[1].institutions[1].ror | https://ror.org/03dbr7087 |
| authorships[1].institutions[1].type | education |
| authorships[1].institutions[1].lineage | https://openalex.org/I185261750 |
| authorships[1].institutions[1].country_code | CA |
| authorships[1].institutions[1].display_name | University of Toronto |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Daniel Persaud |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Department of Materials Science and Engineering, University of Toronto, 27 King's College Cir, Toronto, ON, Canada |
| authorships[2].author.id | https://openalex.org/A5019215236 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-9737-8074 |
| authorships[2].author.display_name | Kamal Choudhary |
| authorships[2].countries | US |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I1321296531, https://openalex.org/I4210147263 |
| authorships[2].affiliations[0].raw_affiliation_string | Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, Gaithersburg, MD, USA |
| authorships[2].institutions[0].id | https://openalex.org/I4210147263 |
| authorships[2].institutions[0].ror | https://ror.org/04a0y3b96 |
| authorships[2].institutions[0].type | government |
| authorships[2].institutions[0].lineage | https://openalex.org/I1321296531, https://openalex.org/I1343035065, https://openalex.org/I4210147263 |
| authorships[2].institutions[0].country_code | US |
| authorships[2].institutions[0].display_name | Material Measurement Laboratory |
| authorships[2].institutions[1].id | https://openalex.org/I1321296531 |
| authorships[2].institutions[1].ror | https://ror.org/05xpvk416 |
| authorships[2].institutions[1].type | government |
| authorships[2].institutions[1].lineage | https://openalex.org/I1321296531, https://openalex.org/I1343035065 |
| authorships[2].institutions[1].country_code | US |
| authorships[2].institutions[1].display_name | National Institute of Standards and Technology |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Kamal Choudhary |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, Gaithersburg, MD, USA |
| authorships[3].author.id | https://openalex.org/A5053080365 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-3459-5888 |
| authorships[3].author.display_name | Brian DeCost |
| authorships[3].countries | US |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I1321296531, https://openalex.org/I4210147263 |
| authorships[3].affiliations[0].raw_affiliation_string | Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, Gaithersburg, MD, USA |
| authorships[3].institutions[0].id | https://openalex.org/I4210147263 |
| authorships[3].institutions[0].ror | https://ror.org/04a0y3b96 |
| authorships[3].institutions[0].type | government |
| authorships[3].institutions[0].lineage | https://openalex.org/I1321296531, https://openalex.org/I1343035065, https://openalex.org/I4210147263 |
| authorships[3].institutions[0].country_code | US |
| authorships[3].institutions[0].display_name | Material Measurement Laboratory |
| authorships[3].institutions[1].id | https://openalex.org/I1321296531 |
| authorships[3].institutions[1].ror | https://ror.org/05xpvk416 |
| authorships[3].institutions[1].type | government |
| authorships[3].institutions[1].lineage | https://openalex.org/I1321296531, https://openalex.org/I1343035065 |
| authorships[3].institutions[1].country_code | US |
| authorships[3].institutions[1].display_name | National Institute of Standards and Technology |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Brian DeCost |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, Gaithersburg, MD, USA |
| authorships[4].author.id | https://openalex.org/A5069201034 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-0890-3434 |
| authorships[4].author.display_name | Michael T. Greenwood |
| authorships[4].countries | CA |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I1281735042 |
| authorships[4].affiliations[0].raw_affiliation_string | Canmet MATERIALS, Natural Resources Canada, 183 Longwood Road south, Hamilton, ON, Canada |
| authorships[4].institutions[0].id | https://openalex.org/I1281735042 |
| authorships[4].institutions[0].ror | https://ror.org/05hepy730 |
| authorships[4].institutions[0].type | government |
| authorships[4].institutions[0].lineage | https://openalex.org/I1281735042, https://openalex.org/I2802286613 |
| authorships[4].institutions[0].country_code | CA |
| authorships[4].institutions[0].display_name | Natural Resources Canada |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Michael Greenwood |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | Canmet MATERIALS, Natural Resources Canada, 183 Longwood Road south, Hamilton, ON, Canada |
| authorships[5].author.id | https://openalex.org/A5073635313 |
| authorships[5].author.orcid | https://orcid.org/0000-0003-2937-3188 |
| authorships[5].author.display_name | Jason Hattrick‐Simpers |
| authorships[5].countries | CA |
| authorships[5].affiliations[0].institution_ids | https://openalex.org/I185261750 |
| authorships[5].affiliations[0].raw_affiliation_string | Acceleration Consortium, University of Toronto, 27 King's College Cir, Toronto, ON, Canada |
| authorships[5].institutions[0].id | https://openalex.org/I185261750 |
| authorships[5].institutions[0].ror | https://ror.org/03dbr7087 |
| authorships[5].institutions[0].type | education |
| authorships[5].institutions[0].lineage | https://openalex.org/I185261750 |
| authorships[5].institutions[0].country_code | CA |
| authorships[5].institutions[0].display_name | University of Toronto |
| authorships[5].author_position | last |
| authorships[5].raw_author_name | Jason Hattrick-Simpers |
| authorships[5].is_corresponding | True |
| authorships[5].raw_affiliation_strings | Acceleration Consortium, University of Toronto, 27 King's College Cir, Toronto, ON, Canada |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.nature.com/articles/s41467-023-42992-y.pdf |
| open_access.oa_status | gold |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Exploiting redundancy in large materials datasets for efficient machine learning with less data |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T11948 |
| primary_topic.field.id | https://openalex.org/fields/25 |
| primary_topic.field.display_name | Materials Science |
| primary_topic.score | 0.9998999834060669 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2505 |
| primary_topic.subfield.display_name | Materials Chemistry |
| primary_topic.display_name | Machine Learning in Materials Science |
| related_works | https://openalex.org/W2770593030, https://openalex.org/W1495042958, https://openalex.org/W2494338568, https://openalex.org/W4281727072, https://openalex.org/W4312219546, https://openalex.org/W2122678784, https://openalex.org/W3154990682, https://openalex.org/W2171975302, https://openalex.org/W2022352247, https://openalex.org/W2282510344 |
| cited_by_count | 75 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 39 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 33 |
| counts_by_year[2].year | 2023 |
| counts_by_year[2].cited_by_count | 3 |
| locations_count | 4 |
| best_oa_location.id | doi:10.1038/s41467-023-42992-y |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S64187185 |
| best_oa_location.source.issn | 2041-1723 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | 2041-1723 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | True |
| best_oa_location.source.display_name | Nature Communications |
| best_oa_location.source.host_organization | https://openalex.org/P4310319908 |
| best_oa_location.source.host_organization_name | Nature Portfolio |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310319908, https://openalex.org/P4310319965 |
| best_oa_location.source.host_organization_lineage_names | Nature Portfolio, Springer Nature |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://www.nature.com/articles/s41467-023-42992-y.pdf |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | Nature Communications |
| best_oa_location.landing_page_url | https://doi.org/10.1038/s41467-023-42992-y |
| primary_location.id | doi:10.1038/s41467-023-42992-y |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S64187185 |
| primary_location.source.issn | 2041-1723 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | 2041-1723 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | True |
| primary_location.source.display_name | Nature Communications |
| primary_location.source.host_organization | https://openalex.org/P4310319908 |
| primary_location.source.host_organization_name | Nature Portfolio |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310319908, https://openalex.org/P4310319965 |
| primary_location.source.host_organization_lineage_names | Nature Portfolio, Springer Nature |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://www.nature.com/articles/s41467-023-42992-y.pdf |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | Nature Communications |
| primary_location.landing_page_url | https://doi.org/10.1038/s41467-023-42992-y |
| publication_date | 2023-11-10 |
| publication_year | 2023 |
| referenced_works | https://openalex.org/W2884430236, https://openalex.org/W2963899413, https://openalex.org/W3022530152, https://openalex.org/W3010708275, https://openalex.org/W3183767639, https://openalex.org/W3183861986, https://openalex.org/W3208687975, https://openalex.org/W4298149167, https://openalex.org/W4214843102, https://openalex.org/W4304185479, https://openalex.org/W3119113486, https://openalex.org/W2964332384, https://openalex.org/W2117363206, https://openalex.org/W3039596418, https://openalex.org/W1992985800, https://openalex.org/W1976492731, https://openalex.org/W3093999435, https://openalex.org/W4321085451, https://openalex.org/W4283398621, https://openalex.org/W4298178320, https://openalex.org/W4319865391, https://openalex.org/W4281258849, https://openalex.org/W1583837637, https://openalex.org/W2609397020, https://openalex.org/W2785813126, https://openalex.org/W2916338083, https://openalex.org/W2972597827, https://openalex.org/W3025104221, https://openalex.org/W3034954837, https://openalex.org/W2999359202, https://openalex.org/W3215518215, https://openalex.org/W4224143931, https://openalex.org/W4295846579, https://openalex.org/W4362700002, https://openalex.org/W2295598076, https://openalex.org/W2911964244, https://openalex.org/W3212512279, https://openalex.org/W4363675884, https://openalex.org/W2997959813, https://openalex.org/W2278970271, https://openalex.org/W4282041110, https://openalex.org/W4310135808, https://openalex.org/W2734520197, https://openalex.org/W2804431384, https://openalex.org/W6675354045, https://openalex.org/W1981276685, https://openalex.org/W6931691375, https://openalex.org/W3105321997, https://openalex.org/W3033499467, https://openalex.org/W3100221827, https://openalex.org/W3102400420, https://openalex.org/W3103297471, https://openalex.org/W4388567027, https://openalex.org/W3107581532 |
| referenced_works_count | 54 |
| abstract_inverted_index.a | 20, 138 |
| abstract_inverted_index.In | 13, 76 |
| abstract_inverted_index.We | 93 |
| abstract_inverted_index.be | 42 |
| abstract_inverted_index.by | 33 |
| abstract_inverted_index.in | 100 |
| abstract_inverted_index.is | 59, 122 |
| abstract_inverted_index.of | 19, 23, 39, 97, 133 |
| abstract_inverted_index.on | 52, 73, 141 |
| abstract_inverted_index.to | 3, 37, 61, 129 |
| abstract_inverted_index.up | 36 |
| abstract_inverted_index.we | 16, 78 |
| abstract_inverted_index.95% | 38 |
| abstract_inverted_index.The | 56 |
| abstract_inverted_index.and | 65, 104, 106, 113, 125 |
| abstract_inverted_index.but | 89 |
| abstract_inverted_index.can | 41, 85 |
| abstract_inverted_index.for | 29, 127 |
| abstract_inverted_index.not | 67 |
| abstract_inverted_index.the | 69, 95, 120, 130 |
| abstract_inverted_index.This | 117 |
| abstract_inverted_index.data | 6, 11, 40, 58, 99, 111, 135, 142 |
| abstract_inverted_index.does | 66 |
| abstract_inverted_index.from | 45 |
| abstract_inverted_index.have | 7 |
| abstract_inverted_index.into | 109 |
| abstract_inverted_index.much | 87 |
| abstract_inverted_index.show | 79 |
| abstract_inverted_index.than | 137 |
| abstract_inverted_index.that | 35, 80 |
| abstract_inverted_index.this | 14 |
| abstract_inverted_index.with | 49 |
| abstract_inverted_index.work | 118 |
| abstract_inverted_index.calls | 126 |
| abstract_inverted_index.large | 27 |
| abstract_inverted_index.types | 64 |
| abstract_inverted_index.across | 25 |
| abstract_inverted_index.active | 82 |
| abstract_inverted_index.degree | 22 |
| abstract_inverted_index.gather | 4 |
| abstract_inverted_index.impact | 51 |
| abstract_inverted_index.little | 50 |
| abstract_inverted_index.narrow | 139 |
| abstract_inverted_index.rather | 136 |
| abstract_inverted_index.safely | 43 |
| abstract_inverted_index.severe | 70 |
| abstract_inverted_index.study, | 15 |
| abstract_inverted_index.discuss | 94 |
| abstract_inverted_index.efforts | 2 |
| abstract_inverted_index.equally | 90 |
| abstract_inverted_index.largely | 8 |
| abstract_inverted_index.machine | 46, 114 |
| abstract_inverted_index.present | 17 |
| abstract_inverted_index.provide | 107 |
| abstract_inverted_index.related | 60 |
| abstract_inverted_index.removed | 44 |
| abstract_inverted_index.smaller | 88 |
| abstract_inverted_index.various | 30 |
| abstract_inverted_index.volume. | 143 |
| abstract_inverted_index.Abstract | 0 |
| abstract_inverted_index.datasets | 28 |
| abstract_inverted_index.emphasis | 140 |
| abstract_inverted_index.evidence | 18 |
| abstract_inverted_index.insights | 108 |
| abstract_inverted_index.learning | 47, 83, 115 |
| abstract_inverted_index.material | 31, 63 |
| abstract_inverted_index.mitigate | 68 |
| abstract_inverted_index.multiple | 26 |
| abstract_inverted_index.richness | 132 |
| abstract_inverted_index.samples. | 75 |
| abstract_inverted_index.training | 48 |
| abstract_inverted_index.Extensive | 1 |
| abstract_inverted_index.addition, | 77 |
| abstract_inverted_index.attention | 128 |
| abstract_inverted_index.better” | 123 |
| abstract_inverted_index.construct | 86 |
| abstract_inverted_index.datasets. | 92 |
| abstract_inverted_index.efficient | 110 |
| abstract_inverted_index.improving | 101 |
| abstract_inverted_index.materials | 5, 134 |
| abstract_inverted_index.mentality | 124 |
| abstract_inverted_index.potential | 10 |
| abstract_inverted_index.redundant | 57 |
| abstract_inverted_index.revealing | 34 |
| abstract_inverted_index.training. | 116 |
| abstract_inverted_index.“bigger | 121 |
| abstract_inverted_index.algorithms | 84 |
| abstract_inverted_index.challenges | 119 |
| abstract_inverted_index.overlooked | 9 |
| abstract_inverted_index.prediction | 54, 102 |
| abstract_inverted_index.redundancy | 24 |
| abstract_inverted_index.robustness | 105 |
| abstract_inverted_index.acquisition | 112 |
| abstract_inverted_index.degradation | 72 |
| abstract_inverted_index.information | 131 |
| abstract_inverted_index.informative | 91, 98 |
| abstract_inverted_index.performance | 71, 103 |
| abstract_inverted_index.properties, | 32 |
| abstract_inverted_index.redundancy. | 12 |
| abstract_inverted_index.significant | 21 |
| abstract_inverted_index.performance. | 55 |
| abstract_inverted_index.effectiveness | 96 |
| abstract_inverted_index.in-distribution | 53 |
| abstract_inverted_index.over-represented | 62 |
| abstract_inverted_index.uncertainty-based | 81 |
| abstract_inverted_index.out-of-distribution | 74 |
| cited_by_percentile_year.max | 100 |
| cited_by_percentile_year.min | 96 |
| corresponding_author_ids | https://openalex.org/A5073635313 |
| countries_distinct_count | 2 |
| institutions_distinct_count | 6 |
| corresponding_institution_ids | https://openalex.org/I185261750 |
| citation_normalized_percentile.value | 0.98824392 |
| citation_normalized_percentile.is_in_top_1_percent | True |
| citation_normalized_percentile.is_in_top_10_percent | True |