Large language models outperform traditional natural language processing methods in extracting patient-reported outcomes in IBD Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.1101/2024.09.05.24313139
Background and Aims Patient-reported outcomes (PROs) are vital in assessing disease activity and treatment outcomes in inflammatory bowel disease (IBD). However, manual extraction of these PROs from the free-text of clinical notes is burdensome. We aimed to improve data curation from free-text information in the electronic health record, making it more available for research and quality improvement. This study aimed to compare traditional natural language processing (tNLP) and large language models (LLMs) in extracting three IBD PROs (abdominal pain, diarrhea, fecal blood) from clinical notes across two institutions. Methods Clinic notes were annotated for each PRO using preset protocols. Models were developed and internally tested at the University of California San Francisco (UCSF), and then externally validated at Stanford University. We compared tNLP and LLM-based models on accuracy, sensitivity, specificity, positive and negative predictive value. Additionally, we conducted fairness and error assessments. Results Inter-rater reliability between annotators was >90%. On the UCSF test set (n=50), the top-performing tNLP models showcased accuracies of 92% (abdominal pain), 82% (diarrhea) and 80% (fecal blood), comparable to GPT-4, which was 96%, 88%, and 90% accurate, respectively. On external validation at Stanford (n=250), tNLP models failed to generalize (61-62% accuracy) while GPT-4 maintained accuracies >90%. PaLM-2 and GPT-4 showed similar performance. No biases were detected based on demographics or diagnosis. Conclusions LLMs are accurate and generalizable methods for extracting PROs. They maintain excellent accuracy across institutions, despite heterogeneity in note templates and authors. Widespread adoption of such tools has the potential to enhance IBD research and patient care.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.1101/2024.09.05.24313139
- OA Status
- green
- References
- 14
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4402335501
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4402335501Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1101/2024.09.05.24313139Digital Object Identifier
- Title
-
Large language models outperform traditional natural language processing methods in extracting patient-reported outcomes in IBDWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-09-06Full publication date if available
- Authors
-
Perseus V. Patel, Conner Davis, Amariel Ralbovsky, Daniel Tinoco, Christopher Y. K. Williams, Shadera Slatter, Behzad Naderalvojoud, Michael Rosen, Tina Hernandez‐Boussard, Vivek A. RudrapatnaList of authors in order
- Landing page
-
https://doi.org/10.1101/2024.09.05.24313139Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://www.ncbi.nlm.nih.gov/pmc/articles/11398594Direct OA link when available
- Concepts
-
Diarrhea, Health records, Abdominal pain, Natural language processing, Computer science, Information extraction, Metric (unit), Natural history, Artificial intelligence, Electronic health record, Disease, Medicine, Text messaging, Health care, Pathology, World Wide Web, Internal medicine, Operations management, Economic growth, EconomicsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- References (count)
-
14Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4402335501 |
|---|---|
| doi | https://doi.org/10.1101/2024.09.05.24313139 |
| ids.doi | https://doi.org/10.1101/2024.09.05.24313139 |
| ids.pmid | https://pubmed.ncbi.nlm.nih.gov/39281744 |
| ids.openalex | https://openalex.org/W4402335501 |
| fwci | |
| type | preprint |
| title | Large language models outperform traditional natural language processing methods in extracting patient-reported outcomes in IBD |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12578 |
| topics[0].field.id | https://openalex.org/fields/27 |
| topics[0].field.display_name | Medicine |
| topics[0].score | 0.9865999817848206 |
| topics[0].domain.id | https://openalex.org/domains/4 |
| topics[0].domain.display_name | Health Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2746 |
| topics[0].subfield.display_name | Surgery |
| topics[0].display_name | Diagnosis and treatment of tuberculosis |
| topics[1].id | https://openalex.org/T10134 |
| topics[1].field.id | https://openalex.org/fields/13 |
| topics[1].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[1].score | 0.9800000190734863 |
| topics[1].domain.id | https://openalex.org/domains/1 |
| topics[1].domain.display_name | Life Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1311 |
| topics[1].subfield.display_name | Genetics |
| topics[1].display_name | Inflammatory Bowel Disease |
| topics[2].id | https://openalex.org/T10552 |
| topics[2].field.id | https://openalex.org/fields/27 |
| topics[2].field.display_name | Medicine |
| topics[2].score | 0.9771000146865845 |
| topics[2].domain.id | https://openalex.org/domains/4 |
| topics[2].domain.display_name | Health Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2730 |
| topics[2].subfield.display_name | Oncology |
| topics[2].display_name | Colorectal Cancer Screening and Detection |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2779802037 |
| concepts[0].level | 2 |
| concepts[0].score | 0.5992506146430969 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q40878 |
| concepts[0].display_name | Diarrhea |
| concepts[1].id | https://openalex.org/C3019952477 |
| concepts[1].level | 3 |
| concepts[1].score | 0.5387265682220459 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q1324077 |
| concepts[1].display_name | Health records |
| concepts[2].id | https://openalex.org/C2780955771 |
| concepts[2].level | 2 |
| concepts[2].score | 0.5285802483558655 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q183425 |
| concepts[2].display_name | Abdominal pain |
| concepts[3].id | https://openalex.org/C204321447 |
| concepts[3].level | 1 |
| concepts[3].score | 0.5178885459899902 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[3].display_name | Natural language processing |
| concepts[4].id | https://openalex.org/C41008148 |
| concepts[4].level | 0 |
| concepts[4].score | 0.5025894641876221 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[4].display_name | Computer science |
| concepts[5].id | https://openalex.org/C195807954 |
| concepts[5].level | 2 |
| concepts[5].score | 0.4830361306667328 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q1662562 |
| concepts[5].display_name | Information extraction |
| concepts[6].id | https://openalex.org/C176217482 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4692765772342682 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q860554 |
| concepts[6].display_name | Metric (unit) |
| concepts[7].id | https://openalex.org/C163276114 |
| concepts[7].level | 2 |
| concepts[7].score | 0.468534380197525 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q484591 |
| concepts[7].display_name | Natural history |
| concepts[8].id | https://openalex.org/C154945302 |
| concepts[8].level | 1 |
| concepts[8].score | 0.44253674149513245 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[8].display_name | Artificial intelligence |
| concepts[9].id | https://openalex.org/C3020144179 |
| concepts[9].level | 3 |
| concepts[9].score | 0.4412826597690582 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q10871684 |
| concepts[9].display_name | Electronic health record |
| concepts[10].id | https://openalex.org/C2779134260 |
| concepts[10].level | 2 |
| concepts[10].score | 0.4392460286617279 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q12136 |
| concepts[10].display_name | Disease |
| concepts[11].id | https://openalex.org/C71924100 |
| concepts[11].level | 0 |
| concepts[11].score | 0.42219728231430054 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q11190 |
| concepts[11].display_name | Medicine |
| concepts[12].id | https://openalex.org/C3018949938 |
| concepts[12].level | 2 |
| concepts[12].score | 0.41778337955474854 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q17166101 |
| concepts[12].display_name | Text messaging |
| concepts[13].id | https://openalex.org/C160735492 |
| concepts[13].level | 2 |
| concepts[13].score | 0.2824733257293701 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q31207 |
| concepts[13].display_name | Health care |
| concepts[14].id | https://openalex.org/C142724271 |
| concepts[14].level | 1 |
| concepts[14].score | 0.1968657374382019 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q7208 |
| concepts[14].display_name | Pathology |
| concepts[15].id | https://openalex.org/C136764020 |
| concepts[15].level | 1 |
| concepts[15].score | 0.13964173197746277 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q466 |
| concepts[15].display_name | World Wide Web |
| concepts[16].id | https://openalex.org/C126322002 |
| concepts[16].level | 1 |
| concepts[16].score | 0.13063183426856995 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q11180 |
| concepts[16].display_name | Internal medicine |
| concepts[17].id | https://openalex.org/C21547014 |
| concepts[17].level | 1 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q1423657 |
| concepts[17].display_name | Operations management |
| concepts[18].id | https://openalex.org/C50522688 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q189833 |
| concepts[18].display_name | Economic growth |
| concepts[19].id | https://openalex.org/C162324750 |
| concepts[19].level | 0 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[19].display_name | Economics |
| keywords[0].id | https://openalex.org/keywords/diarrhea |
| keywords[0].score | 0.5992506146430969 |
| keywords[0].display_name | Diarrhea |
| keywords[1].id | https://openalex.org/keywords/health-records |
| keywords[1].score | 0.5387265682220459 |
| keywords[1].display_name | Health records |
| keywords[2].id | https://openalex.org/keywords/abdominal-pain |
| keywords[2].score | 0.5285802483558655 |
| keywords[2].display_name | Abdominal pain |
| keywords[3].id | https://openalex.org/keywords/natural-language-processing |
| keywords[3].score | 0.5178885459899902 |
| keywords[3].display_name | Natural language processing |
| keywords[4].id | https://openalex.org/keywords/computer-science |
| keywords[4].score | 0.5025894641876221 |
| keywords[4].display_name | Computer science |
| keywords[5].id | https://openalex.org/keywords/information-extraction |
| keywords[5].score | 0.4830361306667328 |
| keywords[5].display_name | Information extraction |
| keywords[6].id | https://openalex.org/keywords/metric |
| keywords[6].score | 0.4692765772342682 |
| keywords[6].display_name | Metric (unit) |
| keywords[7].id | https://openalex.org/keywords/natural-history |
| keywords[7].score | 0.468534380197525 |
| keywords[7].display_name | Natural history |
| keywords[8].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[8].score | 0.44253674149513245 |
| keywords[8].display_name | Artificial intelligence |
| keywords[9].id | https://openalex.org/keywords/electronic-health-record |
| keywords[9].score | 0.4412826597690582 |
| keywords[9].display_name | Electronic health record |
| keywords[10].id | https://openalex.org/keywords/disease |
| keywords[10].score | 0.4392460286617279 |
| keywords[10].display_name | Disease |
| keywords[11].id | https://openalex.org/keywords/medicine |
| keywords[11].score | 0.42219728231430054 |
| keywords[11].display_name | Medicine |
| keywords[12].id | https://openalex.org/keywords/text-messaging |
| keywords[12].score | 0.41778337955474854 |
| keywords[12].display_name | Text messaging |
| keywords[13].id | https://openalex.org/keywords/health-care |
| keywords[13].score | 0.2824733257293701 |
| keywords[13].display_name | Health care |
| keywords[14].id | https://openalex.org/keywords/pathology |
| keywords[14].score | 0.1968657374382019 |
| keywords[14].display_name | Pathology |
| keywords[15].id | https://openalex.org/keywords/world-wide-web |
| keywords[15].score | 0.13964173197746277 |
| keywords[15].display_name | World Wide Web |
| keywords[16].id | https://openalex.org/keywords/internal-medicine |
| keywords[16].score | 0.13063183426856995 |
| keywords[16].display_name | Internal medicine |
| language | en |
| locations[0].id | doi:10.1101/2024.09.05.24313139 |
| locations[0].is_oa | False |
| locations[0].source.id | https://openalex.org/S4306402567 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | bioRxiv (Cold Spring Harbor Laboratory) |
| locations[0].source.host_organization | https://openalex.org/I2750212522 |
| locations[0].source.host_organization_name | Cold Spring Harbor Laboratory |
| locations[0].source.host_organization_lineage | https://openalex.org/I2750212522 |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | acceptedVersion |
| locations[0].raw_type | posted-content |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.1101/2024.09.05.24313139 |
| locations[1].id | pmid:39281744 |
| locations[1].is_oa | False |
| locations[1].source.id | https://openalex.org/S4306525036 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | PubMed |
| locations[1].source.host_organization | https://openalex.org/I1299303238 |
| locations[1].source.host_organization_name | National Institutes of Health |
| locations[1].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | publishedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | True |
| locations[1].is_published | True |
| locations[1].raw_source_name | medRxiv : the preprint server for health sciences |
| locations[1].landing_page_url | https://pubmed.ncbi.nlm.nih.gov/39281744 |
| locations[2].id | pmh:oai:pubmedcentral.nih.gov:11398594 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S2764455111 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | False |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | PubMed Central |
| locations[2].source.host_organization | https://openalex.org/I1299303238 |
| locations[2].source.host_organization_name | National Institutes of Health |
| locations[2].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[2].license | |
| locations[2].pdf_url | |
| locations[2].version | submittedVersion |
| locations[2].raw_type | Text |
| locations[2].license_id | |
| locations[2].is_accepted | False |
| locations[2].is_published | False |
| locations[2].raw_source_name | medRxiv |
| locations[2].landing_page_url | https://www.ncbi.nlm.nih.gov/pmc/articles/11398594 |
| indexed_in | crossref, pubmed |
| authorships[0].author.id | https://openalex.org/A5010663592 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-8287-3424 |
| authorships[0].author.display_name | Perseus V. Patel |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I97018004 |
| authorships[0].affiliations[0].raw_affiliation_string | Division of Pediatric Gastroenterology, Stanford University School of Medicine, Palo Alto, CA. |
| authorships[0].affiliations[1].institution_ids | https://openalex.org/I180670191 |
| authorships[0].affiliations[1].raw_affiliation_string | Department of Pediatrics, University of California San Francisco, San Francisco, CA. |
| authorships[0].institutions[0].id | https://openalex.org/I97018004 |
| authorships[0].institutions[0].ror | https://ror.org/00f54p054 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I97018004 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | Stanford University |
| authorships[0].institutions[1].id | https://openalex.org/I180670191 |
| authorships[0].institutions[1].ror | https://ror.org/043mz5j54 |
| authorships[0].institutions[1].type | education |
| authorships[0].institutions[1].lineage | https://openalex.org/I180670191 |
| authorships[0].institutions[1].country_code | US |
| authorships[0].institutions[1].display_name | University of California, San Francisco |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Perseus V Patel |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Department of Pediatrics, University of California San Francisco, San Francisco, CA., Division of Pediatric Gastroenterology, Stanford University School of Medicine, Palo Alto, CA. |
| authorships[1].author.id | https://openalex.org/A5111188339 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Conner Davis |
| authorships[1].countries | US |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I180670191 |
| authorships[1].affiliations[0].raw_affiliation_string | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA. |
| authorships[1].institutions[0].id | https://openalex.org/I180670191 |
| authorships[1].institutions[0].ror | https://ror.org/043mz5j54 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I180670191 |
| authorships[1].institutions[0].country_code | US |
| authorships[1].institutions[0].display_name | University of California, San Francisco |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Conner Davis |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA. |
| authorships[2].author.id | https://openalex.org/A5018929776 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Amariel Ralbovsky |
| authorships[2].countries | US |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I180670191 |
| authorships[2].affiliations[0].raw_affiliation_string | Department of Pediatrics, University of California San Francisco, San Francisco, CA. |
| authorships[2].institutions[0].id | https://openalex.org/I180670191 |
| authorships[2].institutions[0].ror | https://ror.org/043mz5j54 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I180670191 |
| authorships[2].institutions[0].country_code | US |
| authorships[2].institutions[0].display_name | University of California, San Francisco |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Amariel Ralbovsky |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | Department of Pediatrics, University of California San Francisco, San Francisco, CA. |
| authorships[3].author.id | https://openalex.org/A5113767672 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Daniel Tinoco |
| authorships[3].countries | US |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I180670191 |
| authorships[3].affiliations[0].raw_affiliation_string | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA. |
| authorships[3].institutions[0].id | https://openalex.org/I180670191 |
| authorships[3].institutions[0].ror | https://ror.org/043mz5j54 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I180670191 |
| authorships[3].institutions[0].country_code | US |
| authorships[3].institutions[0].display_name | University of California, San Francisco |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Daniel Tinoco |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA. |
| authorships[4].author.id | https://openalex.org/A5052287559 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-8867-1623 |
| authorships[4].author.display_name | Christopher Y. K. Williams |
| authorships[4].countries | US |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I180670191 |
| authorships[4].affiliations[0].raw_affiliation_string | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA. |
| authorships[4].institutions[0].id | https://openalex.org/I180670191 |
| authorships[4].institutions[0].ror | https://ror.org/043mz5j54 |
| authorships[4].institutions[0].type | education |
| authorships[4].institutions[0].lineage | https://openalex.org/I180670191 |
| authorships[4].institutions[0].country_code | US |
| authorships[4].institutions[0].display_name | University of California, San Francisco |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Christopher Y K Williams |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA. |
| authorships[5].author.id | https://openalex.org/A5033274345 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Shadera Slatter |
| authorships[5].countries | US |
| authorships[5].affiliations[0].institution_ids | https://openalex.org/I180670191 |
| authorships[5].affiliations[0].raw_affiliation_string | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA. |
| authorships[5].institutions[0].id | https://openalex.org/I180670191 |
| authorships[5].institutions[0].ror | https://ror.org/043mz5j54 |
| authorships[5].institutions[0].type | education |
| authorships[5].institutions[0].lineage | https://openalex.org/I180670191 |
| authorships[5].institutions[0].country_code | US |
| authorships[5].institutions[0].display_name | University of California, San Francisco |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Shadera Slatter |
| authorships[5].is_corresponding | False |
| authorships[5].raw_affiliation_strings | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA. |
| authorships[6].author.id | https://openalex.org/A5096260726 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Behzad Naderalvojoud |
| authorships[6].countries | US |
| authorships[6].affiliations[0].institution_ids | https://openalex.org/I4210137306 |
| authorships[6].affiliations[0].raw_affiliation_string | Stanford Center for Biomedical Informatics Research, Department of Medicine, StanfordUniversity, Palo Alto, CA. |
| authorships[6].institutions[0].id | https://openalex.org/I4210137306 |
| authorships[6].institutions[0].ror | https://ror.org/03mtd9a03 |
| authorships[6].institutions[0].type | healthcare |
| authorships[6].institutions[0].lineage | https://openalex.org/I4210137306, https://openalex.org/I97018004 |
| authorships[6].institutions[0].country_code | US |
| authorships[6].institutions[0].display_name | Stanford Medicine |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Behzad Naderalvojoud |
| authorships[6].is_corresponding | False |
| authorships[6].raw_affiliation_strings | Stanford Center for Biomedical Informatics Research, Department of Medicine, StanfordUniversity, Palo Alto, CA. |
| authorships[7].author.id | https://openalex.org/A5053271325 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-8842-6692 |
| authorships[7].author.display_name | Michael Rosen |
| authorships[7].countries | US |
| authorships[7].affiliations[0].institution_ids | https://openalex.org/I97018004 |
| authorships[7].affiliations[0].raw_affiliation_string | Division of Pediatric Gastroenterology, Stanford University School of Medicine, Palo Alto, CA. |
| authorships[7].institutions[0].id | https://openalex.org/I97018004 |
| authorships[7].institutions[0].ror | https://ror.org/00f54p054 |
| authorships[7].institutions[0].type | education |
| authorships[7].institutions[0].lineage | https://openalex.org/I97018004 |
| authorships[7].institutions[0].country_code | US |
| authorships[7].institutions[0].display_name | Stanford University |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Michael J Rosen |
| authorships[7].is_corresponding | False |
| authorships[7].raw_affiliation_strings | Division of Pediatric Gastroenterology, Stanford University School of Medicine, Palo Alto, CA. |
| authorships[8].author.id | https://openalex.org/A5023944464 |
| authorships[8].author.orcid | https://orcid.org/0000-0001-6553-3455 |
| authorships[8].author.display_name | Tina Hernandez‐Boussard |
| authorships[8].countries | US |
| authorships[8].affiliations[0].institution_ids | https://openalex.org/I4210137306 |
| authorships[8].affiliations[0].raw_affiliation_string | Stanford Center for Biomedical Informatics Research, Department of Medicine, StanfordUniversity, Palo Alto, CA. |
| authorships[8].institutions[0].id | https://openalex.org/I4210137306 |
| authorships[8].institutions[0].ror | https://ror.org/03mtd9a03 |
| authorships[8].institutions[0].type | healthcare |
| authorships[8].institutions[0].lineage | https://openalex.org/I4210137306, https://openalex.org/I97018004 |
| authorships[8].institutions[0].country_code | US |
| authorships[8].institutions[0].display_name | Stanford Medicine |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Tina Hernandez-Boussard |
| authorships[8].is_corresponding | False |
| authorships[8].raw_affiliation_strings | Stanford Center for Biomedical Informatics Research, Department of Medicine, StanfordUniversity, Palo Alto, CA. |
| authorships[9].author.id | https://openalex.org/A5073538344 |
| authorships[9].author.orcid | https://orcid.org/0000-0003-1789-3004 |
| authorships[9].author.display_name | Vivek A. Rudrapatna |
| authorships[9].countries | US |
| authorships[9].affiliations[0].institution_ids | https://openalex.org/I180670191 |
| authorships[9].affiliations[0].raw_affiliation_string | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA. |
| authorships[9].affiliations[1].institution_ids | https://openalex.org/I180670191 |
| authorships[9].affiliations[1].raw_affiliation_string | Division of Gastroenterology, Department of Medicine, University of California San Francisco,San Francisco, CA. |
| authorships[9].institutions[0].id | https://openalex.org/I180670191 |
| authorships[9].institutions[0].ror | https://ror.org/043mz5j54 |
| authorships[9].institutions[0].type | education |
| authorships[9].institutions[0].lineage | https://openalex.org/I180670191 |
| authorships[9].institutions[0].country_code | US |
| authorships[9].institutions[0].display_name | University of California, San Francisco |
| authorships[9].author_position | last |
| authorships[9].raw_author_name | Vivek Rudrapatna |
| authorships[9].is_corresponding | False |
| authorships[9].raw_affiliation_strings | Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA., Division of Gastroenterology, Department of Medicine, University of California San Francisco,San Francisco, CA. |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.ncbi.nlm.nih.gov/pmc/articles/11398594 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Large language models outperform traditional natural language processing methods in extracting patient-reported outcomes in IBD |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T12578 |
| primary_topic.field.id | https://openalex.org/fields/27 |
| primary_topic.field.display_name | Medicine |
| primary_topic.score | 0.9865999817848206 |
| primary_topic.domain.id | https://openalex.org/domains/4 |
| primary_topic.domain.display_name | Health Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2746 |
| primary_topic.subfield.display_name | Surgery |
| primary_topic.display_name | Diagnosis and treatment of tuberculosis |
| related_works | https://openalex.org/W187932805, https://openalex.org/W4392490004, https://openalex.org/W1641026212, https://openalex.org/W4402738807, https://openalex.org/W2911982698, https://openalex.org/W2323588885, https://openalex.org/W3047677938, https://openalex.org/W2087134418, https://openalex.org/W2078646730, https://openalex.org/W4312053962 |
| cited_by_count | 0 |
| locations_count | 3 |
| best_oa_location.id | pmh:oai:pubmedcentral.nih.gov:11398594 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S2764455111 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | PubMed Central |
| best_oa_location.source.host_organization | https://openalex.org/I1299303238 |
| best_oa_location.source.host_organization_name | National Institutes of Health |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I1299303238 |
| best_oa_location.license | |
| best_oa_location.pdf_url | |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | Text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | medRxiv |
| best_oa_location.landing_page_url | https://www.ncbi.nlm.nih.gov/pmc/articles/11398594 |
| primary_location.id | doi:10.1101/2024.09.05.24313139 |
| primary_location.is_oa | False |
| primary_location.source.id | https://openalex.org/S4306402567 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | bioRxiv (Cold Spring Harbor Laboratory) |
| primary_location.source.host_organization | https://openalex.org/I2750212522 |
| primary_location.source.host_organization_name | Cold Spring Harbor Laboratory |
| primary_location.source.host_organization_lineage | https://openalex.org/I2750212522 |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | acceptedVersion |
| primary_location.raw_type | posted-content |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.1101/2024.09.05.24313139 |
| publication_date | 2024-09-06 |
| publication_year | 2024 |
| referenced_works | https://openalex.org/W2019150265, https://openalex.org/W2239629203, https://openalex.org/W3115410759, https://openalex.org/W2033081617, https://openalex.org/W4280525579, https://openalex.org/W4392619666, https://openalex.org/W4390191984, https://openalex.org/W3169284847, https://openalex.org/W4390745503, https://openalex.org/W3022060579, https://openalex.org/W2109056977, https://openalex.org/W2803423890, https://openalex.org/W4282024093, https://openalex.org/W3086703480 |
| referenced_works_count | 14 |
| abstract_inverted_index.No | 207 |
| abstract_inverted_index.On | 150, 183 |
| abstract_inverted_index.We | 35, 121 |
| abstract_inverted_index.at | 106, 118, 186 |
| abstract_inverted_index.in | 9, 16, 44, 73, 234 |
| abstract_inverted_index.is | 33 |
| abstract_inverted_index.it | 50 |
| abstract_inverted_index.of | 24, 30, 109, 162, 241 |
| abstract_inverted_index.on | 127, 212 |
| abstract_inverted_index.or | 214 |
| abstract_inverted_index.to | 37, 61, 173, 192, 247 |
| abstract_inverted_index.we | 137 |
| abstract_inverted_index.80% | 169 |
| abstract_inverted_index.82% | 166 |
| abstract_inverted_index.90% | 180 |
| abstract_inverted_index.92% | 163 |
| abstract_inverted_index.IBD | 76, 249 |
| abstract_inverted_index.PRO | 96 |
| abstract_inverted_index.San | 111 |
| abstract_inverted_index.and | 2, 13, 55, 68, 103, 114, 124, 132, 140, 168, 179, 202, 220, 237, 251 |
| abstract_inverted_index.are | 7, 218 |
| abstract_inverted_index.for | 53, 94, 223 |
| abstract_inverted_index.has | 244 |
| abstract_inverted_index.set | 154 |
| abstract_inverted_index.the | 28, 45, 107, 151, 156, 245 |
| abstract_inverted_index.two | 87 |
| abstract_inverted_index.was | 148, 176 |
| abstract_inverted_index.88%, | 178 |
| abstract_inverted_index.96%, | 177 |
| abstract_inverted_index.Aims | 3 |
| abstract_inverted_index.LLMs | 217 |
| abstract_inverted_index.PROs | 26, 77 |
| abstract_inverted_index.They | 226 |
| abstract_inverted_index.This | 58 |
| abstract_inverted_index.UCSF | 152 |
| abstract_inverted_index.data | 39 |
| abstract_inverted_index.each | 95 |
| abstract_inverted_index.from | 27, 41, 83 |
| abstract_inverted_index.more | 51 |
| abstract_inverted_index.note | 235 |
| abstract_inverted_index.such | 242 |
| abstract_inverted_index.tNLP | 123, 158, 189 |
| abstract_inverted_index.test | 153 |
| abstract_inverted_index.then | 115 |
| abstract_inverted_index.were | 92, 101, 209 |
| abstract_inverted_index.GPT-4 | 197, 203 |
| abstract_inverted_index.PROs. | 225 |
| abstract_inverted_index.aimed | 36, 60 |
| abstract_inverted_index.based | 211 |
| abstract_inverted_index.bowel | 18 |
| abstract_inverted_index.care. | 253 |
| abstract_inverted_index.error | 141 |
| abstract_inverted_index.fecal | 81 |
| abstract_inverted_index.large | 69 |
| abstract_inverted_index.notes | 32, 85, 91 |
| abstract_inverted_index.pain, | 79 |
| abstract_inverted_index.study | 59 |
| abstract_inverted_index.these | 25 |
| abstract_inverted_index.three | 75 |
| abstract_inverted_index.tools | 243 |
| abstract_inverted_index.using | 97 |
| abstract_inverted_index.vital | 8 |
| abstract_inverted_index.which | 175 |
| abstract_inverted_index.while | 196 |
| abstract_inverted_index.(IBD). | 20 |
| abstract_inverted_index.(LLMs) | 72 |
| abstract_inverted_index.(PROs) | 6 |
| abstract_inverted_index.(fecal | 170 |
| abstract_inverted_index.(tNLP) | 67 |
| abstract_inverted_index.Clinic | 90 |
| abstract_inverted_index.GPT-4, | 174 |
| abstract_inverted_index.Models | 100 |
| abstract_inverted_index.PaLM-2 | 201 |
| abstract_inverted_index.across | 86, 230 |
| abstract_inverted_index.biases | 208 |
| abstract_inverted_index.blood) | 82 |
| abstract_inverted_index.failed | 191 |
| abstract_inverted_index.health | 47 |
| abstract_inverted_index.making | 49 |
| abstract_inverted_index.manual | 22 |
| abstract_inverted_index.models | 71, 126, 159, 190 |
| abstract_inverted_index.pain), | 165 |
| abstract_inverted_index.preset | 98 |
| abstract_inverted_index.showed | 204 |
| abstract_inverted_index.tested | 105 |
| abstract_inverted_index.value. | 135 |
| abstract_inverted_index.(61-62% | 194 |
| abstract_inverted_index.(UCSF), | 113 |
| abstract_inverted_index.(n=50), | 155 |
| abstract_inverted_index.Methods | 89 |
| abstract_inverted_index.Results | 143 |
| abstract_inverted_index.between | 146 |
| abstract_inverted_index.blood), | 171 |
| abstract_inverted_index.compare | 62 |
| abstract_inverted_index.despite | 232 |
| abstract_inverted_index.disease | 11, 19 |
| abstract_inverted_index.enhance | 248 |
| abstract_inverted_index.improve | 38 |
| abstract_inverted_index.methods | 222 |
| abstract_inverted_index.natural | 64 |
| abstract_inverted_index.patient | 252 |
| abstract_inverted_index.quality | 56 |
| abstract_inverted_index.record, | 48 |
| abstract_inverted_index.similar | 205 |
| abstract_inverted_index.>90%. | 149, 200 |
| abstract_inverted_index.(n=250), | 188 |
| abstract_inverted_index.Abstract | 0 |
| abstract_inverted_index.However, | 21 |
| abstract_inverted_index.Stanford | 119, 187 |
| abstract_inverted_index.accuracy | 229 |
| abstract_inverted_index.accurate | 219 |
| abstract_inverted_index.activity | 12 |
| abstract_inverted_index.adoption | 240 |
| abstract_inverted_index.authors. | 238 |
| abstract_inverted_index.clinical | 31, 84 |
| abstract_inverted_index.compared | 122 |
| abstract_inverted_index.curation | 40 |
| abstract_inverted_index.detected | 210 |
| abstract_inverted_index.external | 184 |
| abstract_inverted_index.fairness | 139 |
| abstract_inverted_index.language | 65, 70 |
| abstract_inverted_index.maintain | 227 |
| abstract_inverted_index.negative | 133 |
| abstract_inverted_index.outcomes | 5, 15 |
| abstract_inverted_index.positive | 131 |
| abstract_inverted_index.research | 54, 250 |
| abstract_inverted_index.Francisco | 112 |
| abstract_inverted_index.LLM-based | 125 |
| abstract_inverted_index.accuracy) | 195 |
| abstract_inverted_index.accuracy, | 128 |
| abstract_inverted_index.accurate, | 181 |
| abstract_inverted_index.annotated | 93 |
| abstract_inverted_index.assessing | 10 |
| abstract_inverted_index.available | 52 |
| abstract_inverted_index.conducted | 138 |
| abstract_inverted_index.developed | 102 |
| abstract_inverted_index.diarrhea, | 80 |
| abstract_inverted_index.excellent | 228 |
| abstract_inverted_index.free-text | 29, 42 |
| abstract_inverted_index.potential | 246 |
| abstract_inverted_index.showcased | 160 |
| abstract_inverted_index.templates | 236 |
| abstract_inverted_index.treatment | 14 |
| abstract_inverted_index.validated | 117 |
| abstract_inverted_index.(abdominal | 78, 164 |
| abstract_inverted_index.(diarrhea) | 167 |
| abstract_inverted_index.Background | 1 |
| abstract_inverted_index.California | 110 |
| abstract_inverted_index.University | 108 |
| abstract_inverted_index.Widespread | 239 |
| abstract_inverted_index.accuracies | 161, 199 |
| abstract_inverted_index.annotators | 147 |
| abstract_inverted_index.comparable | 172 |
| abstract_inverted_index.diagnosis. | 215 |
| abstract_inverted_index.electronic | 46 |
| abstract_inverted_index.externally | 116 |
| abstract_inverted_index.extracting | 74, 224 |
| abstract_inverted_index.extraction | 23 |
| abstract_inverted_index.generalize | 193 |
| abstract_inverted_index.internally | 104 |
| abstract_inverted_index.maintained | 198 |
| abstract_inverted_index.predictive | 134 |
| abstract_inverted_index.processing | 66 |
| abstract_inverted_index.protocols. | 99 |
| abstract_inverted_index.validation | 185 |
| abstract_inverted_index.Conclusions | 216 |
| abstract_inverted_index.Inter-rater | 144 |
| abstract_inverted_index.University. | 120 |
| abstract_inverted_index.burdensome. | 34 |
| abstract_inverted_index.information | 43 |
| abstract_inverted_index.reliability | 145 |
| abstract_inverted_index.traditional | 63 |
| abstract_inverted_index.assessments. | 142 |
| abstract_inverted_index.demographics | 213 |
| abstract_inverted_index.improvement. | 57 |
| abstract_inverted_index.inflammatory | 17 |
| abstract_inverted_index.performance. | 206 |
| abstract_inverted_index.sensitivity, | 129 |
| abstract_inverted_index.specificity, | 130 |
| abstract_inverted_index.Additionally, | 136 |
| abstract_inverted_index.generalizable | 221 |
| abstract_inverted_index.heterogeneity | 233 |
| abstract_inverted_index.institutions, | 231 |
| abstract_inverted_index.institutions. | 88 |
| abstract_inverted_index.respectively. | 182 |
| abstract_inverted_index.top-performing | 157 |
| abstract_inverted_index.Patient-reported | 4 |
| cited_by_percentile_year | |
| countries_distinct_count | 1 |
| institutions_distinct_count | 10 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.8100000023841858 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile.value | 0.30701341 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |