Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.17615/zf8m-w435
Recently, ChatGPT and GPT-4 have emerged and gained immense global attention due to their unparalleled performance in language processing. Despite demonstrating impressive capability in various open-domain tasks, their adequacy in highly specific fields like radiology remains untested. Radiology presents unique linguistic phenomena distinct from open-domain data due to its specificity and complexity. Assessing the performance of large language models (LLMs) in such specific domains is crucial not only for a thorough evaluation of their overall performance but also for providing valuable insights into future model design directions: whether model design should be generic or domain-specific. To this end, in this study, we evaluate the performance of ChatGPT/GPT-4 on a radiology natural language inference (NLI) task and compare it to other models fine-tuned specifically on task-related data samples. We also conduct a comprehensive investigation on ChatGPT/GPT-4's reasoning ability by introducing varying levels of inference difficulty. Our results show that 1) ChatGPT and GPT-4 outperform other LLMs in the radiology NLI task; 2) other specifically fine-tuned Bert-based models require significant amounts of data samples to achieve comparable performance to ChatGPT/GPT-4. These findings not only demonstrate the feasibility and promise of constructing a generic model capable of addressing various tasks across different domains, but also highlight several key factors crucial for developing a unified model, particularly in a medical context, paving the way for future artificial general intelligence (AGI) systems. We release our code and data to the research community$^\ddagger$.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.17615/zf8m-w435
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416023466
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416023466Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.17615/zf8m-w435Digital Object Identifier
- Title
-
Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI TaskWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-02-20Full publication date if available
- Authors
-
Zihao Wu, Lu Zhang, Chao Cao, Xiaowei Yu, Zhengliang Liu, Lin Zhao, Yiwei Li, Haixing Dai, Chong Ma, Gang Li, Wei Liu, Quanzheng Li, Dinggang Shen, Xiang Li, Dajiang Zhu, Tianming LiuList of authors in order
- Landing page
-
https://doi.org/10.17615/zf8m-w435Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://doi.org/10.17615/zf8m-w435Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416023466 |
|---|---|
| doi | https://doi.org/10.17615/zf8m-w435 |
| ids.doi | https://doi.org/10.17615/zf8m-w435 |
| ids.openalex | https://openalex.org/W4416023466 |
| fwci | |
| type | article |
| title | Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | doi:10.17615/zf8m-w435 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S7407051488 |
| locations[0].source.type | repository |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | UNC Libraries |
| locations[0].source.host_organization | |
| locations[0].source.host_organization_name | |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | |
| locations[0].raw_type | article-journal |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.17615/zf8m-w435 |
| indexed_in | datacite |
| authorships[0].author.id | https://openalex.org/A5037769489 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-4389-2980 |
| authorships[0].author.display_name | Zihao Wu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Wu, Zihao |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5035359666 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-8859-5453 |
| authorships[1].author.display_name | Lu Zhang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zhang, Lu |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101308694 |
| authorships[2].author.orcid | https://orcid.org/0009-0009-6844-519X |
| authorships[2].author.display_name | Chao Cao |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Cao, Chao |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5062577958 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-5599-8424 |
| authorships[3].author.display_name | Xiaowei Yu |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Yu, Xiaowei |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5101505879 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-7061-6714 |
| authorships[4].author.display_name | Zhengliang Liu |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Liu, Zhengliang |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5100351935 |
| authorships[5].author.orcid | https://orcid.org/0000-0003-2567-627X |
| authorships[5].author.display_name | Lin Zhao |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Zhao, Lin |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5100633406 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-5203-0290 |
| authorships[6].author.display_name | Yiwei Li |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Li, Yiwei |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5041058074 |
| authorships[7].author.orcid | https://orcid.org/0000-0003-0409-6129 |
| authorships[7].author.display_name | Haixing Dai |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Dai, Haixing |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5100410928 |
| authorships[8].author.orcid | https://orcid.org/0009-0001-8192-9676 |
| authorships[8].author.display_name | Chong Ma |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Ma, Chong |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5100438653 |
| authorships[9].author.orcid | https://orcid.org/0000-0001-9585-1382 |
| authorships[9].author.display_name | Gang Li |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Li, Gang |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5100431708 |
| authorships[10].author.orcid | https://orcid.org/0000-0001-9475-6455 |
| authorships[10].author.display_name | Wei Liu |
| authorships[10].author_position | middle |
| authorships[10].raw_author_name | Liu, Wei |
| authorships[10].is_corresponding | False |
| authorships[11].author.id | https://openalex.org/A5058429770 |
| authorships[11].author.orcid | https://orcid.org/0000-0002-9651-5820 |
| authorships[11].author.display_name | Quanzheng Li |
| authorships[11].author_position | middle |
| authorships[11].raw_author_name | Li, Quanzheng |
| authorships[11].is_corresponding | False |
| authorships[12].author.id | https://openalex.org/A5000937401 |
| authorships[12].author.orcid | https://orcid.org/0000-0002-7934-5698 |
| authorships[12].author.display_name | Dinggang Shen |
| authorships[12].author_position | middle |
| authorships[12].raw_author_name | Shen, Dinggang |
| authorships[12].is_corresponding | False |
| authorships[13].author.id | https://openalex.org/A5100343095 |
| authorships[13].author.orcid | https://orcid.org/0000-0002-3423-5065 |
| authorships[13].author.display_name | Xiang Li |
| authorships[13].author_position | middle |
| authorships[13].raw_author_name | Li, Xiang |
| authorships[13].is_corresponding | False |
| authorships[14].author.id | https://openalex.org/A5038582366 |
| authorships[14].author.orcid | https://orcid.org/0000-0002-6940-3911 |
| authorships[14].author.display_name | Dajiang Zhu |
| authorships[14].author_position | middle |
| authorships[14].raw_author_name | Zhu, Dajiang |
| authorships[14].is_corresponding | False |
| authorships[15].author.id | https://openalex.org/A5100647159 |
| authorships[15].author.orcid | |
| authorships[15].author.display_name | Tianming Liu |
| authorships[15].author_position | last |
| authorships[15].raw_author_name | Liu, Tianming |
| authorships[15].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://doi.org/10.17615/zf8m-w435 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-09T23:09:16.995542 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.17615/zf8m-w435 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S7407051488 |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | UNC Libraries |
| best_oa_location.source.host_organization | |
| best_oa_location.source.host_organization_name | |
| best_oa_location.license | |
| best_oa_location.pdf_url | |
| best_oa_location.version | |
| best_oa_location.raw_type | article-journal |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.17615/zf8m-w435 |
| primary_location.id | doi:10.17615/zf8m-w435 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S7407051488 |
| primary_location.source.type | repository |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | UNC Libraries |
| primary_location.source.host_organization | |
| primary_location.source.host_organization_name | |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | |
| primary_location.raw_type | article-journal |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.17615/zf8m-w435 |
| publication_date | 2025-02-20 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 69, 108, 130, 189, 209, 214 |
| abstract_inverted_index.1) | 148 |
| abstract_inverted_index.2) | 160 |
| abstract_inverted_index.To | 95 |
| abstract_inverted_index.We | 127, 227 |
| abstract_inverted_index.be | 91 |
| abstract_inverted_index.by | 137 |
| abstract_inverted_index.in | 16, 23, 29, 60, 98, 155, 213 |
| abstract_inverted_index.is | 64 |
| abstract_inverted_index.it | 117 |
| abstract_inverted_index.of | 55, 72, 105, 141, 169, 187, 193 |
| abstract_inverted_index.on | 107, 123, 133 |
| abstract_inverted_index.or | 93 |
| abstract_inverted_index.to | 12, 47, 118, 172, 176, 233 |
| abstract_inverted_index.we | 101 |
| abstract_inverted_index.NLI | 158 |
| abstract_inverted_index.Our | 144 |
| abstract_inverted_index.and | 2, 6, 50, 115, 150, 185, 231 |
| abstract_inverted_index.but | 76, 200 |
| abstract_inverted_index.due | 11, 46 |
| abstract_inverted_index.for | 68, 78, 207, 220 |
| abstract_inverted_index.its | 48 |
| abstract_inverted_index.key | 204 |
| abstract_inverted_index.not | 66, 180 |
| abstract_inverted_index.our | 229 |
| abstract_inverted_index.the | 53, 103, 156, 183, 218, 234 |
| abstract_inverted_index.way | 219 |
| abstract_inverted_index.LLMs | 154 |
| abstract_inverted_index.also | 77, 128, 201 |
| abstract_inverted_index.code | 230 |
| abstract_inverted_index.data | 45, 125, 170, 232 |
| abstract_inverted_index.end, | 97 |
| abstract_inverted_index.from | 43 |
| abstract_inverted_index.have | 4 |
| abstract_inverted_index.into | 82 |
| abstract_inverted_index.like | 33 |
| abstract_inverted_index.only | 67, 181 |
| abstract_inverted_index.show | 146 |
| abstract_inverted_index.such | 61 |
| abstract_inverted_index.task | 114 |
| abstract_inverted_index.that | 147 |
| abstract_inverted_index.this | 96, 99 |
| abstract_inverted_index.(AGI) | 225 |
| abstract_inverted_index.(NLI) | 113 |
| abstract_inverted_index.GPT-4 | 3, 151 |
| abstract_inverted_index.These | 178 |
| abstract_inverted_index.large | 56 |
| abstract_inverted_index.model | 84, 88, 191 |
| abstract_inverted_index.other | 119, 153, 161 |
| abstract_inverted_index.task; | 159 |
| abstract_inverted_index.tasks | 196 |
| abstract_inverted_index.their | 13, 27, 73 |
| abstract_inverted_index.(LLMs) | 59 |
| abstract_inverted_index.across | 197 |
| abstract_inverted_index.design | 85, 89 |
| abstract_inverted_index.fields | 32 |
| abstract_inverted_index.future | 83, 221 |
| abstract_inverted_index.gained | 7 |
| abstract_inverted_index.global | 9 |
| abstract_inverted_index.highly | 30 |
| abstract_inverted_index.levels | 140 |
| abstract_inverted_index.model, | 211 |
| abstract_inverted_index.models | 58, 120, 165 |
| abstract_inverted_index.paving | 217 |
| abstract_inverted_index.should | 90 |
| abstract_inverted_index.study, | 100 |
| abstract_inverted_index.tasks, | 26 |
| abstract_inverted_index.unique | 39 |
| abstract_inverted_index.ChatGPT | 1, 149 |
| abstract_inverted_index.Despite | 19 |
| abstract_inverted_index.ability | 136 |
| abstract_inverted_index.achieve | 173 |
| abstract_inverted_index.amounts | 168 |
| abstract_inverted_index.capable | 192 |
| abstract_inverted_index.compare | 116 |
| abstract_inverted_index.conduct | 129 |
| abstract_inverted_index.crucial | 65, 206 |
| abstract_inverted_index.domains | 63 |
| abstract_inverted_index.emerged | 5 |
| abstract_inverted_index.factors | 205 |
| abstract_inverted_index.general | 223 |
| abstract_inverted_index.generic | 92, 190 |
| abstract_inverted_index.immense | 8 |
| abstract_inverted_index.medical | 215 |
| abstract_inverted_index.natural | 110 |
| abstract_inverted_index.overall | 74 |
| abstract_inverted_index.promise | 186 |
| abstract_inverted_index.release | 228 |
| abstract_inverted_index.remains | 35 |
| abstract_inverted_index.require | 166 |
| abstract_inverted_index.results | 145 |
| abstract_inverted_index.samples | 171 |
| abstract_inverted_index.several | 203 |
| abstract_inverted_index.unified | 210 |
| abstract_inverted_index.various | 24, 195 |
| abstract_inverted_index.varying | 139 |
| abstract_inverted_index.whether | 87 |
| abstract_inverted_index.adequacy | 28 |
| abstract_inverted_index.context, | 216 |
| abstract_inverted_index.distinct | 42 |
| abstract_inverted_index.domains, | 199 |
| abstract_inverted_index.evaluate | 102 |
| abstract_inverted_index.findings | 179 |
| abstract_inverted_index.insights | 81 |
| abstract_inverted_index.language | 17, 57, 111 |
| abstract_inverted_index.presents | 38 |
| abstract_inverted_index.research | 235 |
| abstract_inverted_index.samples. | 126 |
| abstract_inverted_index.specific | 31, 62 |
| abstract_inverted_index.systems. | 226 |
| abstract_inverted_index.thorough | 70 |
| abstract_inverted_index.valuable | 80 |
| abstract_inverted_index.Assessing | 52 |
| abstract_inverted_index.Radiology | 37 |
| abstract_inverted_index.Recently, | 0 |
| abstract_inverted_index.attention | 10 |
| abstract_inverted_index.different | 198 |
| abstract_inverted_index.highlight | 202 |
| abstract_inverted_index.inference | 112, 142 |
| abstract_inverted_index.phenomena | 41 |
| abstract_inverted_index.providing | 79 |
| abstract_inverted_index.radiology | 34, 109, 157 |
| abstract_inverted_index.reasoning | 135 |
| abstract_inverted_index.untested. | 36 |
| abstract_inverted_index.Bert-based | 164 |
| abstract_inverted_index.addressing | 194 |
| abstract_inverted_index.artificial | 222 |
| abstract_inverted_index.capability | 22 |
| abstract_inverted_index.comparable | 174 |
| abstract_inverted_index.developing | 208 |
| abstract_inverted_index.evaluation | 71 |
| abstract_inverted_index.fine-tuned | 121, 163 |
| abstract_inverted_index.impressive | 21 |
| abstract_inverted_index.linguistic | 40 |
| abstract_inverted_index.outperform | 152 |
| abstract_inverted_index.complexity. | 51 |
| abstract_inverted_index.demonstrate | 182 |
| abstract_inverted_index.difficulty. | 143 |
| abstract_inverted_index.directions: | 86 |
| abstract_inverted_index.feasibility | 184 |
| abstract_inverted_index.introducing | 138 |
| abstract_inverted_index.open-domain | 25, 44 |
| abstract_inverted_index.performance | 15, 54, 75, 104, 175 |
| abstract_inverted_index.processing. | 18 |
| abstract_inverted_index.significant | 167 |
| abstract_inverted_index.specificity | 49 |
| abstract_inverted_index.constructing | 188 |
| abstract_inverted_index.intelligence | 224 |
| abstract_inverted_index.particularly | 212 |
| abstract_inverted_index.specifically | 122, 162 |
| abstract_inverted_index.task-related | 124 |
| abstract_inverted_index.unparalleled | 14 |
| abstract_inverted_index.ChatGPT/GPT-4 | 106 |
| abstract_inverted_index.comprehensive | 131 |
| abstract_inverted_index.demonstrating | 20 |
| abstract_inverted_index.investigation | 132 |
| abstract_inverted_index.ChatGPT/GPT-4. | 177 |
| abstract_inverted_index.ChatGPT/GPT-4's | 134 |
| abstract_inverted_index.domain-specific. | 94 |
| abstract_inverted_index.community$^\ddagger$. | 236 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 16 |
| citation_normalized_percentile |