A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2402.18649
Large Language Model (LLM) systems are inherently compositional, with individual LLM serving as the core foundation with additional layers of objects such as plugins, sandbox, and so on. Along with the great potential, there are also increasing concerns over the security of such probabilistic intelligent systems. However, existing studies on LLM security often focus on individual LLM, but without examining the ecosystem through the lens of LLM systems with other objects (e.g., Frontend, Webtool, Sandbox, and so on). In this paper, we systematically analyze the security of LLM systems, instead of focusing on the individual LLMs. To do so, we build on top of the information flow and formulate the security of LLM systems as constraints on the alignment of the information flow within LLM and between LLM and other objects. Based on this construction and the unique probabilistic nature of LLM, the attack surface of the LLM system can be decomposed into three key components: (1) multi-layer security analysis, (2) analysis of the existence of constraints, and (3) analysis of the robustness of these constraints. To ground this new attack surface, we propose a multi-layer and multi-step approach and apply it to the state-of-art LLM system, OpenAI GPT4. Our investigation exposes several security issues, not just within the LLM model itself but also in its integration with other components. We found that although the OpenAI GPT4 has designed numerous safety constraints to improve its safety features, these safety constraints are still vulnerable to attackers. To further demonstrate the real-world threats of our discovered vulnerabilities, we construct an end-to-end attack where an adversary can illicitly acquire the user's chat history, all without the need to manipulate the user's input or gain direct access to OpenAI GPT4. Our demo is in the link: https://fzwark.github.io/LLM-System-Attack-Demo/
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2402.18649
- https://arxiv.org/pdf/2402.18649
- OA Status
- green
- Cited By
- 17
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4401066072
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4401066072Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2402.18649Digital Object Identifier
- Title
-
A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based SystemsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-02-28Full publication date if available
- Authors
-
Fangzhou Wu, Ning Zhang, Somesh Jha, Patrick McDaniel, Chaowei XiaoList of authors in order
- Landing page
-
https://arxiv.org/abs/2402.18649Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2402.18649Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2402.18649Direct OA link when available
- Concepts
-
Political science, Computer security, Computer scienceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
17Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 9, 2024: 8Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4401066072 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2402.18649 |
| ids.doi | https://doi.org/10.48550/arxiv.2402.18649 |
| ids.openalex | https://openalex.org/W4401066072 |
| fwci | |
| type | preprint |
| title | A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10270 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9584000110626221 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1710 |
| topics[0].subfield.display_name | Information Systems |
| topics[0].display_name | Blockchain Technology Applications and Security |
| topics[1].id | https://openalex.org/T13999 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9182999730110168 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1710 |
| topics[1].subfield.display_name | Information Systems |
| topics[1].display_name | Digital Rights Management and Security |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C17744445 |
| concepts[0].level | 0 |
| concepts[0].score | 0.4369807541370392 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q36442 |
| concepts[0].display_name | Political science |
| concepts[1].id | https://openalex.org/C38652104 |
| concepts[1].level | 1 |
| concepts[1].score | 0.4173579812049866 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q3510521 |
| concepts[1].display_name | Computer security |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.31095466017723083 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| keywords[0].id | https://openalex.org/keywords/political-science |
| keywords[0].score | 0.4369807541370392 |
| keywords[0].display_name | Political science |
| keywords[1].id | https://openalex.org/keywords/computer-security |
| keywords[1].score | 0.4173579812049866 |
| keywords[1].display_name | Computer security |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.31095466017723083 |
| keywords[2].display_name | Computer science |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2402.18649 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2402.18649 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2402.18649 |
| locations[1].id | doi:10.48550/arxiv.2402.18649 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2402.18649 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5100593806 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Fangzhou Wu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Wu, Fangzhou |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100404908 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-2266-2956 |
| authorships[1].author.display_name | Ning Zhang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zhang, Ning |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5103835847 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Somesh Jha |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Jha, Somesh |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5055368149 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-2091-7484 |
| authorships[3].author.display_name | Patrick McDaniel |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | McDaniel, Patrick |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5005843046 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-7043-4926 |
| authorships[4].author.display_name | Chaowei Xiao |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Xiao, Chaowei |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2402.18649 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-07-31T00:00:00 |
| display_name | A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10270 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9584000110626221 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1710 |
| primary_topic.subfield.display_name | Information Systems |
| primary_topic.display_name | Blockchain Technology Applications and Security |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052, https://openalex.org/W2382290278, https://openalex.org/W4395014643 |
| cited_by_count | 17 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 9 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 8 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2402.18649 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2402.18649 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2402.18649 |
| primary_location.id | pmh:oai:arXiv.org:2402.18649 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2402.18649 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2402.18649 |
| publication_date | 2024-02-28 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 184 |
| abstract_inverted_index.In | 78 |
| abstract_inverted_index.To | 96, 176, 245 |
| abstract_inverted_index.We | 220 |
| abstract_inverted_index.an | 257, 261 |
| abstract_inverted_index.as | 12, 22, 114 |
| abstract_inverted_index.be | 150 |
| abstract_inverted_index.do | 97 |
| abstract_inverted_index.in | 214, 289 |
| abstract_inverted_index.is | 288 |
| abstract_inverted_index.it | 191 |
| abstract_inverted_index.of | 19, 41, 65, 86, 90, 103, 111, 119, 140, 145, 162, 165, 170, 173, 251 |
| abstract_inverted_index.on | 49, 54, 92, 101, 116, 132 |
| abstract_inverted_index.or | 279 |
| abstract_inverted_index.so | 26, 76 |
| abstract_inverted_index.to | 192, 232, 243, 274, 283 |
| abstract_inverted_index.we | 81, 99, 182, 255 |
| abstract_inverted_index.(1) | 156 |
| abstract_inverted_index.(2) | 160 |
| abstract_inverted_index.(3) | 168 |
| abstract_inverted_index.LLM | 10, 50, 66, 87, 112, 124, 127, 147, 195, 209 |
| abstract_inverted_index.Our | 199, 286 |
| abstract_inverted_index.all | 270 |
| abstract_inverted_index.and | 25, 75, 107, 125, 128, 135, 167, 186, 189 |
| abstract_inverted_index.are | 5, 34, 240 |
| abstract_inverted_index.but | 57, 212 |
| abstract_inverted_index.can | 149, 263 |
| abstract_inverted_index.has | 227 |
| abstract_inverted_index.its | 215, 234 |
| abstract_inverted_index.key | 154 |
| abstract_inverted_index.new | 179 |
| abstract_inverted_index.not | 205 |
| abstract_inverted_index.on. | 27 |
| abstract_inverted_index.our | 252 |
| abstract_inverted_index.so, | 98 |
| abstract_inverted_index.the | 13, 30, 39, 60, 63, 84, 93, 104, 109, 117, 120, 136, 142, 146, 163, 171, 193, 208, 224, 248, 266, 272, 276, 290 |
| abstract_inverted_index.top | 102 |
| abstract_inverted_index.GPT4 | 226 |
| abstract_inverted_index.LLM, | 56, 141 |
| abstract_inverted_index.also | 35, 213 |
| abstract_inverted_index.chat | 268 |
| abstract_inverted_index.core | 14 |
| abstract_inverted_index.demo | 287 |
| abstract_inverted_index.flow | 106, 122 |
| abstract_inverted_index.gain | 280 |
| abstract_inverted_index.into | 152 |
| abstract_inverted_index.just | 206 |
| abstract_inverted_index.lens | 64 |
| abstract_inverted_index.need | 273 |
| abstract_inverted_index.on). | 77 |
| abstract_inverted_index.over | 38 |
| abstract_inverted_index.such | 21, 42 |
| abstract_inverted_index.that | 222 |
| abstract_inverted_index.this | 79, 133, 178 |
| abstract_inverted_index.with | 8, 16, 29, 68, 217 |
| abstract_inverted_index.(LLM) | 3 |
| abstract_inverted_index.Along | 28 |
| abstract_inverted_index.Based | 131 |
| abstract_inverted_index.GPT4. | 198, 285 |
| abstract_inverted_index.LLMs. | 95 |
| abstract_inverted_index.Large | 0 |
| abstract_inverted_index.Model | 2 |
| abstract_inverted_index.apply | 190 |
| abstract_inverted_index.build | 100 |
| abstract_inverted_index.focus | 53 |
| abstract_inverted_index.found | 221 |
| abstract_inverted_index.great | 31 |
| abstract_inverted_index.input | 278 |
| abstract_inverted_index.link: | 291 |
| abstract_inverted_index.model | 210 |
| abstract_inverted_index.often | 52 |
| abstract_inverted_index.other | 69, 129, 218 |
| abstract_inverted_index.still | 241 |
| abstract_inverted_index.there | 33 |
| abstract_inverted_index.these | 174, 237 |
| abstract_inverted_index.three | 153 |
| abstract_inverted_index.where | 260 |
| abstract_inverted_index.(e.g., | 71 |
| abstract_inverted_index.OpenAI | 197, 225, 284 |
| abstract_inverted_index.access | 282 |
| abstract_inverted_index.attack | 143, 180, 259 |
| abstract_inverted_index.direct | 281 |
| abstract_inverted_index.ground | 177 |
| abstract_inverted_index.itself | 211 |
| abstract_inverted_index.layers | 18 |
| abstract_inverted_index.nature | 139 |
| abstract_inverted_index.paper, | 80 |
| abstract_inverted_index.safety | 230, 235, 238 |
| abstract_inverted_index.system | 148 |
| abstract_inverted_index.unique | 137 |
| abstract_inverted_index.user's | 267, 277 |
| abstract_inverted_index.within | 123, 207 |
| abstract_inverted_index.acquire | 265 |
| abstract_inverted_index.analyze | 83 |
| abstract_inverted_index.between | 126 |
| abstract_inverted_index.exposes | 201 |
| abstract_inverted_index.further | 246 |
| abstract_inverted_index.improve | 233 |
| abstract_inverted_index.instead | 89 |
| abstract_inverted_index.issues, | 204 |
| abstract_inverted_index.objects | 20, 70 |
| abstract_inverted_index.propose | 183 |
| abstract_inverted_index.serving | 11 |
| abstract_inverted_index.several | 202 |
| abstract_inverted_index.studies | 48 |
| abstract_inverted_index.surface | 144 |
| abstract_inverted_index.system, | 196 |
| abstract_inverted_index.systems | 4, 67, 113 |
| abstract_inverted_index.threats | 250 |
| abstract_inverted_index.through | 62 |
| abstract_inverted_index.without | 58, 271 |
| abstract_inverted_index.However, | 46 |
| abstract_inverted_index.Language | 1 |
| abstract_inverted_index.Sandbox, | 74 |
| abstract_inverted_index.Webtool, | 73 |
| abstract_inverted_index.although | 223 |
| abstract_inverted_index.analysis | 161, 169 |
| abstract_inverted_index.approach | 188 |
| abstract_inverted_index.concerns | 37 |
| abstract_inverted_index.designed | 228 |
| abstract_inverted_index.existing | 47 |
| abstract_inverted_index.focusing | 91 |
| abstract_inverted_index.history, | 269 |
| abstract_inverted_index.numerous | 229 |
| abstract_inverted_index.objects. | 130 |
| abstract_inverted_index.plugins, | 23 |
| abstract_inverted_index.sandbox, | 24 |
| abstract_inverted_index.security | 40, 51, 85, 110, 158, 203 |
| abstract_inverted_index.surface, | 181 |
| abstract_inverted_index.systems, | 88 |
| abstract_inverted_index.systems. | 45 |
| abstract_inverted_index.Frontend, | 72 |
| abstract_inverted_index.adversary | 262 |
| abstract_inverted_index.alignment | 118 |
| abstract_inverted_index.analysis, | 159 |
| abstract_inverted_index.construct | 256 |
| abstract_inverted_index.ecosystem | 61 |
| abstract_inverted_index.examining | 59 |
| abstract_inverted_index.existence | 164 |
| abstract_inverted_index.features, | 236 |
| abstract_inverted_index.formulate | 108 |
| abstract_inverted_index.illicitly | 264 |
| abstract_inverted_index.additional | 17 |
| abstract_inverted_index.attackers. | 244 |
| abstract_inverted_index.decomposed | 151 |
| abstract_inverted_index.discovered | 253 |
| abstract_inverted_index.end-to-end | 258 |
| abstract_inverted_index.foundation | 15 |
| abstract_inverted_index.increasing | 36 |
| abstract_inverted_index.individual | 9, 55, 94 |
| abstract_inverted_index.inherently | 6 |
| abstract_inverted_index.manipulate | 275 |
| abstract_inverted_index.multi-step | 187 |
| abstract_inverted_index.potential, | 32 |
| abstract_inverted_index.real-world | 249 |
| abstract_inverted_index.robustness | 172 |
| abstract_inverted_index.vulnerable | 242 |
| abstract_inverted_index.components. | 219 |
| abstract_inverted_index.components: | 155 |
| abstract_inverted_index.constraints | 115, 231, 239 |
| abstract_inverted_index.demonstrate | 247 |
| abstract_inverted_index.information | 105, 121 |
| abstract_inverted_index.integration | 216 |
| abstract_inverted_index.intelligent | 44 |
| abstract_inverted_index.multi-layer | 157, 185 |
| abstract_inverted_index.constraints, | 166 |
| abstract_inverted_index.constraints. | 175 |
| abstract_inverted_index.construction | 134 |
| abstract_inverted_index.state-of-art | 194 |
| abstract_inverted_index.investigation | 200 |
| abstract_inverted_index.probabilistic | 43, 138 |
| abstract_inverted_index.compositional, | 7 |
| abstract_inverted_index.systematically | 82 |
| abstract_inverted_index.vulnerabilities, | 254 |
| abstract_inverted_index.https://fzwark.github.io/LLM-System-Attack-Demo/ | 292 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |