A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems Article Swipe

PDF

Fangzhou Wu , Ning Zhang , Somesh Jha , Patrick McDaniel , Chaowei Xiao ·

YOU? · · 2024 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2402.18649

Large Language Model (LLM) systems are inherently compositional, with individual LLM serving as the core foundation with additional layers of objects such as plugins, sandbox, and so on. Along with the great potential, there are also increasing concerns over the security of such probabilistic intelligent systems. However, existing studies on LLM security often focus on individual LLM, but without examining the ecosystem through the lens of LLM systems with other objects (e.g., Frontend, Webtool, Sandbox, and so on). In this paper, we systematically analyze the security of LLM systems, instead of focusing on the individual LLMs. To do so, we build on top of the information flow and formulate the security of LLM systems as constraints on the alignment of the information flow within LLM and between LLM and other objects. Based on this construction and the unique probabilistic nature of LLM, the attack surface of the LLM system can be decomposed into three key components: (1) multi-layer security analysis, (2) analysis of the existence of constraints, and (3) analysis of the robustness of these constraints. To ground this new attack surface, we propose a multi-layer and multi-step approach and apply it to the state-of-art LLM system, OpenAI GPT4. Our investigation exposes several security issues, not just within the LLM model itself but also in its integration with other components. We found that although the OpenAI GPT4 has designed numerous safety constraints to improve its safety features, these safety constraints are still vulnerable to attackers. To further demonstrate the real-world threats of our discovered vulnerabilities, we construct an end-to-end attack where an adversary can illicitly acquire the user's chat history, all without the need to manipulate the user's input or gain direct access to OpenAI GPT4. Our demo is in the link: https://fzwark.github.io/LLM-System-Attack-Demo/

Related Topics

Computer Security

Computer Science

Concepts

Political science Computer security Computer science

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2402.18649
PDF: https://arxiv.org/pdf/2402.18649
OA Status: green
Cited By: 17
Related Works: 10
OpenAlex ID: https://openalex.org/W4401066072

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4401066072

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2402.18649

Digital Object Identifier
Title: A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2024

Year of publication
Publication date: 2024-02-28

Full publication date if available
Authors: Fangzhou Wu, Ning Zhang, Somesh Jha, Patrick McDaniel, Chaowei Xiao

List of authors in order
Landing page: https://arxiv.org/abs/2402.18649

Publisher landing page
PDF URL: https://arxiv.org/pdf/2402.18649

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2402.18649

Direct OA link when available
Concepts: Political science, Computer security, Computer science

Top concepts (fields/topics) attached by OpenAlex
Cited by: 17

Total citation count in OpenAlex
Citations by year (recent): 2025: 9, 2024: 8

Per-year citation counts (last 5 years)
Related works (count): 10

Other works algorithmically related by OpenAlex

Full payload

id	https://openalex.org/W4401066072
doi	https://doi.org/10.48550/arxiv.2402.18649
ids.doi	https://doi.org/10.48550/arxiv.2402.18649
ids.openalex	https://openalex.org/W4401066072
fwci
type	preprint
title	A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
topics[0].id	https://openalex.org/T10270
topics[0].field.id	https://openalex.org/fields/17
topics[0].field.display_name	Computer Science
topics[0].score	0.9584000110626221
topics[0].domain.id	https://openalex.org/domains/3
topics[0].domain.display_name	Physical Sciences
topics[0].subfield.id	https://openalex.org/subfields/1710
topics[0].subfield.display_name	Information Systems
topics[0].display_name	Blockchain Technology Applications and Security
topics[1].id	https://openalex.org/T13999
topics[1].field.id	https://openalex.org/fields/17
topics[1].field.display_name	Computer Science
topics[1].score	0.9182999730110168
topics[1].domain.id	https://openalex.org/domains/3
topics[1].domain.display_name	Physical Sciences
topics[1].subfield.id	https://openalex.org/subfields/1710
topics[1].subfield.display_name	Information Systems
topics[1].display_name	Digital Rights Management and Security
is_xpac	False
apc_list
apc_paid
concepts[0].id	https://openalex.org/C17744445
concepts[0].level	0
concepts[0].score	0.4369807541370392
concepts[0].wikidata	https://www.wikidata.org/wiki/Q36442
concepts[0].display_name	Political science
concepts[1].id	https://openalex.org/C38652104
concepts[1].level	1
concepts[1].score	0.4173579812049866
concepts[1].wikidata	https://www.wikidata.org/wiki/Q3510521
concepts[1].display_name	Computer security
concepts[2].id	https://openalex.org/C41008148
concepts[2].level	0
concepts[2].score	0.31095466017723083
concepts[2].wikidata	https://www.wikidata.org/wiki/Q21198
concepts[2].display_name	Computer science
keywords[0].id	https://openalex.org/keywords/political-science
keywords[0].score	0.4369807541370392
keywords[0].display_name	Political science
keywords[1].id	https://openalex.org/keywords/computer-security
keywords[1].score	0.4173579812049866
keywords[1].display_name	Computer security
keywords[2].id	https://openalex.org/keywords/computer-science
keywords[2].score	0.31095466017723083
keywords[2].display_name	Computer science
language	en
locations[0].id	pmh:oai:arXiv.org:2402.18649
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license
locations[0].pdf_url	https://arxiv.org/pdf/2402.18649
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2402.18649
locations[1].id	doi:10.48550/arxiv.2402.18649
locations[1].is_oa	True
locations[1].source.id	https://openalex.org/S4306400194
locations[1].source.issn
locations[1].source.type	repository
locations[1].source.is_oa	True
locations[1].source.issn_l
locations[1].source.is_core	False
locations[1].source.is_in_doaj	False
locations[1].source.display_name	arXiv (Cornell University)
locations[1].source.host_organization	https://openalex.org/I205783295
locations[1].source.host_organization_name	Cornell University
locations[1].source.host_organization_lineage	https://openalex.org/I205783295
locations[1].license
locations[1].pdf_url
locations[1].version
locations[1].raw_type	article
locations[1].license_id
locations[1].is_accepted	False
locations[1].is_published
locations[1].raw_source_name
locations[1].landing_page_url	https://doi.org/10.48550/arxiv.2402.18649
indexed_in	arxiv, datacite
authorships[0].author.id	https://openalex.org/A5100593806
authorships[0].author.orcid
authorships[0].author.display_name	Fangzhou Wu
authorships[0].author_position	first
authorships[0].raw_author_name	Wu, Fangzhou
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5100404908
authorships[1].author.orcid	https://orcid.org/0000-0003-2266-2956
authorships[1].author.display_name	Ning Zhang
authorships[1].author_position	middle
authorships[1].raw_author_name	Zhang, Ning
authorships[1].is_corresponding	False
authorships[2].author.id	https://openalex.org/A5103835847
authorships[2].author.orcid
authorships[2].author.display_name	Somesh Jha
authorships[2].author_position	middle
authorships[2].raw_author_name	Jha, Somesh
authorships[2].is_corresponding	False
authorships[3].author.id	https://openalex.org/A5055368149
authorships[3].author.orcid	https://orcid.org/0000-0003-2091-7484
authorships[3].author.display_name	Patrick McDaniel
authorships[3].author_position	middle
authorships[3].raw_author_name	McDaniel, Patrick
authorships[3].is_corresponding	False
authorships[4].author.id	https://openalex.org/A5005843046
authorships[4].author.orcid	https://orcid.org/0000-0002-7043-4926
authorships[4].author.display_name	Chaowei Xiao
authorships[4].author_position	last
authorships[4].raw_author_name	Xiao, Chaowei
authorships[4].is_corresponding	False
has_content.pdf	False
has_content.grobid_xml	False
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2402.18649
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2024-07-31T00:00:00
display_name	A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems
has_fulltext	False
is_retracted	False
updated_date	2025-11-06T06:51:31.235846
primary_topic.id	https://openalex.org/T10270
primary_topic.field.id	https://openalex.org/fields/17
primary_topic.field.display_name	Computer Science
primary_topic.score	0.9584000110626221
primary_topic.domain.id	https://openalex.org/domains/3
primary_topic.domain.display_name	Physical Sciences
primary_topic.subfield.id	https://openalex.org/subfields/1710
primary_topic.subfield.display_name	Information Systems
primary_topic.display_name	Blockchain Technology Applications and Security
related_works	https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052, https://openalex.org/W2382290278, https://openalex.org/W4395014643
cited_by_count	17
counts_by_year[0].year	2025
counts_by_year[0].cited_by_count	9
counts_by_year[1].year	2024
counts_by_year[1].cited_by_count	8
locations_count	2
best_oa_location.id	pmh:oai:arXiv.org:2402.18649
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license
best_oa_location.pdf_url	https://arxiv.org/pdf/2402.18649
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2402.18649
primary_location.id	pmh:oai:arXiv.org:2402.18649
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license
primary_location.pdf_url	https://arxiv.org/pdf/2402.18649
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2402.18649
publication_date	2024-02-28
publication_year	2024
referenced_works_count	0
abstract_inverted_index.a	184
abstract_inverted_index.In	78
abstract_inverted_index.To	96, 176, 245
abstract_inverted_index.We	220
abstract_inverted_index.an	257, 261
abstract_inverted_index.as	12, 22, 114
abstract_inverted_index.be	150
abstract_inverted_index.do	97
abstract_inverted_index.in	214, 289
abstract_inverted_index.is	288
abstract_inverted_index.it	191
abstract_inverted_index.of	19, 41, 65, 86, 90, 103, 111, 119, 140, 145, 162, 165, 170, 173, 251
abstract_inverted_index.on	49, 54, 92, 101, 116, 132
abstract_inverted_index.or	279
abstract_inverted_index.so	26, 76
abstract_inverted_index.to	192, 232, 243, 274, 283
abstract_inverted_index.we	81, 99, 182, 255
abstract_inverted_index.(1)	156
abstract_inverted_index.(2)	160
abstract_inverted_index.(3)	168
abstract_inverted_index.LLM	10, 50, 66, 87, 112, 124, 127, 147, 195, 209
abstract_inverted_index.Our	199, 286
abstract_inverted_index.all	270
abstract_inverted_index.and	25, 75, 107, 125, 128, 135, 167, 186, 189
abstract_inverted_index.are	5, 34, 240
abstract_inverted_index.but	57, 212
abstract_inverted_index.can	149, 263
abstract_inverted_index.has	227
abstract_inverted_index.its	215, 234
abstract_inverted_index.key	154
abstract_inverted_index.new	179
abstract_inverted_index.not	205
abstract_inverted_index.on.	27
abstract_inverted_index.our	252
abstract_inverted_index.so,	98
abstract_inverted_index.the	13, 30, 39, 60, 63, 84, 93, 104, 109, 117, 120, 136, 142, 146, 163, 171, 193, 208, 224, 248, 266, 272, 276, 290
abstract_inverted_index.top	102
abstract_inverted_index.GPT4	226
abstract_inverted_index.LLM,	56, 141
abstract_inverted_index.also	35, 213
abstract_inverted_index.chat	268
abstract_inverted_index.core	14
abstract_inverted_index.demo	287
abstract_inverted_index.flow	106, 122
abstract_inverted_index.gain	280
abstract_inverted_index.into	152
abstract_inverted_index.just	206
abstract_inverted_index.lens	64
abstract_inverted_index.need	273
abstract_inverted_index.on).	77
abstract_inverted_index.over	38
abstract_inverted_index.such	21, 42
abstract_inverted_index.that	222
abstract_inverted_index.this	79, 133, 178
abstract_inverted_index.with	8, 16, 29, 68, 217
abstract_inverted_index.(LLM)	3
abstract_inverted_index.Along	28
abstract_inverted_index.Based	131
abstract_inverted_index.GPT4.	198, 285
abstract_inverted_index.LLMs.	95
abstract_inverted_index.Large	0
abstract_inverted_index.Model	2
abstract_inverted_index.apply	190
abstract_inverted_index.build	100
abstract_inverted_index.focus	53
abstract_inverted_index.found	221
abstract_inverted_index.great	31
abstract_inverted_index.input	278
abstract_inverted_index.link:	291
abstract_inverted_index.model	210
abstract_inverted_index.often	52
abstract_inverted_index.other	69, 129, 218
abstract_inverted_index.still	241
abstract_inverted_index.there	33
abstract_inverted_index.these	174, 237
abstract_inverted_index.three	153
abstract_inverted_index.where	260
abstract_inverted_index.(e.g.,	71
abstract_inverted_index.OpenAI	197, 225, 284
abstract_inverted_index.access	282
abstract_inverted_index.attack	143, 180, 259
abstract_inverted_index.direct	281
abstract_inverted_index.ground	177
abstract_inverted_index.itself	211
abstract_inverted_index.layers	18
abstract_inverted_index.nature	139
abstract_inverted_index.paper,	80
abstract_inverted_index.safety	230, 235, 238
abstract_inverted_index.system	148
abstract_inverted_index.unique	137
abstract_inverted_index.user's	267, 277
abstract_inverted_index.within	123, 207
abstract_inverted_index.acquire	265
abstract_inverted_index.analyze	83
abstract_inverted_index.between	126
abstract_inverted_index.exposes	201
abstract_inverted_index.further	246
abstract_inverted_index.improve	233
abstract_inverted_index.instead	89
abstract_inverted_index.issues,	204
abstract_inverted_index.objects	20, 70
abstract_inverted_index.propose	183
abstract_inverted_index.serving	11
abstract_inverted_index.several	202
abstract_inverted_index.studies	48
abstract_inverted_index.surface	144
abstract_inverted_index.system,	196
abstract_inverted_index.systems	4, 67, 113
abstract_inverted_index.threats	250
abstract_inverted_index.through	62
abstract_inverted_index.without	58, 271
abstract_inverted_index.However,	46
abstract_inverted_index.Language	1
abstract_inverted_index.Sandbox,	74
abstract_inverted_index.Webtool,	73
abstract_inverted_index.although	223
abstract_inverted_index.analysis	161, 169
abstract_inverted_index.approach	188
abstract_inverted_index.concerns	37
abstract_inverted_index.designed	228
abstract_inverted_index.existing	47
abstract_inverted_index.focusing	91
abstract_inverted_index.history,	269
abstract_inverted_index.numerous	229
abstract_inverted_index.objects.	130
abstract_inverted_index.plugins,	23
abstract_inverted_index.sandbox,	24
abstract_inverted_index.security	40, 51, 85, 110, 158, 203
abstract_inverted_index.surface,	181
abstract_inverted_index.systems,	88
abstract_inverted_index.systems.	45
abstract_inverted_index.Frontend,	72
abstract_inverted_index.adversary	262
abstract_inverted_index.alignment	118
abstract_inverted_index.analysis,	159
abstract_inverted_index.construct	256
abstract_inverted_index.ecosystem	61
abstract_inverted_index.examining	59
abstract_inverted_index.existence	164
abstract_inverted_index.features,	236
abstract_inverted_index.formulate	108
abstract_inverted_index.illicitly	264
abstract_inverted_index.additional	17
abstract_inverted_index.attackers.	244
abstract_inverted_index.decomposed	151
abstract_inverted_index.discovered	253
abstract_inverted_index.end-to-end	258
abstract_inverted_index.foundation	15
abstract_inverted_index.increasing	36
abstract_inverted_index.individual	9, 55, 94
abstract_inverted_index.inherently	6
abstract_inverted_index.manipulate	275
abstract_inverted_index.multi-step	187
abstract_inverted_index.potential,	32
abstract_inverted_index.real-world	249
abstract_inverted_index.robustness	172
abstract_inverted_index.vulnerable	242
abstract_inverted_index.components.	219
abstract_inverted_index.components:	155
abstract_inverted_index.constraints	115, 231, 239
abstract_inverted_index.demonstrate	247
abstract_inverted_index.information	105, 121
abstract_inverted_index.integration	216
abstract_inverted_index.intelligent	44
abstract_inverted_index.multi-layer	157, 185
abstract_inverted_index.constraints,	166
abstract_inverted_index.constraints.	175
abstract_inverted_index.construction	134
abstract_inverted_index.state-of-art	194
abstract_inverted_index.investigation	200
abstract_inverted_index.probabilistic	43, 138
abstract_inverted_index.compositional,	7
abstract_inverted_index.systematically	82
abstract_inverted_index.vulnerabilities,	254
abstract_inverted_index.https://fzwark.github.io/LLM-System-Attack-Demo/	292
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	5
citation_normalized_percentile