MARAGE: Transferable Multi-Model Adversarial Attack for Retrieval-Augmented Generation Data Extraction Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2502.04360
Retrieval-Augmented Generation (RAG) offers a solution to mitigate hallucinations in Large Language Models (LLMs) by grounding their outputs to knowledge retrieved from external sources. The use of private resources and data in constructing these external data stores can expose them to risks of extraction attacks, in which attackers attempt to steal data from these private databases. Existing RAG extraction attacks often rely on manually crafted prompts, which limit their effectiveness. In this paper, we introduce a framework called MARAGE for optimizing an adversarial string that, when appended to user queries submitted to a target RAG system, causes outputs containing the retrieved RAG data verbatim. MARAGE leverages a continuous optimization scheme that integrates gradients from multiple models with different architectures simultaneously to enhance the transferability of the optimized string to unseen models. Additionally, we propose a strategy that emphasizes the initial tokens in the target RAG data, further improving the attack's generalizability. Evaluations show that MARAGE consistently outperforms both manual and optimization-based baselines across multiple LLMs and RAG datasets, while maintaining robust transferability to previously unseen models. Moreover, we conduct probing tasks to shed light on the reasons why MARAGE is more effective compared to the baselines and to analyze the impact of our approach on the model's internal state.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2502.04360
- https://arxiv.org/pdf/2502.04360
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4407308219
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4407308219Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2502.04360Digital Object Identifier
- Title
-
MARAGE: Transferable Multi-Model Adversarial Attack for Retrieval-Augmented Generation Data ExtractionWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-02-05Full publication date if available
- Authors
-
Xiaoling Hu, Eric Minwei Liu, Weizhou Wang, Xiangyu Guo, David LieList of authors in order
- Landing page
-
https://arxiv.org/abs/2502.04360Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2502.04360Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2502.04360Direct OA link when available
- Concepts
-
Adversarial system, Computer science, Extraction (chemistry), Data mining, Information retrieval, Artificial intelligence, Chromatography, ChemistryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4407308219 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2502.04360 |
| ids.doi | https://doi.org/10.48550/arxiv.2502.04360 |
| ids.openalex | https://openalex.org/W4407308219 |
| fwci | |
| type | preprint |
| title | MARAGE: Transferable Multi-Model Adversarial Attack for Retrieval-Augmented Generation Data Extraction |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11689 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9902999997138977 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Adversarial Robustness in Machine Learning |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C37736160 |
| concepts[0].level | 2 |
| concepts[0].score | 0.807422399520874 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1801315 |
| concepts[0].display_name | Adversarial system |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.7014756798744202 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C4725764 |
| concepts[2].level | 2 |
| concepts[2].score | 0.46200722455978394 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q844704 |
| concepts[2].display_name | Extraction (chemistry) |
| concepts[3].id | https://openalex.org/C124101348 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3651029169559479 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q172491 |
| concepts[3].display_name | Data mining |
| concepts[4].id | https://openalex.org/C23123220 |
| concepts[4].level | 1 |
| concepts[4].score | 0.3618581295013428 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q816826 |
| concepts[4].display_name | Information retrieval |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.3354484438896179 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C43617362 |
| concepts[6].level | 1 |
| concepts[6].score | 0.0 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q170050 |
| concepts[6].display_name | Chromatography |
| concepts[7].id | https://openalex.org/C185592680 |
| concepts[7].level | 0 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[7].display_name | Chemistry |
| keywords[0].id | https://openalex.org/keywords/adversarial-system |
| keywords[0].score | 0.807422399520874 |
| keywords[0].display_name | Adversarial system |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.7014756798744202 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/extraction |
| keywords[2].score | 0.46200722455978394 |
| keywords[2].display_name | Extraction (chemistry) |
| keywords[3].id | https://openalex.org/keywords/data-mining |
| keywords[3].score | 0.3651029169559479 |
| keywords[3].display_name | Data mining |
| keywords[4].id | https://openalex.org/keywords/information-retrieval |
| keywords[4].score | 0.3618581295013428 |
| keywords[4].display_name | Information retrieval |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.3354484438896179 |
| keywords[5].display_name | Artificial intelligence |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2502.04360 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2502.04360 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2502.04360 |
| locations[1].id | doi:10.48550/arxiv.2502.04360 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2502.04360 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5022509823 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-3188-3005 |
| authorships[0].author.display_name | Xiaoling Hu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Hu, Xiao |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5071301697 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-3594-6678 |
| authorships[1].author.display_name | Eric Minwei Liu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Liu, Eric |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5009906998 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-4309-9077 |
| authorships[2].author.display_name | Weizhou Wang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Wang, Weizhou |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5026529043 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-9405-3792 |
| authorships[3].author.display_name | Xiangyu Guo |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Guo, Xiangyu |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5049933072 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-2000-6827 |
| authorships[4].author.display_name | David Lie |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Lie, David |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2502.04360 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | MARAGE: Transferable Multi-Model Adversarial Attack for Retrieval-Augmented Generation Data Extraction |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11689 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9902999997138977 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Adversarial Robustness in Machine Learning |
| related_works | https://openalex.org/W2502115930, https://openalex.org/W2482350142, https://openalex.org/W4246396837, https://openalex.org/W3126451824, https://openalex.org/W1561927205, https://openalex.org/W3191453585, https://openalex.org/W4297672492, https://openalex.org/W4310988119, https://openalex.org/W4285226279, https://openalex.org/W4288019534 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2502.04360 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2502.04360 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2502.04360 |
| primary_location.id | pmh:oai:arXiv.org:2502.04360 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2502.04360 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2502.04360 |
| publication_date | 2025-02-05 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 4, 75, 92, 106, 134 |
| abstract_inverted_index.In | 70 |
| abstract_inverted_index.an | 81 |
| abstract_inverted_index.by | 14 |
| abstract_inverted_index.in | 9, 31, 45, 141 |
| abstract_inverted_index.is | 189 |
| abstract_inverted_index.of | 26, 42, 124, 201 |
| abstract_inverted_index.on | 62, 184, 204 |
| abstract_inverted_index.to | 6, 18, 40, 49, 87, 91, 120, 128, 172, 181, 193, 197 |
| abstract_inverted_index.we | 73, 132, 177 |
| abstract_inverted_index.RAG | 57, 94, 101, 144, 166 |
| abstract_inverted_index.The | 24 |
| abstract_inverted_index.and | 29, 159, 165, 196 |
| abstract_inverted_index.can | 37 |
| abstract_inverted_index.for | 79 |
| abstract_inverted_index.our | 202 |
| abstract_inverted_index.the | 99, 122, 125, 138, 142, 148, 185, 194, 199, 205 |
| abstract_inverted_index.use | 25 |
| abstract_inverted_index.why | 187 |
| abstract_inverted_index.LLMs | 164 |
| abstract_inverted_index.both | 157 |
| abstract_inverted_index.data | 30, 35, 51, 102 |
| abstract_inverted_index.from | 21, 52, 113 |
| abstract_inverted_index.more | 190 |
| abstract_inverted_index.rely | 61 |
| abstract_inverted_index.shed | 182 |
| abstract_inverted_index.show | 152 |
| abstract_inverted_index.that | 110, 136, 153 |
| abstract_inverted_index.them | 39 |
| abstract_inverted_index.this | 71 |
| abstract_inverted_index.user | 88 |
| abstract_inverted_index.when | 85 |
| abstract_inverted_index.with | 116 |
| abstract_inverted_index.(RAG) | 2 |
| abstract_inverted_index.Large | 10 |
| abstract_inverted_index.data, | 145 |
| abstract_inverted_index.light | 183 |
| abstract_inverted_index.limit | 67 |
| abstract_inverted_index.often | 60 |
| abstract_inverted_index.risks | 41 |
| abstract_inverted_index.steal | 50 |
| abstract_inverted_index.tasks | 180 |
| abstract_inverted_index.that, | 84 |
| abstract_inverted_index.their | 16, 68 |
| abstract_inverted_index.these | 33, 53 |
| abstract_inverted_index.which | 46, 66 |
| abstract_inverted_index.while | 168 |
| abstract_inverted_index.(LLMs) | 13 |
| abstract_inverted_index.MARAGE | 78, 104, 154, 188 |
| abstract_inverted_index.Models | 12 |
| abstract_inverted_index.across | 162 |
| abstract_inverted_index.called | 77 |
| abstract_inverted_index.causes | 96 |
| abstract_inverted_index.expose | 38 |
| abstract_inverted_index.impact | 200 |
| abstract_inverted_index.manual | 158 |
| abstract_inverted_index.models | 115 |
| abstract_inverted_index.offers | 3 |
| abstract_inverted_index.paper, | 72 |
| abstract_inverted_index.robust | 170 |
| abstract_inverted_index.scheme | 109 |
| abstract_inverted_index.state. | 208 |
| abstract_inverted_index.stores | 36 |
| abstract_inverted_index.string | 83, 127 |
| abstract_inverted_index.target | 93, 143 |
| abstract_inverted_index.tokens | 140 |
| abstract_inverted_index.unseen | 129, 174 |
| abstract_inverted_index.analyze | 198 |
| abstract_inverted_index.attacks | 59 |
| abstract_inverted_index.attempt | 48 |
| abstract_inverted_index.conduct | 178 |
| abstract_inverted_index.crafted | 64 |
| abstract_inverted_index.enhance | 121 |
| abstract_inverted_index.further | 146 |
| abstract_inverted_index.initial | 139 |
| abstract_inverted_index.model's | 206 |
| abstract_inverted_index.models. | 130, 175 |
| abstract_inverted_index.outputs | 17, 97 |
| abstract_inverted_index.private | 27, 54 |
| abstract_inverted_index.probing | 179 |
| abstract_inverted_index.propose | 133 |
| abstract_inverted_index.queries | 89 |
| abstract_inverted_index.reasons | 186 |
| abstract_inverted_index.system, | 95 |
| abstract_inverted_index.Existing | 56 |
| abstract_inverted_index.Language | 11 |
| abstract_inverted_index.appended | 86 |
| abstract_inverted_index.approach | 203 |
| abstract_inverted_index.attack's | 149 |
| abstract_inverted_index.attacks, | 44 |
| abstract_inverted_index.compared | 192 |
| abstract_inverted_index.external | 22, 34 |
| abstract_inverted_index.internal | 207 |
| abstract_inverted_index.manually | 63 |
| abstract_inverted_index.mitigate | 7 |
| abstract_inverted_index.multiple | 114, 163 |
| abstract_inverted_index.prompts, | 65 |
| abstract_inverted_index.solution | 5 |
| abstract_inverted_index.sources. | 23 |
| abstract_inverted_index.strategy | 135 |
| abstract_inverted_index.Moreover, | 176 |
| abstract_inverted_index.attackers | 47 |
| abstract_inverted_index.baselines | 161, 195 |
| abstract_inverted_index.datasets, | 167 |
| abstract_inverted_index.different | 117 |
| abstract_inverted_index.effective | 191 |
| abstract_inverted_index.framework | 76 |
| abstract_inverted_index.gradients | 112 |
| abstract_inverted_index.grounding | 15 |
| abstract_inverted_index.improving | 147 |
| abstract_inverted_index.introduce | 74 |
| abstract_inverted_index.knowledge | 19 |
| abstract_inverted_index.leverages | 105 |
| abstract_inverted_index.optimized | 126 |
| abstract_inverted_index.resources | 28 |
| abstract_inverted_index.retrieved | 20, 100 |
| abstract_inverted_index.submitted | 90 |
| abstract_inverted_index.verbatim. | 103 |
| abstract_inverted_index.Generation | 1 |
| abstract_inverted_index.containing | 98 |
| abstract_inverted_index.continuous | 107 |
| abstract_inverted_index.databases. | 55 |
| abstract_inverted_index.emphasizes | 137 |
| abstract_inverted_index.extraction | 43, 58 |
| abstract_inverted_index.integrates | 111 |
| abstract_inverted_index.optimizing | 80 |
| abstract_inverted_index.previously | 173 |
| abstract_inverted_index.Evaluations | 151 |
| abstract_inverted_index.adversarial | 82 |
| abstract_inverted_index.maintaining | 169 |
| abstract_inverted_index.outperforms | 156 |
| abstract_inverted_index.consistently | 155 |
| abstract_inverted_index.constructing | 32 |
| abstract_inverted_index.optimization | 108 |
| abstract_inverted_index.Additionally, | 131 |
| abstract_inverted_index.architectures | 118 |
| abstract_inverted_index.effectiveness. | 69 |
| abstract_inverted_index.hallucinations | 8 |
| abstract_inverted_index.simultaneously | 119 |
| abstract_inverted_index.transferability | 123, 171 |
| abstract_inverted_index.generalizability. | 150 |
| abstract_inverted_index.optimization-based | 160 |
| abstract_inverted_index.Retrieval-Augmented | 0 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |