MEF: A Capability-Aware Multi-Encryption Framework for Evaluating Vulnerabilities in Black-Box Large Language Models Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2505.23404
Recent advancements in adversarial jailbreak attacks have exposed critical vulnerabilities in Large Language Models (LLMs), enabling the circumvention of alignment safeguards through increasingly sophisticated prompt manipulations. Based on our experiments, we found that the effectiveness of jailbreak strategies is influenced by the comprehension ability of the attacked LLM. Building on this insight, we propose a capability-aware Multi-Encryption Framework (MEF) for evaluating vulnerabilities in black-box LLMs. Specifically, MEF first categorizes the comprehension ability level of the LLM, then applies different strategies accordingly: For models with limited comprehension ability, MEF adopts the Fu+En1 strategy, which integrates layered semantic mutations with an encryption technique, more effectively contributing to evasion of the LLM's defenses at the input and inference stages. For models with strong comprehension ability, MEF uses a more complex Fu+En1+En2 strategy, in which additional dual-ended encryption techniques are applied to the LLM's responses, further contributing to evasion of the LLM's defenses at the output stage. Experimental results demonstrate the effectiveness of our approach, achieving attack success rates of 98.9% on GPT-4o (29 May 2025 release) and 99.8% on GPT-4.1 (8 July 2025 release). Our work contributes to a deeper understanding of the vulnerabilities in current LLM alignment mechanisms.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2505.23404
- https://arxiv.org/pdf/2505.23404
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4414876843
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4414876843Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2505.23404Digital Object Identifier
- Title
-
MEF: A Capability-Aware Multi-Encryption Framework for Evaluating Vulnerabilities in Black-Box Large Language ModelsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-05-29Full publication date if available
- Authors
-
Mingyu Yu, Sheng Wang, Yu Wei, Su‐Juan Qin, Fei Gao, Wenmin LiList of authors in order
- Landing page
-
https://arxiv.org/abs/2505.23404Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2505.23404Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2505.23404Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4414876843 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2505.23404 |
| ids.doi | https://doi.org/10.48550/arxiv.2505.23404 |
| ids.openalex | https://openalex.org/W4414876843 |
| fwci | |
| type | preprint |
| title | MEF: A Capability-Aware Multi-Encryption Framework for Evaluating Vulnerabilities in Black-Box Large Language Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11614 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.7516000270843506 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1710 |
| topics[0].subfield.display_name | Information Systems |
| topics[0].display_name | Cloud Data Security Solutions |
| topics[1].id | https://openalex.org/T10764 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.7028999924659729 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Privacy-Preserving Technologies in Data |
| topics[2].id | https://openalex.org/T11424 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.6579999923706055 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Security and Verification in Computing |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2505.23404 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2505.23404 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2505.23404 |
| locations[1].id | doi:10.48550/arxiv.2505.23404 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2505.23404 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5083118115 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-6730-7478 |
| authorships[0].author.display_name | Mingyu Yu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Yu, Mingyu |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100371313 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-2696-2171 |
| authorships[1].author.display_name | Sheng Wang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wang, Wei |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5008125230 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-8473-6856 |
| authorships[2].author.display_name | Yu Wei |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Wei, Yanjie |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5040230786 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-6405-6711 |
| authorships[3].author.display_name | Su‐Juan Qin |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Qin, Sujuan |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5100318655 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-6739-5076 |
| authorships[4].author.display_name | Fei Gao |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Gao, Fei |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5108781441 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Wenmin Li |
| authorships[5].author_position | last |
| authorships[5].raw_author_name | Li, Wenmin |
| authorships[5].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2505.23404 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | MEF: A Capability-Aware Multi-Encryption Framework for Evaluating Vulnerabilities in Black-Box Large Language Models |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11614 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.7516000270843506 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1710 |
| primary_topic.subfield.display_name | Information Systems |
| primary_topic.display_name | Cloud Data Security Solutions |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2505.23404 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2505.23404 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2505.23404 |
| primary_location.id | pmh:oai:arXiv.org:2505.23404 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2505.23404 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2505.23404 |
| publication_date | 2025-05-29 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 54, 124, 185 |
| abstract_inverted_index.(8 | 177 |
| abstract_inverted_index.an | 98 |
| abstract_inverted_index.at | 110, 149 |
| abstract_inverted_index.by | 40 |
| abstract_inverted_index.in | 2, 10, 62, 129, 191 |
| abstract_inverted_index.is | 38 |
| abstract_inverted_index.of | 18, 35, 44, 73, 106, 145, 158, 165, 188 |
| abstract_inverted_index.on | 27, 49, 167, 175 |
| abstract_inverted_index.to | 104, 137, 143, 184 |
| abstract_inverted_index.we | 30, 52 |
| abstract_inverted_index.(29 | 169 |
| abstract_inverted_index.For | 81, 116 |
| abstract_inverted_index.LLM | 193 |
| abstract_inverted_index.MEF | 66, 87, 122 |
| abstract_inverted_index.May | 170 |
| abstract_inverted_index.Our | 181 |
| abstract_inverted_index.and | 113, 173 |
| abstract_inverted_index.are | 135 |
| abstract_inverted_index.for | 59 |
| abstract_inverted_index.our | 28, 159 |
| abstract_inverted_index.the | 16, 33, 41, 45, 69, 74, 89, 107, 111, 138, 146, 150, 156, 189 |
| abstract_inverted_index.2025 | 171, 179 |
| abstract_inverted_index.July | 178 |
| abstract_inverted_index.LLM, | 75 |
| abstract_inverted_index.LLM. | 47 |
| abstract_inverted_index.have | 6 |
| abstract_inverted_index.more | 101, 125 |
| abstract_inverted_index.that | 32 |
| abstract_inverted_index.then | 76 |
| abstract_inverted_index.this | 50 |
| abstract_inverted_index.uses | 123 |
| abstract_inverted_index.with | 83, 97, 118 |
| abstract_inverted_index.work | 182 |
| abstract_inverted_index.(MEF) | 58 |
| abstract_inverted_index.98.9% | 166 |
| abstract_inverted_index.99.8% | 174 |
| abstract_inverted_index.Based | 26 |
| abstract_inverted_index.LLM's | 108, 139, 147 |
| abstract_inverted_index.LLMs. | 64 |
| abstract_inverted_index.Large | 11 |
| abstract_inverted_index.first | 67 |
| abstract_inverted_index.found | 31 |
| abstract_inverted_index.input | 112 |
| abstract_inverted_index.level | 72 |
| abstract_inverted_index.rates | 164 |
| abstract_inverted_index.which | 92, 130 |
| abstract_inverted_index.Fu+En1 | 90 |
| abstract_inverted_index.GPT-4o | 168 |
| abstract_inverted_index.Models | 13 |
| abstract_inverted_index.Recent | 0 |
| abstract_inverted_index.adopts | 88 |
| abstract_inverted_index.attack | 162 |
| abstract_inverted_index.deeper | 186 |
| abstract_inverted_index.models | 82, 117 |
| abstract_inverted_index.output | 151 |
| abstract_inverted_index.prompt | 24 |
| abstract_inverted_index.stage. | 152 |
| abstract_inverted_index.strong | 119 |
| abstract_inverted_index.(LLMs), | 14 |
| abstract_inverted_index.GPT-4.1 | 176 |
| abstract_inverted_index.ability | 43, 71 |
| abstract_inverted_index.applied | 136 |
| abstract_inverted_index.applies | 77 |
| abstract_inverted_index.attacks | 5 |
| abstract_inverted_index.complex | 126 |
| abstract_inverted_index.current | 192 |
| abstract_inverted_index.evasion | 105, 144 |
| abstract_inverted_index.exposed | 7 |
| abstract_inverted_index.further | 141 |
| abstract_inverted_index.layered | 94 |
| abstract_inverted_index.limited | 84 |
| abstract_inverted_index.propose | 53 |
| abstract_inverted_index.results | 154 |
| abstract_inverted_index.stages. | 115 |
| abstract_inverted_index.success | 163 |
| abstract_inverted_index.through | 21 |
| abstract_inverted_index.Building | 48 |
| abstract_inverted_index.Language | 12 |
| abstract_inverted_index.ability, | 86, 121 |
| abstract_inverted_index.attacked | 46 |
| abstract_inverted_index.critical | 8 |
| abstract_inverted_index.defenses | 109, 148 |
| abstract_inverted_index.enabling | 15 |
| abstract_inverted_index.insight, | 51 |
| abstract_inverted_index.release) | 172 |
| abstract_inverted_index.semantic | 95 |
| abstract_inverted_index.Framework | 57 |
| abstract_inverted_index.achieving | 161 |
| abstract_inverted_index.alignment | 19, 194 |
| abstract_inverted_index.approach, | 160 |
| abstract_inverted_index.black-box | 63 |
| abstract_inverted_index.different | 78 |
| abstract_inverted_index.inference | 114 |
| abstract_inverted_index.jailbreak | 4, 36 |
| abstract_inverted_index.mutations | 96 |
| abstract_inverted_index.release). | 180 |
| abstract_inverted_index.strategy, | 91, 128 |
| abstract_inverted_index.Fu+En1+En2 | 127 |
| abstract_inverted_index.additional | 131 |
| abstract_inverted_index.dual-ended | 132 |
| abstract_inverted_index.encryption | 99, 133 |
| abstract_inverted_index.evaluating | 60 |
| abstract_inverted_index.influenced | 39 |
| abstract_inverted_index.integrates | 93 |
| abstract_inverted_index.responses, | 140 |
| abstract_inverted_index.safeguards | 20 |
| abstract_inverted_index.strategies | 37, 79 |
| abstract_inverted_index.technique, | 100 |
| abstract_inverted_index.techniques | 134 |
| abstract_inverted_index.adversarial | 3 |
| abstract_inverted_index.categorizes | 68 |
| abstract_inverted_index.contributes | 183 |
| abstract_inverted_index.demonstrate | 155 |
| abstract_inverted_index.effectively | 102 |
| abstract_inverted_index.mechanisms. | 195 |
| abstract_inverted_index.Experimental | 153 |
| abstract_inverted_index.accordingly: | 80 |
| abstract_inverted_index.advancements | 1 |
| abstract_inverted_index.contributing | 103, 142 |
| abstract_inverted_index.experiments, | 29 |
| abstract_inverted_index.increasingly | 22 |
| abstract_inverted_index.Specifically, | 65 |
| abstract_inverted_index.circumvention | 17 |
| abstract_inverted_index.comprehension | 42, 70, 85, 120 |
| abstract_inverted_index.effectiveness | 34, 157 |
| abstract_inverted_index.sophisticated | 23 |
| abstract_inverted_index.understanding | 187 |
| abstract_inverted_index.manipulations. | 25 |
| abstract_inverted_index.vulnerabilities | 9, 61, 190 |
| abstract_inverted_index.Multi-Encryption | 56 |
| abstract_inverted_index.capability-aware | 55 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 6 |
| citation_normalized_percentile |