Stepwise Reasoning Error Disruption Attack of LLMs Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2412.11934
Large language models (LLMs) have made remarkable strides in complex reasoning tasks, but their safety and robustness in reasoning processes remain underexplored. Existing attacks on LLM reasoning are constrained by specific settings or lack of imperceptibility, limiting their feasibility and generalizability. To address these challenges, we propose the Stepwise rEasoning Error Disruption (SEED) attack, which subtly injects errors into prior reasoning steps to mislead the model into producing incorrect subsequent reasoning and final answers. Unlike previous methods, SEED is compatible with zero-shot and few-shot settings, maintains the natural reasoning flow, and ensures covert execution without modifying the instruction. Extensive experiments on four datasets across four different models demonstrate SEED's effectiveness, revealing the vulnerabilities of LLMs to disruptions in reasoning processes. These findings underscore the need for greater attention to the robustness of LLM reasoning to ensure safety in practical applications. Our code is available at: https://github.com/Applied-Machine-Learning-Lab/SEED-Attack.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2412.11934
- https://arxiv.org/pdf/2412.11934
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4405561501
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4405561501Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2412.11934Digital Object Identifier
- Title
-
Stepwise Reasoning Error Disruption Attack of LLMsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-16Full publication date if available
- Authors
-
Jingyu Peng, Maolin Wang, Xiangyu Zhao, Kai Zhang, Wanyu Wang, Pengyue Jia, Qidong Liu, Ruocheng Guo, Qi LiuList of authors in order
- Landing page
-
https://arxiv.org/abs/2412.11934Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2412.11934Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2412.11934Direct OA link when available
- Concepts
-
Computer security, Computer scienceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4405561501 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2412.11934 |
| ids.doi | https://doi.org/10.48550/arxiv.2412.11934 |
| ids.openalex | https://openalex.org/W4405561501 |
| fwci | |
| type | preprint |
| title | Stepwise Reasoning Error Disruption Attack of LLMs |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10927 |
| topics[0].field.id | https://openalex.org/fields/33 |
| topics[0].field.display_name | Social Sciences |
| topics[0].score | 0.9595000147819519 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/3312 |
| topics[0].subfield.display_name | Sociology and Political Science |
| topics[0].display_name | Access Control and Trust |
| topics[1].id | https://openalex.org/T11424 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9591000080108643 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Security and Verification in Computing |
| topics[2].id | https://openalex.org/T11614 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9580000042915344 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1710 |
| topics[2].subfield.display_name | Information Systems |
| topics[2].display_name | Cloud Data Security Solutions |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C38652104 |
| concepts[0].level | 1 |
| concepts[0].score | 0.35369980335235596 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q3510521 |
| concepts[0].display_name | Computer security |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.33326563239097595 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| keywords[0].id | https://openalex.org/keywords/computer-security |
| keywords[0].score | 0.35369980335235596 |
| keywords[0].display_name | Computer security |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.33326563239097595 |
| keywords[1].display_name | Computer science |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2412.11934 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2412.11934 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2412.11934 |
| locations[1].id | doi:10.48550/arxiv.2412.11934 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2412.11934 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5081921599 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-7921-0155 |
| authorships[0].author.display_name | Jingyu Peng |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Peng, Jingyu |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5014749523 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-7449-9834 |
| authorships[1].author.display_name | Maolin Wang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wang, Maolin |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5113826340 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-4468-5529 |
| authorships[2].author.display_name | Xiangyu Zhao |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Zhao, Xiangyu |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5100323973 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-5221-1868 |
| authorships[3].author.display_name | Kai Zhang |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Zhang, Kai |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5101914989 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-2500-4318 |
| authorships[4].author.display_name | Wanyu Wang |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Wang, Wanyu |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5115539407 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Pengyue Jia |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Jia, Pengyue |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5101541166 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-6954-6576 |
| authorships[6].author.display_name | Qidong Liu |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Liu, Qidong |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5054719216 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-8522-6142 |
| authorships[7].author.display_name | Ruocheng Guo |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Guo, Ruocheng |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5115597047 |
| authorships[8].author.orcid | |
| authorships[8].author.display_name | Qi Liu |
| authorships[8].author_position | last |
| authorships[8].raw_author_name | Liu, Qi |
| authorships[8].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2412.11934 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Stepwise Reasoning Error Disruption Attack of LLMs |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10927 |
| primary_topic.field.id | https://openalex.org/fields/33 |
| primary_topic.field.display_name | Social Sciences |
| primary_topic.score | 0.9595000147819519 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/3312 |
| primary_topic.subfield.display_name | Sociology and Political Science |
| primary_topic.display_name | Access Control and Trust |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2412.11934 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2412.11934 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2412.11934 |
| primary_location.id | pmh:oai:arXiv.org:2412.11934 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2412.11934 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2412.11934 |
| publication_date | 2024-12-16 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.To | 41 |
| abstract_inverted_index.by | 29 |
| abstract_inverted_index.in | 8, 17, 117, 137 |
| abstract_inverted_index.is | 78, 142 |
| abstract_inverted_index.of | 34, 113, 131 |
| abstract_inverted_index.on | 24, 100 |
| abstract_inverted_index.or | 32 |
| abstract_inverted_index.to | 62, 115, 128, 134 |
| abstract_inverted_index.we | 45 |
| abstract_inverted_index.LLM | 25, 132 |
| abstract_inverted_index.Our | 140 |
| abstract_inverted_index.and | 15, 39, 71, 82, 90 |
| abstract_inverted_index.are | 27 |
| abstract_inverted_index.at: | 144 |
| abstract_inverted_index.but | 12 |
| abstract_inverted_index.for | 125 |
| abstract_inverted_index.the | 47, 64, 86, 96, 111, 123, 129 |
| abstract_inverted_index.LLMs | 114 |
| abstract_inverted_index.SEED | 77 |
| abstract_inverted_index.code | 141 |
| abstract_inverted_index.four | 101, 104 |
| abstract_inverted_index.have | 4 |
| abstract_inverted_index.into | 58, 66 |
| abstract_inverted_index.lack | 33 |
| abstract_inverted_index.made | 5 |
| abstract_inverted_index.need | 124 |
| abstract_inverted_index.with | 80 |
| abstract_inverted_index.Error | 50 |
| abstract_inverted_index.Large | 0 |
| abstract_inverted_index.These | 120 |
| abstract_inverted_index.final | 72 |
| abstract_inverted_index.flow, | 89 |
| abstract_inverted_index.model | 65 |
| abstract_inverted_index.prior | 59 |
| abstract_inverted_index.steps | 61 |
| abstract_inverted_index.their | 13, 37 |
| abstract_inverted_index.these | 43 |
| abstract_inverted_index.which | 54 |
| abstract_inverted_index.(LLMs) | 3 |
| abstract_inverted_index.(SEED) | 52 |
| abstract_inverted_index.SEED's | 108 |
| abstract_inverted_index.Unlike | 74 |
| abstract_inverted_index.across | 103 |
| abstract_inverted_index.covert | 92 |
| abstract_inverted_index.ensure | 135 |
| abstract_inverted_index.errors | 57 |
| abstract_inverted_index.models | 2, 106 |
| abstract_inverted_index.remain | 20 |
| abstract_inverted_index.safety | 14, 136 |
| abstract_inverted_index.subtly | 55 |
| abstract_inverted_index.tasks, | 11 |
| abstract_inverted_index.address | 42 |
| abstract_inverted_index.attack, | 53 |
| abstract_inverted_index.attacks | 23 |
| abstract_inverted_index.complex | 9 |
| abstract_inverted_index.ensures | 91 |
| abstract_inverted_index.greater | 126 |
| abstract_inverted_index.injects | 56 |
| abstract_inverted_index.mislead | 63 |
| abstract_inverted_index.natural | 87 |
| abstract_inverted_index.propose | 46 |
| abstract_inverted_index.strides | 7 |
| abstract_inverted_index.without | 94 |
| abstract_inverted_index.Existing | 22 |
| abstract_inverted_index.Stepwise | 48 |
| abstract_inverted_index.answers. | 73 |
| abstract_inverted_index.datasets | 102 |
| abstract_inverted_index.few-shot | 83 |
| abstract_inverted_index.findings | 121 |
| abstract_inverted_index.language | 1 |
| abstract_inverted_index.limiting | 36 |
| abstract_inverted_index.methods, | 76 |
| abstract_inverted_index.previous | 75 |
| abstract_inverted_index.settings | 31 |
| abstract_inverted_index.specific | 30 |
| abstract_inverted_index.Extensive | 98 |
| abstract_inverted_index.attention | 127 |
| abstract_inverted_index.available | 143 |
| abstract_inverted_index.different | 105 |
| abstract_inverted_index.execution | 93 |
| abstract_inverted_index.incorrect | 68 |
| abstract_inverted_index.maintains | 85 |
| abstract_inverted_index.modifying | 95 |
| abstract_inverted_index.practical | 138 |
| abstract_inverted_index.processes | 19 |
| abstract_inverted_index.producing | 67 |
| abstract_inverted_index.rEasoning | 49 |
| abstract_inverted_index.reasoning | 10, 18, 26, 60, 70, 88, 118, 133 |
| abstract_inverted_index.revealing | 110 |
| abstract_inverted_index.settings, | 84 |
| abstract_inverted_index.zero-shot | 81 |
| abstract_inverted_index.Disruption | 51 |
| abstract_inverted_index.compatible | 79 |
| abstract_inverted_index.processes. | 119 |
| abstract_inverted_index.remarkable | 6 |
| abstract_inverted_index.robustness | 16, 130 |
| abstract_inverted_index.subsequent | 69 |
| abstract_inverted_index.underscore | 122 |
| abstract_inverted_index.challenges, | 44 |
| abstract_inverted_index.constrained | 28 |
| abstract_inverted_index.demonstrate | 107 |
| abstract_inverted_index.disruptions | 116 |
| abstract_inverted_index.experiments | 99 |
| abstract_inverted_index.feasibility | 38 |
| abstract_inverted_index.instruction. | 97 |
| abstract_inverted_index.applications. | 139 |
| abstract_inverted_index.effectiveness, | 109 |
| abstract_inverted_index.underexplored. | 21 |
| abstract_inverted_index.vulnerabilities | 112 |
| abstract_inverted_index.generalizability. | 40 |
| abstract_inverted_index.imperceptibility, | 35 |
| abstract_inverted_index.https://github.com/Applied-Machine-Learning-Lab/SEED-Attack. | 145 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 9 |
| citation_normalized_percentile |