Prediction Inconsistency Helps Achieve Generalizable Detection of Adversarial Examples Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2506.03765
Adversarial detection protects models from adversarial attacks by refusing suspicious test samples. However, current detection methods often suffer from weak generalization: their effectiveness tends to degrade significantly when applied to adversarially trained models rather than naturally trained ones, and they generally struggle to achieve consistent effectiveness across both white-box and black-box attack settings. In this work, we observe that an auxiliary model, differing from the primary model in training strategy or model architecture, tends to assign low confidence to the primary model's predictions on adversarial examples (AEs), while preserving high confidence on normal examples (NEs). Based on this discovery, we propose Prediction Inconsistency Detector (PID), a lightweight and generalizable detection framework to distinguish AEs from NEs by capturing the prediction inconsistency between the primal and auxiliary models. PID is compatible with both naturally and adversarially trained primal models and outperforms four detection methods across 3 white-box, 3 black-box, and 1 mixed adversarial attacks. Specifically, PID achieves average AUC scores of 99.29\% and 99.30\% on CIFAR-10 when the primal model is naturally and adversarially trained, respectively, and 98.31% and 96.81% on ImageNet under the same conditions, outperforming existing SOTAs by 4.70%$\sim$25.46%.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2506.03765
- https://arxiv.org/pdf/2506.03765
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416073812
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416073812Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2506.03765Digital Object Identifier
- Title
-
Prediction Inconsistency Helps Achieve Generalizable Detection of Adversarial ExamplesWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-06-04Full publication date if available
- Authors
-
Sicong Han, Chenhao Lin, Zhengyu Zhao, Xiyuan Wang, Xinlei He, Qian Li, Cong Wang, Qian Wang, Chao ShenList of authors in order
- Landing page
-
https://arxiv.org/abs/2506.03765Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2506.03765Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2506.03765Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416073812 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2506.03765 |
| ids.doi | https://doi.org/10.48550/arxiv.2506.03765 |
| ids.openalex | https://openalex.org/W4416073812 |
| fwci | |
| type | preprint |
| title | Prediction Inconsistency Helps Achieve Generalizable Detection of Adversarial Examples |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2506.03765 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by-nc-sa |
| locations[0].pdf_url | https://arxiv.org/pdf/2506.03765 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by-nc-sa |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2506.03765 |
| locations[1].id | doi:10.48550/arxiv.2506.03765 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2506.03765 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5010311797 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-8459-4701 |
| authorships[0].author.display_name | Sicong Han |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Han, Sicong |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5077967584 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-9327-9260 |
| authorships[1].author.display_name | Chenhao Lin |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Lin, Chenhao |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101795752 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-1678-9694 |
| authorships[2].author.display_name | Zhengyu Zhao |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Zhao, Zhengyu |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5036688249 |
| authorships[3].author.orcid | https://orcid.org/0009-0008-1839-2010 |
| authorships[3].author.display_name | Xiyuan Wang |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Wang, Xiyuan |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5044579853 |
| authorships[4].author.orcid | https://orcid.org/0009-0007-3879-9080 |
| authorships[4].author.display_name | Xinlei He |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | He, Xinlei |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5100340635 |
| authorships[5].author.orcid | https://orcid.org/0000-0002-9530-4925 |
| authorships[5].author.display_name | Qian Li |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Li, Qian |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5100390483 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-4539-2525 |
| authorships[6].author.display_name | Cong Wang |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Wang, Cong |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5100391116 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-8967-8525 |
| authorships[7].author.display_name | Qian Wang |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Wang, Qian |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5064997871 |
| authorships[8].author.orcid | https://orcid.org/0000-0003-4147-4934 |
| authorships[8].author.display_name | Chao Shen |
| authorships[8].author_position | last |
| authorships[8].raw_author_name | Shen, Chao |
| authorships[8].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2506.03765 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Prediction Inconsistency Helps Achieve Generalizable Detection of Adversarial Examples |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-28T09:53:01.849860 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2506.03765 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by-nc-sa |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2506.03765 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by-nc-sa |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2506.03765 |
| primary_location.id | pmh:oai:arXiv.org:2506.03765 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by-nc-sa |
| primary_location.pdf_url | https://arxiv.org/pdf/2506.03765 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by-nc-sa |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2506.03765 |
| publication_date | 2025-06-04 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.1 | 149 |
| abstract_inverted_index.3 | 144, 146 |
| abstract_inverted_index.a | 105 |
| abstract_inverted_index.In | 53 |
| abstract_inverted_index.an | 59 |
| abstract_inverted_index.by | 7, 116, 188 |
| abstract_inverted_index.in | 67 |
| abstract_inverted_index.is | 128, 169 |
| abstract_inverted_index.of | 159 |
| abstract_inverted_index.on | 83, 91, 96, 163, 179 |
| abstract_inverted_index.or | 70 |
| abstract_inverted_index.to | 24, 29, 42, 74, 78, 111 |
| abstract_inverted_index.we | 56, 99 |
| abstract_inverted_index.AEs | 113 |
| abstract_inverted_index.AUC | 157 |
| abstract_inverted_index.NEs | 115 |
| abstract_inverted_index.PID | 127, 154 |
| abstract_inverted_index.and | 38, 49, 107, 124, 133, 138, 148, 161, 171, 175, 177 |
| abstract_inverted_index.low | 76 |
| abstract_inverted_index.the | 64, 79, 118, 122, 166, 182 |
| abstract_inverted_index.both | 47, 131 |
| abstract_inverted_index.four | 140 |
| abstract_inverted_index.from | 4, 18, 63, 114 |
| abstract_inverted_index.high | 89 |
| abstract_inverted_index.same | 183 |
| abstract_inverted_index.test | 10 |
| abstract_inverted_index.than | 34 |
| abstract_inverted_index.that | 58 |
| abstract_inverted_index.they | 39 |
| abstract_inverted_index.this | 54, 97 |
| abstract_inverted_index.weak | 19 |
| abstract_inverted_index.when | 27, 165 |
| abstract_inverted_index.with | 130 |
| abstract_inverted_index.Based | 95 |
| abstract_inverted_index.SOTAs | 187 |
| abstract_inverted_index.mixed | 150 |
| abstract_inverted_index.model | 66, 71, 168 |
| abstract_inverted_index.often | 16 |
| abstract_inverted_index.ones, | 37 |
| abstract_inverted_index.tends | 23, 73 |
| abstract_inverted_index.their | 21 |
| abstract_inverted_index.under | 181 |
| abstract_inverted_index.while | 87 |
| abstract_inverted_index.work, | 55 |
| abstract_inverted_index.(AEs), | 86 |
| abstract_inverted_index.(NEs). | 94 |
| abstract_inverted_index.(PID), | 104 |
| abstract_inverted_index.96.81% | 178 |
| abstract_inverted_index.98.31% | 176 |
| abstract_inverted_index.across | 46, 143 |
| abstract_inverted_index.assign | 75 |
| abstract_inverted_index.attack | 51 |
| abstract_inverted_index.model, | 61 |
| abstract_inverted_index.models | 3, 32, 137 |
| abstract_inverted_index.normal | 92 |
| abstract_inverted_index.primal | 123, 136, 167 |
| abstract_inverted_index.rather | 33 |
| abstract_inverted_index.scores | 158 |
| abstract_inverted_index.suffer | 17 |
| abstract_inverted_index.99.29\% | 160 |
| abstract_inverted_index.99.30\% | 162 |
| abstract_inverted_index.achieve | 43 |
| abstract_inverted_index.applied | 28 |
| abstract_inverted_index.attacks | 6 |
| abstract_inverted_index.average | 156 |
| abstract_inverted_index.between | 121 |
| abstract_inverted_index.current | 13 |
| abstract_inverted_index.degrade | 25 |
| abstract_inverted_index.methods | 15, 142 |
| abstract_inverted_index.model's | 81 |
| abstract_inverted_index.models. | 126 |
| abstract_inverted_index.observe | 57 |
| abstract_inverted_index.primary | 65, 80 |
| abstract_inverted_index.propose | 100 |
| abstract_inverted_index.trained | 31, 36, 135 |
| abstract_inverted_index.CIFAR-10 | 164 |
| abstract_inverted_index.Detector | 103 |
| abstract_inverted_index.However, | 12 |
| abstract_inverted_index.ImageNet | 180 |
| abstract_inverted_index.achieves | 155 |
| abstract_inverted_index.attacks. | 152 |
| abstract_inverted_index.examples | 85, 93 |
| abstract_inverted_index.existing | 186 |
| abstract_inverted_index.protects | 2 |
| abstract_inverted_index.refusing | 8 |
| abstract_inverted_index.samples. | 11 |
| abstract_inverted_index.strategy | 69 |
| abstract_inverted_index.struggle | 41 |
| abstract_inverted_index.trained, | 173 |
| abstract_inverted_index.training | 68 |
| abstract_inverted_index.auxiliary | 60, 125 |
| abstract_inverted_index.black-box | 50 |
| abstract_inverted_index.capturing | 117 |
| abstract_inverted_index.detection | 1, 14, 109, 141 |
| abstract_inverted_index.differing | 62 |
| abstract_inverted_index.framework | 110 |
| abstract_inverted_index.generally | 40 |
| abstract_inverted_index.naturally | 35, 132, 170 |
| abstract_inverted_index.settings. | 52 |
| abstract_inverted_index.white-box | 48 |
| abstract_inverted_index.Prediction | 101 |
| abstract_inverted_index.black-box, | 147 |
| abstract_inverted_index.compatible | 129 |
| abstract_inverted_index.confidence | 77, 90 |
| abstract_inverted_index.consistent | 44 |
| abstract_inverted_index.discovery, | 98 |
| abstract_inverted_index.prediction | 119 |
| abstract_inverted_index.preserving | 88 |
| abstract_inverted_index.suspicious | 9 |
| abstract_inverted_index.white-box, | 145 |
| abstract_inverted_index.Adversarial | 0 |
| abstract_inverted_index.adversarial | 5, 84, 151 |
| abstract_inverted_index.conditions, | 184 |
| abstract_inverted_index.distinguish | 112 |
| abstract_inverted_index.lightweight | 106 |
| abstract_inverted_index.outperforms | 139 |
| abstract_inverted_index.predictions | 82 |
| abstract_inverted_index.Inconsistency | 102 |
| abstract_inverted_index.Specifically, | 153 |
| abstract_inverted_index.adversarially | 30, 134, 172 |
| abstract_inverted_index.architecture, | 72 |
| abstract_inverted_index.effectiveness | 22, 45 |
| abstract_inverted_index.generalizable | 108 |
| abstract_inverted_index.inconsistency | 120 |
| abstract_inverted_index.outperforming | 185 |
| abstract_inverted_index.respectively, | 174 |
| abstract_inverted_index.significantly | 26 |
| abstract_inverted_index.generalization: | 20 |
| abstract_inverted_index.4.70%$\sim$25.46%. | 189 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 9 |
| citation_normalized_percentile |