Privacy-Preserving Model and Preprocessing Verification for Machine Learning Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2501.08236
This paper presents a framework for privacy-preserving verification of machine learning models, focusing on models trained on sensitive data. Integrating Local Differential Privacy (LDP) with model explanations from LIME and SHAP, our framework enables robust verification without compromising individual privacy. It addresses two key tasks: binary classification, to verify if a target model was trained correctly by applying the appropriate preprocessing steps, and multi-class classification, to identify specific preprocessing errors. Evaluations on three real-world datasets-Diabetes, Adult, and Student Record-demonstrate that while the ML-based approach is particularly effective in binary tasks, the threshold-based method performs comparably in multi-class tasks. Results indicate that although verification accuracy varies across datasets and noise levels, the framework provides effective detection of preprocessing errors, strong privacy guarantees, and practical applicability for safeguarding sensitive data.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2501.08236
- https://arxiv.org/pdf/2501.08236
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4406449961
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4406449961Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2501.08236Digital Object Identifier
- Title
-
Privacy-Preserving Model and Preprocessing Verification for Machine LearningWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-01-14Full publication date if available
- Authors
-
Wenbiao Li, Anisa Halimi, Xiaoqian Jiang, Jaideep Vaidya, Erman AydayList of authors in order
- Landing page
-
https://arxiv.org/abs/2501.08236Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2501.08236Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2501.08236Direct OA link when available
- Concepts
-
Computer science, Preprocessor, Data pre-processing, Artificial intelligence, Machine learning, Computer security, Data miningTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4406449961 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2501.08236 |
| ids.doi | https://doi.org/10.48550/arxiv.2501.08236 |
| ids.openalex | https://openalex.org/W4406449961 |
| fwci | |
| type | preprint |
| title | Privacy-Preserving Model and Preprocessing Verification for Machine Learning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10764 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.8256999850273132 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Privacy-Preserving Technologies in Data |
| topics[1].id | https://openalex.org/T12026 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.7075999975204468 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Explainable Artificial Intelligence (XAI) |
| topics[2].id | https://openalex.org/T11614 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.6819999814033508 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1710 |
| topics[2].subfield.display_name | Information Systems |
| topics[2].display_name | Cloud Data Security Solutions |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.7491588592529297 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C34736171 |
| concepts[1].level | 2 |
| concepts[1].score | 0.690380334854126 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q918333 |
| concepts[1].display_name | Preprocessor |
| concepts[2].id | https://openalex.org/C10551718 |
| concepts[2].level | 2 |
| concepts[2].score | 0.5345112085342407 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q5227332 |
| concepts[2].display_name | Data pre-processing |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.5150956511497498 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C119857082 |
| concepts[4].level | 1 |
| concepts[4].score | 0.46331787109375 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[4].display_name | Machine learning |
| concepts[5].id | https://openalex.org/C38652104 |
| concepts[5].level | 1 |
| concepts[5].score | 0.401019424200058 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q3510521 |
| concepts[5].display_name | Computer security |
| concepts[6].id | https://openalex.org/C124101348 |
| concepts[6].level | 1 |
| concepts[6].score | 0.3573465347290039 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q172491 |
| concepts[6].display_name | Data mining |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.7491588592529297 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/preprocessor |
| keywords[1].score | 0.690380334854126 |
| keywords[1].display_name | Preprocessor |
| keywords[2].id | https://openalex.org/keywords/data-pre-processing |
| keywords[2].score | 0.5345112085342407 |
| keywords[2].display_name | Data pre-processing |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.5150956511497498 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/machine-learning |
| keywords[4].score | 0.46331787109375 |
| keywords[4].display_name | Machine learning |
| keywords[5].id | https://openalex.org/keywords/computer-security |
| keywords[5].score | 0.401019424200058 |
| keywords[5].display_name | Computer security |
| keywords[6].id | https://openalex.org/keywords/data-mining |
| keywords[6].score | 0.3573465347290039 |
| keywords[6].display_name | Data mining |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2501.08236 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2501.08236 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2501.08236 |
| locations[1].id | doi:10.48550/arxiv.2501.08236 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2501.08236 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5034327508 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-1674-7988 |
| authorships[0].author.display_name | Wenbiao Li |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Li, Wenbiao |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5014259795 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Anisa Halimi |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Halimi, Anisa |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5055458864 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-9933-2205 |
| authorships[2].author.display_name | Xiaoqian Jiang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Jiang, Xiaoqian |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5034878799 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-7420-6947 |
| authorships[3].author.display_name | Jaideep Vaidya |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Vaidya, Jaideep |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5028326739 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-3383-1081 |
| authorships[4].author.display_name | Erman Ayday |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Ayday, Erman |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2501.08236 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Privacy-Preserving Model and Preprocessing Verification for Machine Learning |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10764 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.8256999850273132 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Privacy-Preserving Technologies in Data |
| related_works | https://openalex.org/W2989490741, https://openalex.org/W3092506759, https://openalex.org/W2367545121, https://openalex.org/W4248881655, https://openalex.org/W2482165163, https://openalex.org/W3010890513, https://openalex.org/W120741642, https://openalex.org/W138569904, https://openalex.org/W2390914021, https://openalex.org/W2952736244 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2501.08236 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2501.08236 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2501.08236 |
| primary_location.id | pmh:oai:arXiv.org:2501.08236 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2501.08236 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2501.08236 |
| publication_date | 2025-01-14 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 3, 50 |
| abstract_inverted_index.It | 40 |
| abstract_inverted_index.by | 56 |
| abstract_inverted_index.if | 49 |
| abstract_inverted_index.in | 87, 95 |
| abstract_inverted_index.is | 84 |
| abstract_inverted_index.of | 8, 115 |
| abstract_inverted_index.on | 13, 16, 71 |
| abstract_inverted_index.to | 47, 65 |
| abstract_inverted_index.and | 29, 62, 76, 107, 121 |
| abstract_inverted_index.for | 5, 124 |
| abstract_inverted_index.key | 43 |
| abstract_inverted_index.our | 31 |
| abstract_inverted_index.the | 58, 81, 90, 110 |
| abstract_inverted_index.two | 42 |
| abstract_inverted_index.was | 53 |
| abstract_inverted_index.LIME | 28 |
| abstract_inverted_index.This | 0 |
| abstract_inverted_index.from | 27 |
| abstract_inverted_index.that | 79, 100 |
| abstract_inverted_index.with | 24 |
| abstract_inverted_index.(LDP) | 23 |
| abstract_inverted_index.Local | 20 |
| abstract_inverted_index.SHAP, | 30 |
| abstract_inverted_index.data. | 18, 127 |
| abstract_inverted_index.model | 25, 52 |
| abstract_inverted_index.noise | 108 |
| abstract_inverted_index.paper | 1 |
| abstract_inverted_index.three | 72 |
| abstract_inverted_index.while | 80 |
| abstract_inverted_index.Adult, | 75 |
| abstract_inverted_index.across | 105 |
| abstract_inverted_index.binary | 45, 88 |
| abstract_inverted_index.method | 92 |
| abstract_inverted_index.models | 14 |
| abstract_inverted_index.robust | 34 |
| abstract_inverted_index.steps, | 61 |
| abstract_inverted_index.strong | 118 |
| abstract_inverted_index.target | 51 |
| abstract_inverted_index.tasks, | 89 |
| abstract_inverted_index.tasks. | 97 |
| abstract_inverted_index.tasks: | 44 |
| abstract_inverted_index.varies | 104 |
| abstract_inverted_index.verify | 48 |
| abstract_inverted_index.Privacy | 22 |
| abstract_inverted_index.Results | 98 |
| abstract_inverted_index.Student | 77 |
| abstract_inverted_index.enables | 33 |
| abstract_inverted_index.errors, | 117 |
| abstract_inverted_index.errors. | 69 |
| abstract_inverted_index.levels, | 109 |
| abstract_inverted_index.machine | 9 |
| abstract_inverted_index.models, | 11 |
| abstract_inverted_index.privacy | 119 |
| abstract_inverted_index.trained | 15, 54 |
| abstract_inverted_index.without | 36 |
| abstract_inverted_index.ML-based | 82 |
| abstract_inverted_index.accuracy | 103 |
| abstract_inverted_index.although | 101 |
| abstract_inverted_index.applying | 57 |
| abstract_inverted_index.approach | 83 |
| abstract_inverted_index.datasets | 106 |
| abstract_inverted_index.focusing | 12 |
| abstract_inverted_index.identify | 66 |
| abstract_inverted_index.indicate | 99 |
| abstract_inverted_index.learning | 10 |
| abstract_inverted_index.performs | 93 |
| abstract_inverted_index.presents | 2 |
| abstract_inverted_index.privacy. | 39 |
| abstract_inverted_index.provides | 112 |
| abstract_inverted_index.specific | 67 |
| abstract_inverted_index.addresses | 41 |
| abstract_inverted_index.correctly | 55 |
| abstract_inverted_index.detection | 114 |
| abstract_inverted_index.effective | 86, 113 |
| abstract_inverted_index.framework | 4, 32, 111 |
| abstract_inverted_index.practical | 122 |
| abstract_inverted_index.sensitive | 17, 126 |
| abstract_inverted_index.comparably | 94 |
| abstract_inverted_index.individual | 38 |
| abstract_inverted_index.real-world | 73 |
| abstract_inverted_index.Evaluations | 70 |
| abstract_inverted_index.Integrating | 19 |
| abstract_inverted_index.appropriate | 59 |
| abstract_inverted_index.guarantees, | 120 |
| abstract_inverted_index.multi-class | 63, 96 |
| abstract_inverted_index.Differential | 21 |
| abstract_inverted_index.compromising | 37 |
| abstract_inverted_index.explanations | 26 |
| abstract_inverted_index.particularly | 85 |
| abstract_inverted_index.safeguarding | 125 |
| abstract_inverted_index.verification | 7, 35, 102 |
| abstract_inverted_index.applicability | 123 |
| abstract_inverted_index.preprocessing | 60, 68, 116 |
| abstract_inverted_index.classification, | 46, 64 |
| abstract_inverted_index.threshold-based | 91 |
| abstract_inverted_index.Record-demonstrate | 78 |
| abstract_inverted_index.datasets-Diabetes, | 74 |
| abstract_inverted_index.privacy-preserving | 6 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |