Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference Article Swipe
Souvik Kundu
,
Yuke Zhang
,
Dake Chen
,
Peter A. Beerel
·
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2304.13274
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2304.13274
Large number of ReLU and MAC operations of Deep neural networks make them ill-suited for latency and compute-efficient private inference. In this paper, we present a model optimization method that allows a model to learn to be shallow. In particular, we leverage the ReLU sensitivity of a convolutional block to remove a ReLU layer and merge its succeeding and preceding convolution layers to a shallow block. Unlike existing ReLU reduction methods, our joint reduction method can yield models with improved reduction of both ReLUs and linear operations by up to 1.73x and 1.47x, respectively, evaluated with ResNet18 on CIFAR-100 without any significant accuracy-drop.
Related Topics
Concepts
Metadata
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2304.13274
- https://arxiv.org/pdf/2304.13274
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4367189850
All OpenAlex metadata
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4367189850Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2304.13274Digital Object Identifier
- Title
-
Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private InferenceWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-04-26Full publication date if available
- Authors
-
Souvik Kundu, Yuke Zhang, Dake Chen, Peter A. BeerelList of authors in order
- Landing page
-
https://arxiv.org/abs/2304.13274Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2304.13274Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2304.13274Direct OA link when available
- Concepts
-
Inference, Computer science, Merge (version control), Convolutional neural network, Leverage (statistics), Latency (audio), Reduction (mathematics), Deep learning, Artificial intelligence, Algorithm, Computer engineering, Parallel computing, Mathematics, Telecommunications, GeometryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4367189850 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2304.13274 |
| ids.doi | https://doi.org/10.48550/arxiv.2304.13274 |
| ids.openalex | https://openalex.org/W4367189850 |
| fwci | |
| type | preprint |
| title | Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11689 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9983000159263611 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Adversarial Robustness in Machine Learning |
| topics[1].id | https://openalex.org/T10036 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.998199999332428 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Advanced Neural Network Applications |
| topics[2].id | https://openalex.org/T10764 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9926999807357788 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Privacy-Preserving Technologies in Data |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2776214188 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7930399179458618 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q408386 |
| concepts[0].display_name | Inference |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.685430645942688 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C197129107 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6567244529724121 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q1921621 |
| concepts[2].display_name | Merge (version control) |
| concepts[3].id | https://openalex.org/C81363708 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5888683199882507 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q17084460 |
| concepts[3].display_name | Convolutional neural network |
| concepts[4].id | https://openalex.org/C153083717 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5472861528396606 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q6535263 |
| concepts[4].display_name | Leverage (statistics) |
| concepts[5].id | https://openalex.org/C82876162 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5199071764945984 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q17096504 |
| concepts[5].display_name | Latency (audio) |
| concepts[6].id | https://openalex.org/C111335779 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4874575138092041 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q3454686 |
| concepts[6].display_name | Reduction (mathematics) |
| concepts[7].id | https://openalex.org/C108583219 |
| concepts[7].level | 2 |
| concepts[7].score | 0.471695214509964 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q197536 |
| concepts[7].display_name | Deep learning |
| concepts[8].id | https://openalex.org/C154945302 |
| concepts[8].level | 1 |
| concepts[8].score | 0.4397997558116913 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[8].display_name | Artificial intelligence |
| concepts[9].id | https://openalex.org/C11413529 |
| concepts[9].level | 1 |
| concepts[9].score | 0.3964127004146576 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[9].display_name | Algorithm |
| concepts[10].id | https://openalex.org/C113775141 |
| concepts[10].level | 1 |
| concepts[10].score | 0.37626272439956665 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q428691 |
| concepts[10].display_name | Computer engineering |
| concepts[11].id | https://openalex.org/C173608175 |
| concepts[11].level | 1 |
| concepts[11].score | 0.2473386824131012 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q232661 |
| concepts[11].display_name | Parallel computing |
| concepts[12].id | https://openalex.org/C33923547 |
| concepts[12].level | 0 |
| concepts[12].score | 0.15751081705093384 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[12].display_name | Mathematics |
| concepts[13].id | https://openalex.org/C76155785 |
| concepts[13].level | 1 |
| concepts[13].score | 0.08113458752632141 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q418 |
| concepts[13].display_name | Telecommunications |
| concepts[14].id | https://openalex.org/C2524010 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[14].display_name | Geometry |
| keywords[0].id | https://openalex.org/keywords/inference |
| keywords[0].score | 0.7930399179458618 |
| keywords[0].display_name | Inference |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.685430645942688 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/merge |
| keywords[2].score | 0.6567244529724121 |
| keywords[2].display_name | Merge (version control) |
| keywords[3].id | https://openalex.org/keywords/convolutional-neural-network |
| keywords[3].score | 0.5888683199882507 |
| keywords[3].display_name | Convolutional neural network |
| keywords[4].id | https://openalex.org/keywords/leverage |
| keywords[4].score | 0.5472861528396606 |
| keywords[4].display_name | Leverage (statistics) |
| keywords[5].id | https://openalex.org/keywords/latency |
| keywords[5].score | 0.5199071764945984 |
| keywords[5].display_name | Latency (audio) |
| keywords[6].id | https://openalex.org/keywords/reduction |
| keywords[6].score | 0.4874575138092041 |
| keywords[6].display_name | Reduction (mathematics) |
| keywords[7].id | https://openalex.org/keywords/deep-learning |
| keywords[7].score | 0.471695214509964 |
| keywords[7].display_name | Deep learning |
| keywords[8].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[8].score | 0.4397997558116913 |
| keywords[8].display_name | Artificial intelligence |
| keywords[9].id | https://openalex.org/keywords/algorithm |
| keywords[9].score | 0.3964127004146576 |
| keywords[9].display_name | Algorithm |
| keywords[10].id | https://openalex.org/keywords/computer-engineering |
| keywords[10].score | 0.37626272439956665 |
| keywords[10].display_name | Computer engineering |
| keywords[11].id | https://openalex.org/keywords/parallel-computing |
| keywords[11].score | 0.2473386824131012 |
| keywords[11].display_name | Parallel computing |
| keywords[12].id | https://openalex.org/keywords/mathematics |
| keywords[12].score | 0.15751081705093384 |
| keywords[12].display_name | Mathematics |
| keywords[13].id | https://openalex.org/keywords/telecommunications |
| keywords[13].score | 0.08113458752632141 |
| keywords[13].display_name | Telecommunications |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2304.13274 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2304.13274 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2304.13274 |
| locations[1].id | doi:10.48550/arxiv.2304.13274 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2304.13274 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5087095284 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-5815-8765 |
| authorships[0].author.display_name | Souvik Kundu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Kundu, Souvik |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5019021341 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-5253-5478 |
| authorships[1].author.display_name | Yuke Zhang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zhang, Yuke |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5022810915 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-8193-7158 |
| authorships[2].author.display_name | Dake Chen |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Chen, Dake |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5084205024 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-8283-0168 |
| authorships[3].author.display_name | Peter A. Beerel |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Beerel, Peter A. |
| authorships[3].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2304.13274 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11689 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9983000159263611 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Adversarial Robustness in Machine Learning |
| related_works | https://openalex.org/W2486541857, https://openalex.org/W4234886518, https://openalex.org/W2389591058, https://openalex.org/W2382112581, https://openalex.org/W3124036233, https://openalex.org/W4229787472, https://openalex.org/W2108840191, https://openalex.org/W4293226380, https://openalex.org/W2759366996, https://openalex.org/W2110679372 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2304.13274 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2304.13274 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2304.13274 |
| primary_location.id | pmh:oai:arXiv.org:2304.13274 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2304.13274 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2304.13274 |
| publication_date | 2023-04-26 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 25, 31, 46, 51, 63 |
| abstract_inverted_index.In | 20, 38 |
| abstract_inverted_index.be | 36 |
| abstract_inverted_index.by | 87 |
| abstract_inverted_index.of | 2, 7, 45, 81 |
| abstract_inverted_index.on | 97 |
| abstract_inverted_index.to | 33, 35, 49, 62, 89 |
| abstract_inverted_index.up | 88 |
| abstract_inverted_index.we | 23, 40 |
| abstract_inverted_index.MAC | 5 |
| abstract_inverted_index.and | 4, 16, 54, 58, 84, 91 |
| abstract_inverted_index.any | 100 |
| abstract_inverted_index.can | 75 |
| abstract_inverted_index.for | 14 |
| abstract_inverted_index.its | 56 |
| abstract_inverted_index.our | 71 |
| abstract_inverted_index.the | 42 |
| abstract_inverted_index.Deep | 8 |
| abstract_inverted_index.ReLU | 3, 43, 52, 68 |
| abstract_inverted_index.both | 82 |
| abstract_inverted_index.make | 11 |
| abstract_inverted_index.that | 29 |
| abstract_inverted_index.them | 12 |
| abstract_inverted_index.this | 21 |
| abstract_inverted_index.with | 78, 95 |
| abstract_inverted_index.1.73x | 90 |
| abstract_inverted_index.Large | 0 |
| abstract_inverted_index.ReLUs | 83 |
| abstract_inverted_index.block | 48 |
| abstract_inverted_index.joint | 72 |
| abstract_inverted_index.layer | 53 |
| abstract_inverted_index.learn | 34 |
| abstract_inverted_index.merge | 55 |
| abstract_inverted_index.model | 26, 32 |
| abstract_inverted_index.yield | 76 |
| abstract_inverted_index.1.47x, | 92 |
| abstract_inverted_index.Unlike | 66 |
| abstract_inverted_index.allows | 30 |
| abstract_inverted_index.block. | 65 |
| abstract_inverted_index.layers | 61 |
| abstract_inverted_index.linear | 85 |
| abstract_inverted_index.method | 28, 74 |
| abstract_inverted_index.models | 77 |
| abstract_inverted_index.neural | 9 |
| abstract_inverted_index.number | 1 |
| abstract_inverted_index.paper, | 22 |
| abstract_inverted_index.remove | 50 |
| abstract_inverted_index.latency | 15 |
| abstract_inverted_index.present | 24 |
| abstract_inverted_index.private | 18 |
| abstract_inverted_index.shallow | 64 |
| abstract_inverted_index.without | 99 |
| abstract_inverted_index.ResNet18 | 96 |
| abstract_inverted_index.existing | 67 |
| abstract_inverted_index.improved | 79 |
| abstract_inverted_index.leverage | 41 |
| abstract_inverted_index.methods, | 70 |
| abstract_inverted_index.networks | 10 |
| abstract_inverted_index.shallow. | 37 |
| abstract_inverted_index.CIFAR-100 | 98 |
| abstract_inverted_index.evaluated | 94 |
| abstract_inverted_index.preceding | 59 |
| abstract_inverted_index.reduction | 69, 73, 80 |
| abstract_inverted_index.ill-suited | 13 |
| abstract_inverted_index.inference. | 19 |
| abstract_inverted_index.operations | 6, 86 |
| abstract_inverted_index.succeeding | 57 |
| abstract_inverted_index.convolution | 60 |
| abstract_inverted_index.particular, | 39 |
| abstract_inverted_index.sensitivity | 44 |
| abstract_inverted_index.significant | 101 |
| abstract_inverted_index.optimization | 27 |
| abstract_inverted_index.convolutional | 47 |
| abstract_inverted_index.respectively, | 93 |
| abstract_inverted_index.accuracy-drop. | 102 |
| abstract_inverted_index.compute-efficient | 17 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |