Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference Article Swipe

PDF

Souvik Kundu , Yuke Zhang , Dake Chen , Peter A. Beerel ·

YOU? · · 2023 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2304.13274

Large number of ReLU and MAC operations of Deep neural networks make them ill-suited for latency and compute-efficient private inference. In this paper, we present a model optimization method that allows a model to learn to be shallow. In particular, we leverage the ReLU sensitivity of a convolutional block to remove a ReLU layer and merge its succeeding and preceding convolution layers to a shallow block. Unlike existing ReLU reduction methods, our joint reduction method can yield models with improved reduction of both ReLUs and linear operations by up to 1.73x and 1.47x, respectively, evaluated with ResNet18 on CIFAR-100 without any significant accuracy-drop.

Related Topics

Computer Science

Convolutional Neural Network

Deep Learning

Artificial Intelligence

Concepts

Inference Computer science Merge (version control) Convolutional neural network Leverage (statistics) Latency (audio) Reduction (mathematics) Deep learning Artificial intelligence Algorithm Computer engineering Parallel computing Mathematics Telecommunications Geometry

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2304.13274
PDF: https://arxiv.org/pdf/2304.13274
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4367189850

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4367189850

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2304.13274

Digital Object Identifier
Title: Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2023

Year of publication
Publication date: 2023-04-26

Full publication date if available
Authors: Souvik Kundu, Yuke Zhang, Dake Chen, Peter A. Beerel

List of authors in order
Landing page: https://arxiv.org/abs/2304.13274

Publisher landing page
PDF URL: https://arxiv.org/pdf/2304.13274

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2304.13274

Direct OA link when available
Concepts: Inference, Computer science, Merge (version control), Convolutional neural network, Leverage (statistics), Latency (audio), Reduction (mathematics), Deep learning, Artificial intelligence, Algorithm, Computer engineering, Parallel computing, Mathematics, Telecommunications, Geometry

Top concepts (fields/topics) attached by OpenAlex
Cited by: 0

Total citation count in OpenAlex
Related works (count): 10

Other works algorithmically related by OpenAlex

Full payload

id	https://openalex.org/W4367189850
doi	https://doi.org/10.48550/arxiv.2304.13274
ids.doi	https://doi.org/10.48550/arxiv.2304.13274
ids.openalex	https://openalex.org/W4367189850
fwci
type	preprint
title	Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
topics[0].id	https://openalex.org/T11689
topics[0].field.id	https://openalex.org/fields/17
topics[0].field.display_name	Computer Science
topics[0].score	0.9983000159263611
topics[0].domain.id	https://openalex.org/domains/3
topics[0].domain.display_name	Physical Sciences
topics[0].subfield.id	https://openalex.org/subfields/1702
topics[0].subfield.display_name	Artificial Intelligence
topics[0].display_name	Adversarial Robustness in Machine Learning
topics[1].id	https://openalex.org/T10036
topics[1].field.id	https://openalex.org/fields/17
topics[1].field.display_name	Computer Science
topics[1].score	0.998199999332428
topics[1].domain.id	https://openalex.org/domains/3
topics[1].domain.display_name	Physical Sciences
topics[1].subfield.id	https://openalex.org/subfields/1707
topics[1].subfield.display_name	Computer Vision and Pattern Recognition
topics[1].display_name	Advanced Neural Network Applications
topics[2].id	https://openalex.org/T10764
topics[2].field.id	https://openalex.org/fields/17
topics[2].field.display_name	Computer Science
topics[2].score	0.9926999807357788
topics[2].domain.id	https://openalex.org/domains/3
topics[2].domain.display_name	Physical Sciences
topics[2].subfield.id	https://openalex.org/subfields/1702
topics[2].subfield.display_name	Artificial Intelligence
topics[2].display_name	Privacy-Preserving Technologies in Data
is_xpac	False
apc_list
apc_paid
concepts[0].id	https://openalex.org/C2776214188
concepts[0].level	2
concepts[0].score	0.7930399179458618
concepts[0].wikidata	https://www.wikidata.org/wiki/Q408386
concepts[0].display_name	Inference
concepts[1].id	https://openalex.org/C41008148
concepts[1].level	0
concepts[1].score	0.685430645942688
concepts[1].wikidata	https://www.wikidata.org/wiki/Q21198
concepts[1].display_name	Computer science
concepts[2].id	https://openalex.org/C197129107
concepts[2].level	2
concepts[2].score	0.6567244529724121
concepts[2].wikidata	https://www.wikidata.org/wiki/Q1921621
concepts[2].display_name	Merge (version control)
concepts[3].id	https://openalex.org/C81363708
concepts[3].level	2
concepts[3].score	0.5888683199882507
concepts[3].wikidata	https://www.wikidata.org/wiki/Q17084460
concepts[3].display_name	Convolutional neural network
concepts[4].id	https://openalex.org/C153083717
concepts[4].level	2
concepts[4].score	0.5472861528396606
concepts[4].wikidata	https://www.wikidata.org/wiki/Q6535263
concepts[4].display_name	Leverage (statistics)
concepts[5].id	https://openalex.org/C82876162
concepts[5].level	2
concepts[5].score	0.5199071764945984
concepts[5].wikidata	https://www.wikidata.org/wiki/Q17096504
concepts[5].display_name	Latency (audio)
concepts[6].id	https://openalex.org/C111335779
concepts[6].level	2
concepts[6].score	0.4874575138092041
concepts[6].wikidata	https://www.wikidata.org/wiki/Q3454686
concepts[6].display_name	Reduction (mathematics)
concepts[7].id	https://openalex.org/C108583219
concepts[7].level	2
concepts[7].score	0.471695214509964
concepts[7].wikidata	https://www.wikidata.org/wiki/Q197536
concepts[7].display_name	Deep learning
concepts[8].id	https://openalex.org/C154945302
concepts[8].level	1
concepts[8].score	0.4397997558116913
concepts[8].wikidata	https://www.wikidata.org/wiki/Q11660
concepts[8].display_name	Artificial intelligence
concepts[9].id	https://openalex.org/C11413529
concepts[9].level	1
concepts[9].score	0.3964127004146576
concepts[9].wikidata	https://www.wikidata.org/wiki/Q8366
concepts[9].display_name	Algorithm
concepts[10].id	https://openalex.org/C113775141
concepts[10].level	1
concepts[10].score	0.37626272439956665
concepts[10].wikidata	https://www.wikidata.org/wiki/Q428691
concepts[10].display_name	Computer engineering
concepts[11].id	https://openalex.org/C173608175
concepts[11].level	1
concepts[11].score	0.2473386824131012
concepts[11].wikidata	https://www.wikidata.org/wiki/Q232661
concepts[11].display_name	Parallel computing
concepts[12].id	https://openalex.org/C33923547
concepts[12].level	0
concepts[12].score	0.15751081705093384
concepts[12].wikidata	https://www.wikidata.org/wiki/Q395
concepts[12].display_name	Mathematics
concepts[13].id	https://openalex.org/C76155785
concepts[13].level	1
concepts[13].score	0.08113458752632141
concepts[13].wikidata	https://www.wikidata.org/wiki/Q418
concepts[13].display_name	Telecommunications
concepts[14].id	https://openalex.org/C2524010
concepts[14].level	1
concepts[14].score	0.0
concepts[14].wikidata	https://www.wikidata.org/wiki/Q8087
concepts[14].display_name	Geometry
keywords[0].id	https://openalex.org/keywords/inference
keywords[0].score	0.7930399179458618
keywords[0].display_name	Inference
keywords[1].id	https://openalex.org/keywords/computer-science
keywords[1].score	0.685430645942688
keywords[1].display_name	Computer science
keywords[2].id	https://openalex.org/keywords/merge
keywords[2].score	0.6567244529724121
keywords[2].display_name	Merge (version control)
keywords[3].id	https://openalex.org/keywords/convolutional-neural-network
keywords[3].score	0.5888683199882507
keywords[3].display_name	Convolutional neural network
keywords[4].id	https://openalex.org/keywords/leverage
keywords[4].score	0.5472861528396606
keywords[4].display_name	Leverage (statistics)
keywords[5].id	https://openalex.org/keywords/latency
keywords[5].score	0.5199071764945984
keywords[5].display_name	Latency (audio)
keywords[6].id	https://openalex.org/keywords/reduction
keywords[6].score	0.4874575138092041
keywords[6].display_name	Reduction (mathematics)
keywords[7].id	https://openalex.org/keywords/deep-learning
keywords[7].score	0.471695214509964
keywords[7].display_name	Deep learning
keywords[8].id	https://openalex.org/keywords/artificial-intelligence
keywords[8].score	0.4397997558116913
keywords[8].display_name	Artificial intelligence
keywords[9].id	https://openalex.org/keywords/algorithm
keywords[9].score	0.3964127004146576
keywords[9].display_name	Algorithm
keywords[10].id	https://openalex.org/keywords/computer-engineering
keywords[10].score	0.37626272439956665
keywords[10].display_name	Computer engineering
keywords[11].id	https://openalex.org/keywords/parallel-computing
keywords[11].score	0.2473386824131012
keywords[11].display_name	Parallel computing
keywords[12].id	https://openalex.org/keywords/mathematics
keywords[12].score	0.15751081705093384
keywords[12].display_name	Mathematics
keywords[13].id	https://openalex.org/keywords/telecommunications
keywords[13].score	0.08113458752632141
keywords[13].display_name	Telecommunications
language	en
locations[0].id	pmh:oai:arXiv.org:2304.13274
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license
locations[0].pdf_url	https://arxiv.org/pdf/2304.13274
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2304.13274
locations[1].id	doi:10.48550/arxiv.2304.13274
locations[1].is_oa	True
locations[1].source.id	https://openalex.org/S4306400194
locations[1].source.issn
locations[1].source.type	repository
locations[1].source.is_oa	True
locations[1].source.issn_l
locations[1].source.is_core	False
locations[1].source.is_in_doaj	False
locations[1].source.display_name	arXiv (Cornell University)
locations[1].source.host_organization	https://openalex.org/I205783295
locations[1].source.host_organization_name	Cornell University
locations[1].source.host_organization_lineage	https://openalex.org/I205783295
locations[1].license
locations[1].pdf_url
locations[1].version
locations[1].raw_type	article
locations[1].license_id
locations[1].is_accepted	False
locations[1].is_published
locations[1].raw_source_name
locations[1].landing_page_url	https://doi.org/10.48550/arxiv.2304.13274
indexed_in	arxiv, datacite
authorships[0].author.id	https://openalex.org/A5087095284
authorships[0].author.orcid	https://orcid.org/0000-0001-5815-8765
authorships[0].author.display_name	Souvik Kundu
authorships[0].author_position	first
authorships[0].raw_author_name	Kundu, Souvik
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5019021341
authorships[1].author.orcid	https://orcid.org/0000-0001-5253-5478
authorships[1].author.display_name	Yuke Zhang
authorships[1].author_position	middle
authorships[1].raw_author_name	Zhang, Yuke
authorships[1].is_corresponding	False
authorships[2].author.id	https://openalex.org/A5022810915
authorships[2].author.orcid	https://orcid.org/0000-0002-8193-7158
authorships[2].author.display_name	Dake Chen
authorships[2].author_position	middle
authorships[2].raw_author_name	Chen, Dake
authorships[2].is_corresponding	False
authorships[3].author.id	https://openalex.org/A5084205024
authorships[3].author.orcid	https://orcid.org/0000-0002-8283-0168
authorships[3].author.display_name	Peter A. Beerel
authorships[3].author_position	last
authorships[3].raw_author_name	Beerel, Peter A.
authorships[3].is_corresponding	False
has_content.pdf	True
has_content.grobid_xml	True
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2304.13274
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2025-10-10T00:00:00
display_name	Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference
has_fulltext	True
is_retracted	False
updated_date	2025-11-06T06:51:31.235846
primary_topic.id	https://openalex.org/T11689
primary_topic.field.id	https://openalex.org/fields/17
primary_topic.field.display_name	Computer Science
primary_topic.score	0.9983000159263611
primary_topic.domain.id	https://openalex.org/domains/3
primary_topic.domain.display_name	Physical Sciences
primary_topic.subfield.id	https://openalex.org/subfields/1702
primary_topic.subfield.display_name	Artificial Intelligence
primary_topic.display_name	Adversarial Robustness in Machine Learning
related_works	https://openalex.org/W2486541857, https://openalex.org/W4234886518, https://openalex.org/W2389591058, https://openalex.org/W2382112581, https://openalex.org/W3124036233, https://openalex.org/W4229787472, https://openalex.org/W2108840191, https://openalex.org/W4293226380, https://openalex.org/W2759366996, https://openalex.org/W2110679372
cited_by_count	0
locations_count	2
best_oa_location.id	pmh:oai:arXiv.org:2304.13274
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license
best_oa_location.pdf_url	https://arxiv.org/pdf/2304.13274
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2304.13274
primary_location.id	pmh:oai:arXiv.org:2304.13274
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license
primary_location.pdf_url	https://arxiv.org/pdf/2304.13274
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2304.13274
publication_date	2023-04-26
publication_year	2023
referenced_works_count	0
abstract_inverted_index.a	25, 31, 46, 51, 63
abstract_inverted_index.In	20, 38
abstract_inverted_index.be	36
abstract_inverted_index.by	87
abstract_inverted_index.of	2, 7, 45, 81
abstract_inverted_index.on	97
abstract_inverted_index.to	33, 35, 49, 62, 89
abstract_inverted_index.up	88
abstract_inverted_index.we	23, 40
abstract_inverted_index.MAC	5
abstract_inverted_index.and	4, 16, 54, 58, 84, 91
abstract_inverted_index.any	100
abstract_inverted_index.can	75
abstract_inverted_index.for	14
abstract_inverted_index.its	56
abstract_inverted_index.our	71
abstract_inverted_index.the	42
abstract_inverted_index.Deep	8
abstract_inverted_index.ReLU	3, 43, 52, 68
abstract_inverted_index.both	82
abstract_inverted_index.make	11
abstract_inverted_index.that	29
abstract_inverted_index.them	12
abstract_inverted_index.this	21
abstract_inverted_index.with	78, 95
abstract_inverted_index.1.73x	90
abstract_inverted_index.Large	0
abstract_inverted_index.ReLUs	83
abstract_inverted_index.block	48
abstract_inverted_index.joint	72
abstract_inverted_index.layer	53
abstract_inverted_index.learn	34
abstract_inverted_index.merge	55
abstract_inverted_index.model	26, 32
abstract_inverted_index.yield	76
abstract_inverted_index.1.47x,	92
abstract_inverted_index.Unlike	66
abstract_inverted_index.allows	30
abstract_inverted_index.block.	65
abstract_inverted_index.layers	61
abstract_inverted_index.linear	85
abstract_inverted_index.method	28, 74
abstract_inverted_index.models	77
abstract_inverted_index.neural	9
abstract_inverted_index.number	1
abstract_inverted_index.paper,	22
abstract_inverted_index.remove	50
abstract_inverted_index.latency	15
abstract_inverted_index.present	24
abstract_inverted_index.private	18
abstract_inverted_index.shallow	64
abstract_inverted_index.without	99
abstract_inverted_index.ResNet18	96
abstract_inverted_index.existing	67
abstract_inverted_index.improved	79
abstract_inverted_index.leverage	41
abstract_inverted_index.methods,	70
abstract_inverted_index.networks	10
abstract_inverted_index.shallow.	37
abstract_inverted_index.CIFAR-100	98
abstract_inverted_index.evaluated	94
abstract_inverted_index.preceding	59
abstract_inverted_index.reduction	69, 73, 80
abstract_inverted_index.ill-suited	13
abstract_inverted_index.inference.	19
abstract_inverted_index.operations	6, 86
abstract_inverted_index.succeeding	57
abstract_inverted_index.convolution	60
abstract_inverted_index.particular,	39
abstract_inverted_index.sensitivity	44
abstract_inverted_index.significant	101
abstract_inverted_index.optimization	27
abstract_inverted_index.convolutional	47
abstract_inverted_index.respectively,	93
abstract_inverted_index.accuracy-drop.	102
abstract_inverted_index.compute-efficient	17
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	4
citation_normalized_percentile