Consistent Sparse Deep Learning: Theory and Computation Article Swipe
Deep learning has been the engine powering many successes of data science.However, the deep neural network (DNN), as the basic model of deep learning, is often excessively over parameterized, causing many difficulties in training, prediction and interpretation.We propose a frequentist-like method for learning sparse DNNs and justify its consistency under the Bayesian framework: the proposed method can learn a sparse DNN with at most O(n/\log(n)) connections and nice theoretical guarantees such as posterior consistency, variable selection consistency and asymptotically optimal generalization bounds.In particular, we establish posterior consistency for the sparse DNN with a mixture Gaussian prior, show that the structure of the sparse DNN can be consistently determined using a Laplace approximation-based marginal posterior inclusion probability approach, and use Bayesian evidence to elicit sparse DNNs learned by an optimization method such as stochastic gradient descent in multiple runs with different initializations.The proposed method is computationally more efficient than standard Bayesian methods for large-scale sparse DNNs.The numerical results indicate that the proposed method can perform very well for large-scale network compression and high-dimensional nonlinear variable selection, both advancing interpretable machine learning.This talk is based on a joint work with Yan Sun and Qifan Song.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- http://doi.org/10.11159/icsta21.004
- https://doi.org/10.11159/icsta21.004
- OA Status
- bronze
- Cited By
- 4
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4214642831
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4214642831Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.11159/icsta21.004Digital Object Identifier
- Title
-
Consistent Sparse Deep Learning: Theory and ComputationWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2021Year of publication
- Publication date
-
2021-08-01Full publication date if available
- Authors
-
Faming LiangList of authors in order
- Landing page
-
https://doi.org/10.11159/icsta21.004Publisher landing page
- PDF URL
-
https://doi.org/10.11159/icsta21.004Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
bronzeOpen access status per OpenAlex
- OA URL
-
https://doi.org/10.11159/icsta21.004Direct OA link when available
- Concepts
-
Computer science, Computation, Artificial intelligence, Deep learning, Machine learning, AlgorithmTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
4Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1, 2023: 1, 2022: 1, 2021: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4214642831 |
|---|---|
| doi | https://doi.org/10.11159/icsta21.004 |
| ids.doi | https://doi.org/10.11159/icsta21.004 |
| ids.openalex | https://openalex.org/W4214642831 |
| fwci | 0.36785912 |
| type | article |
| title | Consistent Sparse Deep Learning: Theory and Computation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10320 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.17960000038146973 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Neural Networks and Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.665755033493042 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C45374587 |
| concepts[1].level | 2 |
| concepts[1].score | 0.5806915760040283 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q12525525 |
| concepts[1].display_name | Computation |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.5550685524940491 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| concepts[3].id | https://openalex.org/C108583219 |
| concepts[3].level | 2 |
| concepts[3].score | 0.4939735531806946 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q197536 |
| concepts[3].display_name | Deep learning |
| concepts[4].id | https://openalex.org/C119857082 |
| concepts[4].level | 1 |
| concepts[4].score | 0.37468481063842773 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[4].display_name | Machine learning |
| concepts[5].id | https://openalex.org/C11413529 |
| concepts[5].level | 1 |
| concepts[5].score | 0.2463456392288208 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[5].display_name | Algorithm |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.665755033493042 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/computation |
| keywords[1].score | 0.5806915760040283 |
| keywords[1].display_name | Computation |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.5550685524940491 |
| keywords[2].display_name | Artificial intelligence |
| keywords[3].id | https://openalex.org/keywords/deep-learning |
| keywords[3].score | 0.4939735531806946 |
| keywords[3].display_name | Deep learning |
| keywords[4].id | https://openalex.org/keywords/machine-learning |
| keywords[4].score | 0.37468481063842773 |
| keywords[4].display_name | Machine learning |
| keywords[5].id | https://openalex.org/keywords/algorithm |
| keywords[5].score | 0.2463456392288208 |
| keywords[5].display_name | Algorithm |
| language | en |
| locations[0].id | doi:10.11159/icsta21.004 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4386872458 |
| locations[0].source.issn | 2562-7767 |
| locations[0].source.type | conference |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | 2562-7767 |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | Proceedings of the International Conference on Statistics, Theory and Applications (ICSTA ...) |
| locations[0].source.host_organization | |
| locations[0].source.host_organization_name | |
| locations[0].license | |
| locations[0].pdf_url | https://doi.org/10.11159/icsta21.004 |
| locations[0].version | publishedVersion |
| locations[0].raw_type | proceedings-article |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | International Conference on Statistics: Theory and Applications |
| locations[0].landing_page_url | http://doi.org/10.11159/icsta21.004 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5085287370 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-1177-5501 |
| authorships[0].author.display_name | Faming Liang |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I219193219 |
| authorships[0].affiliations[0].raw_affiliation_string | Purdue University, USA |
| authorships[0].institutions[0].id | https://openalex.org/I219193219 |
| authorships[0].institutions[0].ror | https://ror.org/02dqehb95 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I219193219 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | Purdue University West Lafayette |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Faming Liang |
| authorships[0].is_corresponding | True |
| authorships[0].raw_affiliation_strings | Purdue University, USA |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://doi.org/10.11159/icsta21.004 |
| open_access.oa_status | bronze |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Consistent Sparse Deep Learning: Theory and Computation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10320 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.17960000038146973 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Neural Networks and Applications |
| related_works | https://openalex.org/W2731899572, https://openalex.org/W2961085424, https://openalex.org/W3215138031, https://openalex.org/W4306674287, https://openalex.org/W4286629047, https://openalex.org/W3009238340, https://openalex.org/W2939353110, https://openalex.org/W4321369474, https://openalex.org/W4360585206, https://openalex.org/W4285208911 |
| cited_by_count | 4 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2023 |
| counts_by_year[1].cited_by_count | 1 |
| counts_by_year[2].year | 2022 |
| counts_by_year[2].cited_by_count | 1 |
| counts_by_year[3].year | 2021 |
| counts_by_year[3].cited_by_count | 1 |
| locations_count | 1 |
| best_oa_location.id | doi:10.11159/icsta21.004 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4386872458 |
| best_oa_location.source.issn | 2562-7767 |
| best_oa_location.source.type | conference |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | 2562-7767 |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | Proceedings of the International Conference on Statistics, Theory and Applications (ICSTA ...) |
| best_oa_location.source.host_organization | |
| best_oa_location.source.host_organization_name | |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://doi.org/10.11159/icsta21.004 |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | proceedings-article |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | International Conference on Statistics: Theory and Applications |
| best_oa_location.landing_page_url | http://doi.org/10.11159/icsta21.004 |
| primary_location.id | doi:10.11159/icsta21.004 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4386872458 |
| primary_location.source.issn | 2562-7767 |
| primary_location.source.type | conference |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | 2562-7767 |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | Proceedings of the International Conference on Statistics, Theory and Applications (ICSTA ...) |
| primary_location.source.host_organization | |
| primary_location.source.host_organization_name | |
| primary_location.license | |
| primary_location.pdf_url | https://doi.org/10.11159/icsta21.004 |
| primary_location.version | publishedVersion |
| primary_location.raw_type | proceedings-article |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | International Conference on Statistics: Theory and Applications |
| primary_location.landing_page_url | http://doi.org/10.11159/icsta21.004 |
| publication_date | 2021-08-01 |
| publication_year | 2021 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 38, 58, 92, 109, 184 |
| abstract_inverted_index.an | 127 |
| abstract_inverted_index.as | 17, 71, 131 |
| abstract_inverted_index.at | 62 |
| abstract_inverted_index.be | 105 |
| abstract_inverted_index.by | 126 |
| abstract_inverted_index.in | 32, 135 |
| abstract_inverted_index.is | 24, 143, 181 |
| abstract_inverted_index.of | 9, 21, 100 |
| abstract_inverted_index.on | 183 |
| abstract_inverted_index.to | 121 |
| abstract_inverted_index.we | 83 |
| abstract_inverted_index.DNN | 60, 90, 103 |
| abstract_inverted_index.Sun | 189 |
| abstract_inverted_index.Yan | 188 |
| abstract_inverted_index.and | 35, 45, 66, 77, 117, 170, 190 |
| abstract_inverted_index.can | 56, 104, 162 |
| abstract_inverted_index.for | 41, 87, 151, 166 |
| abstract_inverted_index.has | 2 |
| abstract_inverted_index.its | 47 |
| abstract_inverted_index.the | 4, 12, 18, 50, 53, 88, 98, 101, 159 |
| abstract_inverted_index.use | 118 |
| abstract_inverted_index.DNNs | 44, 124 |
| abstract_inverted_index.Deep | 0 |
| abstract_inverted_index.been | 3 |
| abstract_inverted_index.both | 175 |
| abstract_inverted_index.data | 10 |
| abstract_inverted_index.deep | 13, 22 |
| abstract_inverted_index.many | 7, 30 |
| abstract_inverted_index.more | 145 |
| abstract_inverted_index.most | 63 |
| abstract_inverted_index.nice | 67 |
| abstract_inverted_index.over | 27 |
| abstract_inverted_index.runs | 137 |
| abstract_inverted_index.show | 96 |
| abstract_inverted_index.such | 70, 130 |
| abstract_inverted_index.talk | 180 |
| abstract_inverted_index.than | 147 |
| abstract_inverted_index.that | 97, 158 |
| abstract_inverted_index.very | 164 |
| abstract_inverted_index.well | 165 |
| abstract_inverted_index.with | 61, 91, 138, 187 |
| abstract_inverted_index.work | 186 |
| abstract_inverted_index.Qifan | 191 |
| abstract_inverted_index.Song. | 192 |
| abstract_inverted_index.based | 182 |
| abstract_inverted_index.basic | 19 |
| abstract_inverted_index.joint | 185 |
| abstract_inverted_index.learn | 57 |
| abstract_inverted_index.model | 20 |
| abstract_inverted_index.often | 25 |
| abstract_inverted_index.under | 49 |
| abstract_inverted_index.using | 108 |
| abstract_inverted_index.(DNN), | 16 |
| abstract_inverted_index.elicit | 122 |
| abstract_inverted_index.engine | 5 |
| abstract_inverted_index.method | 40, 55, 129, 142, 161 |
| abstract_inverted_index.neural | 14 |
| abstract_inverted_index.prior, | 95 |
| abstract_inverted_index.sparse | 43, 59, 89, 102, 123, 153 |
| abstract_inverted_index.Laplace | 110 |
| abstract_inverted_index.causing | 29 |
| abstract_inverted_index.descent | 134 |
| abstract_inverted_index.justify | 46 |
| abstract_inverted_index.learned | 125 |
| abstract_inverted_index.machine | 178 |
| abstract_inverted_index.methods | 150 |
| abstract_inverted_index.mixture | 93 |
| abstract_inverted_index.network | 15, 168 |
| abstract_inverted_index.optimal | 79 |
| abstract_inverted_index.perform | 163 |
| abstract_inverted_index.propose | 37 |
| abstract_inverted_index.results | 156 |
| abstract_inverted_index.Bayesian | 51, 119, 149 |
| abstract_inverted_index.DNNs.The | 154 |
| abstract_inverted_index.Gaussian | 94 |
| abstract_inverted_index.evidence | 120 |
| abstract_inverted_index.gradient | 133 |
| abstract_inverted_index.indicate | 157 |
| abstract_inverted_index.learning | 1, 42 |
| abstract_inverted_index.marginal | 112 |
| abstract_inverted_index.multiple | 136 |
| abstract_inverted_index.powering | 6 |
| abstract_inverted_index.proposed | 54, 141, 160 |
| abstract_inverted_index.standard | 148 |
| abstract_inverted_index.variable | 74, 173 |
| abstract_inverted_index.advancing | 176 |
| abstract_inverted_index.approach, | 116 |
| abstract_inverted_index.bounds.In | 81 |
| abstract_inverted_index.different | 139 |
| abstract_inverted_index.efficient | 146 |
| abstract_inverted_index.establish | 84 |
| abstract_inverted_index.inclusion | 114 |
| abstract_inverted_index.learning, | 23 |
| abstract_inverted_index.nonlinear | 172 |
| abstract_inverted_index.numerical | 155 |
| abstract_inverted_index.posterior | 72, 85, 113 |
| abstract_inverted_index.selection | 75 |
| abstract_inverted_index.structure | 99 |
| abstract_inverted_index.successes | 8 |
| abstract_inverted_index.training, | 33 |
| abstract_inverted_index.determined | 107 |
| abstract_inverted_index.framework: | 52 |
| abstract_inverted_index.guarantees | 69 |
| abstract_inverted_index.prediction | 34 |
| abstract_inverted_index.selection, | 174 |
| abstract_inverted_index.stochastic | 132 |
| abstract_inverted_index.compression | 169 |
| abstract_inverted_index.connections | 65 |
| abstract_inverted_index.consistency | 48, 76, 86 |
| abstract_inverted_index.excessively | 26 |
| abstract_inverted_index.large-scale | 152, 167 |
| abstract_inverted_index.particular, | 82 |
| abstract_inverted_index.probability | 115 |
| abstract_inverted_index.theoretical | 68 |
| abstract_inverted_index.O(n/\log(n)) | 64 |
| abstract_inverted_index.consistency, | 73 |
| abstract_inverted_index.consistently | 106 |
| abstract_inverted_index.difficulties | 31 |
| abstract_inverted_index.optimization | 128 |
| abstract_inverted_index.interpretable | 177 |
| abstract_inverted_index.learning.This | 179 |
| abstract_inverted_index.asymptotically | 78 |
| abstract_inverted_index.generalization | 80 |
| abstract_inverted_index.parameterized, | 28 |
| abstract_inverted_index.computationally | 144 |
| abstract_inverted_index.frequentist-like | 39 |
| abstract_inverted_index.high-dimensional | 171 |
| abstract_inverted_index.science.However, | 11 |
| abstract_inverted_index.interpretation.We | 36 |
| abstract_inverted_index.approximation-based | 111 |
| abstract_inverted_index.initializations.The | 140 |
| cited_by_percentile_year.max | 95 |
| cited_by_percentile_year.min | 89 |
| corresponding_author_ids | https://openalex.org/A5085287370 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 1 |
| corresponding_institution_ids | https://openalex.org/I219193219 |
| citation_normalized_percentile.value | 0.62620136 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |