Feature Preserving Shrinkage on Bayesian Neural Networks via the R2D2 Prior Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2505.18280
Bayesian neural networks (BNNs) treat neural network weights as random variables, which aim to provide posterior uncertainty estimates and avoid overfitting by performing inference on the posterior weights. However, the selection of appropriate prior distributions remains a challenging task, and BNNs may suffer from catastrophic inflated variance or poor predictive performance when poor choices are made for the priors. Existing BNN designs apply different priors to weights, while the behaviours of these priors make it difficult to sufficiently shrink noisy signals or they are prone to overshrinking important signals in the weights. To alleviate this problem, we propose a novel R2D2-Net, which imposes the R^2-induced Dirichlet Decomposition (R2D2) prior to the BNN weights. The R2D2-Net can effectively shrink irrelevant coefficients towards zero, while preventing key features from over-shrinkage. To approximate the posterior distribution of weights more accurately, we further propose a variational Gibbs inference algorithm that combines the Gibbs updating procedure and gradient-based optimization. This strategy enhances stability and consistency in estimation when the variational objective involving the shrinkage parameters is non-convex. We also analyze the evidence lower bound (ELBO) and the posterior concentration rates from a theoretical perspective. Experiments on both natural and medical image classification and uncertainty estimation tasks demonstrate satisfactory performance of our method.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2505.18280
- https://arxiv.org/pdf/2505.18280
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4414581395
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4414581395Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2505.18280Digital Object Identifier
- Title
-
Feature Preserving Shrinkage on Bayesian Neural Networks via the R2D2 PriorWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-05-23Full publication date if available
- Authors
-
Tsai Hor Chan, Ding Zhang, Guosheng Yin, Lequan YuList of authors in order
- Landing page
-
https://arxiv.org/abs/2505.18280Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2505.18280Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2505.18280Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4414581395 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2505.18280 |
| ids.doi | https://doi.org/10.48550/arxiv.2505.18280 |
| ids.openalex | https://openalex.org/W4414581395 |
| fwci | |
| type | preprint |
| title | Feature Preserving Shrinkage on Bayesian Neural Networks via the R2D2 Prior |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10876 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.9607999920845032 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2207 |
| topics[0].subfield.display_name | Control and Systems Engineering |
| topics[0].display_name | Fault Detection and Control Systems |
| topics[1].id | https://openalex.org/T10320 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9585999846458435 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Neural Networks and Applications |
| topics[2].id | https://openalex.org/T11512 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9377999901771545 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Anomaly Detection Techniques and Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2505.18280 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2505.18280 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2505.18280 |
| locations[1].id | doi:10.48550/arxiv.2505.18280 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2505.18280 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5073653170 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-3545-397X |
| authorships[0].author.display_name | Tsai Hor Chan |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Chan, Tsai Hor |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5073843936 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-7906-9497 |
| authorships[1].author.display_name | Ding Zhang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zhang, Dora Yan |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5080151722 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-3276-1392 |
| authorships[2].author.display_name | Guosheng Yin |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Yin, Guosheng |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5012581106 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-9315-6527 |
| authorships[3].author.display_name | Lequan Yu |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Yu, Lequan |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2505.18280 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Feature Preserving Shrinkage on Bayesian Neural Networks via the R2D2 Prior |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10876 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.9607999920845032 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2207 |
| primary_topic.subfield.display_name | Control and Systems Engineering |
| primary_topic.display_name | Fault Detection and Control Systems |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2505.18280 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2505.18280 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2505.18280 |
| primary_location.id | pmh:oai:arXiv.org:2505.18280 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2505.18280 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2505.18280 |
| publication_date | 2025-05-23 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 36, 98, 140, 186 |
| abstract_inverted_index.To | 92, 128 |
| abstract_inverted_index.We | 172 |
| abstract_inverted_index.as | 8 |
| abstract_inverted_index.by | 21 |
| abstract_inverted_index.in | 89, 160 |
| abstract_inverted_index.is | 170 |
| abstract_inverted_index.it | 74 |
| abstract_inverted_index.of | 31, 70, 133, 204 |
| abstract_inverted_index.on | 24, 190 |
| abstract_inverted_index.or | 47, 81 |
| abstract_inverted_index.to | 13, 65, 76, 85, 109 |
| abstract_inverted_index.we | 96, 137 |
| abstract_inverted_index.BNN | 60, 111 |
| abstract_inverted_index.The | 113 |
| abstract_inverted_index.aim | 12 |
| abstract_inverted_index.and | 18, 39, 151, 158, 180, 193, 197 |
| abstract_inverted_index.are | 54, 83 |
| abstract_inverted_index.can | 115 |
| abstract_inverted_index.for | 56 |
| abstract_inverted_index.key | 124 |
| abstract_inverted_index.may | 41 |
| abstract_inverted_index.our | 205 |
| abstract_inverted_index.the | 25, 29, 57, 68, 90, 103, 110, 130, 147, 163, 167, 175, 181 |
| abstract_inverted_index.BNNs | 40 |
| abstract_inverted_index.This | 154 |
| abstract_inverted_index.also | 173 |
| abstract_inverted_index.both | 191 |
| abstract_inverted_index.from | 43, 126, 185 |
| abstract_inverted_index.made | 55 |
| abstract_inverted_index.make | 73 |
| abstract_inverted_index.more | 135 |
| abstract_inverted_index.poor | 48, 52 |
| abstract_inverted_index.that | 145 |
| abstract_inverted_index.they | 82 |
| abstract_inverted_index.this | 94 |
| abstract_inverted_index.when | 51, 162 |
| abstract_inverted_index.Gibbs | 142, 148 |
| abstract_inverted_index.apply | 62 |
| abstract_inverted_index.avoid | 19 |
| abstract_inverted_index.bound | 178 |
| abstract_inverted_index.image | 195 |
| abstract_inverted_index.lower | 177 |
| abstract_inverted_index.noisy | 79 |
| abstract_inverted_index.novel | 99 |
| abstract_inverted_index.prior | 33, 108 |
| abstract_inverted_index.prone | 84 |
| abstract_inverted_index.rates | 184 |
| abstract_inverted_index.task, | 38 |
| abstract_inverted_index.tasks | 200 |
| abstract_inverted_index.these | 71 |
| abstract_inverted_index.treat | 4 |
| abstract_inverted_index.which | 11, 101 |
| abstract_inverted_index.while | 67, 122 |
| abstract_inverted_index.zero, | 121 |
| abstract_inverted_index.(BNNs) | 3 |
| abstract_inverted_index.(ELBO) | 179 |
| abstract_inverted_index.(R2D2) | 107 |
| abstract_inverted_index.neural | 1, 5 |
| abstract_inverted_index.priors | 64, 72 |
| abstract_inverted_index.random | 9 |
| abstract_inverted_index.shrink | 78, 117 |
| abstract_inverted_index.suffer | 42 |
| abstract_inverted_index.analyze | 174 |
| abstract_inverted_index.choices | 53 |
| abstract_inverted_index.designs | 61 |
| abstract_inverted_index.further | 138 |
| abstract_inverted_index.imposes | 102 |
| abstract_inverted_index.medical | 194 |
| abstract_inverted_index.method. | 206 |
| abstract_inverted_index.natural | 192 |
| abstract_inverted_index.network | 6 |
| abstract_inverted_index.priors. | 58 |
| abstract_inverted_index.propose | 97, 139 |
| abstract_inverted_index.provide | 14 |
| abstract_inverted_index.remains | 35 |
| abstract_inverted_index.signals | 80, 88 |
| abstract_inverted_index.towards | 120 |
| abstract_inverted_index.weights | 7, 134 |
| abstract_inverted_index.Bayesian | 0 |
| abstract_inverted_index.Existing | 59 |
| abstract_inverted_index.However, | 28 |
| abstract_inverted_index.R2D2-Net | 114 |
| abstract_inverted_index.combines | 146 |
| abstract_inverted_index.enhances | 156 |
| abstract_inverted_index.evidence | 176 |
| abstract_inverted_index.features | 125 |
| abstract_inverted_index.inflated | 45 |
| abstract_inverted_index.networks | 2 |
| abstract_inverted_index.problem, | 95 |
| abstract_inverted_index.strategy | 155 |
| abstract_inverted_index.updating | 149 |
| abstract_inverted_index.variance | 46 |
| abstract_inverted_index.weights, | 66 |
| abstract_inverted_index.weights. | 27, 91, 112 |
| abstract_inverted_index.Dirichlet | 105 |
| abstract_inverted_index.R2D2-Net, | 100 |
| abstract_inverted_index.algorithm | 144 |
| abstract_inverted_index.alleviate | 93 |
| abstract_inverted_index.different | 63 |
| abstract_inverted_index.difficult | 75 |
| abstract_inverted_index.estimates | 17 |
| abstract_inverted_index.important | 87 |
| abstract_inverted_index.inference | 23, 143 |
| abstract_inverted_index.involving | 166 |
| abstract_inverted_index.objective | 165 |
| abstract_inverted_index.posterior | 15, 26, 131, 182 |
| abstract_inverted_index.procedure | 150 |
| abstract_inverted_index.selection | 30 |
| abstract_inverted_index.shrinkage | 168 |
| abstract_inverted_index.stability | 157 |
| abstract_inverted_index.behaviours | 69 |
| abstract_inverted_index.estimation | 161, 199 |
| abstract_inverted_index.irrelevant | 118 |
| abstract_inverted_index.parameters | 169 |
| abstract_inverted_index.performing | 22 |
| abstract_inverted_index.predictive | 49 |
| abstract_inverted_index.preventing | 123 |
| abstract_inverted_index.variables, | 10 |
| abstract_inverted_index.Experiments | 189 |
| abstract_inverted_index.R^2-induced | 104 |
| abstract_inverted_index.accurately, | 136 |
| abstract_inverted_index.appropriate | 32 |
| abstract_inverted_index.approximate | 129 |
| abstract_inverted_index.challenging | 37 |
| abstract_inverted_index.consistency | 159 |
| abstract_inverted_index.demonstrate | 201 |
| abstract_inverted_index.effectively | 116 |
| abstract_inverted_index.non-convex. | 171 |
| abstract_inverted_index.overfitting | 20 |
| abstract_inverted_index.performance | 50, 203 |
| abstract_inverted_index.theoretical | 187 |
| abstract_inverted_index.uncertainty | 16, 198 |
| abstract_inverted_index.variational | 141, 164 |
| abstract_inverted_index.catastrophic | 44 |
| abstract_inverted_index.coefficients | 119 |
| abstract_inverted_index.distribution | 132 |
| abstract_inverted_index.perspective. | 188 |
| abstract_inverted_index.satisfactory | 202 |
| abstract_inverted_index.sufficiently | 77 |
| abstract_inverted_index.Decomposition | 106 |
| abstract_inverted_index.concentration | 183 |
| abstract_inverted_index.distributions | 34 |
| abstract_inverted_index.optimization. | 153 |
| abstract_inverted_index.overshrinking | 86 |
| abstract_inverted_index.classification | 196 |
| abstract_inverted_index.gradient-based | 152 |
| abstract_inverted_index.over-shrinkage. | 127 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |