Making Batch Normalization Great in Federated Deep Learning Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2303.06530
Batch Normalization (BN) is widely used in {centralized} deep learning to improve convergence and generalization. However, in {federated} learning (FL) with decentralized data, prior work has observed that training with BN could hinder performance and suggested replacing it with Group Normalization (GN). In this paper, we revisit this substitution by expanding the empirical study conducted in prior work. Surprisingly, we find that BN outperforms GN in many FL settings. The exceptions are high-frequency communication and extreme non-IID regimes. We reinvestigate factors that are believed to cause this problem, including the mismatch of BN statistics across clients and the deviation of gradients during local training. We empirically identify a simple practice that could reduce the impacts of these factors while maintaining the strength of BN. Our approach, which we named FIXBN, is fairly easy to implement, without any additional training or communication costs, and performs favorably across a wide range of FL settings. We hope that our study could serve as a valuable reference for future practical usage and theoretical analysis in FL.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2303.06530
- https://arxiv.org/pdf/2303.06530
- OA Status
- green
- Cited By
- 4
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4324316636
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4324316636Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2303.06530Digital Object Identifier
- Title
-
Making Batch Normalization Great in Federated Deep LearningWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-03-12Full publication date if available
- Authors
-
Jike Zhong, Hong-You Chen, Wei‐Lun ChaoList of authors in order
- Landing page
-
https://arxiv.org/abs/2303.06530Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2303.06530Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2303.06530Direct OA link when available
- Concepts
-
Normalization (sociology), Computer science, Deep learning, Artificial intelligence, Sociology, AnthropologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
4Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 2, 2024: 2Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4324316636 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2303.06530 |
| ids.doi | https://doi.org/10.48550/arxiv.2303.06530 |
| ids.openalex | https://openalex.org/W4324316636 |
| fwci | |
| type | preprint |
| title | Making Batch Normalization Great in Federated Deep Learning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10320 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9067000150680542 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Neural Networks and Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C136886441 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8065153360366821 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q926129 |
| concepts[0].display_name | Normalization (sociology) |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.6333236694335938 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C108583219 |
| concepts[2].level | 2 |
| concepts[2].score | 0.4603680372238159 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q197536 |
| concepts[2].display_name | Deep learning |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.4291980564594269 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C144024400 |
| concepts[4].level | 0 |
| concepts[4].score | 0.055474668741226196 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q21201 |
| concepts[4].display_name | Sociology |
| concepts[5].id | https://openalex.org/C19165224 |
| concepts[5].level | 1 |
| concepts[5].score | 0.0 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q23404 |
| concepts[5].display_name | Anthropology |
| keywords[0].id | https://openalex.org/keywords/normalization |
| keywords[0].score | 0.8065153360366821 |
| keywords[0].display_name | Normalization (sociology) |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.6333236694335938 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/deep-learning |
| keywords[2].score | 0.4603680372238159 |
| keywords[2].display_name | Deep learning |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.4291980564594269 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/sociology |
| keywords[4].score | 0.055474668741226196 |
| keywords[4].display_name | Sociology |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2303.06530 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2303.06530 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2303.06530 |
| locations[1].id | doi:10.48550/arxiv.2303.06530 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2303.06530 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5088198600 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Jike Zhong |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zhong, Jike |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5012764388 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-8127-5588 |
| authorships[1].author.display_name | Hong-You Chen |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Chen, Hong-You |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101520942 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-1269-7231 |
| authorships[2].author.display_name | Wei‐Lun Chao |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Chao, Wei-Lun |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2303.06530 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Making Batch Normalization Great in Federated Deep Learning |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10320 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9067000150680542 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Neural Networks and Applications |
| related_works | https://openalex.org/W2731899572, https://openalex.org/W3215138031, https://openalex.org/W3009238340, https://openalex.org/W4321369474, https://openalex.org/W4360585206, https://openalex.org/W4285208911, https://openalex.org/W3082895349, https://openalex.org/W4213079790, https://openalex.org/W2248239756, https://openalex.org/W4323565446 |
| cited_by_count | 4 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 2 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 2 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2303.06530 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2303.06530 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2303.06530 |
| primary_location.id | pmh:oai:arXiv.org:2303.06530 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2303.06530 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2303.06530 |
| publication_date | 2023-03-12 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 107, 146, 160 |
| abstract_inverted_index.BN | 30, 62, 92 |
| abstract_inverted_index.FL | 67, 150 |
| abstract_inverted_index.GN | 64 |
| abstract_inverted_index.In | 42 |
| abstract_inverted_index.We | 78, 104, 152 |
| abstract_inverted_index.as | 159 |
| abstract_inverted_index.by | 49 |
| abstract_inverted_index.in | 6, 16, 55, 65, 170 |
| abstract_inverted_index.is | 3, 130 |
| abstract_inverted_index.it | 37 |
| abstract_inverted_index.of | 91, 99, 115, 122, 149 |
| abstract_inverted_index.or | 139 |
| abstract_inverted_index.to | 10, 84, 133 |
| abstract_inverted_index.we | 45, 59, 127 |
| abstract_inverted_index.BN. | 123 |
| abstract_inverted_index.FL. | 171 |
| abstract_inverted_index.Our | 124 |
| abstract_inverted_index.The | 69 |
| abstract_inverted_index.and | 13, 34, 74, 96, 142, 167 |
| abstract_inverted_index.any | 136 |
| abstract_inverted_index.are | 71, 82 |
| abstract_inverted_index.for | 163 |
| abstract_inverted_index.has | 25 |
| abstract_inverted_index.our | 155 |
| abstract_inverted_index.the | 51, 89, 97, 113, 120 |
| abstract_inverted_index.(BN) | 2 |
| abstract_inverted_index.(FL) | 19 |
| abstract_inverted_index.deep | 8 |
| abstract_inverted_index.easy | 132 |
| abstract_inverted_index.find | 60 |
| abstract_inverted_index.hope | 153 |
| abstract_inverted_index.many | 66 |
| abstract_inverted_index.that | 27, 61, 81, 110, 154 |
| abstract_inverted_index.this | 43, 47, 86 |
| abstract_inverted_index.used | 5 |
| abstract_inverted_index.wide | 147 |
| abstract_inverted_index.with | 20, 29, 38 |
| abstract_inverted_index.work | 24 |
| abstract_inverted_index.(GN). | 41 |
| abstract_inverted_index.Batch | 0 |
| abstract_inverted_index.Group | 39 |
| abstract_inverted_index.cause | 85 |
| abstract_inverted_index.could | 31, 111, 157 |
| abstract_inverted_index.data, | 22 |
| abstract_inverted_index.local | 102 |
| abstract_inverted_index.named | 128 |
| abstract_inverted_index.prior | 23, 56 |
| abstract_inverted_index.range | 148 |
| abstract_inverted_index.serve | 158 |
| abstract_inverted_index.study | 53, 156 |
| abstract_inverted_index.these | 116 |
| abstract_inverted_index.usage | 166 |
| abstract_inverted_index.which | 126 |
| abstract_inverted_index.while | 118 |
| abstract_inverted_index.work. | 57 |
| abstract_inverted_index.FIXBN, | 129 |
| abstract_inverted_index.across | 94, 145 |
| abstract_inverted_index.costs, | 141 |
| abstract_inverted_index.during | 101 |
| abstract_inverted_index.fairly | 131 |
| abstract_inverted_index.future | 164 |
| abstract_inverted_index.hinder | 32 |
| abstract_inverted_index.paper, | 44 |
| abstract_inverted_index.reduce | 112 |
| abstract_inverted_index.simple | 108 |
| abstract_inverted_index.widely | 4 |
| abstract_inverted_index.clients | 95 |
| abstract_inverted_index.extreme | 75 |
| abstract_inverted_index.factors | 80, 117 |
| abstract_inverted_index.impacts | 114 |
| abstract_inverted_index.improve | 11 |
| abstract_inverted_index.non-IID | 76 |
| abstract_inverted_index.revisit | 46 |
| abstract_inverted_index.without | 135 |
| abstract_inverted_index.However, | 15 |
| abstract_inverted_index.analysis | 169 |
| abstract_inverted_index.believed | 83 |
| abstract_inverted_index.identify | 106 |
| abstract_inverted_index.learning | 9, 18 |
| abstract_inverted_index.mismatch | 90 |
| abstract_inverted_index.observed | 26 |
| abstract_inverted_index.performs | 143 |
| abstract_inverted_index.practice | 109 |
| abstract_inverted_index.problem, | 87 |
| abstract_inverted_index.regimes. | 77 |
| abstract_inverted_index.strength | 121 |
| abstract_inverted_index.training | 28, 138 |
| abstract_inverted_index.valuable | 161 |
| abstract_inverted_index.approach, | 125 |
| abstract_inverted_index.conducted | 54 |
| abstract_inverted_index.deviation | 98 |
| abstract_inverted_index.empirical | 52 |
| abstract_inverted_index.expanding | 50 |
| abstract_inverted_index.favorably | 144 |
| abstract_inverted_index.gradients | 100 |
| abstract_inverted_index.including | 88 |
| abstract_inverted_index.practical | 165 |
| abstract_inverted_index.reference | 162 |
| abstract_inverted_index.replacing | 36 |
| abstract_inverted_index.settings. | 68, 151 |
| abstract_inverted_index.suggested | 35 |
| abstract_inverted_index.training. | 103 |
| abstract_inverted_index.additional | 137 |
| abstract_inverted_index.exceptions | 70 |
| abstract_inverted_index.implement, | 134 |
| abstract_inverted_index.statistics | 93 |
| abstract_inverted_index.convergence | 12 |
| abstract_inverted_index.empirically | 105 |
| abstract_inverted_index.maintaining | 119 |
| abstract_inverted_index.outperforms | 63 |
| abstract_inverted_index.performance | 33 |
| abstract_inverted_index.theoretical | 168 |
| abstract_inverted_index.{federated} | 17 |
| abstract_inverted_index.substitution | 48 |
| abstract_inverted_index.Normalization | 1, 40 |
| abstract_inverted_index.Surprisingly, | 58 |
| abstract_inverted_index.communication | 73, 140 |
| abstract_inverted_index.decentralized | 21 |
| abstract_inverted_index.reinvestigate | 79 |
| abstract_inverted_index.{centralized} | 7 |
| abstract_inverted_index.high-frequency | 72 |
| abstract_inverted_index.generalization. | 14 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.5099999904632568 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile |