Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2305.16891
Recently, significant progress has been made in understanding the generalization of neural networks (NNs) trained by gradient descent (GD) using the algorithmic stability approach. However, most of the existing research has focused on one-hidden-layer NNs and has not addressed the impact of different network scaling parameters. In this paper, we greatly extend the previous work \cite{lei2022stability,richards2021stability} by conducting a comprehensive stability and generalization analysis of GD for multi-layer NNs. For two-layer NNs, our results are established under general network scaling parameters, relaxing previous conditions. In the case of three-layer NNs, our technical contribution lies in demonstrating its nearly co-coercive property by utilizing a novel induction strategy that thoroughly explores the effects of over-parameterization. As a direct application of our general findings, we derive the excess risk rate of $O(1/\sqrt{n})$ for GD algorithms in both two-layer and three-layer NNs. This sheds light on sufficient or necessary conditions for under-parameterized and over-parameterized NNs trained by GD to attain the desired risk rate of $O(1/\sqrt{n})$. Moreover, we demonstrate that as the scaling parameter increases or the network complexity decreases, less over-parameterization is required for GD to achieve the desired error rates. Additionally, under a low-noise condition, we obtain a fast risk rate of $O(1/n)$ for GD in both two-layer and three-layer NNs.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2305.16891
- https://arxiv.org/pdf/2305.16891
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4378718232
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4378718232Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2305.16891Digital Object Identifier
- Title
-
Generalization Guarantees of Gradient Descent for Multi-Layer Neural NetworksWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-05-26Full publication date if available
- Authors
-
Puyu Wang, Yunwen Lei, Di Wang, Yiming Ying, Ding‐Xuan ZhouList of authors in order
- Landing page
-
https://arxiv.org/abs/2305.16891Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2305.16891Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2305.16891Direct OA link when available
- Concepts
-
Parameterized complexity, Generalization, Gradient descent, Scaling, Artificial neural network, Layer (electronics), Stability (learning theory), Computer science, Algorithm, Applied mathematics, Mathematics, Artificial intelligence, Mathematical analysis, Machine learning, Materials science, Geometry, Composite materialTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4378718232 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2305.16891 |
| ids.doi | https://doi.org/10.48550/arxiv.2305.16891 |
| ids.openalex | https://openalex.org/W4378718232 |
| fwci | |
| type | preprint |
| title | Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12676 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9993000030517578 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Machine Learning and ELM |
| topics[1].id | https://openalex.org/T10320 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9987000226974487 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Neural Networks and Applications |
| topics[2].id | https://openalex.org/T12702 |
| topics[2].field.id | https://openalex.org/fields/28 |
| topics[2].field.display_name | Neuroscience |
| topics[2].score | 0.995199978351593 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2808 |
| topics[2].subfield.display_name | Neurology |
| topics[2].display_name | Brain Tumor Detection and Classification |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C165464430 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8715852499008179 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1570441 |
| concepts[0].display_name | Parameterized complexity |
| concepts[1].id | https://openalex.org/C177148314 |
| concepts[1].level | 2 |
| concepts[1].score | 0.8177061676979065 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q170084 |
| concepts[1].display_name | Generalization |
| concepts[2].id | https://openalex.org/C153258448 |
| concepts[2].level | 3 |
| concepts[2].score | 0.7330757975578308 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q1199743 |
| concepts[2].display_name | Gradient descent |
| concepts[3].id | https://openalex.org/C99844830 |
| concepts[3].level | 2 |
| concepts[3].score | 0.7102431654930115 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q102441924 |
| concepts[3].display_name | Scaling |
| concepts[4].id | https://openalex.org/C50644808 |
| concepts[4].level | 2 |
| concepts[4].score | 0.705914318561554 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[4].display_name | Artificial neural network |
| concepts[5].id | https://openalex.org/C2779227376 |
| concepts[5].level | 2 |
| concepts[5].score | 0.6452301740646362 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q6505497 |
| concepts[5].display_name | Layer (electronics) |
| concepts[6].id | https://openalex.org/C112972136 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5714186429977417 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q7595718 |
| concepts[6].display_name | Stability (learning theory) |
| concepts[7].id | https://openalex.org/C41008148 |
| concepts[7].level | 0 |
| concepts[7].score | 0.5194370746612549 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[7].display_name | Computer science |
| concepts[8].id | https://openalex.org/C11413529 |
| concepts[8].level | 1 |
| concepts[8].score | 0.4437360167503357 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[8].display_name | Algorithm |
| concepts[9].id | https://openalex.org/C28826006 |
| concepts[9].level | 1 |
| concepts[9].score | 0.4240490794181824 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q33521 |
| concepts[9].display_name | Applied mathematics |
| concepts[10].id | https://openalex.org/C33923547 |
| concepts[10].level | 0 |
| concepts[10].score | 0.3899722993373871 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[10].display_name | Mathematics |
| concepts[11].id | https://openalex.org/C154945302 |
| concepts[11].level | 1 |
| concepts[11].score | 0.27786701917648315 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[11].display_name | Artificial intelligence |
| concepts[12].id | https://openalex.org/C134306372 |
| concepts[12].level | 1 |
| concepts[12].score | 0.1409292221069336 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[12].display_name | Mathematical analysis |
| concepts[13].id | https://openalex.org/C119857082 |
| concepts[13].level | 1 |
| concepts[13].score | 0.11085370182991028 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[13].display_name | Machine learning |
| concepts[14].id | https://openalex.org/C192562407 |
| concepts[14].level | 0 |
| concepts[14].score | 0.08551988005638123 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q228736 |
| concepts[14].display_name | Materials science |
| concepts[15].id | https://openalex.org/C2524010 |
| concepts[15].level | 1 |
| concepts[15].score | 0.06849116086959839 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[15].display_name | Geometry |
| concepts[16].id | https://openalex.org/C159985019 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q181790 |
| concepts[16].display_name | Composite material |
| keywords[0].id | https://openalex.org/keywords/parameterized-complexity |
| keywords[0].score | 0.8715852499008179 |
| keywords[0].display_name | Parameterized complexity |
| keywords[1].id | https://openalex.org/keywords/generalization |
| keywords[1].score | 0.8177061676979065 |
| keywords[1].display_name | Generalization |
| keywords[2].id | https://openalex.org/keywords/gradient-descent |
| keywords[2].score | 0.7330757975578308 |
| keywords[2].display_name | Gradient descent |
| keywords[3].id | https://openalex.org/keywords/scaling |
| keywords[3].score | 0.7102431654930115 |
| keywords[3].display_name | Scaling |
| keywords[4].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[4].score | 0.705914318561554 |
| keywords[4].display_name | Artificial neural network |
| keywords[5].id | https://openalex.org/keywords/layer |
| keywords[5].score | 0.6452301740646362 |
| keywords[5].display_name | Layer (electronics) |
| keywords[6].id | https://openalex.org/keywords/stability |
| keywords[6].score | 0.5714186429977417 |
| keywords[6].display_name | Stability (learning theory) |
| keywords[7].id | https://openalex.org/keywords/computer-science |
| keywords[7].score | 0.5194370746612549 |
| keywords[7].display_name | Computer science |
| keywords[8].id | https://openalex.org/keywords/algorithm |
| keywords[8].score | 0.4437360167503357 |
| keywords[8].display_name | Algorithm |
| keywords[9].id | https://openalex.org/keywords/applied-mathematics |
| keywords[9].score | 0.4240490794181824 |
| keywords[9].display_name | Applied mathematics |
| keywords[10].id | https://openalex.org/keywords/mathematics |
| keywords[10].score | 0.3899722993373871 |
| keywords[10].display_name | Mathematics |
| keywords[11].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[11].score | 0.27786701917648315 |
| keywords[11].display_name | Artificial intelligence |
| keywords[12].id | https://openalex.org/keywords/mathematical-analysis |
| keywords[12].score | 0.1409292221069336 |
| keywords[12].display_name | Mathematical analysis |
| keywords[13].id | https://openalex.org/keywords/machine-learning |
| keywords[13].score | 0.11085370182991028 |
| keywords[13].display_name | Machine learning |
| keywords[14].id | https://openalex.org/keywords/materials-science |
| keywords[14].score | 0.08551988005638123 |
| keywords[14].display_name | Materials science |
| keywords[15].id | https://openalex.org/keywords/geometry |
| keywords[15].score | 0.06849116086959839 |
| keywords[15].display_name | Geometry |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2305.16891 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2305.16891 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2305.16891 |
| locations[1].id | doi:10.48550/arxiv.2305.16891 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2305.16891 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5103081079 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-3104-3081 |
| authorships[0].author.display_name | Puyu Wang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Wang, Puyu |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5046468616 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-5383-467X |
| authorships[1].author.display_name | Yunwen Lei |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Lei, Yunwen |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5100401482 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-4908-0243 |
| authorships[2].author.display_name | Di Wang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Wang, Di |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5048960543 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Yiming Ying |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Ying, Yiming |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5006802589 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-0224-9216 |
| authorships[4].author.display_name | Ding‐Xuan Zhou |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Zhou, Ding-Xuan |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2305.16891 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T12676 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9993000030517578 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Machine Learning and ELM |
| related_works | https://openalex.org/W2051058708, https://openalex.org/W1494268238, https://openalex.org/W154868527, https://openalex.org/W1983207144, https://openalex.org/W2490706771, https://openalex.org/W2480116122, https://openalex.org/W4255576661, https://openalex.org/W3214675586, https://openalex.org/W3046835365, https://openalex.org/W2381411913 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2305.16891 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2305.16891 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2305.16891 |
| primary_location.id | pmh:oai:arXiv.org:2305.16891 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2305.16891 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2305.16891 |
| publication_date | 2023-05-26 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 58, 102, 114, 190, 195 |
| abstract_inverted_index.As | 113 |
| abstract_inverted_index.GD | 65, 130, 153, 181, 202 |
| abstract_inverted_index.In | 46, 84 |
| abstract_inverted_index.as | 166 |
| abstract_inverted_index.by | 15, 56, 100, 152 |
| abstract_inverted_index.in | 6, 94, 132, 203 |
| abstract_inverted_index.is | 178 |
| abstract_inverted_index.of | 10, 26, 41, 64, 87, 111, 117, 127, 160, 199 |
| abstract_inverted_index.on | 32, 141 |
| abstract_inverted_index.or | 143, 171 |
| abstract_inverted_index.to | 154, 182 |
| abstract_inverted_index.we | 49, 121, 163, 193 |
| abstract_inverted_index.For | 69 |
| abstract_inverted_index.NNs | 34, 150 |
| abstract_inverted_index.and | 35, 61, 135, 148, 206 |
| abstract_inverted_index.are | 74 |
| abstract_inverted_index.for | 66, 129, 146, 180, 201 |
| abstract_inverted_index.has | 3, 30, 36 |
| abstract_inverted_index.its | 96 |
| abstract_inverted_index.not | 37 |
| abstract_inverted_index.our | 72, 90, 118 |
| abstract_inverted_index.the | 8, 20, 27, 39, 52, 85, 109, 123, 156, 167, 172, 184 |
| abstract_inverted_index.(GD) | 18 |
| abstract_inverted_index.NNs, | 71, 89 |
| abstract_inverted_index.NNs. | 68, 137, 208 |
| abstract_inverted_index.This | 138 |
| abstract_inverted_index.been | 4 |
| abstract_inverted_index.both | 133, 204 |
| abstract_inverted_index.case | 86 |
| abstract_inverted_index.fast | 196 |
| abstract_inverted_index.less | 176 |
| abstract_inverted_index.lies | 93 |
| abstract_inverted_index.made | 5 |
| abstract_inverted_index.most | 25 |
| abstract_inverted_index.rate | 126, 159, 198 |
| abstract_inverted_index.risk | 125, 158, 197 |
| abstract_inverted_index.that | 106, 165 |
| abstract_inverted_index.this | 47 |
| abstract_inverted_index.work | 54 |
| abstract_inverted_index.(NNs) | 13 |
| abstract_inverted_index.error | 186 |
| abstract_inverted_index.light | 140 |
| abstract_inverted_index.novel | 103 |
| abstract_inverted_index.sheds | 139 |
| abstract_inverted_index.under | 76, 189 |
| abstract_inverted_index.using | 19 |
| abstract_inverted_index.attain | 155 |
| abstract_inverted_index.derive | 122 |
| abstract_inverted_index.direct | 115 |
| abstract_inverted_index.excess | 124 |
| abstract_inverted_index.extend | 51 |
| abstract_inverted_index.impact | 40 |
| abstract_inverted_index.nearly | 97 |
| abstract_inverted_index.neural | 11 |
| abstract_inverted_index.obtain | 194 |
| abstract_inverted_index.paper, | 48 |
| abstract_inverted_index.rates. | 187 |
| abstract_inverted_index.achieve | 183 |
| abstract_inverted_index.descent | 17 |
| abstract_inverted_index.desired | 157, 185 |
| abstract_inverted_index.effects | 110 |
| abstract_inverted_index.focused | 31 |
| abstract_inverted_index.general | 77, 119 |
| abstract_inverted_index.greatly | 50 |
| abstract_inverted_index.network | 43, 78, 173 |
| abstract_inverted_index.results | 73 |
| abstract_inverted_index.scaling | 44, 79, 168 |
| abstract_inverted_index.trained | 14, 151 |
| abstract_inverted_index.$O(1/n)$ | 200 |
| abstract_inverted_index.However, | 24 |
| abstract_inverted_index.analysis | 63 |
| abstract_inverted_index.existing | 28 |
| abstract_inverted_index.explores | 108 |
| abstract_inverted_index.gradient | 16 |
| abstract_inverted_index.networks | 12 |
| abstract_inverted_index.previous | 53, 82 |
| abstract_inverted_index.progress | 2 |
| abstract_inverted_index.property | 99 |
| abstract_inverted_index.relaxing | 81 |
| abstract_inverted_index.required | 179 |
| abstract_inverted_index.research | 29 |
| abstract_inverted_index.strategy | 105 |
| abstract_inverted_index.Moreover, | 162 |
| abstract_inverted_index.Recently, | 0 |
| abstract_inverted_index.addressed | 38 |
| abstract_inverted_index.approach. | 23 |
| abstract_inverted_index.different | 42 |
| abstract_inverted_index.findings, | 120 |
| abstract_inverted_index.increases | 170 |
| abstract_inverted_index.induction | 104 |
| abstract_inverted_index.low-noise | 191 |
| abstract_inverted_index.necessary | 144 |
| abstract_inverted_index.parameter | 169 |
| abstract_inverted_index.stability | 22, 60 |
| abstract_inverted_index.technical | 91 |
| abstract_inverted_index.two-layer | 70, 134, 205 |
| abstract_inverted_index.utilizing | 101 |
| abstract_inverted_index.algorithms | 131 |
| abstract_inverted_index.complexity | 174 |
| abstract_inverted_index.condition, | 192 |
| abstract_inverted_index.conditions | 145 |
| abstract_inverted_index.conducting | 57 |
| abstract_inverted_index.decreases, | 175 |
| abstract_inverted_index.sufficient | 142 |
| abstract_inverted_index.thoroughly | 107 |
| abstract_inverted_index.algorithmic | 21 |
| abstract_inverted_index.application | 116 |
| abstract_inverted_index.co-coercive | 98 |
| abstract_inverted_index.conditions. | 83 |
| abstract_inverted_index.demonstrate | 164 |
| abstract_inverted_index.established | 75 |
| abstract_inverted_index.multi-layer | 67 |
| abstract_inverted_index.parameters, | 80 |
| abstract_inverted_index.parameters. | 45 |
| abstract_inverted_index.significant | 1 |
| abstract_inverted_index.three-layer | 88, 136, 207 |
| abstract_inverted_index.contribution | 92 |
| abstract_inverted_index.Additionally, | 188 |
| abstract_inverted_index.comprehensive | 59 |
| abstract_inverted_index.demonstrating | 95 |
| abstract_inverted_index.understanding | 7 |
| abstract_inverted_index.generalization | 9, 62 |
| abstract_inverted_index.$O(1/\sqrt{n})$ | 128 |
| abstract_inverted_index.$O(1/\sqrt{n})$. | 161 |
| abstract_inverted_index.one-hidden-layer | 33 |
| abstract_inverted_index.over-parameterized | 149 |
| abstract_inverted_index.under-parameterized | 147 |
| abstract_inverted_index.over-parameterization | 177 |
| abstract_inverted_index.over-parameterization. | 112 |
| abstract_inverted_index.\cite{lei2022stability,richards2021stability} | 55 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |