Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2407.20209
For overparameterized optimization tasks, such as those found in modern machine learning, global minima are generally not unique. In order to understand generalization in these settings, it is vital to study to which minimum an optimization algorithm converges. The possibility of having minima that are unstable under the dynamics imposed by the optimization algorithm limits the potential minima that the algorithm can find. In this paper, we characterize the global minima that are dynamically stable/unstable for both deterministic and stochastic gradient descent (SGD). In particular, we introduce a characteristic Lyapunov exponent that depends on the local dynamics around a global minimum and rigorously prove that the sign of this Lyapunov exponent determines whether SGD can accumulate at the respective global minimum.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2407.20209
- https://arxiv.org/pdf/2407.20209
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4401202595
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4401202595Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2407.20209Digital Object Identifier
- Title
-
Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized LearningWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-07-29Full publication date if available
- Authors
-
Dennis Chemnitz, Maximilian EngelList of authors in order
- Landing page
-
https://arxiv.org/abs/2407.20209Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2407.20209Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2407.20209Direct OA link when available
- Concepts
-
Stability (learning theory), Stochastic gradient descent, Gradient descent, Descent (aeronautics), Computer science, Applied mathematics, Mathematics, Artificial intelligence, Physics, Machine learning, Artificial neural network, MeteorologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4401202595 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2407.20209 |
| ids.doi | https://doi.org/10.48550/arxiv.2407.20209 |
| ids.openalex | https://openalex.org/W4401202595 |
| fwci | |
| type | preprint |
| title | Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11612 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9973999857902527 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Stochastic Gradient Optimization Techniques |
| topics[1].id | https://openalex.org/T10320 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9922999739646912 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Neural Networks and Applications |
| topics[2].id | https://openalex.org/T12101 |
| topics[2].field.id | https://openalex.org/fields/18 |
| topics[2].field.display_name | Decision Sciences |
| topics[2].score | 0.9922999739646912 |
| topics[2].domain.id | https://openalex.org/domains/2 |
| topics[2].domain.display_name | Social Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1803 |
| topics[2].subfield.display_name | Management Science and Operations Research |
| topics[2].display_name | Advanced Bandit Algorithms Research |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C112972136 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6980382204055786 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q7595718 |
| concepts[0].display_name | Stability (learning theory) |
| concepts[1].id | https://openalex.org/C206688291 |
| concepts[1].level | 3 |
| concepts[1].score | 0.5514989495277405 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q7617819 |
| concepts[1].display_name | Stochastic gradient descent |
| concepts[2].id | https://openalex.org/C153258448 |
| concepts[2].level | 3 |
| concepts[2].score | 0.5258429646492004 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q1199743 |
| concepts[2].display_name | Gradient descent |
| concepts[3].id | https://openalex.org/C2776637919 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5155160427093506 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q624380 |
| concepts[3].display_name | Descent (aeronautics) |
| concepts[4].id | https://openalex.org/C41008148 |
| concepts[4].level | 0 |
| concepts[4].score | 0.46223360300064087 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[4].display_name | Computer science |
| concepts[5].id | https://openalex.org/C28826006 |
| concepts[5].level | 1 |
| concepts[5].score | 0.3880324959754944 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q33521 |
| concepts[5].display_name | Applied mathematics |
| concepts[6].id | https://openalex.org/C33923547 |
| concepts[6].level | 0 |
| concepts[6].score | 0.352546751499176 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[6].display_name | Mathematics |
| concepts[7].id | https://openalex.org/C154945302 |
| concepts[7].level | 1 |
| concepts[7].score | 0.3478369116783142 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[7].display_name | Artificial intelligence |
| concepts[8].id | https://openalex.org/C121332964 |
| concepts[8].level | 0 |
| concepts[8].score | 0.17613205313682556 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[8].display_name | Physics |
| concepts[9].id | https://openalex.org/C119857082 |
| concepts[9].level | 1 |
| concepts[9].score | 0.1681286096572876 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[9].display_name | Machine learning |
| concepts[10].id | https://openalex.org/C50644808 |
| concepts[10].level | 2 |
| concepts[10].score | 0.0829094648361206 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[10].display_name | Artificial neural network |
| concepts[11].id | https://openalex.org/C153294291 |
| concepts[11].level | 1 |
| concepts[11].score | 0.04828760027885437 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q25261 |
| concepts[11].display_name | Meteorology |
| keywords[0].id | https://openalex.org/keywords/stability |
| keywords[0].score | 0.6980382204055786 |
| keywords[0].display_name | Stability (learning theory) |
| keywords[1].id | https://openalex.org/keywords/stochastic-gradient-descent |
| keywords[1].score | 0.5514989495277405 |
| keywords[1].display_name | Stochastic gradient descent |
| keywords[2].id | https://openalex.org/keywords/gradient-descent |
| keywords[2].score | 0.5258429646492004 |
| keywords[2].display_name | Gradient descent |
| keywords[3].id | https://openalex.org/keywords/descent |
| keywords[3].score | 0.5155160427093506 |
| keywords[3].display_name | Descent (aeronautics) |
| keywords[4].id | https://openalex.org/keywords/computer-science |
| keywords[4].score | 0.46223360300064087 |
| keywords[4].display_name | Computer science |
| keywords[5].id | https://openalex.org/keywords/applied-mathematics |
| keywords[5].score | 0.3880324959754944 |
| keywords[5].display_name | Applied mathematics |
| keywords[6].id | https://openalex.org/keywords/mathematics |
| keywords[6].score | 0.352546751499176 |
| keywords[6].display_name | Mathematics |
| keywords[7].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[7].score | 0.3478369116783142 |
| keywords[7].display_name | Artificial intelligence |
| keywords[8].id | https://openalex.org/keywords/physics |
| keywords[8].score | 0.17613205313682556 |
| keywords[8].display_name | Physics |
| keywords[9].id | https://openalex.org/keywords/machine-learning |
| keywords[9].score | 0.1681286096572876 |
| keywords[9].display_name | Machine learning |
| keywords[10].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[10].score | 0.0829094648361206 |
| keywords[10].display_name | Artificial neural network |
| keywords[11].id | https://openalex.org/keywords/meteorology |
| keywords[11].score | 0.04828760027885437 |
| keywords[11].display_name | Meteorology |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2407.20209 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2407.20209 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2407.20209 |
| locations[1].id | doi:10.48550/arxiv.2407.20209 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2407.20209 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5073946070 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-3303-3533 |
| authorships[0].author.display_name | Dennis Chemnitz |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Chemnitz, Dennis |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5003876135 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-1406-8052 |
| authorships[1].author.display_name | Maximilian Engel |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Engel, Maximilian |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2407.20209 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11612 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9973999857902527 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Stochastic Gradient Optimization Techniques |
| related_works | https://openalex.org/W4206903459, https://openalex.org/W2754816816, https://openalex.org/W4366280654, https://openalex.org/W3160167280, https://openalex.org/W4231621013, https://openalex.org/W4362706668, https://openalex.org/W2015288657, https://openalex.org/W3008318776, https://openalex.org/W2041416246, https://openalex.org/W3020853991 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2407.20209 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2407.20209 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2407.20209 |
| primary_location.id | pmh:oai:arXiv.org:2407.20209 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2407.20209 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2407.20209 |
| publication_date | 2024-07-29 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 87, 98 |
| abstract_inverted_index.In | 18, 63, 83 |
| abstract_inverted_index.an | 34 |
| abstract_inverted_index.as | 5 |
| abstract_inverted_index.at | 116 |
| abstract_inverted_index.by | 50 |
| abstract_inverted_index.in | 8, 23 |
| abstract_inverted_index.is | 27 |
| abstract_inverted_index.it | 26 |
| abstract_inverted_index.of | 40, 107 |
| abstract_inverted_index.on | 93 |
| abstract_inverted_index.to | 20, 29, 31 |
| abstract_inverted_index.we | 66, 85 |
| abstract_inverted_index.For | 0 |
| abstract_inverted_index.SGD | 113 |
| abstract_inverted_index.The | 38 |
| abstract_inverted_index.and | 78, 101 |
| abstract_inverted_index.are | 14, 44, 72 |
| abstract_inverted_index.can | 61, 114 |
| abstract_inverted_index.for | 75 |
| abstract_inverted_index.not | 16 |
| abstract_inverted_index.the | 47, 51, 55, 59, 68, 94, 105, 117 |
| abstract_inverted_index.both | 76 |
| abstract_inverted_index.sign | 106 |
| abstract_inverted_index.such | 4 |
| abstract_inverted_index.that | 43, 58, 71, 91, 104 |
| abstract_inverted_index.this | 64, 108 |
| abstract_inverted_index.find. | 62 |
| abstract_inverted_index.found | 7 |
| abstract_inverted_index.local | 95 |
| abstract_inverted_index.order | 19 |
| abstract_inverted_index.prove | 103 |
| abstract_inverted_index.study | 30 |
| abstract_inverted_index.these | 24 |
| abstract_inverted_index.those | 6 |
| abstract_inverted_index.under | 46 |
| abstract_inverted_index.vital | 28 |
| abstract_inverted_index.which | 32 |
| abstract_inverted_index.(SGD). | 82 |
| abstract_inverted_index.around | 97 |
| abstract_inverted_index.global | 12, 69, 99, 119 |
| abstract_inverted_index.having | 41 |
| abstract_inverted_index.limits | 54 |
| abstract_inverted_index.minima | 13, 42, 57, 70 |
| abstract_inverted_index.modern | 9 |
| abstract_inverted_index.paper, | 65 |
| abstract_inverted_index.tasks, | 3 |
| abstract_inverted_index.depends | 92 |
| abstract_inverted_index.descent | 81 |
| abstract_inverted_index.imposed | 49 |
| abstract_inverted_index.machine | 10 |
| abstract_inverted_index.minimum | 33, 100 |
| abstract_inverted_index.unique. | 17 |
| abstract_inverted_index.whether | 112 |
| abstract_inverted_index.Lyapunov | 89, 109 |
| abstract_inverted_index.dynamics | 48, 96 |
| abstract_inverted_index.exponent | 90, 110 |
| abstract_inverted_index.gradient | 80 |
| abstract_inverted_index.minimum. | 120 |
| abstract_inverted_index.unstable | 45 |
| abstract_inverted_index.algorithm | 36, 53, 60 |
| abstract_inverted_index.generally | 15 |
| abstract_inverted_index.introduce | 86 |
| abstract_inverted_index.learning, | 11 |
| abstract_inverted_index.potential | 56 |
| abstract_inverted_index.settings, | 25 |
| abstract_inverted_index.accumulate | 115 |
| abstract_inverted_index.converges. | 37 |
| abstract_inverted_index.determines | 111 |
| abstract_inverted_index.respective | 118 |
| abstract_inverted_index.rigorously | 102 |
| abstract_inverted_index.stochastic | 79 |
| abstract_inverted_index.understand | 21 |
| abstract_inverted_index.dynamically | 73 |
| abstract_inverted_index.particular, | 84 |
| abstract_inverted_index.possibility | 39 |
| abstract_inverted_index.characterize | 67 |
| abstract_inverted_index.optimization | 2, 35, 52 |
| abstract_inverted_index.deterministic | 77 |
| abstract_inverted_index.characteristic | 88 |
| abstract_inverted_index.generalization | 22 |
| abstract_inverted_index.stable/unstable | 74 |
| abstract_inverted_index.overparameterized | 1 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |