Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent Article Swipe
YOU?
·
· 2020
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2012.11184
Deep neural networks (DNNs) have achieved remarkable success in computer vision; however, training DNNs for satisfactory performance remains challenging and suffers from sensitivity to empirical selections of an optimization algorithm for training. Stochastic gradient descent (SGD) is dominant in training a DNN by adjusting neural network weights to minimize the DNNs loss function. As an alternative approach, neuroevolution is more in line with an evolutionary process and provides some key capabilities that are often unavailable in SGD, such as the heuristic black-box search strategy based on individual collaboration in neuroevolution. This paper proposes a novel approach that combines the merits of both neuroevolution and SGD, enabling evolutionary search, parallel exploration, and an effective probe for optimal DNNs. A hierarchical cluster-based suppression algorithm is also developed to overcome similar weight updates among individuals for improving population diversity. We implement the proposed approach in four representative DNNs based on four publicly-available datasets. Experiment results demonstrate that the four DNNs optimized by the proposed approach all outperform corresponding ones optimized by only SGD on all datasets. The performance of DNNs optimized by the proposed approach also outperforms state-of-the-art deep networks. This work also presents a meaningful attempt for pursuing artificial general intelligence.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2012.11184
- https://arxiv.org/pdf/2012.11184
- OA Status
- green
- Cited By
- 1
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4285708251
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4285708251Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2012.11184Digital Object Identifier
- Title
-
Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient DescentWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2020Year of publication
- Publication date
-
2020-12-21Full publication date if available
- Authors
-
Haichao Zhang, Kuangrong Hao, Lei Gao, Bing Wei, Xue‐song TangList of authors in order
- Landing page
-
https://arxiv.org/abs/2012.11184Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2012.11184Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2012.11184Direct OA link when available
- Concepts
-
Neuroevolution, Computer science, Stochastic gradient descent, Artificial intelligence, Artificial neural network, Deep neural networks, Gradient descent, Machine learning, Evolutionary algorithm, HeuristicTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2023: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4285708251 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2012.11184 |
| ids.doi | https://doi.org/10.48550/arxiv.2012.11184 |
| ids.openalex | https://openalex.org/W4285708251 |
| fwci | |
| type | preprint |
| title | Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10036 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9994000196456909 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Advanced Neural Network Applications |
| topics[1].id | https://openalex.org/T12676 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9937999844551086 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Machine Learning and ELM |
| topics[2].id | https://openalex.org/T11612 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.992900013923645 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Stochastic Gradient Optimization Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C118070581 |
| concepts[0].level | 3 |
| concepts[0].score | 0.8770817518234253 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q2060528 |
| concepts[0].display_name | Neuroevolution |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.7524068355560303 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C206688291 |
| concepts[2].level | 3 |
| concepts[2].score | 0.6501603722572327 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q7617819 |
| concepts[2].display_name | Stochastic gradient descent |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.6314229965209961 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C50644808 |
| concepts[4].level | 2 |
| concepts[4].score | 0.553368330001831 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[4].display_name | Artificial neural network |
| concepts[5].id | https://openalex.org/C2984842247 |
| concepts[5].level | 3 |
| concepts[5].score | 0.5488919019699097 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q197536 |
| concepts[5].display_name | Deep neural networks |
| concepts[6].id | https://openalex.org/C153258448 |
| concepts[6].level | 3 |
| concepts[6].score | 0.5095177292823792 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q1199743 |
| concepts[6].display_name | Gradient descent |
| concepts[7].id | https://openalex.org/C119857082 |
| concepts[7].level | 1 |
| concepts[7].score | 0.5015199184417725 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[7].display_name | Machine learning |
| concepts[8].id | https://openalex.org/C159149176 |
| concepts[8].level | 2 |
| concepts[8].score | 0.47257936000823975 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q14489129 |
| concepts[8].display_name | Evolutionary algorithm |
| concepts[9].id | https://openalex.org/C173801870 |
| concepts[9].level | 2 |
| concepts[9].score | 0.4537317156791687 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q201413 |
| concepts[9].display_name | Heuristic |
| keywords[0].id | https://openalex.org/keywords/neuroevolution |
| keywords[0].score | 0.8770817518234253 |
| keywords[0].display_name | Neuroevolution |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.7524068355560303 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/stochastic-gradient-descent |
| keywords[2].score | 0.6501603722572327 |
| keywords[2].display_name | Stochastic gradient descent |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.6314229965209961 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[4].score | 0.553368330001831 |
| keywords[4].display_name | Artificial neural network |
| keywords[5].id | https://openalex.org/keywords/deep-neural-networks |
| keywords[5].score | 0.5488919019699097 |
| keywords[5].display_name | Deep neural networks |
| keywords[6].id | https://openalex.org/keywords/gradient-descent |
| keywords[6].score | 0.5095177292823792 |
| keywords[6].display_name | Gradient descent |
| keywords[7].id | https://openalex.org/keywords/machine-learning |
| keywords[7].score | 0.5015199184417725 |
| keywords[7].display_name | Machine learning |
| keywords[8].id | https://openalex.org/keywords/evolutionary-algorithm |
| keywords[8].score | 0.47257936000823975 |
| keywords[8].display_name | Evolutionary algorithm |
| keywords[9].id | https://openalex.org/keywords/heuristic |
| keywords[9].score | 0.4537317156791687 |
| keywords[9].display_name | Heuristic |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2012.11184 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2012.11184 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2012.11184 |
| locations[1].id | doi:10.48550/arxiv.2012.11184 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2012.11184 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5101758340 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-4168-3640 |
| authorships[0].author.display_name | Haichao Zhang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zhang, Haichao |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5001762132 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-9672-6161 |
| authorships[1].author.display_name | Kuangrong Hao |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Hao, Kuangrong |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5002631807 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-4272-9417 |
| authorships[2].author.display_name | Lei Gao |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Gao, Lei |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5064572247 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-2298-1474 |
| authorships[3].author.display_name | Bing Wei |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Wei, Bing |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5048492871 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-7594-2241 |
| authorships[4].author.display_name | Xue‐song Tang |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Tang, Xuesong |
| authorships[4].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2012.11184 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10036 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9994000196456909 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Advanced Neural Network Applications |
| related_works | https://openalex.org/W2796541468, https://openalex.org/W3154810729, https://openalex.org/W4206903459, https://openalex.org/W2754816816, https://openalex.org/W4366280654, https://openalex.org/W3160167280, https://openalex.org/W4231621013, https://openalex.org/W4362706668, https://openalex.org/W3008318776, https://openalex.org/W2041416246 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2023 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2012.11184 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2012.11184 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2012.11184 |
| primary_location.id | pmh:oai:arXiv.org:2012.11184 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2012.11184 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2012.11184 |
| publication_date | 2020-12-21 |
| publication_year | 2020 |
| referenced_works_count | 0 |
| abstract_inverted_index.A | 117 |
| abstract_inverted_index.a | 40, 93, 191 |
| abstract_inverted_index.As | 53 |
| abstract_inverted_index.We | 136 |
| abstract_inverted_index.an | 27, 54, 63, 111 |
| abstract_inverted_index.as | 78 |
| abstract_inverted_index.by | 42, 158, 167, 178 |
| abstract_inverted_index.in | 8, 38, 60, 75, 88, 141 |
| abstract_inverted_index.is | 36, 58, 122 |
| abstract_inverted_index.of | 26, 100, 175 |
| abstract_inverted_index.on | 85, 146, 170 |
| abstract_inverted_index.to | 23, 47, 125 |
| abstract_inverted_index.DNN | 41 |
| abstract_inverted_index.SGD | 169 |
| abstract_inverted_index.The | 173 |
| abstract_inverted_index.all | 162, 171 |
| abstract_inverted_index.and | 19, 66, 103, 110 |
| abstract_inverted_index.are | 72 |
| abstract_inverted_index.for | 14, 30, 114, 132, 194 |
| abstract_inverted_index.key | 69 |
| abstract_inverted_index.the | 49, 79, 98, 138, 154, 159, 179 |
| abstract_inverted_index.DNNs | 13, 50, 144, 156, 176 |
| abstract_inverted_index.Deep | 0 |
| abstract_inverted_index.SGD, | 76, 104 |
| abstract_inverted_index.This | 90, 187 |
| abstract_inverted_index.also | 123, 182, 189 |
| abstract_inverted_index.both | 101 |
| abstract_inverted_index.deep | 185 |
| abstract_inverted_index.four | 142, 147, 155 |
| abstract_inverted_index.from | 21 |
| abstract_inverted_index.have | 4 |
| abstract_inverted_index.line | 61 |
| abstract_inverted_index.loss | 51 |
| abstract_inverted_index.more | 59 |
| abstract_inverted_index.ones | 165 |
| abstract_inverted_index.only | 168 |
| abstract_inverted_index.some | 68 |
| abstract_inverted_index.such | 77 |
| abstract_inverted_index.that | 71, 96, 153 |
| abstract_inverted_index.with | 62 |
| abstract_inverted_index.work | 188 |
| abstract_inverted_index.(SGD) | 35 |
| abstract_inverted_index.DNNs. | 116 |
| abstract_inverted_index.among | 130 |
| abstract_inverted_index.based | 84, 145 |
| abstract_inverted_index.novel | 94 |
| abstract_inverted_index.often | 73 |
| abstract_inverted_index.paper | 91 |
| abstract_inverted_index.probe | 113 |
| abstract_inverted_index.(DNNs) | 3 |
| abstract_inverted_index.merits | 99 |
| abstract_inverted_index.neural | 1, 44 |
| abstract_inverted_index.search | 82 |
| abstract_inverted_index.weight | 128 |
| abstract_inverted_index.attempt | 193 |
| abstract_inverted_index.descent | 34 |
| abstract_inverted_index.general | 197 |
| abstract_inverted_index.network | 45 |
| abstract_inverted_index.optimal | 115 |
| abstract_inverted_index.process | 65 |
| abstract_inverted_index.remains | 17 |
| abstract_inverted_index.results | 151 |
| abstract_inverted_index.search, | 107 |
| abstract_inverted_index.similar | 127 |
| abstract_inverted_index.success | 7 |
| abstract_inverted_index.suffers | 20 |
| abstract_inverted_index.updates | 129 |
| abstract_inverted_index.vision; | 10 |
| abstract_inverted_index.weights | 46 |
| abstract_inverted_index.achieved | 5 |
| abstract_inverted_index.approach | 95, 140, 161, 181 |
| abstract_inverted_index.combines | 97 |
| abstract_inverted_index.computer | 9 |
| abstract_inverted_index.dominant | 37 |
| abstract_inverted_index.enabling | 105 |
| abstract_inverted_index.gradient | 33 |
| abstract_inverted_index.however, | 11 |
| abstract_inverted_index.minimize | 48 |
| abstract_inverted_index.networks | 2 |
| abstract_inverted_index.overcome | 126 |
| abstract_inverted_index.parallel | 108 |
| abstract_inverted_index.presents | 190 |
| abstract_inverted_index.proposed | 139, 160, 180 |
| abstract_inverted_index.proposes | 92 |
| abstract_inverted_index.provides | 67 |
| abstract_inverted_index.pursuing | 195 |
| abstract_inverted_index.strategy | 83 |
| abstract_inverted_index.training | 12, 39 |
| abstract_inverted_index.adjusting | 43 |
| abstract_inverted_index.algorithm | 29, 121 |
| abstract_inverted_index.approach, | 56 |
| abstract_inverted_index.black-box | 81 |
| abstract_inverted_index.datasets. | 149, 172 |
| abstract_inverted_index.developed | 124 |
| abstract_inverted_index.effective | 112 |
| abstract_inverted_index.empirical | 24 |
| abstract_inverted_index.function. | 52 |
| abstract_inverted_index.heuristic | 80 |
| abstract_inverted_index.implement | 137 |
| abstract_inverted_index.improving | 133 |
| abstract_inverted_index.networks. | 186 |
| abstract_inverted_index.optimized | 157, 166, 177 |
| abstract_inverted_index.training. | 31 |
| abstract_inverted_index.Experiment | 150 |
| abstract_inverted_index.Stochastic | 32 |
| abstract_inverted_index.artificial | 196 |
| abstract_inverted_index.diversity. | 135 |
| abstract_inverted_index.individual | 86 |
| abstract_inverted_index.meaningful | 192 |
| abstract_inverted_index.outperform | 163 |
| abstract_inverted_index.population | 134 |
| abstract_inverted_index.remarkable | 6 |
| abstract_inverted_index.selections | 25 |
| abstract_inverted_index.alternative | 55 |
| abstract_inverted_index.challenging | 18 |
| abstract_inverted_index.demonstrate | 152 |
| abstract_inverted_index.individuals | 131 |
| abstract_inverted_index.outperforms | 183 |
| abstract_inverted_index.performance | 16, 174 |
| abstract_inverted_index.sensitivity | 22 |
| abstract_inverted_index.suppression | 120 |
| abstract_inverted_index.unavailable | 74 |
| abstract_inverted_index.capabilities | 70 |
| abstract_inverted_index.evolutionary | 64, 106 |
| abstract_inverted_index.exploration, | 109 |
| abstract_inverted_index.hierarchical | 118 |
| abstract_inverted_index.optimization | 28 |
| abstract_inverted_index.satisfactory | 15 |
| abstract_inverted_index.cluster-based | 119 |
| abstract_inverted_index.collaboration | 87 |
| abstract_inverted_index.corresponding | 164 |
| abstract_inverted_index.intelligence. | 198 |
| abstract_inverted_index.neuroevolution | 57, 102 |
| abstract_inverted_index.representative | 143 |
| abstract_inverted_index.neuroevolution. | 89 |
| abstract_inverted_index.state-of-the-art | 184 |
| abstract_inverted_index.publicly-available | 148 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |