DSD: Dense-Sparse-Dense Training for Deep Neural Networks Article Swipe
YOU?
·
· 2016
· Open Access
·
Modern deep neural networks have a large number of parameters, making them very hard to train. We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance. In the first D (Dense) step, we train a dense network to learn connection weights and importance. In the S (Sparse) step, we regularize the network by pruning the unimportant connections with small weights and retraining the network given the sparsity constraint. In the final D (re-Dense) step, we increase the model capacity by removing the sparsity constraint, re-initialize the pruned parameters from zero and retrain the whole dense network. Experiments show that DSD training can improve the performance for a wide range of CNNs, RNNs and LSTMs on the tasks of image classification, caption generation and speech recognition. On ImageNet, DSD improved the Top1 accuracy of GoogLeNet by 1.1%, VGG-16 by 4.3%, ResNet-18 by 1.2% and ResNet-50 by 1.1%, respectively. On the WSJ'93 dataset, DSD improved DeepSpeech and DeepSpeech2 WER by 2.0% and 1.1%. On the Flickr-8K dataset, DSD improved the NeuralTalk BLEU score by over 1.7. DSD is easy to use in practice: at training time, DSD incurs only one extra hyper-parameter: the sparsity ratio in the S step. At testing time, DSD doesn't change the network architecture or incur any inference overhead. The consistent and significant performance gain of DSD experiments shows the inadequacy of the current training methods for finding the best local optimum, while DSD effectively achieves superior optimization performance for finding a better solution. DSD models are available to download at this https URL.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://arxiv.org/pdf/1607.04381
- OA Status
- green
- Cited By
- 29
- Related Works
- 20
- OpenAlex ID
- https://openalex.org/W2963126723
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W2963126723Canonical identifier for this work in OpenAlex
- Title
-
DSD: Dense-Sparse-Dense Training for Deep Neural NetworksWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2016Year of publication
- Publication date
-
2016-07-15Full publication date if available
- Authors
-
Song Han, Jeff Pool, Sharan Narang, Huizi Mao, Enhao Gong, Shijian Tang, Erich Elsen, Péter Vajda, Manohar Paluri, John Tran, Bryan Catanzaro, William J. DallyList of authors in order
- Landing page
-
https://arxiv.org/pdf/1607.04381Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/1607.04381Direct OA link when available
- Concepts
-
Computer science, Pruning, Inference, Residual neural network, Artificial intelligence, Overhead (engineering), Constraint (computer-aided design), Artificial neural network, Deep neural networks, Deep learning, Network architecture, Recurrent neural network, Pattern recognition (psychology), Machine learning, Algorithm, Mathematics, Operating system, Biology, Computer security, Agronomy, GeometryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
29Total citation count in OpenAlex
- Citations by year (recent)
-
2022: 1, 2021: 6, 2020: 11, 2019: 7, 2018: 4Per-year citation counts (last 5 years)
- Related works (count)
-
20Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W2963126723 |
|---|---|
| doi | |
| ids.mag | 2963126723 |
| ids.openalex | https://openalex.org/W2963126723 |
| fwci | 1.83900622 |
| type | article |
| title | DSD: Dense-Sparse-Dense Training for Deep Neural Networks |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10036 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9983999729156494 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Advanced Neural Network Applications |
| topics[1].id | https://openalex.org/T11307 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9958999752998352 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Domain Adaptation and Few-Shot Learning |
| topics[2].id | https://openalex.org/T10812 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9904999732971191 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Human Pose and Action Recognition |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.821175217628479 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C108010975 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7736420035362244 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q500094 |
| concepts[1].display_name | Pruning |
| concepts[2].id | https://openalex.org/C2776214188 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6208393573760986 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q408386 |
| concepts[2].display_name | Inference |
| concepts[3].id | https://openalex.org/C2944601119 |
| concepts[3].level | 3 |
| concepts[3].score | 0.593669056892395 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q43744058 |
| concepts[3].display_name | Residual neural network |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.5771716833114624 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C2779960059 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5393242239952087 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q7113681 |
| concepts[5].display_name | Overhead (engineering) |
| concepts[6].id | https://openalex.org/C2776036281 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5223578214645386 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q48769818 |
| concepts[6].display_name | Constraint (computer-aided design) |
| concepts[7].id | https://openalex.org/C50644808 |
| concepts[7].level | 2 |
| concepts[7].score | 0.5136334300041199 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[7].display_name | Artificial neural network |
| concepts[8].id | https://openalex.org/C2984842247 |
| concepts[8].level | 3 |
| concepts[8].score | 0.5133156180381775 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q197536 |
| concepts[8].display_name | Deep neural networks |
| concepts[9].id | https://openalex.org/C108583219 |
| concepts[9].level | 2 |
| concepts[9].score | 0.45073074102401733 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q197536 |
| concepts[9].display_name | Deep learning |
| concepts[10].id | https://openalex.org/C193415008 |
| concepts[10].level | 2 |
| concepts[10].score | 0.4405193328857422 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q639681 |
| concepts[10].display_name | Network architecture |
| concepts[11].id | https://openalex.org/C147168706 |
| concepts[11].level | 3 |
| concepts[11].score | 0.43108803033828735 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q1457734 |
| concepts[11].display_name | Recurrent neural network |
| concepts[12].id | https://openalex.org/C153180895 |
| concepts[12].level | 2 |
| concepts[12].score | 0.3964996337890625 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[12].display_name | Pattern recognition (psychology) |
| concepts[13].id | https://openalex.org/C119857082 |
| concepts[13].level | 1 |
| concepts[13].score | 0.352042555809021 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[13].display_name | Machine learning |
| concepts[14].id | https://openalex.org/C11413529 |
| concepts[14].level | 1 |
| concepts[14].score | 0.32115429639816284 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[14].display_name | Algorithm |
| concepts[15].id | https://openalex.org/C33923547 |
| concepts[15].level | 0 |
| concepts[15].score | 0.10532277822494507 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[15].display_name | Mathematics |
| concepts[16].id | https://openalex.org/C111919701 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[16].display_name | Operating system |
| concepts[17].id | https://openalex.org/C86803240 |
| concepts[17].level | 0 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[17].display_name | Biology |
| concepts[18].id | https://openalex.org/C38652104 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q3510521 |
| concepts[18].display_name | Computer security |
| concepts[19].id | https://openalex.org/C6557445 |
| concepts[19].level | 1 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q173113 |
| concepts[19].display_name | Agronomy |
| concepts[20].id | https://openalex.org/C2524010 |
| concepts[20].level | 1 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[20].display_name | Geometry |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.821175217628479 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/pruning |
| keywords[1].score | 0.7736420035362244 |
| keywords[1].display_name | Pruning |
| keywords[2].id | https://openalex.org/keywords/inference |
| keywords[2].score | 0.6208393573760986 |
| keywords[2].display_name | Inference |
| keywords[3].id | https://openalex.org/keywords/residual-neural-network |
| keywords[3].score | 0.593669056892395 |
| keywords[3].display_name | Residual neural network |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.5771716833114624 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/overhead |
| keywords[5].score | 0.5393242239952087 |
| keywords[5].display_name | Overhead (engineering) |
| keywords[6].id | https://openalex.org/keywords/constraint |
| keywords[6].score | 0.5223578214645386 |
| keywords[6].display_name | Constraint (computer-aided design) |
| keywords[7].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[7].score | 0.5136334300041199 |
| keywords[7].display_name | Artificial neural network |
| keywords[8].id | https://openalex.org/keywords/deep-neural-networks |
| keywords[8].score | 0.5133156180381775 |
| keywords[8].display_name | Deep neural networks |
| keywords[9].id | https://openalex.org/keywords/deep-learning |
| keywords[9].score | 0.45073074102401733 |
| keywords[9].display_name | Deep learning |
| keywords[10].id | https://openalex.org/keywords/network-architecture |
| keywords[10].score | 0.4405193328857422 |
| keywords[10].display_name | Network architecture |
| keywords[11].id | https://openalex.org/keywords/recurrent-neural-network |
| keywords[11].score | 0.43108803033828735 |
| keywords[11].display_name | Recurrent neural network |
| keywords[12].id | https://openalex.org/keywords/pattern-recognition |
| keywords[12].score | 0.3964996337890625 |
| keywords[12].display_name | Pattern recognition (psychology) |
| keywords[13].id | https://openalex.org/keywords/machine-learning |
| keywords[13].score | 0.352042555809021 |
| keywords[13].display_name | Machine learning |
| keywords[14].id | https://openalex.org/keywords/algorithm |
| keywords[14].score | 0.32115429639816284 |
| keywords[14].display_name | Algorithm |
| keywords[15].id | https://openalex.org/keywords/mathematics |
| keywords[15].score | 0.10532277822494507 |
| keywords[15].display_name | Mathematics |
| language | en |
| locations[0].id | mag:2963126723 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | |
| locations[0].raw_source_name | arXiv (Cornell University) |
| locations[0].landing_page_url | https://arxiv.org/pdf/1607.04381 |
| authorships[0].author.id | https://openalex.org/A5070926896 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-4186-7618 |
| authorships[0].author.display_name | Song Han |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I97018004 |
| authorships[0].affiliations[0].raw_affiliation_string | Stanford University, Stanford, United States |
| authorships[0].institutions[0].id | https://openalex.org/I97018004 |
| authorships[0].institutions[0].ror | https://ror.org/00f54p054 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I97018004 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | Stanford University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Song Han |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Stanford University, Stanford, United States |
| authorships[1].author.id | https://openalex.org/A5063722719 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Jeff Pool |
| authorships[1].countries | GB |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I1304085615 |
| authorships[1].affiliations[0].raw_affiliation_string | Nvidia (United Kingdom), Reading, United Kingdom |
| authorships[1].institutions[0].id | https://openalex.org/I1304085615 |
| authorships[1].institutions[0].ror | https://ror.org/02kr42612 |
| authorships[1].institutions[0].type | company |
| authorships[1].institutions[0].lineage | https://openalex.org/I1304085615, https://openalex.org/I4210127875 |
| authorships[1].institutions[0].country_code | GB |
| authorships[1].institutions[0].display_name | Nvidia (United Kingdom) |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Jeff Pool |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Nvidia (United Kingdom), Reading, United Kingdom |
| authorships[2].author.id | https://openalex.org/A5079540764 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Sharan Narang |
| authorships[2].countries | CN |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I98301712 |
| authorships[2].affiliations[0].raw_affiliation_string | Baidu (China), Beijing, China |
| authorships[2].institutions[0].id | https://openalex.org/I98301712 |
| authorships[2].institutions[0].ror | https://ror.org/03vs3wt56 |
| authorships[2].institutions[0].type | company |
| authorships[2].institutions[0].lineage | https://openalex.org/I98301712 |
| authorships[2].institutions[0].country_code | CN |
| authorships[2].institutions[0].display_name | Baidu (China) |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Sharan Narang |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | Baidu (China), Beijing, China |
| authorships[3].author.id | https://openalex.org/A5063294319 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Huizi Mao |
| authorships[3].countries | US |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I97018004 |
| authorships[3].affiliations[0].raw_affiliation_string | Stanford University, Stanford, United States |
| authorships[3].institutions[0].id | https://openalex.org/I97018004 |
| authorships[3].institutions[0].ror | https://ror.org/00f54p054 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I97018004 |
| authorships[3].institutions[0].country_code | US |
| authorships[3].institutions[0].display_name | Stanford University |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Huizi Mao |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | Stanford University, Stanford, United States |
| authorships[4].author.id | https://openalex.org/A5035343735 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-4002-909X |
| authorships[4].author.display_name | Enhao Gong |
| authorships[4].countries | US |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I97018004 |
| authorships[4].affiliations[0].raw_affiliation_string | Stanford University, Stanford, United States |
| authorships[4].institutions[0].id | https://openalex.org/I97018004 |
| authorships[4].institutions[0].ror | https://ror.org/00f54p054 |
| authorships[4].institutions[0].type | education |
| authorships[4].institutions[0].lineage | https://openalex.org/I97018004 |
| authorships[4].institutions[0].country_code | US |
| authorships[4].institutions[0].display_name | Stanford University |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Enhao Gong |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | Stanford University, Stanford, United States |
| authorships[5].author.id | https://openalex.org/A5040008908 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Shijian Tang |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Shijian Tang |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5009345506 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Erich Elsen |
| authorships[6].countries | CN |
| authorships[6].affiliations[0].institution_ids | https://openalex.org/I98301712 |
| authorships[6].affiliations[0].raw_affiliation_string | Baidu (China), Beijing, China |
| authorships[6].institutions[0].id | https://openalex.org/I98301712 |
| authorships[6].institutions[0].ror | https://ror.org/03vs3wt56 |
| authorships[6].institutions[0].type | company |
| authorships[6].institutions[0].lineage | https://openalex.org/I98301712 |
| authorships[6].institutions[0].country_code | CN |
| authorships[6].institutions[0].display_name | Baidu (China) |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Erich Elsen |
| authorships[6].is_corresponding | False |
| authorships[6].raw_affiliation_strings | Baidu (China), Beijing, China |
| authorships[7].author.id | https://openalex.org/A5048668303 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-2031-4678 |
| authorships[7].author.display_name | Péter Vajda |
| authorships[7].countries | US |
| authorships[7].affiliations[0].institution_ids | https://openalex.org/I97018004 |
| authorships[7].affiliations[0].raw_affiliation_string | Stanford University, Stanford, United States |
| authorships[7].institutions[0].id | https://openalex.org/I97018004 |
| authorships[7].institutions[0].ror | https://ror.org/00f54p054 |
| authorships[7].institutions[0].type | education |
| authorships[7].institutions[0].lineage | https://openalex.org/I97018004 |
| authorships[7].institutions[0].country_code | US |
| authorships[7].institutions[0].display_name | Stanford University |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Peter Vajda |
| authorships[7].is_corresponding | False |
| authorships[7].raw_affiliation_strings | Stanford University, Stanford, United States |
| authorships[8].author.id | https://openalex.org/A5054437548 |
| authorships[8].author.orcid | |
| authorships[8].author.display_name | Manohar Paluri |
| authorships[8].countries | IL |
| authorships[8].affiliations[0].institution_ids | https://openalex.org/I2252078561 |
| authorships[8].affiliations[0].raw_affiliation_string | Meta (Israel), Tel Aviv, Israel |
| authorships[8].institutions[0].id | https://openalex.org/I2252078561 |
| authorships[8].institutions[0].ror | https://ror.org/02388em19 |
| authorships[8].institutions[0].type | company |
| authorships[8].institutions[0].lineage | https://openalex.org/I2252078561, https://openalex.org/I4210114444 |
| authorships[8].institutions[0].country_code | IL |
| authorships[8].institutions[0].display_name | Meta (Israel) |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Manohar Paluri |
| authorships[8].is_corresponding | False |
| authorships[8].raw_affiliation_strings | Meta (Israel), Tel Aviv, Israel |
| authorships[9].author.id | https://openalex.org/A5083560161 |
| authorships[9].author.orcid | |
| authorships[9].author.display_name | John Tran |
| authorships[9].countries | GB |
| authorships[9].affiliations[0].institution_ids | https://openalex.org/I1304085615 |
| authorships[9].affiliations[0].raw_affiliation_string | Nvidia (United Kingdom), Reading, United Kingdom |
| authorships[9].institutions[0].id | https://openalex.org/I1304085615 |
| authorships[9].institutions[0].ror | https://ror.org/02kr42612 |
| authorships[9].institutions[0].type | company |
| authorships[9].institutions[0].lineage | https://openalex.org/I1304085615, https://openalex.org/I4210127875 |
| authorships[9].institutions[0].country_code | GB |
| authorships[9].institutions[0].display_name | Nvidia (United Kingdom) |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | John Tran |
| authorships[9].is_corresponding | False |
| authorships[9].raw_affiliation_strings | Nvidia (United Kingdom), Reading, United Kingdom |
| authorships[10].author.id | https://openalex.org/A5066242985 |
| authorships[10].author.orcid | https://orcid.org/0000-0003-0034-7728 |
| authorships[10].author.display_name | Bryan Catanzaro |
| authorships[10].countries | CN |
| authorships[10].affiliations[0].institution_ids | https://openalex.org/I98301712 |
| authorships[10].affiliations[0].raw_affiliation_string | Baidu (China), Beijing, China |
| authorships[10].institutions[0].id | https://openalex.org/I98301712 |
| authorships[10].institutions[0].ror | https://ror.org/03vs3wt56 |
| authorships[10].institutions[0].type | company |
| authorships[10].institutions[0].lineage | https://openalex.org/I98301712 |
| authorships[10].institutions[0].country_code | CN |
| authorships[10].institutions[0].display_name | Baidu (China) |
| authorships[10].author_position | middle |
| authorships[10].raw_author_name | Bryan Catanzaro |
| authorships[10].is_corresponding | False |
| authorships[10].raw_affiliation_strings | Baidu (China), Beijing, China |
| authorships[11].author.id | https://openalex.org/A5084342236 |
| authorships[11].author.orcid | https://orcid.org/0000-0003-4632-2876 |
| authorships[11].author.display_name | William J. Dally |
| authorships[11].countries | US |
| authorships[11].affiliations[0].institution_ids | https://openalex.org/I97018004 |
| authorships[11].affiliations[0].raw_affiliation_string | Stanford University, Stanford, United States |
| authorships[11].institutions[0].id | https://openalex.org/I97018004 |
| authorships[11].institutions[0].ror | https://ror.org/00f54p054 |
| authorships[11].institutions[0].type | education |
| authorships[11].institutions[0].lineage | https://openalex.org/I97018004 |
| authorships[11].institutions[0].country_code | US |
| authorships[11].institutions[0].display_name | Stanford University |
| authorships[11].author_position | last |
| authorships[11].raw_author_name | William J. Dally |
| authorships[11].is_corresponding | False |
| authorships[11].raw_affiliation_strings | Stanford University, Stanford, United States |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/1607.04381 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | DSD: Dense-Sparse-Dense Training for Deep Neural Networks |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-10-10T17:16:08.811792 |
| primary_topic.id | https://openalex.org/T10036 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9983999729156494 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Advanced Neural Network Applications |
| related_works | https://openalex.org/W2194775991, https://openalex.org/W2964299589, https://openalex.org/W2963674932, https://openalex.org/W2114766824, https://openalex.org/W2963981420, https://openalex.org/W3118608800, https://openalex.org/W2163605009, https://openalex.org/W2962835968, https://openalex.org/W2964233199, https://openalex.org/W2963000224, https://openalex.org/W2117539524, https://openalex.org/W1821462560, https://openalex.org/W2963114950, https://openalex.org/W2962965870, https://openalex.org/W2276892413, https://openalex.org/W2125389748, https://openalex.org/W2112796928, https://openalex.org/W2300242332, https://openalex.org/W2161591461, https://openalex.org/W2097117768 |
| cited_by_count | 29 |
| counts_by_year[0].year | 2022 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2021 |
| counts_by_year[1].cited_by_count | 6 |
| counts_by_year[2].year | 2020 |
| counts_by_year[2].cited_by_count | 11 |
| counts_by_year[3].year | 2019 |
| counts_by_year[3].cited_by_count | 7 |
| counts_by_year[4].year | 2018 |
| counts_by_year[4].cited_by_count | 4 |
| locations_count | 1 |
| best_oa_location.id | mag:2963126723 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | |
| best_oa_location.version | |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | arXiv (Cornell University) |
| best_oa_location.landing_page_url | https://arxiv.org/pdf/1607.04381 |
| primary_location.id | mag:2963126723 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | arXiv (Cornell University) |
| primary_location.landing_page_url | https://arxiv.org/pdf/1607.04381 |
| publication_date | 2016-07-15 |
| publication_year | 2016 |
| referenced_works_count | 0 |
| abstract_inverted_index.D | 36, 78 |
| abstract_inverted_index.S | 52, 202 |
| abstract_inverted_index.a | 5, 19, 41, 113, 250 |
| abstract_inverted_index.At | 204 |
| abstract_inverted_index.In | 33, 50, 75 |
| abstract_inverted_index.On | 132, 154, 168 |
| abstract_inverted_index.We | 16 |
| abstract_inverted_index.at | 188, 259 |
| abstract_inverted_index.by | 59, 86, 141, 144, 147, 151, 164, 178 |
| abstract_inverted_index.in | 186, 200 |
| abstract_inverted_index.is | 182 |
| abstract_inverted_index.of | 8, 116, 124, 139, 224, 230 |
| abstract_inverted_index.on | 121 |
| abstract_inverted_index.or | 213 |
| abstract_inverted_index.to | 14, 44, 184, 257 |
| abstract_inverted_index.we | 39, 55, 81 |
| abstract_inverted_index.DSD | 106, 134, 158, 172, 181, 191, 207, 225, 242, 253 |
| abstract_inverted_index.The | 218 |
| abstract_inverted_index.WER | 163 |
| abstract_inverted_index.and | 28, 48, 67, 97, 119, 129, 149, 161, 166, 220 |
| abstract_inverted_index.any | 215 |
| abstract_inverted_index.are | 255 |
| abstract_inverted_index.can | 108 |
| abstract_inverted_index.for | 23, 112, 235, 248 |
| abstract_inverted_index.one | 194 |
| abstract_inverted_index.the | 34, 51, 57, 61, 69, 72, 76, 83, 88, 92, 99, 110, 122, 136, 155, 169, 174, 197, 201, 210, 228, 231, 237 |
| abstract_inverted_index.use | 185 |
| abstract_inverted_index.1.2% | 148 |
| abstract_inverted_index.1.7. | 180 |
| abstract_inverted_index.2.0% | 165 |
| abstract_inverted_index.BLEU | 176 |
| abstract_inverted_index.DSD, | 18 |
| abstract_inverted_index.RNNs | 118 |
| abstract_inverted_index.Top1 | 137 |
| abstract_inverted_index.URL. | 262 |
| abstract_inverted_index.best | 238 |
| abstract_inverted_index.deep | 1, 25 |
| abstract_inverted_index.easy | 183 |
| abstract_inverted_index.from | 95 |
| abstract_inverted_index.gain | 223 |
| abstract_inverted_index.hard | 13 |
| abstract_inverted_index.have | 4 |
| abstract_inverted_index.only | 193 |
| abstract_inverted_index.over | 179 |
| abstract_inverted_index.show | 104 |
| abstract_inverted_index.that | 105 |
| abstract_inverted_index.them | 11 |
| abstract_inverted_index.this | 260 |
| abstract_inverted_index.very | 12 |
| abstract_inverted_index.wide | 114 |
| abstract_inverted_index.with | 64 |
| abstract_inverted_index.zero | 96 |
| abstract_inverted_index.1.1%, | 142, 152 |
| abstract_inverted_index.1.1%. | 167 |
| abstract_inverted_index.4.3%, | 145 |
| abstract_inverted_index.CNNs, | 117 |
| abstract_inverted_index.LSTMs | 120 |
| abstract_inverted_index.dense | 42, 101 |
| abstract_inverted_index.extra | 195 |
| abstract_inverted_index.final | 77 |
| abstract_inverted_index.first | 35 |
| abstract_inverted_index.flow, | 22 |
| abstract_inverted_index.given | 71 |
| abstract_inverted_index.https | 261 |
| abstract_inverted_index.image | 125 |
| abstract_inverted_index.incur | 214 |
| abstract_inverted_index.large | 6 |
| abstract_inverted_index.learn | 45 |
| abstract_inverted_index.local | 239 |
| abstract_inverted_index.model | 84 |
| abstract_inverted_index.range | 115 |
| abstract_inverted_index.ratio | 199 |
| abstract_inverted_index.score | 177 |
| abstract_inverted_index.shows | 227 |
| abstract_inverted_index.small | 65 |
| abstract_inverted_index.step, | 38, 54, 80 |
| abstract_inverted_index.step. | 203 |
| abstract_inverted_index.tasks | 123 |
| abstract_inverted_index.time, | 190, 206 |
| abstract_inverted_index.train | 40 |
| abstract_inverted_index.while | 241 |
| abstract_inverted_index.whole | 100 |
| abstract_inverted_index.Modern | 0 |
| abstract_inverted_index.VGG-16 | 143 |
| abstract_inverted_index.WSJ'93 | 156 |
| abstract_inverted_index.better | 30, 251 |
| abstract_inverted_index.change | 209 |
| abstract_inverted_index.incurs | 192 |
| abstract_inverted_index.making | 10 |
| abstract_inverted_index.models | 254 |
| abstract_inverted_index.neural | 2, 26 |
| abstract_inverted_index.number | 7 |
| abstract_inverted_index.pruned | 93 |
| abstract_inverted_index.speech | 130 |
| abstract_inverted_index.train. | 15 |
| abstract_inverted_index.(Dense) | 37 |
| abstract_inverted_index.caption | 127 |
| abstract_inverted_index.current | 232 |
| abstract_inverted_index.doesn't | 208 |
| abstract_inverted_index.finding | 236, 249 |
| abstract_inverted_index.improve | 109 |
| abstract_inverted_index.methods | 234 |
| abstract_inverted_index.network | 43, 58, 70, 211 |
| abstract_inverted_index.propose | 17 |
| abstract_inverted_index.pruning | 60 |
| abstract_inverted_index.retrain | 98 |
| abstract_inverted_index.testing | 205 |
| abstract_inverted_index.weights | 47, 66 |
| abstract_inverted_index.(Sparse) | 53 |
| abstract_inverted_index.accuracy | 138 |
| abstract_inverted_index.achieves | 244 |
| abstract_inverted_index.capacity | 85 |
| abstract_inverted_index.dataset, | 157, 171 |
| abstract_inverted_index.download | 258 |
| abstract_inverted_index.improved | 135, 159, 173 |
| abstract_inverted_index.increase | 82 |
| abstract_inverted_index.network. | 102 |
| abstract_inverted_index.networks | 3, 27 |
| abstract_inverted_index.optimum, | 240 |
| abstract_inverted_index.removing | 87 |
| abstract_inverted_index.sparsity | 73, 89, 198 |
| abstract_inverted_index.superior | 245 |
| abstract_inverted_index.training | 21, 107, 189, 233 |
| abstract_inverted_index.Flickr-8K | 170 |
| abstract_inverted_index.GoogLeNet | 140 |
| abstract_inverted_index.ImageNet, | 133 |
| abstract_inverted_index.ResNet-18 | 146 |
| abstract_inverted_index.ResNet-50 | 150 |
| abstract_inverted_index.achieving | 29 |
| abstract_inverted_index.available | 256 |
| abstract_inverted_index.inference | 216 |
| abstract_inverted_index.overhead. | 217 |
| abstract_inverted_index.practice: | 187 |
| abstract_inverted_index.solution. | 252 |
| abstract_inverted_index.(re-Dense) | 79 |
| abstract_inverted_index.DeepSpeech | 160 |
| abstract_inverted_index.NeuralTalk | 175 |
| abstract_inverted_index.connection | 46 |
| abstract_inverted_index.consistent | 219 |
| abstract_inverted_index.generation | 128 |
| abstract_inverted_index.inadequacy | 229 |
| abstract_inverted_index.parameters | 94 |
| abstract_inverted_index.regularize | 56 |
| abstract_inverted_index.retraining | 68 |
| abstract_inverted_index.DeepSpeech2 | 162 |
| abstract_inverted_index.Experiments | 103 |
| abstract_inverted_index.connections | 63 |
| abstract_inverted_index.constraint, | 90 |
| abstract_inverted_index.constraint. | 74 |
| abstract_inverted_index.effectively | 243 |
| abstract_inverted_index.experiments | 226 |
| abstract_inverted_index.importance. | 49 |
| abstract_inverted_index.parameters, | 9 |
| abstract_inverted_index.performance | 111, 222, 247 |
| abstract_inverted_index.significant | 221 |
| abstract_inverted_index.unimportant | 62 |
| abstract_inverted_index.architecture | 212 |
| abstract_inverted_index.optimization | 31, 246 |
| abstract_inverted_index.performance. | 32 |
| abstract_inverted_index.recognition. | 131 |
| abstract_inverted_index.regularizing | 24 |
| abstract_inverted_index.re-initialize | 91 |
| abstract_inverted_index.respectively. | 153 |
| abstract_inverted_index.classification, | 126 |
| abstract_inverted_index.hyper-parameter: | 196 |
| abstract_inverted_index.dense-sparse-dense | 20 |
| cited_by_percentile_year.max | 99 |
| cited_by_percentile_year.min | 89 |
| countries_distinct_count | 4 |
| institutions_distinct_count | 12 |
| citation_normalized_percentile.value | 0.91375464 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |