Revisiting Architecture-aware Knowledge Distillation: Smaller Models and Faster Search Article Swipe
YOU?
·
· 2022
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2206.13130
Knowledge Distillation (KD) has recently emerged as a popular method for compressing neural networks. In recent studies, generalized distillation methods that find parameters and architectures of student models at the same time have been proposed. Still, this search method requires a lot of computation to search for architectures and has the disadvantage of considering only convolutional blocks in their search space. This paper introduces a new algorithm, coined as Trust Region Aware architecture search to Distill knowledge Effectively (TRADE), that rapidly finds effective student architectures from several state-of-the-art architectures using trust region Bayesian optimization approach. Experimental results show our proposed TRADE algorithm consistently outperforms both the conventional NAS approach and pre-defined architecture under KD training.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2206.13130
- https://arxiv.org/pdf/2206.13130
- OA Status
- green
- Cited By
- 1
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4283703886
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4283703886Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2206.13130Digital Object Identifier
- Title
-
Revisiting Architecture-aware Knowledge Distillation: Smaller Models and Faster SearchWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2022Year of publication
- Publication date
-
2022-06-27Full publication date if available
- Authors
-
Taehyeon Kim, Heesoo Myeong, Se-Young YunList of authors in order
- Landing page
-
https://arxiv.org/abs/2206.13130Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2206.13130Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2206.13130Direct OA link when available
- Concepts
-
Computer science, Architecture, Distillation, Convolutional neural network, Bayesian optimization, Artificial intelligence, Machine learning, Computation, Search cost, Algorithm, Chemistry, Visual arts, Organic chemistry, Economics, Microeconomics, ArtTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2023: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4283703886 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2206.13130 |
| ids.doi | https://doi.org/10.48550/arxiv.2206.13130 |
| ids.openalex | https://openalex.org/W4283703886 |
| fwci | |
| type | preprint |
| title | Revisiting Architecture-aware Knowledge Distillation: Smaller Models and Faster Search |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10036 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9991999864578247 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Advanced Neural Network Applications |
| topics[1].id | https://openalex.org/T11689 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9979000091552734 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Adversarial Robustness in Machine Learning |
| topics[2].id | https://openalex.org/T12535 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.989799976348877 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Machine Learning and Data Classification |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.7247115969657898 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C123657996 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6592663526535034 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q12271 |
| concepts[1].display_name | Architecture |
| concepts[2].id | https://openalex.org/C204030448 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6234637498855591 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q101017 |
| concepts[2].display_name | Distillation |
| concepts[3].id | https://openalex.org/C81363708 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5794552564620972 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q17084460 |
| concepts[3].display_name | Convolutional neural network |
| concepts[4].id | https://openalex.org/C2778049539 |
| concepts[4].level | 2 |
| concepts[4].score | 0.48833104968070984 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q17002908 |
| concepts[4].display_name | Bayesian optimization |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.4633740484714508 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C119857082 |
| concepts[6].level | 1 |
| concepts[6].score | 0.4548396170139313 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[6].display_name | Machine learning |
| concepts[7].id | https://openalex.org/C45374587 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4380953311920166 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q12525525 |
| concepts[7].display_name | Computation |
| concepts[8].id | https://openalex.org/C21782646 |
| concepts[8].level | 2 |
| concepts[8].score | 0.41103190183639526 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q841666 |
| concepts[8].display_name | Search cost |
| concepts[9].id | https://openalex.org/C11413529 |
| concepts[9].level | 1 |
| concepts[9].score | 0.23144370317459106 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[9].display_name | Algorithm |
| concepts[10].id | https://openalex.org/C185592680 |
| concepts[10].level | 0 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[10].display_name | Chemistry |
| concepts[11].id | https://openalex.org/C153349607 |
| concepts[11].level | 1 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q36649 |
| concepts[11].display_name | Visual arts |
| concepts[12].id | https://openalex.org/C178790620 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q11351 |
| concepts[12].display_name | Organic chemistry |
| concepts[13].id | https://openalex.org/C162324750 |
| concepts[13].level | 0 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[13].display_name | Economics |
| concepts[14].id | https://openalex.org/C175444787 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q39072 |
| concepts[14].display_name | Microeconomics |
| concepts[15].id | https://openalex.org/C142362112 |
| concepts[15].level | 0 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q735 |
| concepts[15].display_name | Art |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.7247115969657898 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/architecture |
| keywords[1].score | 0.6592663526535034 |
| keywords[1].display_name | Architecture |
| keywords[2].id | https://openalex.org/keywords/distillation |
| keywords[2].score | 0.6234637498855591 |
| keywords[2].display_name | Distillation |
| keywords[3].id | https://openalex.org/keywords/convolutional-neural-network |
| keywords[3].score | 0.5794552564620972 |
| keywords[3].display_name | Convolutional neural network |
| keywords[4].id | https://openalex.org/keywords/bayesian-optimization |
| keywords[4].score | 0.48833104968070984 |
| keywords[4].display_name | Bayesian optimization |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.4633740484714508 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/machine-learning |
| keywords[6].score | 0.4548396170139313 |
| keywords[6].display_name | Machine learning |
| keywords[7].id | https://openalex.org/keywords/computation |
| keywords[7].score | 0.4380953311920166 |
| keywords[7].display_name | Computation |
| keywords[8].id | https://openalex.org/keywords/search-cost |
| keywords[8].score | 0.41103190183639526 |
| keywords[8].display_name | Search cost |
| keywords[9].id | https://openalex.org/keywords/algorithm |
| keywords[9].score | 0.23144370317459106 |
| keywords[9].display_name | Algorithm |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2206.13130 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2206.13130 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2206.13130 |
| locations[1].id | doi:10.48550/arxiv.2206.13130 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | public-domain |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/public-domain |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2206.13130 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5100774197 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-5496-2625 |
| authorships[0].author.display_name | Taehyeon Kim |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Kim, Taehyeon |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5023870894 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Heesoo Myeong |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Myeong, Heesoo |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5091674853 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Se-Young Yun |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Yun, Se-Young |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2206.13130 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2022-06-30T00:00:00 |
| display_name | Revisiting Architecture-aware Knowledge Distillation: Smaller Models and Faster Search |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10036 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9991999864578247 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Advanced Neural Network Applications |
| related_works | https://openalex.org/W3106461837, https://openalex.org/W4293226380, https://openalex.org/W3168182983, https://openalex.org/W4387079005, https://openalex.org/W4221166418, https://openalex.org/W3117893869, https://openalex.org/W2968265130, https://openalex.org/W4231775656, https://openalex.org/W4321487865, https://openalex.org/W2046435967 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2023 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2206.13130 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2206.13130 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2206.13130 |
| primary_location.id | pmh:oai:arXiv.org:2206.13130 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2206.13130 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2206.13130 |
| publication_date | 2022-06-27 |
| publication_year | 2022 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 7, 40, 64 |
| abstract_inverted_index.In | 14 |
| abstract_inverted_index.KD | 113 |
| abstract_inverted_index.as | 6, 68 |
| abstract_inverted_index.at | 28 |
| abstract_inverted_index.in | 57 |
| abstract_inverted_index.of | 25, 42, 52 |
| abstract_inverted_index.to | 44, 74 |
| abstract_inverted_index.NAS | 107 |
| abstract_inverted_index.and | 23, 48, 109 |
| abstract_inverted_index.for | 10, 46 |
| abstract_inverted_index.has | 3, 49 |
| abstract_inverted_index.lot | 41 |
| abstract_inverted_index.new | 65 |
| abstract_inverted_index.our | 98 |
| abstract_inverted_index.the | 29, 50, 105 |
| abstract_inverted_index.(KD) | 2 |
| abstract_inverted_index.This | 61 |
| abstract_inverted_index.been | 33 |
| abstract_inverted_index.both | 104 |
| abstract_inverted_index.find | 21 |
| abstract_inverted_index.from | 85 |
| abstract_inverted_index.have | 32 |
| abstract_inverted_index.only | 54 |
| abstract_inverted_index.same | 30 |
| abstract_inverted_index.show | 97 |
| abstract_inverted_index.that | 20, 79 |
| abstract_inverted_index.this | 36 |
| abstract_inverted_index.time | 31 |
| abstract_inverted_index.Aware | 71 |
| abstract_inverted_index.TRADE | 100 |
| abstract_inverted_index.Trust | 69 |
| abstract_inverted_index.finds | 81 |
| abstract_inverted_index.paper | 62 |
| abstract_inverted_index.their | 58 |
| abstract_inverted_index.trust | 90 |
| abstract_inverted_index.under | 112 |
| abstract_inverted_index.using | 89 |
| abstract_inverted_index.Region | 70 |
| abstract_inverted_index.Still, | 35 |
| abstract_inverted_index.blocks | 56 |
| abstract_inverted_index.coined | 67 |
| abstract_inverted_index.method | 9, 38 |
| abstract_inverted_index.models | 27 |
| abstract_inverted_index.neural | 12 |
| abstract_inverted_index.recent | 15 |
| abstract_inverted_index.region | 91 |
| abstract_inverted_index.search | 37, 45, 59, 73 |
| abstract_inverted_index.space. | 60 |
| abstract_inverted_index.Distill | 75 |
| abstract_inverted_index.emerged | 5 |
| abstract_inverted_index.methods | 19 |
| abstract_inverted_index.popular | 8 |
| abstract_inverted_index.rapidly | 80 |
| abstract_inverted_index.results | 96 |
| abstract_inverted_index.several | 86 |
| abstract_inverted_index.student | 26, 83 |
| abstract_inverted_index.(TRADE), | 78 |
| abstract_inverted_index.Bayesian | 92 |
| abstract_inverted_index.approach | 108 |
| abstract_inverted_index.proposed | 99 |
| abstract_inverted_index.recently | 4 |
| abstract_inverted_index.requires | 39 |
| abstract_inverted_index.studies, | 16 |
| abstract_inverted_index.Knowledge | 0 |
| abstract_inverted_index.algorithm | 101 |
| abstract_inverted_index.approach. | 94 |
| abstract_inverted_index.effective | 82 |
| abstract_inverted_index.knowledge | 76 |
| abstract_inverted_index.networks. | 13 |
| abstract_inverted_index.proposed. | 34 |
| abstract_inverted_index.training. | 114 |
| abstract_inverted_index.algorithm, | 66 |
| abstract_inverted_index.introduces | 63 |
| abstract_inverted_index.parameters | 22 |
| abstract_inverted_index.Effectively | 77 |
| abstract_inverted_index.compressing | 11 |
| abstract_inverted_index.computation | 43 |
| abstract_inverted_index.considering | 53 |
| abstract_inverted_index.generalized | 17 |
| abstract_inverted_index.outperforms | 103 |
| abstract_inverted_index.pre-defined | 110 |
| abstract_inverted_index.Distillation | 1 |
| abstract_inverted_index.Experimental | 95 |
| abstract_inverted_index.architecture | 72, 111 |
| abstract_inverted_index.consistently | 102 |
| abstract_inverted_index.conventional | 106 |
| abstract_inverted_index.disadvantage | 51 |
| abstract_inverted_index.distillation | 18 |
| abstract_inverted_index.optimization | 93 |
| abstract_inverted_index.architectures | 24, 47, 84, 88 |
| abstract_inverted_index.convolutional | 55 |
| abstract_inverted_index.state-of-the-art | 87 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/11 |
| sustainable_development_goals[0].score | 0.4300000071525574 |
| sustainable_development_goals[0].display_name | Sustainable cities and communities |
| citation_normalized_percentile |