Multi-Level Decoupled Relational Distillation for Heterogeneous Architectures Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2502.06189
Heterogeneous distillation is an effective way to transfer knowledge from cross-architecture teacher models to student models. However, existing heterogeneous distillation methods do not take full advantage of the dark knowledge hidden in the teacher's output, limiting their performance.To this end, we propose a novel framework named Multi-Level Decoupled Relational Knowledge Distillation (MLDR-KD) to unleash the potential of relational distillation in heterogeneous distillation. Concretely, we first introduce Decoupled Finegrained Relation Alignment (DFRA) in both logit and feature levels to balance the trade-off between distilled dark knowledge and the confidence in the correct category of the heterogeneous teacher model. Then, Multi-Scale Dynamic Fusion (MSDF) module is applied to dynamically fuse the projected logits of multiscale features at different stages in student model, further improving performance of our method in feature level. We verify our method on four architectures (CNNs, Transformers, MLPs and Mambas), two datasets (CIFAR-100 and Tiny-ImageNet). Compared with the best available method, our MLDR-KD improves student model performance with gains of up to 4.86% on CIFAR-100 and 2.78% on Tiny-ImageNet datasets respectively, showing robustness and generality in heterogeneous distillation. Code will be released soon.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2502.06189
- https://arxiv.org/pdf/2502.06189
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4407386480
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4407386480Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2502.06189Digital Object Identifier
- Title
-
Multi-Level Decoupled Relational Distillation for Heterogeneous ArchitecturesWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-02-10Full publication date if available
- Authors
-
Yingshuai Yang, Ye Peng, Weihao Lin, Kuan-Han Li, Yanxuan Wen, Hao Jia, Tao ChenList of authors in order
- Landing page
-
https://arxiv.org/abs/2502.06189Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2502.06189Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2502.06189Direct OA link when available
- Concepts
-
Distillation, Computer science, Distributed computing, Process engineering, Chemistry, Engineering, ChromatographyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4407386480 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2502.06189 |
| ids.doi | https://doi.org/10.48550/arxiv.2502.06189 |
| ids.openalex | https://openalex.org/W4407386480 |
| fwci | |
| type | preprint |
| title | Multi-Level Decoupled Relational Distillation for Heterogeneous Architectures |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11053 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.972100019454956 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2207 |
| topics[0].subfield.display_name | Control and Systems Engineering |
| topics[0].display_name | Process Optimization and Integration |
| topics[1].id | https://openalex.org/T10791 |
| topics[1].field.id | https://openalex.org/fields/22 |
| topics[1].field.display_name | Engineering |
| topics[1].score | 0.9412000179290771 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2207 |
| topics[1].subfield.display_name | Control and Systems Engineering |
| topics[1].display_name | Advanced Control Systems Optimization |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C204030448 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6258582472801208 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q101017 |
| concepts[0].display_name | Distillation |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.5706250667572021 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C120314980 |
| concepts[2].level | 1 |
| concepts[2].score | 0.3511230945587158 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q180634 |
| concepts[2].display_name | Distributed computing |
| concepts[3].id | https://openalex.org/C21880701 |
| concepts[3].level | 1 |
| concepts[3].score | 0.320870578289032 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2144042 |
| concepts[3].display_name | Process engineering |
| concepts[4].id | https://openalex.org/C185592680 |
| concepts[4].level | 0 |
| concepts[4].score | 0.16961669921875 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[4].display_name | Chemistry |
| concepts[5].id | https://openalex.org/C127413603 |
| concepts[5].level | 0 |
| concepts[5].score | 0.09868046641349792 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[5].display_name | Engineering |
| concepts[6].id | https://openalex.org/C43617362 |
| concepts[6].level | 1 |
| concepts[6].score | 0.07757458090782166 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q170050 |
| concepts[6].display_name | Chromatography |
| keywords[0].id | https://openalex.org/keywords/distillation |
| keywords[0].score | 0.6258582472801208 |
| keywords[0].display_name | Distillation |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.5706250667572021 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/distributed-computing |
| keywords[2].score | 0.3511230945587158 |
| keywords[2].display_name | Distributed computing |
| keywords[3].id | https://openalex.org/keywords/process-engineering |
| keywords[3].score | 0.320870578289032 |
| keywords[3].display_name | Process engineering |
| keywords[4].id | https://openalex.org/keywords/chemistry |
| keywords[4].score | 0.16961669921875 |
| keywords[4].display_name | Chemistry |
| keywords[5].id | https://openalex.org/keywords/engineering |
| keywords[5].score | 0.09868046641349792 |
| keywords[5].display_name | Engineering |
| keywords[6].id | https://openalex.org/keywords/chromatography |
| keywords[6].score | 0.07757458090782166 |
| keywords[6].display_name | Chromatography |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2502.06189 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2502.06189 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2502.06189 |
| locations[1].id | doi:10.48550/arxiv.2502.06189 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2502.06189 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5050448098 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Yingshuai Yang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Yang, Yaoxin |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5048305031 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-9613-7217 |
| authorships[1].author.display_name | Ye Peng |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Ye, Peng |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5068141650 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-1499-2608 |
| authorships[2].author.display_name | Weihao Lin |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Lin, Weihao |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5013570241 |
| authorships[3].author.orcid | https://orcid.org/0009-0002-5972-7602 |
| authorships[3].author.display_name | Kuan-Han Li |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Li, Kangcong |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5101971064 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-4712-1953 |
| authorships[4].author.display_name | Yanxuan Wen |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Wen, Yan |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5100668364 |
| authorships[5].author.orcid | https://orcid.org/0000-0003-1157-5646 |
| authorships[5].author.display_name | Hao Jia |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Hao, Jia |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5100357748 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-3941-1603 |
| authorships[6].author.display_name | Tao Chen |
| authorships[6].author_position | last |
| authorships[6].raw_author_name | Chen, Tao |
| authorships[6].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2502.06189 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Multi-Level Decoupled Relational Distillation for Heterogeneous Architectures |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11053 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.972100019454956 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2207 |
| primary_topic.subfield.display_name | Control and Systems Engineering |
| primary_topic.display_name | Process Optimization and Integration |
| related_works | https://openalex.org/W2748952813, https://openalex.org/W3085764877, https://openalex.org/W2514414740, https://openalex.org/W2377414158, https://openalex.org/W3199615306, https://openalex.org/W77207468, https://openalex.org/W3212781313, https://openalex.org/W124863575, https://openalex.org/W3203147184, https://openalex.org/W2037691954 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2502.06189 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2502.06189 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2502.06189 |
| primary_location.id | pmh:oai:arXiv.org:2502.06189 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2502.06189 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2502.06189 |
| publication_date | 2025-02-10 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 42 |
| abstract_inverted_index.We | 129 |
| abstract_inverted_index.an | 3 |
| abstract_inverted_index.at | 114 |
| abstract_inverted_index.be | 181 |
| abstract_inverted_index.do | 21 |
| abstract_inverted_index.in | 31, 59, 71, 88, 117, 126, 176 |
| abstract_inverted_index.is | 2, 103 |
| abstract_inverted_index.of | 26, 56, 92, 111, 123, 160 |
| abstract_inverted_index.on | 133, 164, 168 |
| abstract_inverted_index.to | 6, 13, 52, 77, 105, 162 |
| abstract_inverted_index.up | 161 |
| abstract_inverted_index.we | 40, 63 |
| abstract_inverted_index.and | 74, 85, 139, 144, 166, 174 |
| abstract_inverted_index.not | 22 |
| abstract_inverted_index.our | 124, 131, 152 |
| abstract_inverted_index.the | 27, 32, 54, 79, 86, 89, 93, 108, 148 |
| abstract_inverted_index.two | 141 |
| abstract_inverted_index.way | 5 |
| abstract_inverted_index.Code | 179 |
| abstract_inverted_index.MLPs | 138 |
| abstract_inverted_index.best | 149 |
| abstract_inverted_index.both | 72 |
| abstract_inverted_index.dark | 28, 83 |
| abstract_inverted_index.end, | 39 |
| abstract_inverted_index.four | 134 |
| abstract_inverted_index.from | 9 |
| abstract_inverted_index.full | 24 |
| abstract_inverted_index.fuse | 107 |
| abstract_inverted_index.take | 23 |
| abstract_inverted_index.this | 38 |
| abstract_inverted_index.will | 180 |
| abstract_inverted_index.with | 147, 158 |
| abstract_inverted_index.2.78% | 167 |
| abstract_inverted_index.4.86% | 163 |
| abstract_inverted_index.Then, | 97 |
| abstract_inverted_index.first | 64 |
| abstract_inverted_index.gains | 159 |
| abstract_inverted_index.logit | 73 |
| abstract_inverted_index.model | 156 |
| abstract_inverted_index.named | 45 |
| abstract_inverted_index.novel | 43 |
| abstract_inverted_index.soon. | 183 |
| abstract_inverted_index.their | 36 |
| abstract_inverted_index.(CNNs, | 136 |
| abstract_inverted_index.(DFRA) | 70 |
| abstract_inverted_index.(MSDF) | 101 |
| abstract_inverted_index.Fusion | 100 |
| abstract_inverted_index.hidden | 30 |
| abstract_inverted_index.level. | 128 |
| abstract_inverted_index.levels | 76 |
| abstract_inverted_index.logits | 110 |
| abstract_inverted_index.method | 125, 132 |
| abstract_inverted_index.model, | 119 |
| abstract_inverted_index.model. | 96 |
| abstract_inverted_index.models | 12 |
| abstract_inverted_index.module | 102 |
| abstract_inverted_index.stages | 116 |
| abstract_inverted_index.verify | 130 |
| abstract_inverted_index.Dynamic | 99 |
| abstract_inverted_index.MLDR-KD | 153 |
| abstract_inverted_index.applied | 104 |
| abstract_inverted_index.balance | 78 |
| abstract_inverted_index.between | 81 |
| abstract_inverted_index.correct | 90 |
| abstract_inverted_index.feature | 75, 127 |
| abstract_inverted_index.further | 120 |
| abstract_inverted_index.method, | 151 |
| abstract_inverted_index.methods | 20 |
| abstract_inverted_index.models. | 15 |
| abstract_inverted_index.output, | 34 |
| abstract_inverted_index.propose | 41 |
| abstract_inverted_index.showing | 172 |
| abstract_inverted_index.student | 14, 118, 155 |
| abstract_inverted_index.teacher | 11, 95 |
| abstract_inverted_index.unleash | 53 |
| abstract_inverted_index.Compared | 146 |
| abstract_inverted_index.However, | 16 |
| abstract_inverted_index.Mambas), | 140 |
| abstract_inverted_index.Relation | 68 |
| abstract_inverted_index.category | 91 |
| abstract_inverted_index.datasets | 142, 170 |
| abstract_inverted_index.existing | 17 |
| abstract_inverted_index.features | 113 |
| abstract_inverted_index.improves | 154 |
| abstract_inverted_index.limiting | 35 |
| abstract_inverted_index.released | 182 |
| abstract_inverted_index.transfer | 7 |
| abstract_inverted_index.(MLDR-KD) | 51 |
| abstract_inverted_index.Alignment | 69 |
| abstract_inverted_index.CIFAR-100 | 165 |
| abstract_inverted_index.Decoupled | 47, 66 |
| abstract_inverted_index.Knowledge | 49 |
| abstract_inverted_index.advantage | 25 |
| abstract_inverted_index.available | 150 |
| abstract_inverted_index.different | 115 |
| abstract_inverted_index.distilled | 82 |
| abstract_inverted_index.effective | 4 |
| abstract_inverted_index.framework | 44 |
| abstract_inverted_index.improving | 121 |
| abstract_inverted_index.introduce | 65 |
| abstract_inverted_index.knowledge | 8, 29, 84 |
| abstract_inverted_index.potential | 55 |
| abstract_inverted_index.projected | 109 |
| abstract_inverted_index.teacher's | 33 |
| abstract_inverted_index.trade-off | 80 |
| abstract_inverted_index.(CIFAR-100 | 143 |
| abstract_inverted_index.Relational | 48 |
| abstract_inverted_index.confidence | 87 |
| abstract_inverted_index.generality | 175 |
| abstract_inverted_index.multiscale | 112 |
| abstract_inverted_index.relational | 57 |
| abstract_inverted_index.robustness | 173 |
| abstract_inverted_index.Concretely, | 62 |
| abstract_inverted_index.Finegrained | 67 |
| abstract_inverted_index.Multi-Level | 46 |
| abstract_inverted_index.Multi-Scale | 98 |
| abstract_inverted_index.dynamically | 106 |
| abstract_inverted_index.performance | 122, 157 |
| abstract_inverted_index.Distillation | 50 |
| abstract_inverted_index.distillation | 1, 19, 58 |
| abstract_inverted_index.Heterogeneous | 0 |
| abstract_inverted_index.Tiny-ImageNet | 169 |
| abstract_inverted_index.Transformers, | 137 |
| abstract_inverted_index.architectures | 135 |
| abstract_inverted_index.distillation. | 61, 178 |
| abstract_inverted_index.heterogeneous | 18, 60, 94, 177 |
| abstract_inverted_index.respectively, | 171 |
| abstract_inverted_index.performance.To | 37 |
| abstract_inverted_index.Tiny-ImageNet). | 145 |
| abstract_inverted_index.cross-architecture | 10 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 7 |
| citation_normalized_percentile |