Revisiting Architecture-aware Knowledge Distillation: Smaller Models and Faster Search

Exploring foci of: arXiv (Cornell University) Revisiting Architecture-aware Knowledge Distillation: Smaller Models and Faster Search June 2022 • Taehyeon Kim, Heesoo Myeong, Se-Young Yun Knowledge Distillation (KD) has recently emerged as a popular method for compressing neural networks. In recent studies, generalized distillation methods that find parameters and architectures of student models at the same time have been proposed. Still, this search method requires a lot of computation to search for architectures and has the disadvantage of considering only convolutional blocks in their search space. This paper introduces a new algorithm, coined as Trust Region Aware architecture search to Distill… Open Article Page

Computer Science Architecture Distillation Convolutional Neural Network Artificial Intelligence Machine Learning Algorithm Chemistry Visual Arts Open Article

Organic Chemistry Economics Microeconomics Art Open Article