Mark Grobman
YOU?
Author Swipe
View article: QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom Open
The post-training quantization (PTQ) challenge of bringing quantized neural net accuracy close to original has drawn much attention driven by industry demand. Many of the methods emphasize optimization of a specific degree-of-freedom (DoF)…
Tiled Squeeze-and-Excite: Channel Attention With Local Spatial Context Open
In this paper we investigate the amount of spatial context required for channel attention. To this end we study the popular squeeze-and-excite (SE) block which is a simple and lightweight channel attention mechanism. SE blocks and its nume…
Exploring Neural Networks Quantization via Layer-Wise Quantization Analysis Open
Quantization is an essential step in the efficient deployment of deep learning models and as such is an increasingly popular research topic. An important practical aspect that is not addressed in the current literature is how to analyze an…
Fighting Quantization Bias With Bias Open
Low-precision representation of deep neural networks (DNNs) is critical for efficient deployment of deep learning application on embedded platforms, however, converting the network to low precision degrades its performance. Crucially, netw…
Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization Open
Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks - for …