arXiv (Cornell University)
Regularizing Deep Neural Networks with Stochastic Estimators of Hessian Trace
August 2022 • Yucong Liu, Shixing Yu, Tong Lin
In this paper, we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. We explain its benefits in finding flat minima and avoiding Lyapunov stability in dynamical systems. We adopt the Hutchinson method as a classical unbiased estimator for the trace of a matrix and further accelerate its calculation using a dropout scheme. Experiments demonstrate that our method outperforms existing…