Next week, Dmitry Vetrov (Higher School of Economics & Samsung AI center, Moscow) will be visiting us, and will give a talk titled “Interesting properties of the variational dropout framework”. You are all cordially invited to this talk on Thursday morning July 26, at 11:00 in C1.112 (FNWI, Amsterdam Science Park).
Abstract: Recently it was shown that dropout, popular regularization technique, can be treated as Bayesian procedure. Such Bayesian interpretation allows us to extend the initial model and to set the individual dropout rates for each weight of DNN. Variational inference automatically sets the rates to their optimal values that surprizingly leads to very high sparsification of DNN. The effect is similar in spirit to well-known ARD procedure for linear models and neural networks. By exploiting different extension one may show that DNNs can be trained with extremely large dropout rates and even when traditional signal-to-noise ratio is zero (e.g. when all weights in the layer have zero means and tunable variances). Coupled with recent discoveries about the landscape of loss these results provide new perspective in building much more powerful yet compact ensembles and/or removing the redundancy in modern deep learning models. In the talk we will cover these topics and present our most recent results in exploring those models.
Bio: Dmitry Vetrov (graduated from Moscow State Univerisity in 2003, PhD in 2006) is research professor at Higher School of Economics, Moscow and head of deep learning lab at Samsung AI center in Moscow. He is founder and head of Bayesian methods research group which became one of the strongest research groups in Russia. Three of his former PhD students became researchers in DeepMind. His research focuses on combining Bayesian framework with deep learning models. His group is also actively involved in building scalable tools for stochastic optimization, the application of tensor decomposition methods to large-scale ML, constructing cooperative multi-agent systems, etc.