Seminar | April 24 | 12-1:30 p.m. | 560 Evans Hall
David Rolnick, University of Pennsylvania
It is well-known that the expressivity of a neural network depends on its architecture, with deeper networks expressing more complex functions. For ReLU networks, which are piecewise linear, the number of distinct linear regions is a natural measure of expressivity. It is possible to construct networks for which the number of linear regions grows exponentially with depth. However, we show that the expressivity of networks is in practice far below the theoretical maximum. At initialization, we prove that the average number of regions along any one-dimensional subspace grows only linearly, instead of exponentially, in the total number of neurons. More generally, the average number of regions in a k-dimensional subspace is upper bounded by the kth power of the number of neurons, irrespective of network architecture. Our theory and empirical results suggest that this behavior persists during training. We conclude that inductive bias may play a more significant role than expressivity in the success of deep networks. Joint work with Boris Hanin.