Modern Estimation of Information Theoretic Functionals

Seminar | November 6 | 3-4 p.m. | 540 Cory Hall

 Jiantao Jiao

 Electrical Engineering and Computer Sciences (EECS)

Modern inferential tasks—ranging from graphical model learning to image registration to inference of gene regulatory networks—frequently involve estimation of information theoretic functionals such as entropy, mutual information, Kullback–Leibler divergence, and total variation distance. This talk will focus on recent progress in our understanding of the performance, structure, and deployment of near-optimal estimators for such functionals. We present general methods for constructing near-optimal estimators of these functionals in three observation models: (1) i.i.d. data from a discrete distribution, (2) a sample path from a finite-state Markov chain, and (3) i.i.d. data from a continuous distribution. In the first model, we discuss two distinct approaches, with one aimed at estimating the functionals directly, and the other aimed at reconstructing the sorted probability vector. Both methods give rise to near-optimal functional estimators. For estimating the entropy rate of a Markov chain, we provide a non-asymptotic analysis of the sample complexity in terms of the size of the state space and mixing time. We illustrate our findings by estimating the entropy rates of the English language from the Penn Treebank and the Google 1 Billion Word Dataset, which serve as benchmarks for the perplexity in language modeling. For differential entropy estimation, we characterize the minimax behavior over Besov balls, and show that a fixed-k nearest neighbor estimator adaptively achieves the minimax rates up to logarithmic factors without cognizance of the smoothness of the density. In all three models, the near-optimal estimators are efficiently computable and exhibit a “sample size enlargement” phenomenon, i.e., they attain with n samples what prior methods would have needed n log(n) samples to achieve.

Based on collaborations with Weihao Gao, Yanjun Han, Chuan-Zheng Lee, Kartik Venkat, Pramod Viswanath, Tsachy Weissman, Yihong Wu, and Tiancheng Yu.

 ashwinpm@berkeley.edu