![]() | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
OTHER CALENDARSABOUT THE CALENDARMORE RESOURCES |
Supervised topic modelsSeminar: Neyman Seminar | December 2 | 4-5 p.m. | 1011 Evans Hall Jon McAuliffe, Adjunct Assistant Professor, Department of Statistics, UC Berkeley The scale of contemporary electronic text collections has led to growing interest in statistical models based on so-called topics. Formally, a topic is a probability distribution over a vocabulary. Informally, a topic is intended to capture an underlying semantic theme. Most topic models are unsupervised: only the words in the documents are modelled. I will describe supervised latent Dirichlet allocation, a model in which each document is paired with a response variable. The goal is to infer latent topics predictive of the response, then use the fitted model to predict response values for previously unobserved documents. Since exact maximum likelihood is intractable, I will present an approximate EM algorithm which uses variational inference. I'll also discuss results on example document prediction problems, with comparisons to other approaches. Some background on text modeling and variational methods will be provided. Refreshments from about 3:45pm in 1011 Evans 510-642-2781 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Copyright © 2009 UC Regents
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||