Overlapping Clustering Models, and One (class) SVM to Bind Them All: Neyman Seminar

Seminar | September 25 | 4-5 p.m. | 1011 Evans Hall

 Purnamrita Sarkar, UT Austin

 Department of Statistics

People belong to multiple communities, words belong to multiple topics, and books cover multiple genres; overlapping clusters are commonplace. Many existing overlapping clustering methods model each person (or word, or book) as a non-negative weighted combination of “exemplars” who belong solely to one community, with some small noise. Geometrically, each person is a point on a cone whose corners are these exemplars. This basic form encompasses the widely used Mixed Membership Blockmodel of networks and its degree-corrected variants, as well as topic models such as Latent Dirichlet Allocation. In this talk I will show that a simple one-class SVM can be used to yield provably consistent parameter inference for all such models, and the resulting algorithm scales to large datasets.

 Berkeley, CA 94720, 5106422781