Lecture | April 4 | 4-5:30 p.m. | Dwinelle Hall, Academic Innovation Studio, 117 Dwinelle
Distributed representations of words -- where a word is represented not by its identity but rather by the distributional properties of the contexts it appears in -- are in many ways responsible for the significant gains in accuracy that many applications in natural language processing have witnessed over the past five years, and have driven many interesting applications in the computational humanities. While widespread interest in such representations took off in 2013 with the release of word2vec (Mikolov 2013), distributed representations have a much longer history, arising out of an appreciation of context advocated not only by Wittgenstein, Harris and Firth but also by a millenium of lexicographers and concordance-makers. In this talk, I'll outline the history of distributed representations of words to their height today, and unpack what's new about contemporary (neural) methods of learning such representations compared to previous approaches. By focusing on the fundamentals of representation learning, I'll also discuss how we can incorporate other forms of extra-linguistic information into the representation for a word (such as time, geographical location, or author identity) and use that more complex representation for linguistic reasoning as well.