Unifying Human Processes and Machine Models for Spoken Language Interfaces
Seminar | February 12 | 4-5 p.m. | Soda Hall, HP Auditorium (306)
Gopala Krishna Anumanchipalli, Associate Researcher, University of California, San Francisco
Recent years have witnessed tremendous progress in digital speech interfaces for information access (eg., Amazons Alexa, Google Home etc). The commercial success of these applications is hailed as one of the major achievements of the AI era. Indeed these accomplishments are made possible only by sophisticated deep learning models trained on enormous amounts of supervised data over extensive computing infrastructure. Yet these systems are not robust to variations (like accent, out of vocabulary words etc), remain uninterpretable, and fail in unexpected ways. Most important of all, these systems cannot be easily extended speech and language disabled users, who would potentially benefit the most from availability of such technologies. I am a speech scientist interested in computational modelling of the human speech communication system towards building intelligent spoken language systems. I will present my research where Ive tapped into the human speech communication processes to build robust spoken language systems -- specifically, theories of phonology and physiological data including cortical signals in humans as they produce fluent speech. The insights from these studies reveal elegant organizational principles and computational mechanisms employed by the human brain for fluent speech production, the most complex of motor behaviors. These findings hold the key to the next revolution in human-inspired, human-compatible spoken language technologies that, besides alleviating the problems faced by current systems, can meaningfully impact the lives of millions of people with speech disability.