I work on core algorithmic aspects of computer speech recognition. My focus is on understanding, and learning from the speech signal. Specifically, on measuring and representing the information encoded in it for optimal pattern classification. Speech usually doesn't happen in isolation. Understanding general audio, it's interplay with human speech and it's contextual significance, is a vital part of my research. The final goal of my work is to enable computing machines to recognize speech better in general, and much better than currently possible in high-noise and other kinds of complex environments, using minimal external (human-generated) knowledge. On the periphery, I work on the general quest for more automation, powerful search strategies and more scaleable learning algorithms for automatic speech recognition systems.
- Comutational Forensics and Investigative Intelligence
- CYSE-645 Hamad Bin Khalifa University (HBKU), Qatar
- 15-498 (W) CMU Qatar
- 15-498 (R) CMU Rwanda
- An Introduction to Knowledge based Deep Learning and Socratic Coaches
- 11-364 CMU Pittsburgh
- Yolanda Gao, PhD, Electrical and Computer Engineering
- Wayne Zhao, PhD, Electrical and Computer Engineering
- Tuomas Virtanen, Rita Singh, Bhiksha Raj (Eds)"Techniques for Noise Robustness in Automatic Speech Recognition", 2012.
Research publications (by topic)
General theme: Forensic deductions from human voice, speech and audio signals in general. Publications in this area lag behind our current work by a couple of years at least, as we rapidly respond to the challenges posed by real crimes by creating technology that is immediately applied to the problem at hand.
- General audio analysis, microphone array processing, denoising, dereverberation, signal restoration
General theme: Our approach is that of modeling the effect of highly-nonstationary noise and reverberation as compositional phenomena. Clean signals can then be recomposed from the bases of the composition. This approach differs from ones that model audio phenomena using dynamic generative models.
- Semi-supervised learning, structure discovery, statistical pattern recognition, classification
These papers cover diferent topics such as learning basic units of sound from data, discovering pronunciations for words in terms of these units, selecting better classifiers using weaker classifiers iteratively in a gradient ascent solution to training good acoustic models from completely untranscribed data etc.. They also include general developments in classification techniques.
- Acoustic modeling, decoding, speech processing, speech recognition, adaptation, keyword spotting
These papers relate to core and peripheral issues in speech recognition and processing for HMM-based ASR systems.
- Systems, applications, projects
These papers describe systems developed or deployed for specific tasks. Also include papers from short-term student projects, technical reports and other writeups
Patents, papers on other topics such as chaos theory, radar signal design, geodynamics. From 1993-1998 I worked on these topics. Chaos and complexity theory remain my favorite hobby subjects.
- Associate Editor, IEEE Signal Processing Letters
Home Funded.Projects Where.I.worked Other.ongoing.work Personal.pages My.old.homepage