Research Interests


I work on core algorithmic aspects of computer voice recognition, and artificial intelligence applied to voice forensics. My focus is on the development of technology for the automated discovery, measurement, representation and learning of the information encoded in voice signal for optimal voice intelligence.

I worked on computer speech recognition and general audio processing from 1997 to 2014. During that time, I worked on a wide range of topics, including algorithms that made speech processing systems completely generalizable (agnostic to language), algorithms that enabled automated discovery and learning of information from speech, algorithms that could process speech using minimal external (human-generated) knowledge etc. My goal was to enable greater automation, create more powerful search strategies and more scaleable learning algorithms for voice processing systems, and to find ways to make them work more accurately in high-noise and other kinds of complex acoustic enironments.

In course of all of this research, I studied the human voice very closely, and from multiple perspectives. Based on those studies, I began building up the science of profiling humans from their voice. This involves the concurrent deduction of myriad human parameters from voice. Like the DNA and fingerprints, every human's voice is unique. It carries more information than we realize (or can hear). It carries signatures of the speaker's physical, physiological, medical, psychological, sociological, behavioral and environmental parameters, among other things. Profiling is focused on creating predictors that are agnostic of language and content. It is based on quantitative discovery and measurement of micro-features from the voice signal, and the intricacies of the physics and bio-mechanics of human voice production. Because it focuses on the voice signal, and not its pragmatic content, it is agnostic to language.

More about this work.....

Media coverage...



Courses I teach


  1. Computational Forensics and AI
    Spring 2020, Website

  2. Advanced Topics: Quantum Computing Lab
    Spring 2020, Website
    The websites are not final yet..

  3. Computational Forensics and Investigative Intelligence: Syllabus


Students

  • Yolanda Gao, PhD, Electrical and Computer Engineering
  • Wayne Zhao, PhD, Electrical and Computer Engineering
  • Yandong Wen, PhD, Electrical and Computer Engineering
  • Shahan Ali Memon, Masters, LTI, School of Computer Science
  • Hira Yasin, Masters, LTI, School of Computer Science
  • Mahmoud Al Ismail, Masters, LTI, School of Computer Science
  • Daanish Ali Khan, Masters, CSD, School of Computer Science


Recent Publications

    Book

  • Profiling humans from their voice, Rita Singh, 408p. Publisher: Springer-Nature. Under processing by publisher. Expected worldwide release date: 15 June 2019.

    Currently, this book may be pre-ordered from amazon.com, and from springer.com (available as e-book and as paperback for worldwide shipping)

    It may be cited using its ISBN number (ISBN 978-981-13-8402-8) for now. A separate DOI number will be available for each chapter in the book soon.

  • Papers

  • List to be updated soon....


Literary creations


Older Publications

    Papers
  • List to be updated...
  • Voice impersonation using generative adversarial networks, Yang Gao, Rita Singh, Bhiksha Raj, Int. conf. on Acoustics, Speech and Signal Processing (ICASSP),Calgary, Canada, 15-20 April 2018 Canada. pdf
  • A corrective training approach for text-independent speaker verification, Yandong Wen, Tianyan Zhou, Rita Singh, Bhiksha Raj, Int. conf. on Acoustics, Speech and Signal Processing (ICASSP),Calgary, Canada, 15-20 April 2018 Canada. pdf

  • Voice disguise by mimicry: deriving statistical articulometric evidence to evaluate claimed impersonation, Rita Singh, Abelino Jiminez and Anders Oland, IET Biometrics, January 2017. pdf

  • more below....


Research publications (by topic)


  1. Forensics    Papers
    General theme: Forensic deductions from human voice. Speech and audio forenics are included.

  2. General audio analysis, microphone array processing, denoising, dereverberation, signal restoration    Papers
    General theme: Our approach is that of modeling the effect of highly-nonstationary noise and reverberation as compositional phenomena. Clean signals can then be recomposed from the bases of the composition. This approach differs from ones that model audio phenomena using dynamic generative models.

  3. Semi-supervised learning, structure discovery, statistical pattern recognition, classification    Papers
    These papers cover diferent topics such as learning basic units of sound from data, discovering pronunciations for words in terms of these units, selecting better classifiers using weaker classifiers iteratively in a gradient ascent solution to training good acoustic models from completely untranscribed data etc.. They also include general developments in classification techniques.

  4. Acoustic modeling, decoding, speech processing, speech recognition, adaptation, keyword spotting    Papers
    These papers relate to core and peripheral issues in speech recognition and processing for HMM-based ASR systems.

  5. Systems, applications, projects    Papers
    These papers describe systems developed or deployed for specific tasks. Also include papers from short-term student projects, technical reports and other writeups

  6. Miscellaneous    Papers
    Patents, papers on other topics such as chaos theory, radar signal design, geodynamics. From 1993-1998 I worked on these topics. Chaos and complexity theory remain my favorite hobby subjects.


Other activities

  • Associate Editor, IEEE Signal Processing Letters (Retired recently!)
  • Sphinx-4
  • LDC And other things for me...


Earlier Teaching


  1. Computational Forensics and Investigative Intelligence
    Taught in Spring 2017 and Spring 2018, simultaneously at
    • CMU Pittsburgh
    • Hamad Bin Khalifa University (HBKU), Qatar
    • CMU Qatar
    • CMU Africa
  2. An Introduction to Knowledge based Deep Learning and Socratic Coaches
    11-364 CMU Pittsburgh
    This course was taught in person by Prof. James Karl Baker at the CMU Pittsburgh location. I was nominally co-instructor but couldn't help Jim much.
  3. Design and Implementation of Speech Recognition Systems
    Last taught many years ago. Earliest version co-taught with Prof. James Baker

About me: I'm happiest where I come from. I like simple things. I admire art. When I have time I spend much of it looking at art. I write poetry. I collect comics (the Harvey Pekar and Blake and Mortimer kind..) and puzzles (the Charles Wysocki and Jane Wooster Scott kind..). I read mysteries. I don't watch TV or movies, I haven't switched on my TV for years. I dont know if my TV works. I don't use a cellphone, I have one but its mostly lost anyway. I'd rather watch the clouds in the sky, and the birds and the leaves. A groundhog lives in a grand home under the deck stairs just outside my window. It even has a lamp outside its home. I can tell you all about it. In the summer I wake up to the song of the cardinal. I want nothing more from life or the world, except for medical science to hurry up and make everyone well. Other than that, I am content.

Some hi_res pictures of me


Home