Developing inclusive streaming speech processing systems for voice disorders / non-common speech

Most of the publicly available / open source AI speech models are trained with healthy speech. While we often focus on the clarity and naturalness of voices, it is equally important to recognize and accommodate individuals with disordered voices.

As voice interfaces become more prevalent, it is vital to consider all kinds of speech including disordered/affected or atypical speech. However such data is often not always available.

How do we deal with the imbalance of healthy and affected speech corpora? Actively including diverse voices is important; what are the other techniques (such as synthetic data generation, data augmentation, etc.) we can use create inclusive speech AI systems?

The second part of this session touches upon audio streaming for AI models. Specifically, how do we reduce the context-size in CNNs for real-time applications.

Akash Raj

CTO at Whispp

Amsterdam, The Netherlands

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Developing inclusive streaming speech processing systems for voice disorders / non-common speech

Akash Raj

Links

Actions