The Dual Edge of Multimodal AI: Advancing Accessibility While Navigating Bias

Multimodal AI systems—capable of processing text, images, audio, and video simultaneously—present transformative opportunities for accessibility while introducing complex challenges related to bias and fairness. This presentation explores this duality through evidence-based analysis of current implementations and future directions.
For individuals with disabilities, multimodal AI creates unprecedented opportunities: visual recognition systems achieve high accuracy for common objects, real-time speech-to-text transcription operates with minimal error rates, and adaptive learning technologies significantly improve information retention for neurodivergent learners. However, these same systems exhibit concerning bias patterns: recruitment algorithms show substantial ranking disparities across demographics, speech recognition error rates vary considerably across accents, and many images containing gender bias can be traced to problematic relationships between visual elements and text annotations.
The presentation outlines a comprehensive framework for responsible development including: inclusive design principles (with evidence that disability consultants identify many more potential accessibility barriers), representative dataset curation (addressing the reality that images in computer vision datasets rarely include people with visible disabilities), rigorous testing methodologies (conventional sampling typically captures very few users with disabilities), and ethical governance considerations (most AI practitioners want clearer accessibility standards).
Through case studies including image description technologies (showing notable accuracy disparities between Western and non-Western cultural contexts), diverse speech recognition (where community-driven data collection reduced error rates for underrepresented accent groups), and emotion recognition systems (with higher error rates for non-Western expressions), the presentation provides practical insights for developing multimodal AI that enhances accessibility without reinforcing existing inequities.

Naman Goyal

Google DeepMind, Previously - NVIDIA, Apple USA

Mountain View, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

The Dual Edge of Multimodal AI: Advancing Accessibility While Navigating Bias

Naman Goyal

Links

Actions