Naman Goyal
Google DeepMind, Previously - NVIDIA, Apple USA
Mountain View, California, United States
Actions
Naman Goyal is a Machine Learning Software Engineer at Google DeepMind, where he is a foundational member of the team developing Gemini's Deep Research capabilities. His work focuses on enabling the model to handle complex, multi-step research tasks by formulating intricate plans, analyzing diverse online sources, and synthesizing comprehensive reports. He is also involved in applied research to enhance the reasoning, planning, and instruction-following capabilities. Prior to his work at DeepMind, Naman's experience includes roles at NVIDIA, working on the perception stack for autonomous vehicles, and at Apple, where he focused on multimodal learning for visually rich document understanding. He was also a Computer Vision Research Fellow at Adobe, where he developed novel training strategies for deep metric learning and image retrieval. Naman holds a Master's degree in Computer Science from Columbia University, where his thesis explored Multimodal Learning and on device Natural language processing for smart replies for multilingual speakers from context of code-switching. He is a co-author on the recent paper, "Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities"
Naman has been a speaker for these conferences with 500-1000+ each audience in 2025
https://www.youtube.com/watch?v=SN_67dHJwAU
• Adobe Research World Headquarters, San Jose, Sept 2025 Architectures for the Next Generation of Enterprise AI Agents
• The AI Conference, San Francisco, Sept 2025, The Ascendancy and Challenges of Agentic Large Language Models https://aiconference.com/speakers/naman-goyal/
• AI Risk Summit, CISO Forum, Half Moon Bay, Aug 2025, https://www.airisksummit.com/event-session/the-ascendancy-and-challenges-of-agentic-large-language-models/
• AI Dev Summit, San Francisco, May 2025, The Dual Edge of Multimodal AI: Advancing Accessibility While Navigating Bias https://aidevsummit2025.sched.com/eve
Area of Expertise
Topics
The Dual Edge of Multimodal AI: Advancing Accessibility While Navigating Bias
Multimodal AI systems—capable of processing text, images, audio, and video simultaneously—present transformative opportunities for accessibility while introducing complex challenges related to bias and fairness. This presentation explores this duality through evidence-based analysis of current implementations and future directions.
For individuals with disabilities, multimodal AI creates unprecedented opportunities: visual recognition systems achieve high accuracy for common objects, real-time speech-to-text transcription operates with minimal error rates, and adaptive learning technologies significantly improve information retention for neurodivergent learners. However, these same systems exhibit concerning bias patterns: recruitment algorithms show substantial ranking disparities across demographics, speech recognition error rates vary considerably across accents, and many images containing gender bias can be traced to problematic relationships between visual elements and text annotations.
The presentation outlines a comprehensive framework for responsible development including: inclusive design principles (with evidence that disability consultants identify many more potential accessibility barriers), representative dataset curation (addressing the reality that images in computer vision datasets rarely include people with visible disabilities), rigorous testing methodologies (conventional sampling typically captures very few users with disabilities), and ethical governance considerations (most AI practitioners want clearer accessibility standards).
Through case studies including image description technologies (showing notable accuracy disparities between Western and non-Western cultural contexts), diverse speech recognition (where community-driven data collection reduced error rates for underrepresented accent groups), and emotion recognition systems (with higher error rates for non-Western expressions), the presentation provides practical insights for developing multimodal AI that enhances accessibility without reinforcing existing inequities.
The Ascendancy and Challenges of Agentic Large Language Models
The development of Large Language Models (LLMs) has shifted from passive text generators to proactive, goal-oriented "agentic LLMs," capable of planning, utilizing tools, interacting with environments, and maintaining memory. This talk provides a critical review of this rapidly evolving field, particularly focusing on innovations from late 2023 through 2025. We will explore the core architectural pillars enabling this transition, including hierarchical planning, advanced long-term memory solutions like Mem0 , and sophisticated tool integration. Prominent operational frameworks such as ReAct and Plan-and-Execute will be examined alongside emerging multi-agent systems (MAS). This talk will critically analyze fundamental limitations like "planning hallucination" , the "tyranny of the prior" where pre-training biases override contextual information, and difficulties in robust generalization and adaptation. We will also discuss the evolving landscape of evaluation methodologies, moving beyond traditional metrics to capability-based assessments and benchmarks like BFCL v3 for tool use and LoCoMo for long-term memory.
Furthermore, the presentation will address the critical ethical imperatives and safety protocols necessitated by increasingly autonomous agents. This includes discussing risks like alignment faking, multi-agent security threats , and the need for frameworks such as the Relative Danger Coefficient (RDC).
Finally, we will explore pioneering frontiers, including advanced multi-agent systems, embodied agency for physical world interaction, and the pursuit of continual and meta-learning for adaptive agents. The talk will conclude by synthesizing the current state, emphasizing that overcoming core limitations in reasoning, contextual grounding, and evaluation is crucial for realizing robust, adaptable, and aligned agentic intelligence.
DeveloperWeek 2026 Sessionize Event Upcoming
AI DevSummit 2025 Sessionize Event
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top