Rajeshwari Sah
Rajeshwari Sah, Machine Learning Engineer at Apple
Sunnyvale, California, United States
Actions
Rajeshwari Sah is a Machine Learning Engineer at Apple, where she works on production-scale Agentic AI and Retrieval-Augmented Generation systems powering next-generation intelligent experiences. She has led the development of multilingual, voice-enabled LLM frameworks and advanced agentic orchestration pipelines that significantly improved enterprise efficiency and system reliability.
Rajeshwari’s expertise spans fine-tuning techniques such as RLHF and DPO, multi-agent collaboration, and alignment-driven evaluation frameworks. She has delivered high-impact AI solutions across healthcare, finance, and e-commerce, focusing on workflow automation, process optimization, regulatory document intelligence, and AI-driven document generation. Her work enables knowledge-driven automation and modern policy and decision-support systems for global enterprises.
She is a strong advocate for responsible AI deployment, emphasizing innovation that remains auditable, reliable, and culturally consistent across markets. Rajeshwari holds a Master of Science in Computer Science from UC San Diego and continues to contribute to the evolving landscape of applied AI.
Links
Area of Expertise
Topics
The Agent Tax: What Teams Learn Too Late About Multi-Agent Systems
Multi-agent systems are having a moment. They promise better decomposition, specialization, and flexibility than single-agent workflows, and in demos they often look dramatically more capable. But in production, many teams discover an uncomfortable truth: the gains come with an “agent tax.”
That tax shows up in places teams underestimate at the start—latency compounding across chained calls, brittle handoffs between agents, state and memory drift, tool failures that cascade across the workflow, higher evaluation complexity, and human escalation paths that were never fully designed. What looks elegant on a whiteboard can become expensive, opaque, and hard to debug in the real world.
This session shares practical lessons from building and operating agentic systems beyond the prototype stage. I’ll walk through the hidden costs that emerge in multi-agent architectures, when those costs are justified, and the engineering patterns that make these systems manageable: constrained orchestration, clear ownership of state, failure isolation, observability for intermediate steps, and selective human-in-the-loop design. Attendees will leave with a sharper framework for deciding when multi-agent systems are worth the complexity—and how to build them so they survive production.
3 learning outcomes
1. Recognize the real “agent tax”: latency, coordination overhead, evaluation complexity, reliability risks, and operational burden that appear when multi-agent systems move into production.
2.Know when multi-agent design is actually justified versus when a simpler single-agent or deterministic workflow is the better system choice.
3. Apply practical operating patterns for production agentic systems, including orchestration boundaries, state management, observability, fault isolation, and human escalation design.
Speaker pitch / why me
I work on production-scale AI systems and have seen the gap between impressive agent demos and the realities of shipping dependable systems. My focus is on turning ambitious AI workflows into architectures that are observable, resilient, and operationally sane. This talk is aimed at engineering leaders and practitioners who want a grounded view of what multi-agent systems cost, where they create real value, and how to avoid the most common design mistakes.
Governing the AI-Graph: Observability and Security for LLM-Generated Queries
When we give AI agents access to our GraphQL APIs, we introduce a new class of distributed system challenges: non-deterministic queries, potential N+1 floods, and authorization bypasses. How do we ensure our "AI-generated" queries are safe and efficient?
This talk bridges the gap between AI Quality Engineering and GraphQL governance. Building on my work designing evaluation frameworks for multi-agent systems, I will present strategies for monitoring and governing agents that interact with GraphQL endpoints. We will discuss how to implement "Semantic Rate Limiting" (analyzing query complexity vs. user intent) and how to evaluate the accuracy of agent-generated GraphQL syntax using "LLM-as-a-Judge" frameworks.
We will also cover the "Human-in-the-Loop" aspect: using GraphQL subscriptions to stream agent reasoning to human supervisors for real-time validation before a mutation is executed. Attendees will learn how to open their Graphs to AI without compromising on security or performance reliability.
AI Quality Engineering: Observability, Governance & Reliability for LLM Agent Architectures
As engineering teams transition from single-model assistants to orchestrating networks of tool-using agents (like the one you deployed at Apple ) and complex multilingual generation pipelines, they confront a new class of systemic failures. These LLM-powered architectures behave like distributed systems, not deterministic functions, leading to issues like: inter-agent disagreement, reasoning drift, semantic divergence across locales, and subtle cascade failures that traditional QA cannot detect.
This session introduces AI Quality Engineering (AI-QE), a vital discipline for making LLM-powered systems observable, reliable, and governable at scale. Drawing from real-world deployments in multi-agent orchestration and global localization workflows, we will walk through the telemetry, evaluation methods, and governance patterns required to ensure alignment and robustness over time
Key Takeaways:
1. Distributed Reasoning Observability: Building telemetry and instrumentation to trace reasoning hand-offs across multi-agent workflows, model delegation-loop failures, and measure tool-use effectiveness
2.Inter-Agent Evaluation Metrics: Quantifying collaboration and systemic reliability using metrics like context-retention fidelity, agreement-divergence ratios, and hallucination cascade detection
3. Multilingual Governance: Techniques to prevent semantic drift and cultural inconsistencies across languages and locales, a core challenge in global RAG pipelines.
4. Quality Gates & CI/CD: Implementing quality gates, domain-specific rubric scoring, and LLM-as-judge ensembles that integrate directly into continuous integration and deployment processes.
Rajeshwari Sah
Rajeshwari Sah, Machine Learning Engineer at Apple
Sunnyvale, California, United States
Links
Actions
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top