Session
Synthetic Lives for Smarter Agents: Generating Personal Context Data for AI Benchmark Design
Building benchmarks for personal AI agents requires something that doesn't naturally exist: large-scale, coherent personal context data. Real user data is private. Randomly generated data lacks the internal consistency of real human lives.
For ASTRA-bench, we developed an event-driven synthetic data generation pipeline to solve this problem. Rather than generating isolated data points, we grounded all synthetic data in longitudinal life events - biography, social network, and pattern of life - for four distinct protagonists. These events then propagated into multi-source personal context: emails, calendars, messages, notes, and preferences that are causally coherent with each other.
The generation itself used an agentic LLM workflow to produce 2,413 scenarios that span weeks and months of protagonist life, enabling time-evolving context that single-snapshot benchmarks cannot capture. The result: synthetic personal data realistic enough that state-of-the-art models including Claude 4.5 Opus and DeepSeek V3.2 show significant performance degradation when reasoning over it.
Maitrik Patel
Engineering and Product Leader | AI/ML • Web • Design
San Francisco, California, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top