Session
UISurf: Toward Universal UI Automation with Cross-Environment Agents
In this talk, I will introduce UISurf, an open-source multimodal agentic UI automation platform that enables AI agents to perceive, reason, and collaborate across browser and desktop environments to complete end-to-end tasks involving multiple user interfaces.
UISurf was designed to provide a secure and extensible playground for developing, evaluating, and testing UI automation agents within isolated environments. The platform enables researchers and developers to study how multimodal agents can interpret user interfaces, coordinate actions, and execute complex workflows safely across both web and desktop applications.
UISurf consists of three primary components: uisurf-agent, the runtime responsible for UI automation agents; uisurf-admin, the session orchestration and management service; and uisurf-app, the full-stack user application. Its multi-agent architecture includes a planning_agent that transforms natural-language requests into structured execution plans; specialized Browser and Desktop Agents for environment-specific interactions; an automation_agent that coordinates execution and inter-agent handoffs through Agent-to-Agent (A2A) communication; and a summarization_agent that generates the final task summary for the user.
UISurf supports both fully autonomous execution and human-in-the-loop supervision, providing a practical and extensible framework for studying, benchmarking, and deploying cross-environment UI automation systems powered by multimodal AI agents.
Henry Ruiz
Research Scientist at Texas A&M AgriLife Research, GDE in AI and Cloud
College Station, Texas, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top