Exploring foci of:
arXiv (Cornell University)
OpenApps: Simulating Environment Variations to Measure UI-Agent Reliability
November 2025 • Karen Ullrich, Arjun Subramonian, Amir Bar, Ivan Evtimov, Nikolaos Tsilivis, Randall Balestriero, Julia Kempe
Reliability is key to realizing the promise of autonomous UI-Agents, multimodal agents that directly interact with apps in the same manner as humans, as users must be able to trust an agent to complete a given task. Current evaluations rely on fixed environments, often clones of existing apps, which are limited in that they can only shed light on whether or how often an agent can complete a task within a specific environment. When deployed however, agents are likely to encounter variations in app design and conten…
Social Environment
Impact Of The Covid-19 Pandemic On The Environment
Rio Declaration On Environment And Development
Learning Environment
World Environment Day
Integrated Development Environment
Natural Environment
Proxmox Virtual Environment
Human Impact On The Environment
Automatic Certificate Management Environment
Autumn Variations
Measure For Measure
Cinnamon (Desktop Environment)
Goldberg Variations
Enigma Variations
The Last Full Measure (2019 Film)
Hustler's P.O.M.E. (Product Of My Environment)
Cosmic (Desktop Environment)
Ministry Of Environment, Forest And Climate Change
Hot Dog Variations
Preboot Execution Environment
Tape Measure
Budgie (Desktop Environment)
Common Desktop Environment