Amy Pavel
YOU?
Author Swipe
View article: Task Mode: Dynamic Filtering for Task-Specific Web Navigation using LLMs
Task Mode: Dynamic Filtering for Task-Specific Web Navigation using LLMs Open
Modern web interfaces are unnecessarily complex to use as they overwhelm users with excessive text and visuals unrelated to their current goals. This problem particularly impacts screen reader users (SRUs), who navigate content sequentiall…
View article: Surfacing Variations to Calibrate Perceived Reliability of MLLM-generated Image Descriptions
Surfacing Variations to Calibrate Perceived Reliability of MLLM-generated Image Descriptions Open
Multimodal large language models (MLLMs) provide new opportunities for blind and low vision (BLV) people to access visual information in their daily lives. However, these models often produce errors that are difficult to detect without sig…
View article: VeasyGuide: Personalized Visual Guidance for Low-vision Learners on Instructor Actions in Presentation Videos
VeasyGuide: Personalized Visual Guidance for Low-vision Learners on Instructor Actions in Presentation Videos Open
Instructors often rely on visual actions such as pointing, marking, and sketching to convey information in educational presentation videos. These subtle visual cues often lack verbal descriptions, forcing low-vision (LV) learners to search…
View article: TalkLess: Blending Extractive and Abstractive Summarization for Editing Speech to Preserve Content and Style
TalkLess: Blending Extractive and Abstractive Summarization for Editing Speech to Preserve Content and Style Open
Millions of people listen to podcasts, audio stories, and lectures, but editing speech remains tedious and time-consuming. Creators remove unnecessary words, cut tangential discussions, and even re-record speech to make recordings concise …
View article: Morae: Proactively Pausing UI Agents for User Choices
Morae: Proactively Pausing UI Agents for User Choices Open
User interface (UI) agents promise to make inaccessible or complex UIs easier to access for blind and low-vision (BLV) users. However, current UI agents typically perform tasks end-to-end without involving users in critical choices or maki…
View article: Vid2Coach: Transforming How-To Videos into Task Assistants
Vid2Coach: Transforming How-To Videos into Task Assistants Open
People use videos to learn new recipes, exercises, and crafts. Such videos remain difficult for blind and low vision (BLV) people to follow as they rely on visual comparison. Our observations of visual rehabilitation therapists (VRTs) guid…
View article: VideoDiff: Human-AI Video Co-Creation with Alternatives
VideoDiff: Human-AI Video Co-Creation with Alternatives Open
To make an engaging video, people sequence interesting moments and add visuals such as B-rolls or text. While video editing requires time and effort, AI has recently shown strong potential to make editing easier through suggestions and aut…
View article: Lotus: Creating Short Videos From Long Videos With Abstractive and Extractive Summarization
Lotus: Creating Short Videos From Long Videos With Abstractive and Extractive Summarization Open
Short-form videos are popular on platforms like TikTok and Instagram as they quickly capture viewers' attention. Many creators repurpose their long-form videos to produce short-form videos, but creators report that planning, extracting, an…
View article: HistoryPalette: Supporting Exploration and Reuse of Past Alternatives in Image Generation and Editing
HistoryPalette: Supporting Exploration and Reuse of Past Alternatives in Image Generation and Editing Open
All creative tasks require creators to iteratively produce, select, and discard potentially useful ideas. Now, creativity tools include generative AI features (e.g., Photoshop Generative Fill) that increase the number of alternatives creat…
View article: Context-Aware Image Descriptions for Web Accessibility
Context-Aware Image Descriptions for Web Accessibility Open
Blind and low vision (BLV) internet users access images on the web via text\ndescriptions. New vision-to-language models such as GPT-V, Gemini, and LLaVa\ncan now provide detailed image descriptions on-demand. While prior research and\ngui…
View article: DesignChecker: Visual Design Support for Blind and Low Vision Web Developers
DesignChecker: Visual Design Support for Blind and Low Vision Web Developers Open
Blind and low vision (BLV) developers create websites to share knowledge and\nshowcase their work. A well-designed website can engage audiences and deliver\ninformation effectively, yet it remains challenging for BLV developers to\nreview …
View article: DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation
DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation Open
Enabling machines to understand structured visuals like slides and user interfaces is essential for making them accessible to people with disabilities. However, achieving such understanding computationally has required manual data collecti…
View article: Long-Form Answers to Visual Questions from Blind and Low Vision People
Long-Form Answers to Visual Questions from Blind and Low Vision People Open
Vision language models can now generate long-form answers to questions about images - long-form visual question answers (LFVQA). We contribute VizWiz-LF, a dataset of long-form answers to visual questions posed by blind and low vision (BLV…
View article: Making Short-Form Videos Accessible with Hierarchical Video Summaries
Making Short-Form Videos Accessible with Hierarchical Video Summaries Open
Short videos on platforms such as TikTok, Instagram Reels, and YouTube Shorts\n(i.e. short-form videos) have become a primary source of information and\nentertainment. Many short-form videos are inaccessible to blind and low vision\n(BLV) …
View article: Barriers to Photosensitive Accessibility in Virtual Reality
Barriers to Photosensitive Accessibility in Virtual Reality Open
Virtual reality (VR) systems have grown in popularity as an immersive modality for daily activities such as gaming, socializing, and working. However, this technology is not always accessible for people with photosensitive epilepsy (PSE) w…
View article: COMPA: Using Conversation Context to Achieve Common Ground in AAC
COMPA: Using Conversation Context to Achieve Common Ground in AAC Open
Group conversations often shift quickly from topic to topic, leaving a small window of time for participants to contribute. AAC users often miss this window due to the speed asymmetry between using speech and using AAC devices. AAC users m…
View article: Barriers to Photosensitive Accessibility in Virtual Reality
Barriers to Photosensitive Accessibility in Virtual Reality Open
Virtual reality (VR) systems have grown in popularity as an immersive modality for daily activities such as gaming, socializing, and working. However, this technology is not always accessible for people with photosensitive epilepsy (PSE) w…
View article: Accessibility Evaluation of an Assistive Social Robotic Platform for Rehabilitation and Its Improvement by Means of Haptic Devices
Accessibility Evaluation of an Assistive Social Robotic Platform for Rehabilitation and Its Improvement by Means of Haptic Devices Open
The use of robotic platforms with social capabilities is becoming increasingly common to support people's daily lives. Such systems are commonly referred to as Socially Assistive Robots (SAR). Many SARs focus on a human-machine interaction…
View article: GenAssist: Making Image Generation Accessible
GenAssist: Making Image Generation Accessible Open
Blind and low vision (BLV) creators use images to communicate with sighted audiences. However, creating or retrieving images is challenging for BLV creators as it is difficult to use authoring tools or assess image search results. Thus, cr…
View article: Exploring Community-Driven Descriptions for Making Livestreams Accessible
Exploring Community-Driven Descriptions for Making Livestreams Accessible Open
People watch livestreams to connect with others and learn about their hobbies. Livestreams feature multiple visual streams including the main video, webcams, on-screen overlays, and chat, all of which are inaccessible to livestream viewers…
View article: Exploring Community-Driven Descriptions for Making Livestreams Accessible
Exploring Community-Driven Descriptions for Making Livestreams Accessible Open
People watch livestreams to connect with others and learn about their hobbies. Livestreams feature multiple visual streams including the main video, webcams, on-screen overlays, and chat, all of which are inaccessible to livestream viewers…
View article: GenAssist: Making Image Generation Accessible
GenAssist: Making Image Generation Accessible Open
Blind and low vision (BLV) creators use images to communicate with sighted audiences. However, creating or retrieving images is challenging for BLV creators as it is difficult to use authoring tools or assess image search results. Thus, cr…
View article: AVscript: Accessible Video Editing with Audio-Visual Scripts
AVscript: Accessible Video Editing with Audio-Visual Scripts Open
Sighted and blind and low vision (BLV) creators alike use videos to communicate with broad audiences. Yet, video editing remains inaccessible to BLV creators. Our formative study revealed that current video editing tools make it difficult …
View article: Exploratory Thematic Analysis of Crowdsourced Photosensitivity Warnings
Exploratory Thematic Analysis of Crowdsourced Photosensitivity Warnings Open
Films often include sequences of flashing lights for visual effect that may inadvertently trigger seizures when viewed by individuals with photosensitive epilepsy (PSE). Warnings about photosensitive risk in films can help people with PSE …
View article: SlideSpecs: Automatic and Interactive Presentation Feedback Collation
SlideSpecs: Automatic and Interactive Presentation Feedback Collation Open
Presenters often collect audience feedback through practice talks to refine their presentations. In formative interviews, we find that although text feedback and verbal discussions allow presenters to receive feedback, organizing that feed…
View article: Diffscriber: Describing Visual Design Changes to Support Mixed-Ability Collaborative Presentation Authoring
Diffscriber: Describing Visual Design Changes to Support Mixed-Ability Collaborative Presentation Authoring Open
Visual slide-based presentations are ubiquitous, yet slide authoring tools are largely inaccessible to people who are blind or visually impaired (BVI). When authoring presentations, the 9 BVI presenters in our formative study usually work …
View article: CrossA11y: Identifying Video Accessibility Issues via Cross-modal Grounding
CrossA11y: Identifying Video Accessibility Issues via Cross-modal Grounding Open
Authors make their videos visually accessible by adding audio descriptions (AD), and auditorily accessible by adding closed captions (CC). However, creating AD and CC is challenging and tedious, especially for non-professional describers a…
View article: Tech Help Desk: Support for Local Entrepreneurs Addressing the Long Tail of Computing Challenges
Tech Help Desk: Support for Local Entrepreneurs Addressing the Long Tail of Computing Challenges Open
Even entrepreneurs whose businesses are not technological (e.g., handmade goods) need to be able to use a wide range of computing technologies in order to achieve their business goals. In this paper, we follow a participatory action resear…
View article: Toward supporting quality alt text in computing publications
Toward supporting quality alt text in computing publications Open
While researchers have examined alternative (alt) text for social media and news contexts, few have studied the status and challenges for authoring alt text of figures in computing-related publications. These figures are distinct, often co…