Qianli Xu
YOU?
Author Swipe
View article: Implantable Medical Electronic Devices: Sensing Mechanisms, Communication Methods, and the Biodegradable Future
Implantable Medical Electronic Devices: Sensing Mechanisms, Communication Methods, and the Biodegradable Future Open
In the context of the relentless pursuit of precision, intelligence, and personalization within the realm of medical technology, the real-time monitoring of human physiological signals has assumed heightened significance. Implantable wirel…
View article: SPASCA: Social Presence and Support with Conversational Agent for Persons Living with Dementia
SPASCA: Social Presence and Support with Conversational Agent for Persons Living with Dementia Open
We present SPASCA - a conversational AI system that promotes psychological and cognitive well-being of persons living with dementia (PLWD). This system features an AI agent that provides social presence and support to PLWD through verbal c…
View article: VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting
VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting Open
Large Language Model (LLM)-based agents have shown promise in procedural tasks, but the potential of multimodal instructions augmented by texts and videos to assist users remains under-explored. To address this gap, we propose the Visually…
View article: VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting
VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting Open
Large Language Model (LLM)-based agents have shown promise in procedural tasks, but the potential of multimodal instructions augmented by texts and videos to assist users remains under-explored. To address this gap, we propose the Visually…
View article: DOTA: Distributional Test-Time Adaptation of Vision-Language Models
DOTA: Distributional Test-Time Adaptation of Vision-Language Models Open
Vision-language foundation models (VLMs), such as CLIP, exhibit remarkable performance across a wide range of tasks. However, deploying these models can be unreliable when significant distribution gaps exist between training and test data,…
View article: VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation Open
A well-known dilemma in large vision-language models (e.g., GPT-4, LLaVA) is that while increasing the number of vision tokens generally enhances visual understanding, it also significantly raises memory and computational costs, especially…
View article: Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models
Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models Open
Recent advancements in large vision-language models (LVLMs) have demonstrated impressive capability in visual information understanding with human language. Despite these advances, LVLMs still face challenges with multimodal hallucination,…
View article: Team I2R-VI-FF Technical Report on EPIC-KITCHENS VISOR Hand Object Segmentation Challenge 2023
Team I2R-VI-FF Technical Report on EPIC-KITCHENS VISOR Hand Object Segmentation Challenge 2023 Open
In this report, we present our approach to the EPIC-KITCHENS VISOR Hand Object Segmentation Challenge, which focuses on the estimation of the relation between the hands and the objects given a single frame as input. The EPIC-KITCHENS VISOR…
View article: Masked Diffusion with Task-awareness for Procedure Planning in Instructional Videos
Masked Diffusion with Task-awareness for Procedure Planning in Instructional Videos Open
A key challenge with procedure planning in instructional videos lies in how to handle a large decision space consisting of a multitude of action types that belong to various tasks. To understand real-world video content, an AI agent must p…
View article: Team VI-I2R Technical Report on EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2022
Team VI-I2R Technical Report on EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2022 Open
In this report, we present the technical details of our submission to the EPIC-KITCHENS-100 Unsupervised Domain Adaptation (UDA) Challenge for Action Recognition 2022. This task aims to adapt an action recognition model trained on a labele…
View article: GazeVQA: A Video Question Answering Dataset for Multiview Eye-Gaze Task-Oriented Collaborations
GazeVQA: A Video Question Answering Dataset for Multiview Eye-Gaze Task-Oriented Collaborations Open
The usage of exocentric and egocentric videos in Video Question Answering (VQA) is a new endeavor in human-robot interaction and collaboration studies. Particularly for egocentric videos, one may leverage eye-gaze information to understand…
View article: Barriers associated with the public use of sports facilities in China: a qualitative study
Barriers associated with the public use of sports facilities in China: a qualitative study Open
View article: Visuo-Tactile Manipulation Planning Using Reinforcement Learning with Affordance Representation
Visuo-Tactile Manipulation Planning Using Reinforcement Learning with Affordance Representation Open
Robots are increasingly expected to manipulate objects in ever more unstructured environments where the object properties have high perceptual uncertainty from any single sensory modality. This directly impacts successful object manipulati…
View article: TAILOR: Teaching with Active and Incremental Learning for Object Registration
TAILOR: Teaching with Active and Incremental Learning for Object Registration Open
When deploying a robot to a new task, one often has to train it to detect novel objects, which is time-consuming and labor-intensive. We present TAILOR -- a method and system for object registration with active and incremental learning. Wh…
View article: TAILOR: Teaching with Active and Incremental Learning for Object Registration
TAILOR: Teaching with Active and Incremental Learning for Object Registration Open
When deploying a robot to a new task, one often has to train it to detect novel objects, which is time-consuming and labor- intensive. We present TAILOR - a method and system for ob- ject registration with active and incremental learning. …
View article: Gesture Enhanced Comprehension of Ambiguous Human-to-Robot Instructions
Gesture Enhanced Comprehension of Ambiguous Human-to-Robot Instructions Open
This work demonstrates the feasibility and benefits of using pointing gestures, a naturally-generated additional input modality, to improve the multi-modal comprehension accuracy of human instructions to robotic agents for collaborative ta…
View article: Neural correlates of retrieval-based enhancement of autobiographical memory in older adults
Neural correlates of retrieval-based enhancement of autobiographical memory in older adults Open
View article: Toward Modelling of Transformational Change Processes in Farm Decision-Making
Toward Modelling of Transformational Change Processes in Farm Decision-Making Open
[Departement_IRSTEA]Ecotechnologies [TR1_IRSTEA]MOTIVE [ADD1_IRSTEA]Adaptation des territoires au changement global [ADD2_IRSTEA]Bioéconomie territoriale
View article: SocioGlass: social interaction assistance with face recognition on google glass
SocioGlass: social interaction assistance with face recognition on google glass Open
We present SocioGlass - a system built on Google Glass paired with a mobile phone that provides a user with in-situ information about an acquaintance in face-to-face communication. The system can recognize faces from the live feed of visua…
View article: MedHelp: enhancing medication compliance for demented elderly people with wearable visual intelligence
MedHelp: enhancing medication compliance for demented elderly people with wearable visual intelligence Open
Dementia results in much stress in senior citizens and and immensely affects their quality of life. It also incurs huge financial and emotional burdens to their family members. Personal information assistance may alleviate such a problem b…
View article: Eye-2-I: Eye-tracking for just-in-time implicit user profiling
Eye-2-I: Eye-tracking for just-in-time implicit user profiling Open
For many applications, such as targeted advertising and content recommendation, knowing users' traits and interests is a prerequisite. User profiling is a helpful approach for this purpose. However, current methods, i.e. self-reporting, we…
View article: Exploring users' attitudes towards social interaction assistance on Google Glass
Exploring users' attitudes towards social interaction assistance on Google Glass Open
Wearable vision brings about new opportunities for augmenting humans in social interactions. However, along with it comes privacy concerns and possible information overload. We explore users' needs and attitudes toward augmented interactio…