Explanipedia

Implantable Medical Electronic Devices: Sensing Mechanisms, Communication Methods, and the Biodegradable Future Open

Zhengdao Chu, Yukai Zhou, S.Z. Li, Qianli Xu, Lijia Pan · 2025

In the context of the relentless pursuit of precision, intelligence, and personalization within the realm of medical technology, the real-time monitoring of human physiological signals has assumed heightened significance. Implantable wirel…

SPASCA: Social Presence and Support with Conversational Agent for Persons Living with Dementia Open

Ali Köksal, Jung-Eok Gu, Kotaro Hara, Jing Jiang, Joo‐Hwee Lim , et al. · 2025

We present SPASCA - a conversational AI system that promotes psychological and cognitive well-being of persons living with dementia (PLWD). This system features an AI agent that provides social presence and support to PLWD through verbal c…

VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting Open

Muhammet Furkan Ilaslan, Ali Köksal, Kevin Qinghong Lin, Burak Satar, Mike Zheng Shou , et al. · 2025

Large Language Model (LLM)-based agents have shown promise in procedural tasks, but the potential of multimodal instructions augmented by texts and videos to assist users remains under-explored. To address this gap, we propose the Visually…

VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting Open

Muhammet Furkan Ilaslan, Ali Köksal, Kevin Qinhong Lin, Burak Satar, Mike Zheng Shou , et al. · 2024

Large Language Model (LLM)-based agents have shown promise in procedural tasks, but the potential of multimodal instructions augmented by texts and videos to assist users remains under-explored. To address this gap, we propose the Visually…

DOTA: Distributional Test-Time Adaptation of Vision-Language Models Open

Zongbo Han, Jialong Yang, Junfan Li, Qinghua Hu, Qianli Xu , et al. · 2024

Vision-language foundation models (VLMs), such as CLIP, exhibit remarkable performance across a wide range of tasks. However, deploying these models can be unreliable when significant distribution gaps exist between training and test data,…

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation Open

Shiwei Wu, Joya Chen, Kevin Qinghong Lin, Qimeng Wang, Yan Gao , et al. · 2024

A well-known dilemma in large vision-language models (e.g., GPT-4, LLaVA) is that while increasing the number of vision tokens generally enhances visual understanding, it also significantly raises memory and computational costs, especially…

Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models Open

Zongbo Han, Zechen Bai, Haiyang Mei, Qianli Xu, Changqing Zhang , et al. · 2024

Recent advancements in large vision-language models (LVLMs) have demonstrated impressive capability in visual information understanding with human language. Despite these advances, LVLMs still face challenges with multimodal hallucination,…

Team I2R-VI-FF Technical Report on EPIC-KITCHENS VISOR Hand Object Segmentation Challenge 2023 Open

Fen Fang, Y. C. Cheng, Ying Sun, Qianli Xu · 2023

In this report, we present our approach to the EPIC-KITCHENS VISOR Hand Object Segmentation Challenge, which focuses on the estimation of the relation between the hands and the objects given a single frame as input. The EPIC-KITCHENS VISOR…

Masked Diffusion with Task-awareness for Procedure Planning in Instructional Videos Open

Fen Fang, Yun Liu, Ali Köksal, Qianli Xu, Joo‐Hwee Lim · 2023

A key challenge with procedure planning in instructional videos lies in how to handle a large decision space consisting of a multitude of action types that belong to various tasks. To understand real-world video content, an AI agent must p…

Team VI-I2R Technical Report on EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2022 Open

Yi Cheng, Dongyun Lin, Fen Fang, Hao Xuan Woon, Qianli Xu , et al. · 2023

In this report, we present the technical details of our submission to the EPIC-KITCHENS-100 Unsupervised Domain Adaptation (UDA) Challenge for Action Recognition 2022. This task aims to adapt an action recognition model trained on a labele…

GazeVQA: A Video Question Answering Dataset for Multiview Eye-Gaze Task-Oriented Collaborations Open

Muhammet Furkan Ilaslan, Chenan Song, Joya Chen, Difei Gao, Weixian Lei , et al. · 2023

The usage of exocentric and egocentric videos in Video Question Answering (VQA) is a new endeavor in human-robot interaction and collaboration studies. Particularly for egocentric videos, one may leverage eye-gaze information to understand…

Barriers associated with the public use of sports facilities in China: a qualitative study Open

Wei Gao, Weisheng Feng, Qianli Xu, Shihui Lu, Keqiang Cao · 2022

Visuo-Tactile Manipulation Planning Using Reinforcement Learning with Affordance Representation Open

Wenyu Liang, Fen Fang, Cihan Acar, Wei Qi Toh, Ying Sun , et al. · 2022

Robots are increasingly expected to manipulate objects in ever more unstructured environments where the object properties have high perceptual uncertainty from any single sensory modality. This directly impacts successful object manipulati…

TAILOR: Teaching with Active and Incremental Learning for Object Registration Open

Qianli Xu, Nicolas Gauthier, Wenyu Liang, Fen Fang, Huı Tan , et al. · 2022

When deploying a robot to a new task, one often has to train it to detect novel objects, which is time-consuming and labor-intensive. We present TAILOR -- a method and system for object registration with active and incremental learning. Wh…

TAILOR: Teaching with Active and Incremental Learning for Object Registration Open

Qianli Xu, Nicolas Gauthier, Wenyu Liang, Fen Fang, Hui Li Tan , et al. · 2021

When deploying a robot to a new task, one often has to train it to detect novel objects, which is time-consuming and labor- intensive. We present TAILOR - a method and system for ob- ject registration with active and incremental learning. …

Gesture Enhanced Comprehension of Ambiguous Human-to-Robot Instructions Open

Dulanga Weerakoon, Vigneshwaran Subbaraju, Nipuni Karumpulli, Tuan Tran, Qianli Xu , et al. · 2020

This work demonstrates the feasibility and benefits of using pointing gestures, a naturally-generated additional input modality, to improve the multi-modal comprehension accuracy of human instructions to robotic agents for collaborative ta…

Neural correlates of retrieval-based enhancement of autobiographical memory in older adults Open

Qianli Xu, Jiayi Zhang, Joanes Grandjean, Cheston Tan, Vigneshwaran Subbaraju , et al. · 2020

Toward Modelling of Transformational Change Processes in Farm Decision-Making Open

Sylvie Huet, Cyrille Rigolot, Qianli Xu, Y. De Cacqueray-Valmenier, Isabelle Boisdon · 2018

[Departement_IRSTEA]Ecotechnologies [TR1_IRSTEA]MOTIVE [ADD1_IRSTEA]Adaptation des territoires au changement global [ADD2_IRSTEA]Bioéconomie territoriale

SocioGlass: social interaction assistance with face recognition on google glass Open

Qianli Xu, Shue Ching Chia, Bappaditya Mandal, Liyuan Li, Joo‐Hwee Lim , et al. · 2016

We present SocioGlass - a system built on Google Glass paired with a mobile phone that provides a user with in-situ information about an acquaintance in face-to-face communication. The system can recognize faces from the live feed of visua…

MedHelp: enhancing medication compliance for demented elderly people with wearable visual intelligence Open

Qianli Xu, Shue Ching Chia, Joo‐Hwee Lim, Yiqun Li, Bappaditya Mandal , et al. · 2016

Dementia results in much stress in senior citizens and and immensely affects their quality of life. It also incurs huge financial and emotional burdens to their family members. Personal information assistance may alleviate such a problem b…

Eye-2-I: Eye-tracking for just-in-time implicit user profiling Open

Keng-Teck Ma, Qianli Xu, Liyuan Li, Terence Sim, Mohan Kankanhalli , et al. · 2015

For many applications, such as targeted advertising and content recommendation, knowing users' traits and interests is a prerequisite. User profiling is a helpful approach for this purpose. However, current methods, i.e. self-reporting, we…

Exploring users' attitudes towards social interaction assistance on Google Glass Open

Qianli Xu, Michal Mukawa, Liyuan Li, Joo‐Hwee Lim, Cheston Tan , et al. · 2015

Wearable vision brings about new opportunities for augmenting humans in social interactions. However, along with it comes privacy concerns and possible information overload. We explore users' needs and attitudes toward augmented interactio…

Qianli Xu YOU? Author Swipe