Yong Liu
YOU?
Author Swipe
View article: An integrated language-vision foundation model for conversational diagnostics and triaging in primary eye care
An integrated language-vision foundation model for conversational diagnostics and triaging in primary eye care Open
We present Meta-EyeFM, an integrated language-vision foundation model designed for conversational diagnostics and triaging in primary eye care. By combining a large language model (LLM) with eight task-specific vision foundation models (VF…
View article: Eco-friendly production of AgNPs by ultrasound-intensified continuous method, and process evaluation via life cycle assessment and machine learning
Eco-friendly production of AgNPs by ultrasound-intensified continuous method, and process evaluation via life cycle assessment and machine learning Open
Owing to their excellent biocompatibility and antibacterial properties, the global annual production of silver nanoparticles (AgNPs) is estimated at 400-800 tons. Therefore, developing a green and safe approach for AgNPs synthesis is urgen…
View article: DRL: Discriminative Representation Learning with Parallel Adapters for Class Incremental Learning
DRL: Discriminative Representation Learning with Parallel Adapters for Class Incremental Learning Open
With the excellent representation capabilities of Pre-Trained Models (PTMs), remarkable progress has been made in non-rehearsal Class-Incremental Learning (CIL) research. However, it remains an extremely challenging task due to three conun…
View article: MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents
MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents Open
To enhance the efficiency of GUI agents on various platforms like smartphones and computers, a hybrid paradigm that combines flexible GUI operations with efficient shortcuts (e.g., API, deep links) is emerging as a promising direction. How…
View article: Decentralized coordinated emergency frequency control strategy for renewables-rich power systems based on multi-agent reinforcement learning
Decentralized coordinated emergency frequency control strategy for renewables-rich power systems based on multi-agent reinforcement learning Open
With the rapid growth of renewable energy in the power system and the implementation of high-capacity LCC-HVDC projects, the frequency support capability of the grid is decreasing. Existing centralized single-measurement methods and model-…
View article: 3D Neighbor2Neighbor-based unsupervised deep learning for noise reduction in OCT imaging: insights from multiple clinical datasets
3D Neighbor2Neighbor-based unsupervised deep learning for noise reduction in OCT imaging: insights from multiple clinical datasets Open
Optical coherence tomography (OCT), a non-invasive three-dimensional imaging technique, plays a crucial role in the early diagnosis and precise treatment within the fields of ophthalmology and dermatology. However, speckle and electrical n…
View article: PromptAL: Sample-Aware Dynamic Soft Prompts for Few-Shot Active Learning
PromptAL: Sample-Aware Dynamic Soft Prompts for Few-Shot Active Learning Open
Active learning (AL) aims to optimize model training and reduce annotation costs by selecting the most informative samples for labeling. Typically, AL methods rely on the empirical distribution of labeled data to define the decision bounda…
View article: Yarn Color Measurement Method Based on Digital Photography
Yarn Color Measurement Method Based on Digital Photography Open
To overcome the complexity of yarn color measurement using spectrophotometry with yarn winding techniques and to enhance consistency with human visual perception, a yarn color measurement method based on digital photography is proposed. Th…
View article: Deep deterministic policy gradient-based automatic negotiation framework for shared decision-making
Deep deterministic policy gradient-based automatic negotiation framework for shared decision-making Open
Shared Decision-Making (SDM), a patient-centered approach to medical care, improves treatment outcomes and patient satisfaction. However, traditional SDM struggles in handling complex medical scenarios, dynamic patient preferences, and mul…
View article: Efficient Learning of A Unified Policy For Whole-body Manipulation and Locomotion Skills
Efficient Learning of A Unified Policy For Whole-body Manipulation and Locomotion Skills Open
Equipping quadruped robots with manipulators provides unique loco-manipulation capabilities, enabling diverse practical applications. This integration creates a more complex system that has increased difficulties in modeling and control. R…
View article: Flash-VStream: Efficient Real-Time Understanding for Long Video Streams
Flash-VStream: Efficient Real-Time Understanding for Long Video Streams Open
Benefiting from the advances in large language models and cross-modal alignment, existing multimodal large language models have achieved prominent performance in image and short video understanding. However, the understanding of long video…
View article: Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot
Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot Open
In-Context Learning (ICL) is an essential emergent ability of Large Language Models (LLMs), and recent studies introduce Chain-of-Thought (CoT) to exemplars of ICL to enhance the reasoning capability, especially in mathematics tasks. Howev…
View article: DreamLight: Towards Harmonious and Consistent Image Relighting
DreamLight: Towards Harmonious and Consistent Image Relighting Open
We introduce a model named DreamLight for universal image relighting in this work, which can seamlessly composite subjects into a new background while maintaining aesthetic uniformity in terms of lighting and color tone. The background can…
View article: Application of BITCN-BIGRU Neural Network Based on ICPO Optimization in Pit Deformation Prediction
Application of BITCN-BIGRU Neural Network Based on ICPO Optimization in Pit Deformation Prediction Open
Predicting pit deformation to prevent safety accidents is the primary objective of pit deformation forecasting. A reliable predictive model enhances the ability to accurately monitor future deformation trends in pits. To enhance the predic…
View article: On the Emergence of Weak-to-Strong Generalization: A Bias-Variance Perspective
On the Emergence of Weak-to-Strong Generalization: A Bias-Variance Perspective Open
Weak-to-strong generalization (W2SG) refers to the phenomenon where a strong student model, trained on a dataset labeled by a weak teacher, ultimately outperforms the teacher on the target task. Recent studies attribute this performance ga…
View article: AdaptCLIP: Adapting CLIP for Universal Visual Anomaly Detection
AdaptCLIP: Adapting CLIP for Universal Visual Anomaly Detection Open
Universal visual anomaly detection aims to identify anomalies from novel or unseen vision domains without additional fine-tuning, which is critical in open scenarios. Recent studies have demonstrated that pre-trained vision-language models…
View article: LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects Open
With the rapid rise of large language models (LLMs), phone automation has undergone transformative changes. This paper systematically reviews LLM-driven phone GUI agents, highlighting their evolution from script-based automation to intelli…
View article: Abnormal eye movement, brain regional homogeneity in schizophrenia and clinical high-risk individuals and their associated gene expression profiles
Abnormal eye movement, brain regional homogeneity in schizophrenia and clinical high-risk individuals and their associated gene expression profiles Open
Clinical high-risk (CHR) is a prodromal period before psychosis characterized by attenuated, transient, or intermittent psychotic symptoms and declining functioning. They exhibit eye movement abnormalities and brain functional damage compa…
View article: Effect of Covalent Functionalization on Thermal Transport across Paraffin/Graphene Nanocomposite Interfaces
Effect of Covalent Functionalization on Thermal Transport across Paraffin/Graphene Nanocomposite Interfaces Open
The low interfacial heat transfer efficiency limits the thermal conductivity (TC) of the paraffin/graphene composite phase change material. In this study, n-octadecane is used to represent paraffin. The interfacial thermal conductance (ITC…
View article: VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering
VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering Open
Albeit progress has been made in Composed Image Retrieval (CIR), we empirically find that a certain percentage of failure retrieval results are not consistent with their relative captions. To address this issue, this work provides a Visual…
View article: Quantitative Evaluation of Internal Pavement Distresses Based on 3D Ground Penetrating Radar
Quantitative Evaluation of Internal Pavement Distresses Based on 3D Ground Penetrating Radar Open
Asphalt pavement will inevitably produce internal distresses during service, which increases the risk of deterioration of pavement structural performance. Although three- dimensional ground penetrating radar (3D GPR) with a multi-channel a…
View article: AS-GCL: Asymmetric Spectral Augmentation on Graph Contrastive Learning
AS-GCL: Asymmetric Spectral Augmentation on Graph Contrastive Learning Open
Graph Contrastive Learning (GCL) has emerged as the foremost approach for self-supervised learning on graph-structured data. GCL reduces reliance on labeled data by learning robust representations from various augmented views. However, exi…
View article: Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL
Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL Open
As large language models advance toward superhuman performance, ensuring their alignment with human values and abilities grows increasingly complex. Weak-to-strong generalization offers a promising approach by leveraging predictions from w…
View article: The Capabilities and Limitations of Weak-to-Strong Generalization: Generalization and Calibration
The Capabilities and Limitations of Weak-to-Strong Generalization: Generalization and Calibration Open
Weak-to-strong generalization, where weakly supervised strong models outperform their weaker teachers, offers a promising approach to aligning superhuman models with human values. To deepen the understanding of this approach, we provide th…
View article: RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation Open
In recent years, significant advancements have been made in deep learning for medical image segmentation, particularly with convolutional neural networks (CNNs) and transformer models. However, CNNs face limitations in capturing long-range…
View article: Hyperbolic Binary Neural Network
Hyperbolic Binary Neural Network Open
Binary Neural Network (BNN) converts full-precision weights and activations into their extreme 1-bit counterparts, making it particularly suitable for deployment on lightweight mobile devices. While binary neural networks are typically for…
View article: LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects Open
With the rapid rise of large language models (LLMs), phone automation has undergone transformative changes. This paper systematically reviews LLM-driven phone GUI agents, highlighting their evolution from script-based automation to intelli…