Caili Guo
YOU?
Author Swipe
View article: TOM: An Open-Source Tongue Segmentation Method with Multi-Teacher Distillation and Task-Specific Data Augmentation
TOM: An Open-Source Tongue Segmentation Method with Multi-Teacher Distillation and Task-Specific Data Augmentation Open
Tongue imaging serves as a valuable diagnostic tool, particularly in Traditional Chinese Medicine (TCM). The quality of tongue surface segmentation significantly affects the accuracy of tongue image classification and subsequent diagnosis …
View article: RIDAS: A Multi-Agent Framework for AI-RAN with Representation- and Intention-Driven Agents
RIDAS: A Multi-Agent Framework for AI-RAN with Representation- and Intention-Driven Agents Open
Sixth generation (6G) networks demand tight integration of artificial intelligence (AI) into radio access networks (RANs) to meet stringent quality of service (QoS) and resource efficiency requirements. Existing solutions struggle to bridg…
View article: Remote Sensing-Based Land Use/Land Cover and Photovoltaic Panel Classification via Google Earth Engine and Deep Learning
Remote Sensing-Based Land Use/Land Cover and Photovoltaic Panel Classification via Google Earth Engine and Deep Learning Open
This study presents an integrated approach for land use and land cover (LULC) and photovoltaic panel mapping in Qinghai Province by combining Google Earth Engine (GEE) with deep neural network (DNN) modelling. Time-series, cloud-free Lands…
View article: Lightweight Task-Oriented Semantic Communication Empowered by Large-Scale AI Models
Lightweight Task-Oriented Semantic Communication Empowered by Large-Scale AI Models Open
Recent studies have focused on leveraging large-scale artificial intelligence (LAI) models to improve semantic representation and compression capabilities. However, the substantial computational demands of LAI models pose significant chall…
View article: The value of cytokines in evaluating the efficacy of glucocorticoids in the treatment of severe Mycoplasma pneumoniae pneumonia in children
The value of cytokines in evaluating the efficacy of glucocorticoids in the treatment of severe Mycoplasma pneumoniae pneumonia in children Open
View article: Real-Time Oil Spill Concentration Assessment Through Fluorescence Imaging and Deep Learning
Real-Time Oil Spill Concentration Assessment Through Fluorescence Imaging and Deep Learning Open
View article: Conformal Distributed Remote Inference in Sensor Networks Under Reliability and Communication Constraints
Conformal Distributed Remote Inference in Sensor Networks Under Reliability and Communication Constraints Open
This paper presents communication-constrained distributed conformal risk control (CD-CRC) framework, a novel decision-making framework for sensor networks under communication constraints. Targeting multi-label classification problems, such…
View article: On the Impact of Uncertainty and Calibration on Likelihood-Ratio Membership Inference Attacks
On the Impact of Uncertainty and Calibration on Likelihood-Ratio Membership Inference Attacks Open
In a membership inference attack (MIA), an attacker exploits the overconfidence exhibited by typical machine learning models to determine whether a specific data point was used to train a target model. In this paper, we analyze the perform…
View article: A Survey on Indoor Visible Light Positioning Systems: Fundamentals, Applications, and Challenges
A Survey on Indoor Visible Light Positioning Systems: Fundamentals, Applications, and Challenges Open
The growing demand for location-based services in areas like virtual reality, robot control, and navigation has intensified the focus on indoor localization. Visible light positioning (VLP), leveraging visible light communications (VLC), b…
View article: OFDM-Based Digital Semantic Communication with Importance Awareness
OFDM-Based Digital Semantic Communication with Importance Awareness Open
Semantic communication (SemCom) has received considerable attention for its ability to reduce data transmission size while maintaining task performance. However, existing works mainly focus on analog SemCom with simple channel models, whic…
View article: Multi-View Visual Semantic Embedding for Cross-Modal Image-Text Retrieval
Multi-View Visual Semantic Embedding for Cross-Modal Image-Text Retrieval Open
View article: Federated Inference With Reliable Uncertainty Quantification Over Wireless Channels via Conformal Prediction
Federated Inference With Reliable Uncertainty Quantification Over Wireless Channels via Conformal Prediction Open
In this paper, we consider a wireless federated inference scenario in which devices and a server share a pre-trained machine learning model. The devices communicate statistical information about their local data to the server over a common…
View article: Revisiting Hard Negative Mining in Contrastive Learning for Visual Understanding
Revisiting Hard Negative Mining in Contrastive Learning for Visual Understanding Open
Efficiently mining and distinguishing hard negatives is the key to Contrastive Learning (CL) in various visual understanding tasks. By properly emphasizing the penalty of hard negatives, Hard Negative Mining (HNM) can improve the CL perfor…
View article: Boundary-Aware Proposal Generation Method for Temporal Action Localization
Boundary-Aware Proposal Generation Method for Temporal Action Localization Open
The goal of Temporal Action Localization (TAL) is to find the categories and temporal boundaries of actions in an untrimmed video. Most TAL methods rely heavily on action recognition models that are sensitive to action labels rather than t…
View article: Disentangled Information Bottleneck guided Privacy-Protective JSCC for Image Transmission
Disentangled Information Bottleneck guided Privacy-Protective JSCC for Image Transmission Open
Joint source and channel coding (JSCC) has attracted increasing attention due to its robustness and high efficiency. However, JSCC is vulnerable to privacy leakage due to the high relevance between the source image and channel input. In th…
View article: Privacy-Aware Joint Source-Channel Coding for image transmission based on Disentangled Information Bottleneck
Privacy-Aware Joint Source-Channel Coding for image transmission based on Disentangled Information Bottleneck Open
Current privacy-aware joint source-channel coding (JSCC) works aim at avoiding private information transmission by adversarially training the JSCC encoder and decoder under specific signal-to-noise ratios (SNRs) of eavesdroppers. However, …
View article: Federated Inference with Reliable Uncertainty Quantification over Wireless Channels via Conformal Prediction
Federated Inference with Reliable Uncertainty Quantification over Wireless Channels via Conformal Prediction Open
In this paper, we consider a wireless federated inference scenario in which devices and a server share a pre-trained machine learning model. The devices communicate statistical information about their local data to the server over a common…
View article: Integrating Listwise Ranking into Pairwise-based Image-Text Retrieval
Integrating Listwise Ranking into Pairwise-based Image-Text Retrieval Open
Image-Text Retrieval (ITR) is essentially a ranking problem. Given a query caption, the goal is to rank candidate images by relevance, from large to small. The current ITR datasets are constructed in a pairwise manner. Image-text pairs are…
View article: Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image-Text Matching
Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image-Text Matching Open
Recently, a series of Image-Text Matching (ITM) methods achieve impressive performance. However, we observe that most existing ITM models suffer from gradients vanishing at the beginning of training, which makes these models prone to falli…
View article: Deep Joint Source-Channel Coding for Wireless Image Transmission with Semantic Importance
Deep Joint Source-Channel Coding for Wireless Image Transmission with Semantic Importance Open
The sixth-generation mobile communication system proposes the vision of smart interconnection of everything, which requires accomplishing communication tasks while ensuring the performance of intelligent tasks. A joint source-channel codin…
View article: Physical Layer Authentication Based on Channel Polarization Response in Dual-Polarized Antenna Communication Systems
Physical Layer Authentication Based on Channel Polarization Response in Dual-Polarized Antenna Communication Systems Open
This study presents a novel approach for physical layer authentication based on channel polarization response (CPR). CPR is sensitive to variation in the physical properties of scatterers, and the CPR difference between various channels is…
View article: Information Bottleneck-Inspired Type Based Multiple Access for Remote Estimation in IoT Systems
Information Bottleneck-Inspired Type Based Multiple Access for Remote Estimation in IoT Systems Open
Type-based multiple access (TBMA) is a semantics-aware multiple access protocol for remote inference. In TBMA, codewords are reused across transmitting sensors, with each codeword being assigned to a different observation value. Existing T…
View article: Joint design of ordered QR precoding and SIC detection for MIMO VLC systems
Joint design of ordered QR precoding and SIC detection for MIMO VLC systems Open
Ordered successive interference cancellation (OSIC) detection has been investigated to mitigate the high spatial correlation for multiple-input multiple-output (MIMO) visible light communication (VLC) systems. However, existing OSIC scheme…
View article: Image-Text Retrieval with Binary and Continuous Label Supervision
Image-Text Retrieval with Binary and Continuous Label Supervision Open
Most image-text retrieval work adopts binary labels indicating whether a pair of image and text matches or not. Such a binary indicator covers only a limited subset of image-text semantic relations, which is insufficient to represent relev…
View article: Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval
Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval Open
There are two popular loss functions used for vision-language retrieval, i.e., triplet loss and contrastive learning loss, both of them essentially minimize the difference between the similarities of negative pairs and positive pairs. More…
View article: Deep Joint Source-Channel Coding Based on Semantics of Pixels
Deep Joint Source-Channel Coding Based on Semantics of Pixels Open
The semantic information of the image for intelligent tasks is hidden behind the pixels, and slight changes in the pixels will affect the performance of intelligent tasks. In order to preserve semantic information behind pixels for intelli…
View article: Adaptable Semantic Compression and Resource Allocation for Task-Oriented Communications
Adaptable Semantic Compression and Resource Allocation for Task-Oriented Communications Open
Task-oriented communication is a new paradigm that aims at providing efficient connectivity for accomplishing intelligent tasks rather than the reception of every transmitted bit. In this paper, a deep learning-based task-oriented communic…
View article: Positioning Using Visible Light Communications: A Perspective Arcs Approach
Positioning Using Visible Light Communications: A Perspective Arcs Approach Open
Visible light positioning (VLP) is an accurate indoor positioning technology that uses luminaires as transmitters. In particular, circular luminaires are a common source type for VLP, that are typically treated only as point sources for po…
View article: Adaptive Information Bottleneck Guided Joint Source and Channel Coding for Image Transmission
Adaptive Information Bottleneck Guided Joint Source and Channel Coding for Image Transmission Open
Joint source and channel coding (JSCC) for image transmission has attracted increasing attention due to its robustness and high efficiency. However, the existing deep JSCC research mainly focuses on minimizing the distortion between the tr…
View article: Semantic-assisted image compression
Semantic-assisted image compression Open
Conventional image compression methods typically aim at pixel-level consistency while ignoring the performance of downstream AI tasks.To solve this problem, this paper proposes a Semantic-Assisted Image Compression method (SAIC), which can…