Young-Joo Suh
YOU?
Author Swipe
View article: MIRA: A Transformer-Based Framework for Idler Roller Anomaly Detection and Localization
MIRA: A Transformer-Based Framework for Idler Roller Anomaly Detection and Localization Open
Monitoring the condition of belt conveyor idlers is critical for ensuring safe and efficient operation of industrial conveying systems. However, existing methods often suffer from limited scalability and delayed fault detection, particular…
View article: HyFLM: A Hypernetwork-Based Federated Learning with Multidimensional Trajectory Optimization on Diffusion Paths
HyFLM: A Hypernetwork-Based Federated Learning with Multidimensional Trajectory Optimization on Diffusion Paths Open
The effective training of large-scale distributed deep learning models has become an active and emerging research area in recent years. Federated learning (FL) can address those challenges by training global models through parameter exchan…
View article: UDirEar: Heading Direction Tracking with Commercial UWB Earbud by Interaural Distance Calibration
UDirEar: Heading Direction Tracking with Commercial UWB Earbud by Interaural Distance Calibration Open
Accurate heading direction tracking is essential for immersive VR/AR, spatial audio rendering, and robotic navigation. Existing IMU-based methods suffer from drift and vibration artifacts, vision-based approaches require LoS and raise priv…
View article: A Wi-Fi Fingerprinting Indoor Localization Framework Using Feature-Level Augmentation via Variational Graph Auto-Encoder
A Wi-Fi Fingerprinting Indoor Localization Framework Using Feature-Level Augmentation via Variational Graph Auto-Encoder Open
Wi-Fi fingerprinting is a widely adopted technique for indoor localization in location-based services (LBS) due to its cost-effectiveness and ease of deployment using existing infrastructure. However, the performance of these systems often…
View article: EmoSDS: Unified Emotionally Adaptive Spoken Dialogue System Using Self-Supervised Speech Representations
EmoSDS: Unified Emotionally Adaptive Spoken Dialogue System Using Self-Supervised Speech Representations Open
In recent years, advancements in artificial intelligence, speech, and natural language processing technology have enhanced spoken dialogue systems (SDSs), enabling natural, voice-based human–computer interaction. However, discrete, token-b…
View article: MILD: Minimizing Idle Listening Energy Consumption via Down-Clocking for Energy-Efficient Wi-Fi Communications
MILD: Minimizing Idle Listening Energy Consumption via Down-Clocking for Energy-Efficient Wi-Fi Communications Open
Mobile devices, such as smartphones and laptops, face energy consumption challenges due to battery limitations, with Wi-Fi being one of the major sources of energy consumption in these devices. The IEEE 802.11 standard addresses this issue…
View article: Domain Generalized Open-Set Fault Detection and Diagnosis for Belt Conveyor Systems With Prototype Learning
Domain Generalized Open-Set Fault Detection and Diagnosis for Belt Conveyor Systems With Prototype Learning Open
Belt conveyor systems are essential across various industries but are prone to faults due to their distinctive design and challenging operational environments. Various approaches have been explored for fault detection and diagnosis (FDD) i…
View article: Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency
Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency Open
This paper introduces a novel knowledge distillation (KD) framework for monocular depth estimation (MDE), incorporating dynamic weight adaptation to address critical challenges. The proposed approach effectively mitigates visual limitation…
View article: Designing a Multivariate Belt Conveyor Idler Stall Detection and Identification System with Scalability Analysis
Designing a Multivariate Belt Conveyor Idler Stall Detection and Identification System with Scalability Analysis Open
Belt conveyor idlers are freely rotating idlers supporting the belt of a conveyor, and can induce severe frictional damage to the belt as they fail. Therefore, fast and accurate detection of idler faults is crucial for the effective mainte…
View article: QR-VC: Leveraging Quantization Residuals for Linear Disentanglement in Zero-Shot Voice Conversion
QR-VC: Leveraging Quantization Residuals for Linear Disentanglement in Zero-Shot Voice Conversion Open
Zero-shot voice conversion is a technique that alters the speaker identity of an input speech to match a target speaker using only a single reference utterance, without requiring additional training. Recent approaches extensively utilize s…
View article: WKNN-Based Wi-Fi Fingerprinting with Deep Distance Metric Learning via Siamese Triplet Network for Indoor Positioning
WKNN-Based Wi-Fi Fingerprinting with Deep Distance Metric Learning via Siamese Triplet Network for Indoor Positioning Open
Weighted k-nearest neighbor (WKNN)-based Wi-Fi fingerprinting is popular in indoor location-based services due to its ease of implementation and low computational cost. KNN-based methods rely on distance metrics to select the nearest neigh…
View article: Exploring Public Data Vulnerabilities in Semi-Supervised Learning Models through Gray-box Adversarial Attack
Exploring Public Data Vulnerabilities in Semi-Supervised Learning Models through Gray-box Adversarial Attack Open
Semi-supervised learning (SSL) models, integrating labeled and unlabeled data, have gained prominence in vision-based tasks, yet their susceptibility to adversarial attacks remains underexplored. This paper unveils the vulnerability of SSL…
View article: Automatic Fingerprint Data Labeling Using WiFi Signal and Smartphone Camera for Indoor Positioning
Automatic Fingerprint Data Labeling Using WiFi Signal and Smartphone Camera for Indoor Positioning Open
WiFi fingerprinting has been one of the most practical approaches for implementing an indoor positioning system. However, the need to measure location labels for fingerprint data has hindered the deployment of WiFi fingerprint‐based positi…
View article: AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual Voice Conversion
AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual Voice Conversion Open
This paper proposes a simple and robust zero-shot voice conversion system with a cycle structure and mel-spectrogram pre-processing. Previous works suffer from information loss and poor synthesis quality due to their reliance on a carefull…
View article: GConvLoc: WiFi Fingerprinting-Based Indoor Localization Using Graph Convolutional Networks
GConvLoc: WiFi Fingerprinting-Based Indoor Localization Using Graph Convolutional Networks Open
We propose GConvLoc, a WiFi fingerprinting-based in-door localization method utilizing graph convolutional networks. Using the graph structure, we can consider the fingerprint data of the reference points and their location labels in addit…
View article: Glocal Retriever: Glocal Image Retrieval Using the Combination of Global and Local Descriptors
Glocal Retriever: Glocal Image Retrieval Using the Combination of Global and Local Descriptors Open
Development of deep learning has led to progress in computer vision, including metric learning tasks such as image retrieval, through convolutional neural networks. In image retrieval, the metric distance (i.e., the similarity) between the…
View article: Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech
Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech Open
Although recent neural text-to-speech (TTS) systems have achieved\nhigh-quality speech synthesis, there are cases where a TTS system generates\nlow-quality speech, mainly caused by limited training data or information loss\nduring knowledg…
View article: Data-driven modeling reveals the Western dominance of global public interest in earthquakes
Data-driven modeling reveals the Western dominance of global public interest in earthquakes Open
Catastrophic earthquakes stimulate information-seeking behaviors beyond the affected geographical boundaries; however, our understanding of the dynamics of global public interest in earthquakes remains limited. Herein, we harness Big Data …
View article: Improving Classification Accuracy of Hand Gesture Recognition Based on 60 GHz FMCW Radar with Deep Learning Domain Adaptation
Improving Classification Accuracy of Hand Gesture Recognition Based on 60 GHz FMCW Radar with Deep Learning Domain Adaptation Open
With the recent development of small radars with high resolution, various human–computer interaction (HCI) applications using them have been developed. In particular, a method of applying a user’s hand gesture recognition using a short-ran…
View article: Perceptually Guided End-to-End Text-to-Speech With MOS Prediction
Perceptually Guided End-to-End Text-to-Speech With MOS Prediction Open
Although recent end-to-end text-to-speech (TTS) systems have achieved high-quality speech synthesis, there are still several factors that degrade the quality of synthesized speech, including lack of training data or information loss during…
View article: Perceptually Guided End-to-End Text-to-Speech.
Perceptually Guided End-to-End Text-to-Speech. Open
Several fast text-to-speech (TTS) models have been proposed for real-time processing, but there is room for improvement in speech quality. Meanwhile, there is a mismatch between the loss function for training and the mean opinion score (MO…
View article: Non-parallel voice conversion based on source-to-target direct mapping
Non-parallel voice conversion based on source-to-target direct mapping Open
Recent works of utilizing phonetic posteriograms (PPGs) for non-parallel voice conversion have significantly increased the usability of voice conversion since the source and target DBs are no longer required for matching contents. In this …
View article: Designing and Implementing an Enhanced Bluetooth Low Energy Scanner with User-Level Channel Awareness and Simultaneous Channel Scanning
Designing and Implementing an Enhanced Bluetooth Low Energy Scanner with User-Level Channel Awareness and Simultaneous Channel Scanning Open
This paper proposes an enhanced BLE scanner with user-level channel awareness and simultaneous channel scanning to increase theoretical scanning capability by up to three times. With better scanning capability, channel analysis quality als…
View article: An end-to-end synthesis method for Korean text-to-speech systems
An end-to-end synthesis method for Korean text-to-speech systems Open
A typical statistical parametric speech synthesis (text-to-speech, TTS) system consists of separate modules, such as a text analysis module, an acoustic modeling module, and a speech synthesis module. This causes two problems: 1) expert kn…