Explanipedia

Prior knowledge of layer-specific pruning numbers guarantees effective random pruning at initialization Open

Minju Jung, Sunghyun Baek, Yunho Jeon, Junmo Kim · 2025

Several pruning methods prune a neural network at initialization. These methods carefully determine the importance of each weight, retaining only the important ones and pruning the others. However, subsequent studies have shown that random…

Preference Distillation via Value based Reinforcement Learning Open

Minchan Kwon, Jaepil Ko, Kangil Kim, Junmo Kim · 2025

Direct Preference Optimization (DPO) is a powerful paradigm to align language models with human preferences using pairwise comparisons. However, its binary win-or-loss supervision often proves insufficient for training small models with li…

Comparison Reveals Commonality: Customized Image Generation through Contrastive Inversion Open

Minseo Kim, Minchan Kwon, Dongyeun Lee, Yunho Jeon, Junmo Kim · 2025

The recent demand for customized image generation raises a need for techniques that effectively extract the common concept from small sets of images. Existing methods typically rely on additional guidance, such as text prompts or spatial m…

DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization Open

Dongyeun Lee, Jiwan Hur, Hyounguk Shon, Jae Young Lee, Junmo Kim · 2025

Diffusion models have achieved remarkable success in image generation but come with significant computational costs, posing challenges for deployment in resource-constrained environments. Recent post-training quantization (PTQ) methods hav…

FairASR: Fair Audio Contrastive Learning for Automatic Speech Recognition Open

Jong‐Suk Kim, Jaemyung Yu, Minchan Kwon, Junmo Kim · 2025

Large-scale ASR models have achieved remarkable gains in accuracy and robustness. However, fairness issues remain largely unaddressed despite their critical importance in real-world applications. In this work, we introduce FairASR, a syste…

InfiniteAudio: Infinite-Length Audio Generation with Consistency Open

Chaeyoung Jung, Hojoon Ki, Jihoon Kim, Junmo Kim, J. Chung · 2025

This paper presents InfiniteAudio, a simple yet effective strategy for generating infinite-length audio using diffusion-based text-to-audio methods. Current approaches face memory constraints because the output size increases with input le…

PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion Open

Jaehyun Choi, Jiwan Hur, Gyojin Han, Jaemyung Yu, Junmo Kim · 2025

Video dataset condensation has emerged as a critical technique for addressing the computational challenges associated with large-scale video data processing in deep learning applications. While significant progress has been made in image d…

DAM: Domain-Aware Module for Multi-Domain Dataset Condensation Open

Jaehyun Choi, Gyojin Han, Dongjae Lee, Seung-Hyo Baek, Junmo Kim · 2025

Dataset Condensation (DC) has emerged as a promising solution to mitigate the computational and storage burdens associated with training deep learning models. However, existing DC methods largely overlook the multi-domain nature of modern …

Enhancing self-supervised visual representation learning through adversarially generated examples Open

Mun-Cheon Kang, Junmo Kim · 2025

Computer science Political science

Self-supervised learning has emerged as a powerful paradigm for leveraging unlabeled data to learn rich feature representations. However, the efficacy of self-supervised models is often limited by the degree and complexity of the augmentat…

SFLD: Reducing the content bias for AI-generated Image Detection Open

Seoyeon Gye, Jaepil Ko, Hyounguk Shon, Minchan Kwon, Junmo Kim · 2025

Identifying AI-generated content is critical for the safe and ethical use of generative AI. Recent research has focused on developing detectors that generalize to unknown generators, with popular methods relying either on high-level featur…

Instruct-4DGS: Efficient Dynamic Scene Editing via 4D Gaussian-based Static-Dynamic Separation Open

Jungmin Kwon, Hanbyel Cho, Junmo Kim · 2025

Computer science Physics

Recent 4D dynamic scene editing methods require editing thousands of 2D images used for dynamic scene synthesis and updating the entire scene with additional training loops, resulting in several hours of processing to edit a single dynamic…

Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance Open

Jiwan Hur, Dong-Jae Lee, Gyojin Han, Jaehyun Choi, Yunho Jeon , et al. · 2024

Computer science

Masked generative models (MGMs) have shown impressive generative ability while providing an order of magnitude efficient sampling steps compared to continuous diffusion models. However, MGMs still underperform in image synthesis compared t…

StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models Open

Minchan Kwon, Gaeun Kim, Jong‐Suk Kim, Haeil Lee, Junmo Kim · 2024

Computer science Psychology

Finding appropriate prompts for the specific task has become an important issue as the usage of Large Language Models (LLM) has expanded. Reinforcement Learning (RL) is widely used for prompt tuning, but its inherent instability and enviro…

Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality Open

Young‐Taek Oh, Jae Won Cho, Dong-Jin Kim, In So Kweon, Junmo Kim · 2024

Computer science Philosophy Materials science

In this paper, we propose a new method to enhance compositional understanding in pre-trained vision and language models (VLMs) without sacrificing performance in zero-shot multi-modal tasks. Traditional fine-tuning approaches often improve…

Pretrained Patient Trajectories for Adverse Drug Event Prediction Using Common Data Model-based Electronic Health Records Open

Junmo Kim, Joo Seong Kim, Ji‐Hyang Lee, Mingyu Kim, Taehyun Kim , et al. · 2024

Computer science Medicine Economics

Background Pretraining electronic health record (EHR) data using language models by treating patient trajectories as natural language sentences has enhanced performance across various medical tasks. However, EHR pretraining models have nev…

Beta Sampling is All You Need: Efficient Image Generation Strategy for Diffusion Models using Stepwise Spectral Analysis Open

Haeil Lee, H.S. Lee, Seoyeon Gye, Junmo Kim · 2024

Computer science Mathematics Physics

Generative diffusion models have emerged as a powerful tool for high-quality image synthesis, yet their iterative nature demands significant computational resources. This paper proposes an efficient time step sampling method based on an im…

AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning Open

Jong‐Suk Kim, Jiwon Shin, Junmo Kim · 2024

Computer science

In recent years, advancements in representation learning and language models have propelled Automated Captioning (AC) to new heights, enabling the generation of human-level descriptions. Leveraging these advancements, we propose AVCap, an …

Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition Open

Young‐Taek Oh, Pyunghwan Ahn, Jinhyung Kim, Gwangmo Song, Soon Young Lee , et al. · 2024

Computer science Philosophy Physics

Vision and language models (VLMs) such as CLIP have showcased remarkable zero-shot recognition abilities yet face challenges in visio-linguistic compositionality, particularly in linguistic comprehension and fine-grained image-text alignme…

Towards Understanding Dual BN In Hybrid Adversarial Training Open

Chenshuang Zhang, Chaoning Zhang, Kang Zhang, Axi Niu, Junmo Kim , et al. · 2024

Computer science Physics Philosophy

There is a growing concern about applying batch normalization (BN) in adversarial training (AT), especially when the model is trained on both adversarial samples and clean samples (termed Hybrid-AT). With the assumption that adversarial an…

ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object Open

Chenshuang Zhang, Fei Pan, Junmo Kim, In So Kweon, Chengzhi Mao · 2024

Computer science Business Chemistry

We establish rigorous benchmarks for visual perception robustness. Synthetic images such as ImageNet-C, ImageNet-9, and Stylized ImageNet provide specific type of evaluation over synthetic corruptions, backgrounds, and textures, yet those …

FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection Open

Chan-Ho Lee, Jinsu Son, Hyounguk Shon, Yunho Jeon, Junmo Kim · 2024

Computer science

Rotation-equivariance is an essential yet challenging property in oriented object detection. While general object detectors naturally leverage robustness to spatial shifts due to the translation-equivariance of the conventional CNNs, achie…

Modeling Stereo-Confidence out of the End-to-End Stereo-Matching Network via Disparity Plane Sweep Open

Jae Young Lee, Woonghyun Ka, Jaehyun Choi, Junmo Kim · 2024

Computer science Mathematics

We propose a novel stereo-confidence that can be measured externally to various stereo-matching networks, offering an alternative input modality choice of the cost volume for learning-based approaches, especially in safety-critical systems…

Foreseeing Reconstruction Quality of Gradient Inversion: An Optimization Perspective Open

Hyeong Gwon Hong, Yooshin Cho, Hanbyel Cho, Jaesung Ahn, Junmo Kim · 2024

Computer science Geology Mathematics

Gradient inversion attacks can leak data privacy when clients share weight updates with the server in federated learning (FL). Existing studies mainly use L2 or cosine distance as the loss function for gradient matching in the attack. Our …

EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning Open

Jong‐Suk Kim, Hyeongkeun Lee, Kyeongha Rho, Junmo Kim, Joon Son Chung · 2024

Computer science

Recent advancements in self-supervised audio-visual representation learning have demonstrated its potential to capture rich and comprehensive representations. However, despite the advantages of data augmentation verified in many learning m…

Stereo-Matching Knowledge Distilled Monocular Depth Estimation Filtered by Multiple Disparity Consistency Open

Woonghyun Ka, Jae Young Lee, Jaehyun Choi, Junmo Kim · 2024

Computer science Mathematics

In stereo-matching knowledge distillation methods of the self-supervised monocular depth estimation, the stereo-matching network's knowledge is distilled into a monocular depth network through pseudo-depth maps. In these methods, the learn…

Junmo Kim YOU? Author Swipe