Sungjin Ahn
YOU?
Author Swipe
View article: Numerical Investigation of the Phase Change Behavior of Liquefied CO2 in a Type-C Cryogenic Tank
Numerical Investigation of the Phase Change Behavior of Liquefied CO2 in a Type-C Cryogenic Tank Open
As global warming accelerates, the Paris Agreement has emphasized the urgent need for technologies that reduce and manage carbon dioxide emissions. Consequently, carbon capture and storage (CCS) has emerged as a critical area of research. …
View article: CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter
CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter Open
Developing general-purpose embodied agents is a core challenge in AI. Minecraft provides rich complexity and internet-scale data, but its slow speed and engineering overhead make it unsuitable for rapid prototyping. Crafter offers a lightw…
View article: Fast Monte Carlo Tree Diffusion: 100x Speedup via Parallel Sparse Planning
Fast Monte Carlo Tree Diffusion: 100x Speedup via Parallel Sparse Planning Open
Diffusion models have recently emerged as a powerful approach for trajectory planning. However, their inherently non-sequential nature limits their effectiveness in long-horizon reasoning tasks at test time. The recently proposed Monte Car…
View article: Slot-MLLM: Object-Centric Visual Tokenization for Multimodal LLM
Slot-MLLM: Object-Centric Visual Tokenization for Multimodal LLM Open
Recently, multimodal large language models (MLLMs) have emerged as a key approach in achieving artificial general intelligence. In particular, vision-language MLLMs have been developed to generate not only text but also visual outputs from…
View article: Adaptive Inference-Time Scaling via Cyclic Diffusion Search
Adaptive Inference-Time Scaling via Cyclic Diffusion Search Open
Diffusion models have demonstrated strong generative capabilities across domains ranging from image synthesis to complex reasoning tasks. However, most inference-time scaling methods rely on fixed denoising schedules, limiting their abilit…
View article: Modeling Cascading Driver Interventions in Partially Automated Traffic: A Semi-Markov Chain Approach
Modeling Cascading Driver Interventions in Partially Automated Traffic: A Semi-Markov Chain Approach Open
This paper presents an analytical modeling framework for partially automated traffic, incorporating cascading driver intervention behaviors. In this framework, drivers of partially automated vehicles have the flexibility to switch driving …
View article: Extendable Planning via Multiscale Diffusion
Extendable Planning via Multiscale Diffusion Open
Long-horizon planning is crucial in complex environments, but diffusion-based planners like Diffuser are limited by the trajectory lengths observed during training. This creates a dilemma: long trajectories are needed for effective plannin…
View article: Monte Carlo Tree Diffusion for System 2 Planning
Monte Carlo Tree Diffusion for System 2 Planning Open
Diffusion models have recently emerged as a powerful tool for planning. However, unlike Monte Carlo Tree Search (MCTS)-whose performance naturally improves with inference-time computation scaling-standard diffusion-based planners offer onl…
View article: Dreamweaver: Learning Compositional World Models from Pixels
Dreamweaver: Learning Compositional World Models from Pixels Open
Humans have an innate ability to decompose their perceptions of the world into objects and their attributes, such as colors, shapes, and movement patterns. This cognitive process enables us to imagine novel futures by recombining familiar …
View article: Aerodynamic Design and Performance Analysis of a Large-Scale Composite Blade for Wind Turbines
Aerodynamic Design and Performance Analysis of a Large-Scale Composite Blade for Wind Turbines Open
In this study, we determined an aerodynamic configuration to design structures applying composites for large-scale horizontal-axis wind turbine blades. A new aerodynamic and structural design method for large wind turbine blades is present…
View article: Analysis of Delaminated Composite Plates Using 3D Degenerated Plate Element Considering Geometric Non-Linearity
Analysis of Delaminated Composite Plates Using 3D Degenerated Plate Element Considering Geometric Non-Linearity Open
This paper presents a numerical investigation of delaminated composite plates using a 3D degenerated plate element, with a focus on geometric nonlinearity. An 8-noded degenerated element is used to model the composite plate, ensuring highe…
View article: MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory Open
Significant advances have been made in developing general-purpose embodied AI in environments like Minecraft through the adoption of LLM-augmented hierarchical approaches. While these approaches, which combine high-level planners with low-…
View article: From Pixels to Information: Artificial Intelligence in Fluorescence Microscopy
From Pixels to Information: Artificial Intelligence in Fluorescence Microscopy Open
This review explores how artificial intelligence (AI) is transforming fluorescence microscopy, providing an overview of its fundamental principles and recent advancements. The roles of AI in improving image quality and introducing new imag…
View article: Slot State Space Models
Slot State Space Models Open
Recent State Space Models (SSMs) such as S4, S5, and Mamba have shown remarkable computational benefits in long-range temporal dependency modeling. However, in many sequence modeling problems, the underlying process is inherently modular a…
View article: PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer Open
Despite the recent advancements in offline RL, no unified algorithm could achieve superior performance across a broad range of tasks. Offline \textit{value function learning}, in particular, struggles with sparse-reward, long-horizon tasks…
View article: Learning to Compose: Improving Object Centric Learning by Injecting Compositionality
Learning to Compose: Improving Object Centric Learning by Injecting Compositionality Open
Learning compositional representation is a key aspect of object-centric learning as it enables flexible systematic generalization and supports complex visual reasoning. However, most of the existing approaches rely on auto-encoding objecti…
View article: Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming
Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming Open
Model-based reinforcement learning (MBRL) has been a primary approach to ameliorating the sample efficiency issue as well as to make a generalist agent. However, there has not been much effort toward enhancing the strategy of dreaming itse…
View article: Parallelized Spatiotemporal Binding
Parallelized Spatiotemporal Binding Open
While modern best practices advocate for scalable architectures that support long-range interactions, object-centric models are yet to fully embrace these architectures. In particular, existing object-centric models for handling sequential…
View article: Spatially-Aware Transformer for Embodied Agents
Spatially-Aware Transformer for Embodied Agents Open
Episodic memory plays a crucial role in various cognitive processes, such as the ability to mentally recall past events. While cognitive science emphasizes the significance of spatial context in the formation and retrieval of episodic memo…
View article: Neural Language of Thought Models
Neural Language of Thought Models Open
The Language of Thought Hypothesis suggests that human cognition operates on a structured, language-like system of mental representations. While neural language models can naturally benefit from the compositional structure inherently and e…
View article: Simple Hierarchical Planning with Diffusion
Simple Hierarchical Planning with Diffusion Open
Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. However, they often face computational challenges and can falter in generalization, especially in capturing temporal abstractions for …
View article: Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models
Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models Open
Systematic compositionality, or the ability to adapt to novel situations by creating a mental model of the world using reusable pieces of knowledge, remains a significant challenge in machine learning. While there has been considerable pro…
View article: Facing Off World Model Backbones: RNNs, Transformers, and S4
Facing Off World Model Backbones: RNNs, Transformers, and S4 Open
World models are a fundamental component in model-based reinforcement learning (MBRL). To perform temporally extended and consistent simulations of the future in partially observable environments, world models need to possess long-term mem…
View article: Object-Centric Slot Diffusion
Object-Centric Slot Diffusion Open
The recent success of transformer-based image generative models in object-centric learning highlights the importance of powerful image generators for handling complex scenes. However, despite the high expressiveness of diffusion models in …
View article: An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning
An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning Open
Unsupervised object-centric representation (OCR) learning has recently drawn attention as a new paradigm of visual representation. This is because of its potential of being an effective pre-training technique for various downstream tasks i…
View article: Neural Systematic Binder
Neural Systematic Binder Open
The key to high-level cognition is believed to be the ability to systematically manipulate and compose knowledge pieces. While token-like structured knowledge representations are naturally provided in text, it is elusive how to obtain them…
View article: Derivation of Risk Factors to Quantify the Risk of Safety Accidents for Small and Medium-Sized Enterprises in Construction Industry
Derivation of Risk Factors to Quantify the Risk of Safety Accidents for Small and Medium-Sized Enterprises in Construction Industry Open
The accident rate in the construction industry is much higher than in other industries. In particular, small- and medium-sized construction sites need to be managed by differentiating them from large construction sites. In order to create …
View article: Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos
Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos Open
Unsupervised object-centric learning aims to represent the modular, compositional, and causal structure of a scene as a set of object representations and thereby promises to resolve many critical limitations of traditional single-vector re…