Explanipedia

Reward Modeling of Goal-directed Gaze Control Open

Seoyoung Ahn, Zhibo Yang, Sounak Mondal, Ruoyu Xue, Minh Hoai , et al. · 2025

Goal-directed visual search in natural scenes is a complex behavior that requires the flexible integration of vision, memory, and contextual knowledge. Here we introduce a reward-based framework that unifies these processes by learning tar…

Multi-view Gaze Target Estimation Open

Qiaomu Miao, Vivek Raju Golani, Jingyi Xu, Paramartha Dutta, Minh Hoai , et al. · 2025

This paper presents a method that utilizes multiple camera views for the gaze target estimation (GTE) task. The approach integrates information from different camera views to improve accuracy and expand applicability, addressing limitation…

Low-Rank Head Avatar Personalization with Registers Open

Sai Tanmay Reddy Chakkera, Aggelina Chatziagapi, Md Moniruzzaman, Chen-Ping Yu, Yi–Hsuan Tsai , et al. · 2025

We introduce a novel method for low-rank personalization of a generic model for head avatar generation. Prior work proposes generic models that achieve high-quality face animation by leveraging large-scale datasets of multiple identities. …

Adaptive Multitask Neural Network for High-Fidelity Wake Flow Modeling of Wind Farms Open

Dichang Zhang, Christian Santoni, Zexia Zhang, Dimitris Samaras, Ali Khosronejad · 2025

Wind turbine wake modeling is critical for the design and optimization of wind farms. Traditional methods often struggle with the trade-off between accuracy and computational cost. Recently, data-driven neural networks have emerged as a pr…

Ciliated cell domains with locally coordinated ciliary motion generate a mosaic of microflows in the brain’s lateral ventricles Open

Kennelia A. Mellanson, Lei Zhou, Tatyana V. Michurina, Anatoly Mikhailik, Helene Benveniste , et al. · 2025

Circulation of cerebrospinal fluid (CSF) through the brain’s ventricles is essential for maintaining brain homeostasis and supporting neurogenesis. CSF flow is supported by the structural polarization of multiciliated cells, which align wi…

TopoCellGen: Generating Histopathology Cell Topology with a Diffusion Model Open

Meilong Xu, Saumya Gupta, Xiaoling Hu, Chen Li, Shahira Abousamra , et al. · 2024

Accurately modeling multi-class cell topology is crucial in digital pathology, as it provides critical insights into tissue structure and pathology. The synthetic generation of cell topology enables realistic simulations of complex tissue …

MLI-NeRF: Multi-Light Intrinsic-Aware Neural Radiance Fields Open

Yixiong Yang, Shilin Hu, Haoyu Wu, Ramon Miró Baldrich, Dimitris Samaras , et al. · 2024

Current methods for extracting intrinsic image components, such as reflectance and shading, primarily rely on statistical priors. These methods focus mainly on simple synthetic scenes and isolated objects and struggle to perform well on ch…

MLI-NeRF: Multi-Light Intrinsic-Aware Neural Radiance Fields Open

Yixiong Yang, Shilin Hu, Haoyu Wu, Ramon Miró Baldrich, Dimitris Samaras , et al. · 2024

Current methods for extracting intrinsic image components, such as reflectance and shading, primarily rely on statistical priors. These methods focus mainly on simple synthetic scenes and isolated objects and struggle to perform well on ch…

Instance-Aware Generalized Referring Expression Segmentation Open

E-Ro Nguyen, Hieu Lê, Dimitris Samaras, Michael S. Ryoo · 2024

Recent works on Generalized Referring Expression Segmentation (GRES) struggle with handling complex expressions referring to multiple distinct objects. This is because these methods typically employ an end-to-end foreground-background segm…

Direct and Explicit 3D Generation from a Single Image Open

Haoyu Wu, Meher Gitika Karumuri, Chuhang Zou, Seungbae Bang, Yuelong Li , et al. · 2024

Current image-to-3D approaches suffer from high computational costs and lack scalability for high-resolution outputs. In contrast, we introduce a novel framework to directly generate explicit surface geometry and texture using multi-view 2…

Fast constrained sampling in pre-trained diffusion models Open

Alexandros Graikos, Nebojša Jojić, Dimitris Samaras · 2024

Large denoising diffusion models, such as Stable Diffusion, have been trained on billions of image-caption pairs to perform text-conditioned image generation. As a byproduct of this training, these models have acquired general knowledge ab…

TopoDiffusionNet: A Topology-aware Diffusion Model Open

Saumya Gupta, Dimitris Samaras, Chao Chen · 2024

Diffusion models excel at creating visually impressive images but often struggle to generate images with a specified topology. The Betti number, which represents the number of structures in an image, is a fundamental measure in topology. Y…

Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation Open

Jingyi Xu, Hieu Lê, Zhixin Shu, Yang Wang, Yi–Hsuan Tsai , et al. · 2024

Human emotional expression is inherently dynamic, complex, and fluid, characterized by smooth transitions in intensity throughout verbal communication. However, the modeling of such intensity fluctuations has been largely overlooked by pre…

JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Open

Sai Tanmay Reddy Chakkera, Aggelina Chatziagapi, Dimitris Samaras · 2024

We introduce a novel method for joint expression and audio-guided talking face generation. Recent approaches either struggle to preserve the speaker identity or fail to produce faithful facial expressions. To address these challenges, we p…

Shadow Removal Refinement via Material-Consistent Shadow Edges Open

Shilin Hu, Hieu Lê, ShahRukh Athar, Sagnik Das, Dimitris Samaras · 2024

Shadow boundaries can be confused with material boundaries as both exhibit sharp changes in luminance or contrast within a scene. However, shadows do not modify the intrinsic color or texture of surfaces. Therefore, on both sides of shadow…

Look Hear: Gaze Prediction for Speech-directed Human Attention Open

Sounak Mondal, Seoyoung Ahn, Zhibo Yang, Niranjan Balasubramanian, Dimitris Samaras , et al. · 2024

For computer systems to effectively interact with humans using spoken language, they need to understand how the words being generated affect the users' moment-by-moment attention. Our study focuses on the incremental prediction of attentio…

Assessing Sample Quality via the Latent Space of Generative Models Open

Jingyi Xu, Hieu Lê, Dimitris Samaras · 2024

Advances in generative models increase the need for sample quality assessment. To do so, previous methods rely on a pre-trained feature extractor to embed the generated samples and real samples into a common space for comparison. However, …

MIGS: Multi-Identity Gaussian Splatting via Tensor Decomposition Open

Aggelina Chatziagapi, Grigorios G. Chrysos, Dimitris Samaras · 2024

We introduce MIGS (Multi-Identity Gaussian Splatting), a novel method that learns a single neural representation for multiple identities, using only monocular videos. Recent 3D Gaussian Splatting (3DGS) approaches for human avatars require…

Uncertainty Estimation for Tumor Prediction with Unlabeled Data Open

Ju‐Young Yun, Shahira Abousamra, Li Chen, Rajarsi Gupta, Tahsin Kurç , et al. · 2024

Estimating uncertainty of a neural network is crucial in providing transparency and trustworthiness. In this paper, we focus on uncertainty estimation for digital pathology prediction models. To explore the large amount of unlabeled data i…

Learning Relighting and Intrinsic Decomposition in Neural Radiance Fields Open

Yixiong Yang, Shilin Hu, Haoyu Wu, Ramon Miró Baldrich, Dimitris Samaras , et al. · 2024

The task of extracting intrinsic components, such as reflectance and shading, from neural radiance fields is of growing interest. However, current methods largely focus on synthetic scenes and isolated objects, overlooking the complexities…

Learned Representation-Guided Diffusion Models for Large-Image Generation Open

Alexandros Graikos, Srikar Yellapragada, Minh-Quan Le, Saarthak Kapse, Prateek Prasanna , et al. · 2024

To synthesize high-fidelity samples, diffusion models typically require auxiliary data to guide the generation process. However, it is impractical to procure the painstaking patch-level annotation effort required in specialized domains lik…

Diffusion-Refined VQA Annotations for Semi-Supervised Gaze Following Open

Qiaomu Miao, Alexandros Graikos, Jingwei Zhang, Sounak Mondal, Minh Hoai , et al. · 2024

Training gaze following models requires a large number of images with gaze target coordinates annotated by human annotators, which is a laborious and inherently ambiguous process. We propose the first semi-supervised method for gaze follow…

Toward ultra-efficient high fidelity predictions of wind turbine wakes: Augmenting the accuracy of engineering models via LES-trained machine learning Open

Christian Santoni, Dichang Zhang, Zexia Zhang, Dimitris Samaras, Fotis Sotiropoulos , et al. · 2024

This study proposes a novel machine learning (ML) methodology for the efficient and cost-effective prediction of high-fidelity three-dimensional velocity fields in the wake of utility-scale turbines. The model consists of an auto-encoder c…

MI-NeRF: Learning a Single Face NeRF from Multiple Identities Open

Aggelina Chatziagapi, Grigorios G. Chrysos, Dimitris Samaras · 2024

In this work, we introduce a method that learns a single dynamic neural radiance field (NeRF) from monocular talking face videos of multiple identities. NeRFs have shown remarkable results in modeling the 4D dynamics and appearance of huma…

Self-supervised co-salient object detection via feature correspondence at multiple scales Open

Souradeep Chakraborty, Dimitris Samaras · 2024

Our paper introduces a novel two-stage self-supervised approach for detecting co-occurring salient objects (CoSOD) in image groups without requiring segmentation annotations. Unlike existing unsupervised methods that rely solely on patch-l…

Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos Open

Alfredo Rivero, ShahRukh Athar, Zhixin Shu, Dimitris Samaras · 2024

Creating controllable 3D human portraits from casual smartphone videos is highly desirable due to their immense value in AR/VR applications. The recent development of 3D Gaussian Splatting (3DGS) has shown improvements in rendering quality…

SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology Open

Saarthak Kapse, Pushpak Pati, Srijan Das, Jingwei Zhang, Chao Chen , et al. · 2023

Introducing interpretability and reasoning into Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) analysis is challenging, given the complexity of gigapixel slides. Traditionally, MIL interpretability is limited to ident…

MORCIC: Model Order Reduction Techniques for Electromagnetic Models of Integrated Circuits Open

Dimitrios Garyfallou, Athanasios Stefanou, Christos Giamouzis, Moschos Antoniadis, Georgios Chararas , et al. · 2023

Model order reduction (MOR) is crucial for the design process of integrated circuits. Specifically, the vast amount of passive RLCk elements in electromagnetic models extracted from physical layouts exacerbates the extraction time, the sto…

Unsupervised and semi-supervised co-salient object detection via segmentation frequency statistics Open

Souradeep Chakraborty, Shujon Naha, Muhammet Baştan, Amit Kumar K C, Dimitris Samaras · 2023

In this paper, we address the detection of co-occurring salient objects (CoSOD) in an image group using frequency statistics in an unsupervised manner, which further enable us to develop a semi-supervised method. While previous works have …

A systematic study of key elements underlying molecular property prediction Open

Jianyuan Deng, Zhibo Yang, Hehe Wang, Iwao Ojima, Dimitris Samaras , et al. · 2023

Dimitris Samaras YOU? Author Swipe