Explanipedia

Faster VGGT with Block-Sparse Global Attention Open

C. Wang, C. Max Schmidt, Jens Piekenbrinck, Bastian Leibe · 2025

Efficient and accurate feed-forward multi-view reconstruction has long been an important task in computer vision. Recent transformer-based models like VGGT and $π^3$ have achieved impressive results with simple architectures, yet they face…

Pretrained Models from "MaskTerial: A Foundation Model for Automated 2D Material Flake Detection" Open

Jan-Lucas Uslu, A. N. Nekrasov, Alexander Hermans, Bernd Beschoten, Bastian Leibe , et al. · 2025

This repo hosts the pretrained model weights from "MaskTerial: A Foundation Model for Automated 2D Material Flake Detection" The models follow the naming scheme "MODELTYPE_MODELNAME_MATERIAL.zip". The Code for the Model is on GitHub: MaskT…

How do Foundation Models Compare to Skeleton-Based Approaches for Gesture Recognition in Human-Robot Interaction? Open

Stephanie Käs, Anton Burenko, Louis Markert, Onur Çulha, D. Mack , et al. · 2025

Gestures enable non-verbal human-robot communication, especially in noisy environments like agile production. Traditional deep learning-based gesture recognition relies on task-specific architectures using images, videos, or skeletal pose …

Systematic Comparison of Projection Methods for Monocular 3D Human Pose Estimation on Fisheye Images Open

Stephanie Käs, S. John Peter, Henrik Thillmann, Anton Burenko, David B. Adrian , et al. · 2025

Fisheye cameras offer robots the ability to capture human movements across a wider field of view (FOV) than standard pinhole cameras, making them particularly useful for applications in human-robot interaction and automotive contexts. Howe…

Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving Open

A. N. Nekrasov, Malcolm Burdorf, Stewart Worrall, Bastian Leibe, Julie Stephany Berrío · 2025

To operate safely, autonomous vehicles (AVs) need to detect and handle unexpected objects or anomalies on the road. While significant research exists for anomaly detection and segmentation in 2D, research progress in 3D is underexplored. E…

Acquisition of high-quality images for camera calibration in robotics applications via speech prompts Open

Timm Linder, Kadir Yılmaz, David B. Adrian, Bastian Leibe · 2025

Accurate intrinsic and extrinsic camera calibration can be an important prerequisite for robotic applications that rely on vision as input. While there is ongoing research on enabling camera calibration using natural images, many systems i…

Panoptic-CUDAL: Rural Australia Point Cloud Dataset in Rainy Conditions Open

Tzu-Yun Tseng, A. N. Nekrasov, Malcolm Burdorf, Bastian Leibe, Julie Stephany Berrío , et al. · 2025

Existing autonomous driving datasets are predominantly oriented towards well-structured urban settings and favourable weather conditions, leaving the complexities of rural environments and adverse weather conditions largely unaddressed. Al…

OCCUQ: Exploring Efficient Uncertainty Quantification for 3D Occupancy Prediction Open

Severin Heidrich, Till Beemelmanns, A. N. Nekrasov, Bastian Leibe, Lutz Eckstein · 2025

Autonomous driving has the potential to significantly enhance productivity and provide numerous societal benefits. Ensuring robustness in these safety-critical systems is essential, particularly when vehicles must navigate adverse weather …

Fine-Tuning Image-Conditional Diffusion Models is Easier than you Think Open

Glenn Garcia, Karim Abou Zeid, Christian Schmidt, Daan de Geus, Alexander Hermans , et al. · 2025

Recent work showed that large diffusion models can be reused as highly precise monocular depth estimators by casting depth estimation as an image-conditional image generation task. While the proposed model achieved state-of-the-art results…

MaskTerial: a foundation model for automated 2D material flake detection Open

Jan-Lucas Uslu, A. N. Nekrasov, Alexander Hermans, Bernd Beschoten, Bastian Leibe , et al. · 2025

MaskTerial is a foundation model for 2D material flake detection that uses synthetic pretraining and uncertainty modeling to enable fast adaptation to new materials with as few as 5–10 images.

Look Gauss, No Pose: Novel View Synthesis using Gaussian Splatting without Accurate Pose Initialization Open

C. Max Schmidt, Jens Piekenbrinck, Bastian Leibe · 2024

3D Gaussian Splatting has recently emerged as a powerful tool for fast and accurate novel-view synthesis from a set of posed input images. However, like most novel-view synthesis approaches, it relies on accurate camera pose information, l…

Interactive4D: Interactive 4D LiDAR Segmentation Open

Ilya Fradlin, Idil Esen Zulfikar, Kadir Yılmaz, Theodora Kontogianni, Bastian Leibe · 2024

Interactive segmentation has an important role in facilitating the annotation process of future LiDAR datasets. Existing approaches sequentially segment individual objects at each LiDAR scan, repeating the process throughout the entire seq…

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Open

Glenn Garcia, Karim Abou Zeid, Christian Schmidt, Daan de Geus, Alexander Hermans , et al. · 2024

Recent work showed that large diffusion models can be reused as highly precise monocular depth estimators by casting depth estimation as an image-conditional image generation task. While the proposed model achieved state-of-the-art results…

OoDIS: Anomaly Instance Segmentation and Detection Benchmark Open

A. N. Nekrasov, Rui Zhou, Miriam Ackermann, Alexander Hermans, Bastian Leibe , et al. · 2024

Safe navigation of self-driving cars and robots requires a precise understanding of their environment. Training data for perception systems cannot cover the wide variety of objects that may appear during deployment. Thus, reliable identifi…

Cyto R-CNN and CytoNuke Dataset: Towards reliable whole-cell segmentation in bright-field histological images Open

Johannes Raufeisen, Kunpeng Xie, Fabian Hörst, Till Braunschweig, Jianning Li , et al. · 2024

Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects Open

Zicong Fan, Takehiko Ohkawa, Linlin Yang, Lin Nie, Zhishan Zhou , et al. · 2024

We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3Dunderstanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion g…

Point-VOS: Pointing Up Video Object Segmentation Open

Idil Esen Zulfikar, Sabarinath Mahadevan, Paul Voigtlaender, Bastian Leibe · 2024

Current state-of-the-art Video Object Segmentation (VOS) methods rely on dense per-object mask annotations both during training and testing. This requires time-consuming and costly video annotation mechanisms. We propose a novel Point-VOS …

An Ordinal Regression Framework for a Deep Learning Based Severity Assessment for Chest Radiographs Open

Patrick Wienholt, Alexander Hermans, Firas Khader, Behrus Puladi, Bastian Leibe , et al. · 2024

This study investigates the application of ordinal regression methods for categorizing disease severity in chest radiographs. We propose a framework that divides the ordinal regression problem into three parts: a model, a target function, …

ControlRoom3D: Room Generation using Semantic Proxy Rooms Open

Jonas Schult, Sam S. Tsai, Lukas Höllein, Bichen Wu, Jialiang Wang , et al. · 2023

Manually creating 3D environments for AR/VR applications is a complex process requiring expert knowledge in 3D modeling software. Pioneering works facilitate this process by generating room meshes conditioned on textual style descriptions.…

BUSSARD -- Better Understanding Social Situations for Autonomous Robot Decision-Making Open

Stefan Schiffer, Astrid M. Rosenthal‐von der Pütten, Bastian Leibe · 2023

We report on our effort to create a corpus dataset of different social context situations in an office setting for further disciplinary and interdisciplinary research in computer vision, psychology, and human-robot-interaction. For social …

Mask4Former: Mask Transformer for 4D Panoptic Segmentation Open

Kadir Yılmaz, Jonas Schult, Alexey Nekrasov, Bastian Leibe · 2023

Accurately perceiving and tracking instances over time is essential for the decision-making processes of autonomous agents interacting safely in dynamic environments. With this intention, we propose Mask4Former for the challenging task of …

Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis Open

Jonathon Luiten, Georgios Kopanas, Bastian Leibe, Deva Ramanan · 2023

We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements. We follow an analysis-by-synthesis framework, inspired by recent work…

UGainS: Uncertainty Guided Anomaly Instance Segmentation Open

A. N. Nekrasov, Alexander Hermans, Lars Kuhnert, Bastian Leibe · 2023

A single unexpected object on the road can cause an accident or may lead to injuries. To prevent this, we need a reliable mechanism for finding anomalous objects on the road. This task, called anomaly segmentation, can be a stepping stone …

AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation Open

Yuanwen Yue, Sabarinath Mahadevan, Jonas Schult, Francis Engelmann, Bastian Leibe , et al. · 2023

During interactive segmentation, a model and a user work together to delineate objects of interest in a 3D point cloud. In an iterative process, the model assigns each data point to an object (or the background), while the user corrects er…

DynaMITe: Dynamic Query Bootstrapping for Multi-object Interactive Segmentation Transformer Open

Amit Kumar Rana, Sabarinath Mahadevan, Alexander Hermans, Bastian Leibe · 2023

Most state-of-the-art instance segmentation methods rely on large amounts of pixel-precise ground-truth annotations for training, which are expensive to create. Interactive segmentation networks help generate such annotations based on an i…

Point2Vec for Self-Supervised Representation Learning on Point Clouds Open

Karim Abou Zeid, Jonas Schult, Alexander Hermans, Bastian Leibe · 2023

Recently, the self-supervised learning framework data2vec has shown inspiring performance for various modalities using a masked student-teacher approach. However, it remains open whether such a framework generalizes to the unique challenge…

TarViS: A Unified Approach for Target-based Video Segmentation Open

Ali Athar, Alexander Hermans, Jonathon Luiten, Deva Ramanan, Bastian Leibe · 2023

The general domain of video segmentation is currently fragmented into different tasks spanning multiple benchmarks. Despite rapid progress in the state-of-the-art, current methods are overwhelmingly task-specific and cannot conceptually ge…

3D Segmentation of Humans in Point Clouds with Synthetic Data Open

Ayça Takmaz, Jonas Schult, Irem Kaftan, Mertcan Akçay, Bastian Leibe , et al. · 2023

Segmenting humans in 3D indoor scenes has become increasingly important with the rise of human-centered robotics and AR/VR applications. To this end, we propose the task of joint 3D human semantic segmentation, instance segmentation and mu…

Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats Open

István Sárándi, Alexander Hermans, Bastian Leibe · 2022

Deep learning-based 3D human pose estimation performs best when trained on large amounts of labeled data, making combined learning from many datasets an important research direction. One obstacle to this endeavor are the different skeleton…

Pedestrian-Robot Interactions on Autonomous Crowd Navigation: Reactive Control Methods and Evaluation Metrics Open

Diego Páez-Granados, Yujie He, David Gonon, Dan Jia, Bastian Leibe , et al. · 2022

Autonomous navigation in highly populated areas remains a challenging task for robots because of the difficulty in guaranteeing safe interactions with pedestrians in unstructured situations. In this work, we present a crowd navigation cont…

Bastian Leibe YOU? Author Swipe