Explanipedia

Evaluating Fisheye-Compatible 3D Gaussian Splatting Methods on Real Images Beyond 180 Degree Field of View Open

Matias Turkulainen, Juho Kannala, Esa Rahtu · 2025

We present the first evaluation of fisheye-based 3D Gaussian Splatting methods, Fisheye-GS and 3DGUT, on real images with fields of view exceeding 180 degree. Our study covers both indoor and outdoor scenes captured with 200 degree fisheye…

Hall_In Indoor Scene for FIORD: A Fisheye Indoor-Outdoor Dataset with LIDAR Ground Truth for 3D Scene Reconstruction and Benchmarking Open

Ulas Gunes, Matias Turkulainen, Xuqian Ren, Arno Solin, Juho Kannala , et al. · 2025

The fifth indoor scene for FIORD dataset.

AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using Smartphones Open

Xuqian Ren, Matias Turkulainen, Jiepeng Wang, Otto Seiskari, Iaroslav Melekhov , et al. · 2024

Computer science Physics

Geometric priors are often used to enhance 3D reconstruction. With many smartphones featuring low-resolution depth sensors and the prevalence of off-the-shelf monocular geometry estimators, incorporating geometric priors as regularization …

L2C -- Learning to Learn to Compress Open

Nannan Zou, Honglei Zhang, Francesco Cricri, Hamed R. Tavakoli, Jani Lainema , et al. · 2024

Computer science Mathematics

In this paper we present an end-to-end meta-learned system for image compression. Traditional machine learning based approaches to image compression train one or more neural network for generalization performance. However, at inference tim…

Temporally Aligned Audio for Video with Autoregression Open

Ilpo Viertola, Vladimir Iashin, Esa Rahtu · 2024

Computer science Mathematics

We introduce V-AURA, the first autoregressive model to achieve high temporal alignment and relevance in video-to-audio generation. V-AURA uses a high-framerate visual feature extractor and a cross-modal audio-visual feature fusion strategy…

UDGS-SLAM : UniDepth Assisted Gaussian Splatting for Monocular SLAM Open

Mostafa Mansour, Ahmed Abdelsalam, Ari Happonen, Jari Porras, Esa Rahtu · 2024

Computer science Physics

Recent advancements in monocular neural depth estimation, particularly those achieved by the UniDepth network, have prompted the investigation of integrating UniDepth within a Gaussian splatting framework for monocular SLAM. This study pre…

DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing Open

Matias Turkulainen, Xuqian Ren, Iaroslav Melekhov, Otto Seiskari, Esa Rahtu , et al. · 2024

Computer science Physics

High-fidelity 3D reconstruction of common indoor scenes is crucial for VR and AR applications. 3D Gaussian splatting, a novel differentiable rendering technique, has achieved state-of-the-art novel view synthesis results with high renderin…

Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion Open

Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo, Pekka Rantalankila, Matias Turkulainen , et al. · 2024

Computer science Geology Physics

High-quality scene reconstruction and novel view synthesis based on Gaussian Splatting (3DGS) typically require steady, high-quality photographs, often impractical to capture with handheld cameras. We present a method that adapts to camera…

GS-Pose: Generalizable Segmentation-based 6D Object Pose Estimation with 3D Gaussian Splatting Open

Dingding Cai, Janne Heikkilä, Esa Rahtu · 2024

Computer science Engineering

This paper introduces GS-Pose, a unified framework for localizing and estimating the 6D pose of novel objects. GS-Pose begins with a set of posed RGB images of a previously unseen object and builds three distinct representations stored in …

Synchformer: Efficient Synchronization from Sparse Cues Open

Vladimir Iashin, Weidi Xie, Esa Rahtu, Andrew Zisserman · 2024

Computer science

Our objective is audio-visual synchronization with a focus on 'in-the-wild' videos, such as those on YouTube, where synchronization cues can be sparse. Our contributions include a novel audio-visual synchronization model, and training that…

NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machines Open

Jukka Ahonen, Nam Le, Honglei Zhang, Antti Hallapuro, Francesco Cricri , et al. · 2024

Computer science Mathematics

The recent progress in artificial intelligence has led to an ever-increasing usage of images and videos by machine analysis algorithms, mainly neural networks. Nonetheless, compression, storage and transmission of media have traditionally …

Cascaded and Generalizable Neural Radiance Fields for Fast View Synthesis Open

Phong Nguyen-Ha, Lam Huynh, Esa Rahtu, Jiřı́ Matas, Janne Heikkilä · 2023

Computer science Geology

We present CG-NeRF, a cascade and generalizable neural radiance fields method for view synthesis. Recent generalizing view synthesis methods can render high-quality novel views using a set of nearby input views. However, the rendering spee…

MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis (iPhone Part 3) Open

Xuqian Ren, Wenjia Wang, Dingding Cai, Tuuli Tuominen, Juho Kannala , et al. · 2023

Computer science Biology Chemistry

Metaverse technologies demand accurate, real-time, and immersive modeling on consumer-grade hardware for both non-human perception (e.g., drone/robot/autonomous car navigation) and immersive technologies like AR/VR, requiring both structur…

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation Open

Mayu Otani, Riku Togashi, Yu Sawai, Ryosuke Ishigami, Yuta Nakashima , et al. · 2023

Computer science Psychology Medicine

Human evaluation is critical for validating the performance of text-to-image generative models, as this highly cognitive process requires deep comprehension of text and images. However, our survey of 37 recent papers reveals that many work…

FinnWoodlands Dataset Open

Juan Lagos, Urho Lempiö, Esa Rahtu · 2023

Computer science Geography Mathematics

While the availability of large and diverse datasets has contributed to significant breakthroughs in autonomous driving and indoor applications, forestry applications are still lagging behind and new forest datasets would most certainly co…

MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose Estimation Open

Dingding Cai, Janne Heikkilä, Esa Rahtu · 2023

Computer science Mathematics

Acquiring labeled 6D poses from real images is an expensive and time-consuming task. Though massive amounts of synthetic RGB images are easy to obtain, the models trained on them suffer from noticeable performance degradation due to the sy…

BS3D: Building-scale 3D Reconstruction from RGB-D Images Open

Janne Mustaniemi, Juho Kannala, Esa Rahtu, Li Liu, Janne Heikkilä · 2023

Computer science Geography

Various datasets have been proposed for simultaneous localization and mapping (SLAM) and related problems. Existing datasets often include small environments, have incomplete ground truth, or lack important sensor data, such as depth and i…

PanDepth: Joint Panoptic Segmentation and Depth Completion Open

Juan Lagos, Esa Rahtu · 2022

Computer science Engineering Political science

Understanding 3D environments semantically is pivotal in autonomous driving applications where multiple computer vision tasks are involved. Multi-task models provide different types of outputs for a given scene, yielding a more holistic re…

Bridging the Gap Between Image Coding for Machines and Humans Open

Nam Le, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed R. Tavakoli , et al. · 2022

Computer science Mathematics

Image coding for machines (ICM) aims at reducing the bitrate required to represent an image while minimizing the drop in machine vision analysis accuracy. In many use cases, such as surveillance, it is also important that the visual qualit…

Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors Open

Vladimir Iashin, Weidi Xie, Esa Rahtu, Andrew Zisserman · 2022

Computer science Physics

The objective of this paper is audio-visual synchronisation of general videos 'in the wild'. For such videos, the events that may be harnessed for synchronisation cues may be spatially small and may occur only infrequently during a many se…

Esa Rahtu YOU? Author Swipe