Explanipedia

World Simulation with Video Foundation Models for Physical AI Open

Nvidia Nvidia, NULL AUTHOR_ID, Adnan Ali, Junjie Bai, Madhu Bala , et al. · 2025

We introduce [Cosmos-Predict2.5], the latest generation of the Cosmos World Foundation Models for Physical AI. Built on a flow-based architecture, [Cosmos-Predict2.5] unifies Text2World, Image2World, and Video2World generation in a single …

4DSloMo: 4D Reconstruction for High Speed Scene with Asynchronous Capture Open

Yutian Chen, Shi Guo, Tianshuo Yang, Lihe Ding, Xiuyuan Yu , et al. · 2025

Reconstructing fast-dynamic scenes from multi-view videos is crucial for high-speed motion analysis and realistic 4D reconstruction. However, the majority of 4D capture systems are limited to frame rates below 30 FPS (frames per second), a…

ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary Open

Zeqi Gu, Yin Cui, Zhaoshuo Li, Fangyin Wei, Yunhao Ge , et al. · 2025

Designing 3D scenes is traditionally a challenging task that demands both artistic expertise and proficiency with complex software. Recent advances in text-to-3D generation have greatly simplified this process by letting users create scene…

Parallel Sequence Modeling via Generalized Spatial Propagation Network Open

Hongjun Wang, Wonmin Byeon, Jiarui Xu, Jinwei Gu, Ka Chun Cheung , et al. · 2025

We present the Generalized Spatial Propagation Network (GSPN), a new attention mechanism optimized for vision tasks that inherently captures 2D spatial structures. Existing attention models, including transformers, linear attention, and st…

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images Open

Lingen Li, Zhaoyang Zhang, Yaowei Li, Jiale Xu, Xiaoyu Li , et al. · 2024

Recent advancements in generative models have significantly improved novel view synthesis (NVS) from multi-view data. However, existing methods depend on external multi-view alignment processes, such as explicit pose estimation or pre-reco…

From Sim-to-Real: Toward General Event-based Low-light Frame Interpolation with Per-scene Optimization Open

Ziran Zhang, Yongrui Ma, Yueting Chen, Feng Zhang, Jinwei Gu , et al. · 2024

Video Frame Interpolation (VFI) is important for video enhancement, frame\nrate up-conversion, and slow-motion generation. The introduction of event\ncameras, which capture per-pixel brightness changes asynchronously, has\nsignificantly en…

AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection Open

Yujin Wang, Tianyi Xu, Fan Zhang, Tianfan Xue, Jinwei Gu · 2024

Image Signal Processors (ISPs) convert raw sensor signals into digital images, which significantly influence the image quality and the performance of downstream computer vision tasks. Designing ISP pipeline and tuning ISP parameters are tw…

DualDn: Dual-domain Denoising via Differentiable ISP Open

Ruikang Li, Yujin Wang, Shiqi Chen, Fan Zhang, Jinwei Gu , et al. · 2024

Image denoising is a critical component in a camera's Image Signal Processing (ISP) pipeline. There are two typical ways to inject a denoiser into the ISP pipeline: applying a denoiser directly to captured raw frames (raw domain) or to the…

PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging Open

Xin Cai, Zhiyuan You, Hailong Zhang, Wentao Liu, Jinwei Gu , et al. · 2024

Lensless cameras offer significant advantages in size, weight, and cost compared to traditional lens-based systems. Without a focusing lens, lensless cameras rely on computational algorithms to recover the scenes from multiplexed measureme…

System Structural Error Analysis in Binocular Vision Measurement Systems Open

Miao Yang, Yuquan Qiu, Xinyu Wang, Jinwei Gu, Perry Xiao · 2024

A binocular stereo vision measurement system is widely used in fields such as industrial inspection and marine engineering due to its high accuracy, low cost, and ease of deployment. An unreasonable structural design can lead to difficulti…

Compact Nd: YVO₄ laser system based on Vapor chamber passive cooling techniques Open

Zhuoran Li, Jun Yu, Ke Zhan, Jinwei Gu, Qingyue Cui , et al. · 2024

A compact Nd: YVO₄ laser system based on vapor chamber passive cooling technique has been developed and explored for the first time to our best knowledge. An average power of 8.84 W with beam quality of M² < 2.2 and slope efficiency of 44%…

Compact Nd: YVO₄ laser system based on Vapor chamber passive cooling techniques Open

Zhuoran Li, Jun Yu, Ke Zhan, Jinwei Gu, Qingyue Cui , et al. · 2024

A compact Nd: YVO₄ laser system based on vapor chamber passive cooling technique has been developed and explored for the first time to our best knowledge. An average power of 8.84 W with beam quality of M² < 2.2 and slope efficiency of 44%…

Matting by Generation Open

Zhixiang Wang, Baiang Li, Jian Wang, Yu-Lun Liu, Jinwei Gu , et al. · 2024

This paper introduces an innovative approach for image matting that redefines\nthe traditional regression-based task as a generative modeling challenge. Our\nmethod harnesses the capabilities of latent diffusion models, enriched with\nexte…

LenslessFace: An End-to-End Optimized Lensless System for Privacy-Preserving Face Verification Open

Xin Cai, Hailong Zhang, Chenchen Wang, Wentao Liu, Jinwei Gu , et al. · 2024

Lensless cameras, innovatively replacing traditional lenses for ultra-thin, flat optics, encode light directly onto sensors, producing images that are not immediately recognizable. This compact, lightweight, and cost-effective imaging solu…

Learning-based lens wavefront aberration recovery Open

Li‐Qun Chen, Yuyao Hu, Jiewen Nie, Tianfan Xue, Jinwei Gu · 2024

Wavefront aberration describes the deviation of a wavefront in an imaging system from a desired perfect shape, such as a plane or a sphere, which may be caused by a variety of factors, such as imperfections in optical equipment, atmospheri…

Cached Transformers: Improving Transformers with Differentiable Memory Cachde Open

Zhaoyang Zhang, Wenqi Shao, Yixiao Ge, Xiaogang Wang, Jinwei Gu , et al. · 2024

This work introduces a new Transformer model called Cached Transformer, which uses Gated Recurrent Cached (GRC) attention to extend the self-attention mechanism with a differentiable memory cache of tokens. GRC attention enables attending …

HDRFlow: Real-Time HDR Video Reconstruction with Large Motions Open

Gangwei Xu, Yujin Wang, Jinwei Gu, Tianfan Xue, Xin Yang · 2024

Reconstructing High Dynamic Range (HDR) video from image sequences captured with alternating exposures is challenging, especially in the presence of large camera or object motion. Existing methods typically align low dynamic range sequence…

Event-Based Motion Magnification Open

Yutian Chen, Shi Guo, Fangzheng Yu, Feng Zhang, Jinwei Gu , et al. · 2024

Detecting and magnifying imperceptible high-frequency motions in real-world scenarios has substantial implications for industrial and medical applications. These motions are characterized by small amplitudes and high frequencies. Tradition…

Cached Transformers: Improving Transformers with Differentiable Memory Cache Open

Zhaoyang Zhang, Wenqi Shao, Yixiao Ge, Xiaogang Wang, Jinwei Gu , et al. · 2023

This work introduces a new Transformer model called Cached Transformer, which uses Gated Recurrent Cached (GRC) attention to extend the self-attention mechanism with a differentiable memory cache of tokens. GRC attention enables attending …

AutoDIR: Automatic All-in-One Image Restoration with Latent Diffusion Open

Yitong Jiang, Zhaoyang Zhang, Tianfan Xue, Jinwei Gu · 2023

We present AutoDIR, an innovative all-in-one image restoration system incorporating latent diffusion. AutoDIR excels in its ability to automatically identify and restore images suffering from a range of unknown degradations. AutoDIR offers…

Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image Denoising Open

Yujin Wang, Lingen Li, Tianfan Xue, Jinwei Gu · 2023

Image denoising is a fundamental and challenging task in the field of computer vision. Most supervised denoising methods learn to reconstruct clean images from noisy inputs, which have intrinsic spectral bias and tend to produce over-smoot…

Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration Open

Kechun Liu, Yitong Jiang, Inchang Choi, Jinwei Gu · 2023

Recent work on discrete generative priors, in the form of codebooks, has shown exciting performance for image reconstruction and restoration, as the discrete prior space spanned by the codebooks increases the robustness against diverse ima…

MIPI 2023 Challenge on Nighttime Flare Removal: Methods and Results Open

Yuekun Dai, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Qingpeng Zhu , et al. · 2023

Developing and integrating advanced image sensors with novel algorithms in camera systems are prevalent with the increasing demand for computational photography and imaging on mobile platforms. However, the lack of high-quality data for re…

MIPI 2023 Challenge on RGB+ToF Depth Completion: Methods and Results Open

Qingpeng Zhu, Wenxiu Sun, Yuekun Dai, Chongyi Li, Shangchen Zhou , et al. · 2023

Depth completion from RGB images and sparse Time-of-Flight (ToF) measurements is an important problem in computer vision and robotics. While traditional methods for depth completion have relied on stereo vision or structured light techniqu…

MIPI 2023 Challenge on RGBW Fusion: Methods and Results Open

Qianhui Sun, Qingyu Yang, Chongyi Li, Shangchen Zhou, Ruicheng Feng , et al. · 2023

Developing and integrating advanced image sensors with novel algorithms in camera systems are prevalent with the increasing demand for computational photography and imaging on mobile platforms. However, the lack of high-quality data for re…

MIPI 2023 Challenge on RGBW Remosaic: Methods and Results Open

Qianhui Sun, Qingyu Yang, Yu Li, Shangchen Zhou, Ruicheng Feng , et al. · 2023

Developing and integrating advanced image sensors with novel algorithms in camera systems are prevalent with the increasing demand for computational photography and imaging on mobile platforms. However, the lack of high-quality data for re…

Generating Aligned Pseudo-Supervision from Non-Aligned Data for Image Restoration in Under-Display Camera Open

Ruicheng Feng, Chongyi Li, Huaijin Chen, Shuai Li, Jinwei Gu , et al. · 2023

Due to the difficulty in collecting large-scale and perfectly aligned paired training data for Under-Display Camera (UDC) image restoration, previous methods resort to monitor-based image systems or simulation-based methods, sacrificing th…

Random Weights Networks Work as Loss Prior Constraint for Image Restoration Open

Man Zhou, Naishan Zheng, Jie Huang, Xiangyu Rui, Chunle Guo , et al. · 2023

In this paper, orthogonal to the existing data and model studies, we instead resort our efforts to investigate the potential of loss function in a new perspective and present our belief ``Random Weights Networks can Be Acted as Loss Prior …

Real-time Controllable Denoising for Image and Video Open

Zhaoyang Zhang, Yitong Jiang, Wenqi Shao, Xiaogang Wang, Ping Luo , et al. · 2023

Controllable image denoising aims to generate clean samples with human perceptual priors and balance sharpness and smoothness. In traditional filter-based denoising methods, this can be easily achieved by adjusting the filtering strength. …

Overexposure Mask Fusion: Generalizable Reverse ISP Multi-Step Refinement Open

Jinha Kim, Jun Jiang, Jinwei Gu · 2022

With the advent of deep learning methods replacing the ISP in transforming sensor RAW readings into RGB images, numerous methodologies solidified into real-life applications. Equally potent is the task of inverting this process which will …

Jinwei Gu YOU? Author Swipe