Explanipedia

Specific Scenario Generation Method for Trustworthiness Testing of Autonomous Vehicles Based on Interaction Coding Open

Yuntao Chang, Chenyun Xi, Zhengxiong Luo · 2025

In response to the problems of rough modeling and insufficient coverage of edge interaction scenarios in autonomous driving tests, this paper proposes a scene generation method based on interaction coding. The method constructs a hierarchi…

Seedream 4.0: Toward Next-generation Multimodal Image Generation Open

Team Seedream, NULL AUTHOR_ID, Yunpeng Chen, Yu Gao, Lixue Gong , et al. · 2025

We introduce Seedream 4.0, an efficient and high-performance multimodal image generation system that unifies text-to-image (T2I) synthesis, image editing, and multi-image composition within a single framework. We develop a highly efficient…

Rolling Bearing Life Prediction Based on Improved Transformer Encoding Layer and Multi-Scale Convolution Open

Zhengxiong Luo, Zhihai Wang, Xiaoqin Liu, Yingming Yang · 2025

To accurately and reliably characterize the degradation trend of rolling bearings and predict their life cycle, this paper proposes a bearing life prediction model based on an improved transformer encoder layer and multi-scale convolution.…

Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation Open

Chao‐Yaug Liao, Liyang Liu, Xun Wang, Zhengxiong Luo, Xinyu Zhang , et al. · 2025

Recent progress in unified models for image understanding and generation has been impressive, yet most approaches remain limited to single-modal generation conditioned on multiple modalities. In this paper, we present Mogao, a unified fram…

Autoregressive Video Generation without Vector Quantization Open

Haoge Deng, Ting Pan, Haiwen Diao, Zhengxiong Luo, Yufeng Cui , et al. · 2024

Computer science Mathematics Chemistry

This paper presents a novel approach that enables autoregressive video generation with high efficiency. We propose to reformulate the video generation problem as a non-quantized autoregressive modeling of temporal frame-by-frame prediction…

You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale Open

Baorui Ma, Huachen Gao, Haoge Deng, Zhengxiong Luo, Tiejun Huang , et al. · 2024

Computer science Geography

Recent 3D generation models typically rely on limited-scale 3D `gold-labels' or 2D diffusion priors for 3D content creation. However, their performance is upper-bounded by constrained 3D priors due to the lack of scalable learning paradigm…

Emu3: Next-Token Prediction is All You Need Open

Xinlong Wang, Xiaosong Zhang, Zhengxiong Luo, Quan Sun, Yufeng Cui , et al. · 2024

Computer science

While next-token prediction is considered a promising path towards artificial general intelligence, it has struggled to excel in multimodal tasks, which are still dominated by diffusion models (e.g., Stable Diffusion) and compositional app…

LawLuo: A Multi-Agent Collaborative Framework for Multi-Round Chinese Legal Consultation Open

Jian Sun, Chengxiao Dai, Zhengxiong Luo, Yangbo Chang, Yang Li · 2024

Business Computer science

Legal Large Language Models (LLMs) have shown promise in providing legal consultations to non-experts. However, most existing Chinese legal consultation models are based on single-agent systems, which differ from real-world legal consultat…

Generative Multimodal Models are In-Context Learners Open

Quan Sun, Yufeng Cui, Xiaosong Zhang, Fan Zhang, Qiying Yu , et al. · 2023

Computer science Biology Economics

The human ability to easily solve multimodal tasks in context (i.e., with only a few demonstrations or simple instructions), is what current multimodal systems have largely struggled to imitate. In this work, we demonstrate that the task-a…

Notice of Removal: VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation Open

Zhengxiong Luo, Dayou Chen, Yingya Zhang, Yan Huang, Liang Wang , et al. · 2023

Computer science Physics

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distributi…

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation Open

Zhengxiong Luo, Dayou Chen, Yingya Zhang, Yan Huang, Liang Wang , et al. · 2023

Computer science Physics

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distributi…

Learning the Degradation Distribution for Blind Image Super-Resolution Open

Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, Tieniu Tan · 2022

Computer science Mathematics Philosophy

Synthetic high-resolution (HR) \& low-resolution (LR) pairs are widely used in existing super-resolution (SR) methods. To avoid the domain gap between synthetic and test images, most previous methods try to adaptively learn the synthesizin…

DFAN: Dual Feature Aggregation Network for Lightweight Image Super‐Resolution Open

Li Shang, Guixuan Zhang, Zhengxiong Luo, Jie Liu · 2022

Computer science Political science Art

With the power of deep learning, super‐resolution (SR) methods enjoy a dramatic boost in performance. However, they usually have a large model size and high computational complexity, which hinders the application in devices with limited me…

Approaching the Limit of Image Rescaling via Flow Guidance Open

Shang Li, Guixuan Zhang, Zhengxiong Luo, Jie Liu, Zhi Zeng , et al. · 2021

Computer science Mathematics Biology

Image downscaling and upscaling are two basic rescaling operations. Once the image is downscaled, it is difficult to be reconstructed via upscaling due to the loss of information. To make these two processes more compatible and improve the…

Adaptive Dilated Convolution For Human Pose Estimation Open

Zhengxiong Luo, Zhicheng Wang, Yan Huang, Liang Wang, Tieniu Tan , et al. · 2021

Computer science Mathematics Physics

Most existing human pose estimation (HPE) methods exploit multi-scale information by fusing feature maps of four different spatial sizes, \ie $1/4$, $1/8$, $1/16$, and $1/32$ of the input image. There are two drawbacks of this strategy: 1)…

End-to-end Alternating Optimization for Blind Super Resolution Open

Zhengxiong Luo, Yan Huang, Li Shang, Liang Wang, Tieniu Tan · 2021

Computer science Mathematics

Previous methods decompose the blind super-resolution (SR) problem into two sequential steps: \textit{i}) estimating the blur kernel from given low-resolution (LR) image and \textit{ii}) restoring the SR image based on the estimated kernel…

Rethinking the Heatmap Regression for Bottom-up Human Pose Estimation Open

Zhengxiong Luo, Zhicheng Wang, Yan Huang, Tieniu Tan, Erjin Zhou · 2020

Computer science Mathematics Physics

Heatmap regression has become the most prevalent choice for nowadays human pose estimation methods. The ground-truth heatmaps are usually constructed via covering all skeletal keypoints by 2D gaussian kernels. The standard deviations of th…

Efficient Human Pose Estimation by Learning Deeply Aggregated Representations Open

Zhengxiong Luo, Zhicheng Wang, Yuanhao Cai, Guan-An Wang, Yan Huang , et al. · 2020

Computer science Mathematics Physics

In this paper, we propose an efficient human pose estimation network (DANet) by learning deeply aggregated representations. Most existing models explore multi-scale information mainly from features with different spatial sizes. Powerful mu…

Unfolding the Alternating Optimization for Blind Super Resolution Open

Zhengxiong Luo, Yan Huang, Li Shang, Liang Wang, Tieniu Tan · 2020

Computer science Mathematics

Previous methods decompose blind super resolution (SR) problem into two sequential steps: \textit{i}) estimating blur kernel from given low-resolution (LR) image and \textit{ii}) restoring SR image based on estimated kernel. This two-step …

Learning Delicate Local Representations for Multi-Person Pose Estimation Open

Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du , et al. · 2020

Computer science Biology

In this paper, we propose a novel method called Residual Steps Network (RSN). RSN aggregates features with the same spatial size (Intra-level features) efficiently to obtain delicate local representations, which retain rich low-level spati…

Zhengxiong Luo YOU? Author Swipe