Explanipedia

The Role of Video Generation in Enhancing Data-Limited Action Understanding Open

Wei Li, Dezhao Luo, Dongbao Yang, Z.C. Li, Weiping Wang , et al. · 2025

Video action understanding tasks in real-world scenarios always suffer data limitations. In this paper, we address the data-limited action understanding problem by bridging data scarcity. We propose a novel method that employs a text-to-vi…

DCA: Dividing and Conquering Amnesia in Incremental Object Detection Open

Aoting Zhang, Dongbao Yang, Chang Liu, Xiaopeng Hong, Miwei Shang , et al. · 2025

Computer science Psychology

Incremental object detection (IOD) aims to cultivate an object detector that can continuously localize and recognize novel classes while preserving its performance on previous classes. Existing methods achieve certain success by improving …

Specifying What You Know or Not for Multi-Label Class-Incremental Learning Open

Aoting Zhang, Dongbao Yang, Chang Liu, Xiaopeng Hong, Yu Zhou · 2025

Computer science Psychology

Existing class incremental learning is mainly designed for single-label classification task, which is ill-equipped for multi-label scenarios due to the inherent contradiction of learning objectives for samples with incomplete labels. We ar…

Arbitrary Reading Order Scene Text Spotter with Local Semantics Guidance Open

Jiahao Lyu, Wei Wang, Dongbao Yang, Jinwen Zhong, Zhou Yu · 2025

Computer science Philosophy Economics

Scene text spotting has attracted the enthusiasm of relative researchers in recent years. Most existing scene text spotters follow the detection-then-recognition paradigm, where the vanilla detection module hardly determines the reading or…

DCA: Dividing and Conquering Amnesia in Incremental Object Detection Open

Aoting Zhang, Dongbao Yang, Chang Liu, Xiaopeng Hong, Miwei Shang , et al. · 2025

Incremental object detection (IOD) aims to cultivate an object detector that can continuously localize and recognize novel classes while preserving its performance on previous classes. Existing methods achieve certain success by improving …

Robust Multimodal Sentiment Analysis of Image-Text Pairs by Distribution-Based Feature Recovery and Fusion Open

Daiqing Wu, Dongbao Yang, Yu Zhou, Can Ma · 2024

Computer science Philosophy

As posts on social media increase rapidly, analyzing the sentiments embedded in image-text pairs has become a popular research topic in recent years. Although existing works achieve impressive accomplishments in simultaneously harnessing i…

Bridging Visual Affective Gap: Borrowing Textual Knowledge by Learning from Noisy Image-Text Pairs Open

Daiqing Wu, Dongbao Yang, Yu Zhou, Can Ma · 2024

Computer science Psychology

Visual emotion recognition (VER) is a longstanding field that has garnered increasing attention with the advancement of deep neural networks. Although recent studies have achieved notable improvements by leveraging the knowledge embedded w…

First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending Open

Z.C. Li, Shu Yan, Wei Zeng, Dongbao Yang, Yu Zhou · 2024

Computer science

Diffusion models, known for their impressive image generation abilities, have played a pivotal role in the rise of visual text generation. Nevertheless, existing visual text generation methods often focus on generating entire images with t…

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control Open

Wei Zeng, Shu Yan, Zhenhang Li, Dongbao Yang, Yu Zhou · 2024

Computer science Physics

Centred on content modification and style preservation, Scene Text Editing (STE) remains a challenging task despite considerable progress in text-to-image synthesis and text-driven image manipulation recently. GAN-based STE methods general…

First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending Open

Z.C. Li, Shu Yan, Wei Zeng, Dongbao Yang, Yu Zhou · 2024

Computer science

Diffusion models, known for their impressive image generation abilities, have played a pivotal role in the rise of visual text generation. Nevertheless, existing visual text generation methods often focus on generating entire images with t…

Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval Open

Gangyan Zeng, Yuan Zhang, Jin Wei, Dongbao Yang, Peng Zhang , et al. · 2024

Computer science Physics

Scene text retrieval aims to find all images containing the query text from an image gallery. Current efforts tend to adopt an Optical Character Recognition (OCR) pipeline, which requires complicated text detection and/or recognition proce…

Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition Open

Daiqing Wu, Dongbao Yang, Huawen Shen, Can Ma, Yu Zhou · 2024

Computer science Chemistry

With the proliferation of social media posts in recent years, the need to detect sentiments in multimodal (image-text) content has grown rapidly. Since posts are user-generated, the image and text from the same post can express different o…

Pseudo Object Replay and Mining for Incremental Object Detection Open

Dongbao Yang, Yu Zhou, Xiaopeng Hong, Aoting Zhang, Xin Wei , et al. · 2023

Computer science Philosophy

Incremental object detection (IOD) aims to mitigate catastrophic forgetting for object detectors when incrementally learning to detect new emerging object classes without using original training data. Most existing IOD methods benefit from…

Perceiving Ambiguity and Semantics without Recognition: An Efficient and Effective Ambiguous Scene Text Detector Open

Yan Shu, Wei Wang, Yu Zhou, Shaohui Liu, Aoting Zhang , et al. · 2023

Computer science Psychology

Ambiguous scene text detection is an extremely challenging task. Existing text detectors that rely solely on visual cues often suffer from confusion due to being evenly distributed in rows/columns or incomplete detection owing to large cha…

One-Shot Replay: Boosting Incremental Object Detection via Retrospecting One Object Open

Dongbao Yang, Yu Zhou, Xiaopeng Hong, Aoting Zhang, Weiping Wang · 2023

Computer science Philosophy

Modern object detectors are ill-equipped to incrementally learn new emerging object classes over time due to the well-known phenomenon of catastrophic forgetting. Due to data privacy or limited storage, few or no images of the old data can…

Masked and Permuted Implicit Context Learning for Scene Text Recognition Open

Xiaomeng Yang, Zhi Qiao, Wei Jin, Yu Zhou, Ye Yuan , et al. · 2023

Computer science Economics Geography

Scene Text Recognition (STR) is difficult because of the variations in text styles, shapes, and backgrounds. Though the integration of linguistic information enhances models' performance, existing methods based on either permuted language …

Multi-View Correlation Distillation for Incremental Object Detection Open

Dongbao Yang, Yu Zhou, Weiping Wang · 2021

Computer science Chemistry Philosophy

In real applications, new object classes often emerge after the detection model has been trained on a prepared dataset with fixed classes. Due to the storage burden and the privacy of old data, sometimes it is impractical to train the mode…

Two-Level Residual Distillation based Triple Network for Incremental Object Detection Open

Dongbao Yang, Yu Zhou, Dayan Wu, Can Ma, Fei Yang , et al. · 2020

Computer science Biology Chemistry

Modern object detection methods based on convolutional neural network suffer from severe catastrophic forgetting in learning new classes without original data. Due to time consumption, storage burden and privacy of old data, it is inadvisa…

Self-Training for Domain Adaptive Scene Text Detection Open

Yudi Chen, Wei Wang, Yu Zhou, Fei Yang, Dongbao Yang , et al. · 2020

Computer science Mathematics

Though deep learning based scene text detection has achieved great progress, well-trained detectors suffer from severe performance degradation for different domains. In general, a tremendous amount of data is indispensable to train the det…

SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition Open

Zhi Qiao, Yu Zhou, Dongbao Yang, Yucan Zhou, Weiping Wang · 2020

Computer science Geography

Scene text recognition is a hot research topic in computer vision. Recently, many recognition methods based on the encoder-decoder framework have been proposed, and they can handle scene texts of perspective distortion and curve shape. Nev…

Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning Open

Dezhao Luo, Chang Liu, Yu Zhou, Dongbao Yang, Can Ma , et al. · 2020

Computer science Mathematics Political science

We propose a novel self-supervised method, referred to as Video Cloze Procedure (VCP), to learn rich spatial-temporal representations. VCP first generates “blanks” by withholding video clips and then creates “options” by applying spatio-te…

Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning Open

Dezhao Luo, Chang Liu, Yu Zhou, Dongbao Yang, Can Ma , et al. · 2020

Computer science Mathematics Political science

We propose a novel self-supervised method, referred to as Video Cloze Procedure (VCP), to learn rich spatial-temporal representations. VCP first generates "blanks" by withholding video clips and then creates "options" by applying spatio-te…

Curved Text Detection in Natural Scene Images with Semi- and Weakly-Supervised Learning Open

Xugong Qin, Yu Zhou, Dongbao Yang, Weiping Wang · 2019

Computer science Mathematics

Detecting curved text in the wild is very challenging. Recently, most state-of-the-art methods are segmentation based and require pixel-level annotations. We propose a novel scheme to train an accurate text detector using only a small amou…

Dongbao Yang YOU? Author Swipe