Keyu Wen
YOU?
Author Swipe
View article: Vision-Language Matching for Text-to-Image Synthesis via Generative Adversarial Networks
Vision-Language Matching for Text-to-Image Synthesis via Generative Adversarial Networks Open
Text-to-image synthesis aims to generate a photo-realistic and semantic consistent image from a specific text description. The images synthesized by off-the-shelf models usually contain limited components compared with the corresponding im…
View article: A Unified Two-Stage Group Semantics Propagation and Contrastive Learning Network for Co-Saliency Detection
A Unified Two-Stage Group Semantics Propagation and Contrastive Learning Network for Co-Saliency Detection Open
Co-saliency detection (CoSOD) aims at discovering the repetitive salient objects from multiple images. Two primary challenges are group semantics extraction and noise object suppression. In this paper, we present a unified Two-stage grOup …
View article: Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation Learning and Retrieval
Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation Learning and Retrieval Open
Recently, the cross-modal pre-training task has been a hotspot because of its wide application in various down-streaming researches including retrieval, captioning, question answering and so on. However, exiting methods adopt a one-stream …
View article: Learning Dual Semantic Relations With Graph Attention for Image-Text Matching
Learning Dual Semantic Relations With Graph Attention for Image-Text Matching Open
Image-Text Matching is one major task in cross-modal information processing.\nThe main challenge is to learn the unified visual and textual representations.\nPrevious methods that perform well on this task primarily focus on not only the\n…