Tianyun Zhong
YOU?
Author Swipe
View article: Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction
Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction Open
With the emergence of large language models (LLMs), there is an expectation that LLMs can effectively extract explicit information from complex real-world documents (e.g., papers, reports). However, most LLMs generate paragraph-style answe…
View article: Computational Machine Ethics: A Survey
Computational Machine Ethics: A Survey Open
Computational Machine Ethics (CME) is an interdisciplinary field that integrates moral philosophy into an agent’s decision-making process, contributing to the broader domain of Artificial Intelligence Ethics. Technological advancements hav…
View article: FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation
FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation Open
Diffusion-based audio-driven talking avatar methods have recently gained attention for their high-fidelity, vivid, and expressive results. However, their slow inference speed limits practical applications. Despite the development of variou…
View article: MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes Open
Talking face generation (TFG) aims to animate a target identity's face to create realistic talking videos. Personalized TFG is a variant that emphasizes the perceptual identity similarity of the synthesized result (from the perspective of …
View article: Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Open
With the introduction of diffusion-based video generation techniques, audio-conditioned human video generation has recently achieved significant breakthroughs in both the naturalness of motion and the synthesis of portrait details. Due to …
View article: CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention Open
Diffusion-based video generation technology has advanced significantly, catalyzing a proliferation of research in human animation. However, the majority of these studies are confined to same-modality driving settings, with cross-modality h…
View article: MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices
MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices Open
Existing neural head avatars methods have achieved significant progress in the image quality and motion range of portrait animation. However, these methods neglect the computational overhead, and to the best of our knowledge, none is desig…
View article: Superior and Pragmatic Talking Face Generation with Teacher-Student Framework
Superior and Pragmatic Talking Face Generation with Teacher-Student Framework Open
Talking face generation technology creates talking videos from arbitrary appearance and motion signal, with the "arbitrary" offering ease of use but also introducing challenges in practical applications. Existing methods work well with sta…
View article: Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis Open
One-shot 3D talking portrait generation aims to reconstruct a 3D avatar from an unseen image, and then animate it with a reference video or audio to generate a talking portrait video. The existing methods fail to simultaneously achieve the…
View article: Language Model is a Branch Predictor for Simultaneous Machine Translation
Language Model is a Branch Predictor for Simultaneous Machine Translation Open
The primary objective of simultaneous machine translation (SiMT) is to minimize latency while preserving the quality of the final translation. Drawing inspiration from CPU branch prediction techniques, we propose incorporating branch predi…
View article: Gloss Attention for Gloss-free Sign Language Translation
Gloss Attention for Gloss-free Sign Language Translation Open
Most sign language translation (SLT) methods to date require the use of gloss annotations to provide additional supervision information, however, the acquisition of gloss is not easy. To solve this problem, we first perform an analysis of …
View article: UIRISC at SemEval-2023 Task 10: Explainable Detection of Online Sexism by Ensembling Fine-tuning Language Models
UIRISC at SemEval-2023 Task 10: Explainable Detection of Online Sexism by Ensembling Fine-tuning Language Models Open
Under the umbrella of anonymous social networks, many women have suffered from abuse, discrimination, and other sexist expressions online. However, exsiting methods based on keyword filtering and matching performed poorly on online sexism …