Feilong Chen
YOU?
Author Swipe
View article: A time series algorithm to predict surgery in neonatal necrotizing enterocolitis
A time series algorithm to predict surgery in neonatal necrotizing enterocolitis Open
The LSTM model with FL exhibits high precision and recall in forecasting the need for surgical intervention 1 or 2 days ahead. This predictive capability holds promise for enhancing infants' outcomes by facilitating timely clinical decisio…
View article: DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoder
DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoder Open
Generating high-quality and person-generic visual dubbing remains a challenge. Recent innovation has seen the advent of a two-stage paradigm, decoupling the rendering and lip synchronization process facilitated by intermediate representati…
View article: [Recent research on machine learning in the diagnosis and treatment of necrotizing enterocolitis in neonates].
[Recent research on machine learning in the diagnosis and treatment of necrotizing enterocolitis in neonates]. Open
Necrotizing enterocolitis (NEC), with the main manifestations of bloody stool, abdominal distension, and vomiting, is one of the leading causes of death in neonates, and early identification and diagnosis are crucial for the prognosis of N…
View article: VILAS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition
VILAS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition Open
Enhancing automatic speech recognition (ASR) performance by leveraging additional multimodal information has shown promising results in previous studies. However, most of these works have primarily focused on utilizing visual cues derived …
View article: X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages Open
Large language models (LLMs) have demonstrated remarkable language abilities. GPT-4, based on advanced LLMs, exhibits extraordinary multimodal capabilities beyond previous visual language models. We attribute this to the use of more advanc…
View article: Relationship between depression and lifestyle factors in Chinese adults using multi-level generalized estimation equation model
Relationship between depression and lifestyle factors in Chinese adults using multi-level generalized estimation equation model Open
To the editor: In China, depression is a common mental illness with a lifetime prevalence of 6.8% and a 12-month prevalence of 3.6%.[1] The study aimed to determine the prevalence of depression and its relationship with associated lifestyl…
View article: Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation Open
Large-scale pre-trained language models (PLMs) have shown great potential in natural language processing tasks. Leveraging the capabilities of PLMs to enhance automatic speech recognition (ASR) systems has also emerged as a promising resea…
View article: VLP: A Survey on Vision-language Pre-training
VLP: A Survey on Vision-language Pre-training Open
View article: DualGATs: Dual Graph Attention Networks for Emotion Recognition in Conversations
DualGATs: Dual Graph Attention Networks for Emotion Recognition in Conversations Open
Capturing complex contextual dependencies plays a vital role in Emotion Recognition in Conversations (ERC). Previous studies have predominantly focused on speaker-aware context modeling, overlooking the discourse structure of the conversat…
View article: Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog
Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog Open
Visual dialog requires models to give reasonable answers according to a series of coherent questions and related visual concepts in images. However, most current work either focuses on attention-based fusion or pre-training on large-scale …
View article: An Online Sparse Streaming Feature Selection Algorithm
An Online Sparse Streaming Feature Selection Algorithm Open
Online streaming feature selection (OSFS), which conducts feature selection in an online manner, plays an important role in dealing with high-dimensional data. In many real applications such as intelligent healthcare platform, streaming fe…
View article: HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval
HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval Open
In the past few years, the emergence of vision-language pre-training (VLP) has brought cross-modal retrieval to a new era. However, due to the latency and computation demand, it is commonly challenging to apply VLP in a real-time online re…
View article: Improving Cross-Modal Understanding in Visual Dialog via Contrastive Learning
Improving Cross-Modal Understanding in Visual Dialog via Contrastive Learning Open
Visual Dialog is a challenging vision-language task since the visual dialog agent needs to answer a series of questions after reasoning over both the image content and dialog history. Though existing methods try to deal with the cross-moda…
View article: VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-training Open
In the past few years, the emergence of pre-training models has brought uni-modal fields such as computer vision (CV) and natural language processing (NLP) to a new era. Substantial works have shown they are beneficial for downstream uni-m…
View article: Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation
Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation Open
Visual dialogue is a challenging task since it needs to answer a series of coherent questions on the basis of understanding the visual environment. Previous studies focus on the implicit exploration of multimodal co-reference by implicitly…
View article: GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog Open
Visual dialog, which aims to hold a meaningful conversation with humans about a given image, is a challenging task that requires models to reason the complex dependencies among visual content, dialog history, and current questions. Graph n…
View article: Unsupervised Knowledge Selection for Dialogue Generation
Unsupervised Knowledge Selection for Dialogue Generation Open
Knowledge selection is an important and challenging task which could provide the appropriate knowledge for informative dialogue generation.However, the needed gold knowledge label is difficult to collect in reality.In this paper, we study …
View article: Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation
Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation Open
Visual dialogue is a challenging task since it needs to answer a series of coherent questions on the basis of understanding the visual environment.Previous studies focus on the implicit exploration of multimodal coreference by implicitly a…
View article: Big Archive‐Assisted Ensemble of Many‐Objective Evolutionary Algorithms
Big Archive‐Assisted Ensemble of Many‐Objective Evolutionary Algorithms Open
Multiobjective evolutionary algorithms (MOEAs) have witnessed prosperity in solving many‐objective optimization problems (MaOPs) over the past three decades. Unfortunately, no one single MOEA equipped with given parameter settings, mating‐…
View article: GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog Open
Visual dialog, which aims to hold a meaningful conversation with humans about a given image, is a challenging task that requires models to reason the complex dependencies among visual content, dialog history, and current questions.Graph ne…
View article: Learning to Ground Visual Objects for Visual Dialog
Learning to Ground Visual Objects for Visual Dialog Open
Visual dialog is challenging since it needs to answer a series of coherent questions based on understanding the visual environment. How to ground related visual objects is one of the key problems. Previous studies utilize the question and …
View article: Monocytic MDSCs skew Th17 cells toward a pro-osteoclastogenic phenotype and potentiate bone erosion in rheumatoid arthritis
Monocytic MDSCs skew Th17 cells toward a pro-osteoclastogenic phenotype and potentiate bone erosion in rheumatoid arthritis Open
Objectives While myeloid-derived suppressor cells (MDSCs) were previously shown to promote a proinflammatory T helper (Th) 17 response in autoimmune conditions, a potential impact of the MDSC-Th17 immune axis on abnormal bone destruction i…
View article: DMRM: A Dual-Channel Multi-Hop Reasoning Model for Visual Dialog
DMRM: A Dual-Channel Multi-Hop Reasoning Model for Visual Dialog Open
Visual Dialog is a vision-language task that requires an AI agent to engage in a conversation with humans grounded in an image. It remains a challenging task since it requires the agent to fully understand a given question before making an…
View article: The pan-cancer landscape of netrin family reveals potential oncogenic biomarkers
The pan-cancer landscape of netrin family reveals potential oncogenic biomarkers Open
Recent cancer studies have found that the netrin family of proteins plays vital roles in the development of some cancers. However, the functions of the many variants of these proteins in cancer remain incompletely understood. In this work,…
View article: Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation
Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation Open
Knowledge selection plays an important role in knowledge-grounded dialogue, which is a challenging task to generate more informative responses by leveraging external knowledge. Recently, latent variable models have been proposed to deal wi…
View article: DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog Open
Visual Dialog is a vision-language task that requires an AI agent to engage in a conversation with humans grounded in an image. It remains a challenging task since it requires the agent to fully understand a given question before making an…
View article: Study on Problems and Countermeasures of Cargo Transportation at Haikou Airport
Study on Problems and Countermeasures of Cargo Transportation at Haikou Airport Open
With the continuous development of China's economy and the improvement of various infrastructures, air transportation has gradually become the main force of domestic transport of personnel and material transportation.At present, China's na…
View article: Multimodal Transportation: The Case of Laptop from Chongqing in China to Rotterdam in Europe
Multimodal Transportation: The Case of Laptop from Chongqing in China to Rotterdam in Europe Open
Multimodal transportation is a key component of modern logistics systems, especially for longdistance transnational transportation. This paper explores the various alternative routes for laptop exports from Chongqing, China to Rotterdam, t…