Explanipedia

DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism Open

Chenyu Jiang, Zhenkun Cai, Ye Tian, Zhen Jia, Yida Wang , et al. · 2025

Context parallelism has emerged as a key technique to support long-context training, a growing trend in generative AI for modern large models. However, existing context parallel methods rely on static parallelization configurations that ov…

Robust LLM Training Infrastructure at ByteDance Open

Borui Wan, G.L. Liu, Zuquan Song, Jun Wang, Yun Zhang , et al. · 2025

Leadless pacemakers: A review of communication methods, energy management, and clinical applications Open

Yuanyuan Zhao, Liping Du, Wei Chen, Ping Guo, Chuan Wu · 2025

Leadless pacemakers have emerged as a mainstream clinical solution, and their communication capabilities, crucial for reliable pacing and device monitoring, continue to evolve. This review systematically examines the fundamental principles…

On the Interplay between Graph Structure and Learning Algorithms in Graph Neural Networks Open

Junwei Su, Chuan Wu · 2025

This paper studies the interplay between learning algorithms and graph structure for graph neural networks (GNNs). Existing theoretical studies on the learning dynamics of GNNs primarily focus on the convergence rates of learning algorithm…

Microbial Corrosion Behavior of L245 Pipeline Steel in the Presence of Iron-Oxidizing Bacteria and Shewanella algae Open

Fanghui Zhu, Yiyang Liu, Chuan Wu, Kai Li, Yujin Hu , et al. · 2025

Microbiologically influenced corrosion (MIC) poses significant challenges in oilfield water injection environments, leading to substantial socioeconomic losses. L245 steel, a low-alloy steel widely used in oil and gas pipelines due to its …

HybridFlow: A Flexible and Efficient RLHF Framework Open

Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu, Wang Zhang , et al. · 2025

Reinforcement Learning from Human Feedback (RLHF) is widely used in Large\nLanguage Model (LLM) alignment. Traditional RL can be modeled as a dataflow,\nwhere each node represents computation of a neural network (NN) and each edge\ndenotes…

Mitigating Unfairness in Differentially-Private Federated Learning Open

Bingqian Du, Liyao Xiang, Chuan Wu · 2025

Federated learning is a new learning paradigm which utilizes crowdsourced data stored at dispersed user devices (aka clients) to learn a global model. Studies have shown that even though data are kept on local devices, an adversary is stil…

Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models Open

Jiyue Jiang, Alfred Kar Yin Truong, Yanyu Chen, Qinghang Bao, Sheng Wang , et al. · 2025

High-quality data resources play a crucial role in learning large language models (LLMs), particularly for low-resource languages like Cantonese. Despite having more than 85 million native speakers, Cantonese is still considered a low-reso…

A machine-learning approach to optimize nutritional properties and organic wastes recycling efficiency conversed by black soldier fly (Hermetia illucens) Open

Shasha Feng, Hongyan Ma, Chuan Wu, Vimalanathan Arunprasanna, Xinran Liang , et al. · 2025

Suboptimal nutrition in organic waste limits the growth of black soldier fly (BSF) larvae, thereby reducing biowaste recycling efficiency. In this study, weight gain data from BSF larvae fed diets with distinct nutrient compositions were u…

Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models Open

Jiyue Jiang, Alfred Kar Yin Truong, Yanyu Chen, Qinghang Bao, Sheng Wang , et al. · 2025

ProReason: Multi-Modal Proactive Reasoning with Decoupled Eyesight and Wisdom Open

Jingqi Zhou, Sheng Wang, Jingwei Dong, Kai Liu, Lei Li , et al. · 2025

How Well Do LLMs Handle Cantonese? Benchmarking Cantonese Capabilities of Large Language Models Open

Jiyue Jiang, Pengan Chen, Liheng Chen, Sheng Wang, Qinghang Bao , et al. · 2025

Echo: Simulating Distributed Training At Scale Open

Y. Feng, Y. Chen, Kaiwen Chen, Jingzong Li, Tianyuan Wu , et al. · 2024

Simulation offers unique values for both enumeration and extrapolation purposes, and is becoming increasingly important for managing the massive machine learning (ML) clusters and large-scale distributed training jobs. In this paper, we bu…

Metastable pitting corrosion behavior of the Incoloy 825 liner of metallurgically clad pipe in simulated oilfield produced water Open

Anqing Fu, Chuan Wu, Kai Li, Wensheng Li, Xuanpeng Li , et al. · 2024

This study investigates the metastable pitting corrosion behavior and passive film characteristics of metallurgically clad pipe (MCP) 825 within simulated oilfield produced water. Electrochemical testing and microstructural examination wer…

How Well Do LLMs Handle Cantonese? Benchmarking Cantonese Capabilities of Large Language Models Open

Jiyue Jiang, Liheng Chen, Pengan Chen, Sheng Wang, Qinghang Bao , et al. · 2024

The rapid evolution of large language models (LLMs) has transformed the competitive landscape in natural language processing (NLP), particularly for English and other data-rich languages. However, underrepresented languages like Cantonese,…

Heta: Distributed Training of Heterogeneous Graph Neural Networks Open

Yuchen Zhong, Junwei Su, Chuan Wu, Minjie Wang · 2024

Heterogeneous Graph Neural Networks (HGNNs) leverage diverse semantic relationships in Heterogeneous Graphs (HetGs) and have demonstrated remarkable learning performance in various applications. However, current distributed GNN training sy…

Data Augmentation of Multi-turn Psychological Dialogue via Knowledge-driven Progressive Thought Prompting Open

Jiyue Jiang, Li–Heng Chen, Sheng Wang, Lingpeng Kong, Yu Li , et al. · 2024

Existing dialogue data augmentation (DA) techniques predominantly focus on augmenting utterance-level dialogues, which makes it difficult to take dialogue contextual information into account. The advent of large language models (LLMs) has …

Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping Open

Chenyu Jiang, Ye Tian, Zhen Jia, Shuai Zheng, Chuan Wu , et al. · 2024

The Mixture-of-Expert (MoE) technique plays a crucial role in expanding the size of DNN model parameters. However, it faces the challenge of extended all-to-all communication latency during the training process. Existing methods attempt to…

Preface: Heavy metal(loid)s at mining & metallurgical sites: Fate, risk and remediation Open

Shengguo Xue, Chu-xuan Li, Dongye Zhao, Chuan Wu · 2024

BG-HGNN: Toward Scalable and Efficient Heterogeneous Graph Neural Network Open

Junwei Su, Lingjun Mao, Chuan Wu · 2024

Many computer vision and machine learning problems are modelled as learning tasks on heterogeneous graphs, featuring a wide array of relations from diverse types of nodes and edges. Heterogeneous graph neural networks (HGNNs) stand out as …

On the Topology Awareness and Generalization Performance of Graph Neural Networks Open

Junwei Su, Chuan Wu · 2024

Many computer vision and machine learning problems are modelled as learning tasks on graphs where graph neural networks GNNs have emerged as a dominant tool for learning representations of graph structured data A key feature of GNNs is the…

LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization Open

Juntao Zhao, Borui Wan, Yanghua Peng, Haibin Lin, Chuan Wu · 2024

Recent breakthroughs in Large-scale language models (LLMs) have demonstrated impressive performance on various tasks. The immense sizes of LLMs have led to very high resource demand and cost for running the models. Though the models are la…

Towards Robust Graph Incremental Learning on Evolving Graphs Open

Junwei Su, Difan Zou, Zijun Zhang, Chuan Wu · 2024

Incremental learning is a machine learning approach that involves training a model on a sequence of tasks, rather than all tasks at once. This ability to learn incrementally from a stream of tasks is crucial for many real-world application…

HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis Open

Shiwei Zhang, Lansong Diao, Chuan Wu, Zongyan Cao, Siyu Wang , et al. · 2024

Single-Program-Multiple-Data (SPMD) parallelism has recently been adopted to train large deep neural networks (DNNs). Few studies have explored its applicability on heterogeneous clusters, to fully exploit available resources for large mod…

Identifying Frailty in Older Adults Receiving Home Care Assessment Using Machine Learning: Longitudinal Observational Study on the Role of Classifier, Feature Selection, and Sample Size Open

Cheng Pan, Hao Luo, Gary Cheung, Huiquan Zhou, Reynold Cheng , et al. · 2024

Background Machine learning techniques are starting to be used in various health care data sets to identify frail persons who may benefit from interventions. However, evidence about the performance of machine learning techniques compared t…

Microstructural Evolution and Mechanical Properties of Ti6al4v Alloy Manufactured by the Multi-Pass Hot Caliber Rolling at 700℃ and 800℃ with Different Reductions Open

Chuan Wu, Ya Meng, Baoxi Liu, Tao Huang · 2024

A Multi-Criteria Decision Support Model for New Energy Vehicle Selection Considering Social Media Influencer Reviews and Personalized Preferences Open

Zhang‐peng Tian, Chuan Wu, Ru‐xin Nie · 2024

GNNFlow: A Distributed Framework for Continuous Temporal GNN Learning on Dynamic Graphs Open

Yuchen Zhong, Guangming Sheng, Tianzuo Qin, Minjie Wang, Quan Gan , et al. · 2023

Graph Neural Networks (GNNs) play a crucial role in various fields. However, most existing deep graph learning frameworks assume pre-stored static graphs and do not support training on graph streams. In contrast, many real-world graphs are…

DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines Open

Chenyu Jiang, Zhen Jia, Shuai Zheng, Yida Wang, Chuan Wu · 2023

Multi-task model training has been adopted to enable a single deep neural network model (often a large language model) to handle multiple tasks (e.g., question answering and text summarization). Multi-task training commonly receives input …

CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs Open

Hanpeng Hu, Junwei Su, Juntao Zhao, Yanghua Peng, Yibo Zhu , et al. · 2023

Deep Neural Networks (DNNs) have shown excellent performance in a wide range of machine learning applications. Knowing the latency of running a DNN model or tensor program on a specific device is useful in various tasks, such as DNN graph-…

Chuan Wu YOU? Author Swipe