Xingfu Wu
YOU?
Author Swipe
View article: Coordinated Power Management on Heterogeneous Systems
Coordinated Power Management on Heterogeneous Systems Open
Performance prediction is essential for energy-efficient computing in heterogeneous computing systems that integrate CPUs and GPUs. However, traditional performance modeling methods often rely on exhaustive offline profiling, which becomes…
View article: Leveraging LLMs to Automate Energy-Aware Refactoring of Parallel Scientific Codes
Leveraging LLMs to Automate Energy-Aware Refactoring of Parallel Scientific Codes Open
While large language models (LLMs) are increasingly used for generating parallel scientific codes, most efforts emphasize functional correctness, often overlooking performance, especially energy efficiency. We propose LASSI-EE, an automate…
View article: SAGIPS: a physics-inspired scalable asynchronous generative inverse-problem solver
SAGIPS: a physics-inspired scalable asynchronous generative inverse-problem solver Open
Solving large-scale inverse problems using deep-learning algorithms have become an essential part of modern research and industrial applications. The complexity of the underlying inverse problem may require the utilization of high performa…
View article: Modeling performance of data collection systems for high-energy physics
Modeling performance of data collection systems for high-energy physics Open
Exponential increases in scientific experimental data are outpacing silicon technology progress, necessitating heterogeneous computing systems—particularly those utilizing machine learning (ML)—to meet future scientific computing demands. …
View article: ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales
ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales Open
As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging. We propose a low‐overhead autotuning fram…
View article: Integrating ytopt and libEnsemble to autotune OpenMC
Integrating ytopt and libEnsemble to autotune OpenMC Open
Ytopt is a Python machine-learning-based autotuning software package developed within the ECP PROTEAS-TUNE project. The ytopt software adopts an asynchronous search framework that consists of sampling a small number of input parameter conf…
View article: Co-Design of 2D Heterojunctions for Data Filtering in Tracking Systems
Co-Design of 2D Heterojunctions for Data Filtering in Tracking Systems Open
As particle physics experiments evolve to achieve higher energies and resolutions, handling the massive data volumes produced by silicon pixel detectors, which are used for charged particle tracking, poses a significant challenge. To addre…
View article: LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes
LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes Open
This paper addresses the problem of providing a novel approach to sourcing significant training data for LLMs focused on science and engineering. In particular, a crucial challenge is sourcing parallel scientific codes in the ranges of mil…
View article: Modeling Performance of Data Collection Systems for High-Energy Physics
Modeling Performance of Data Collection Systems for High-Energy Physics Open
Exponential increases in scientific experimental data are outstripping the rate of progress in silicon technology. As a result, heterogeneous combinations of architectures and process or device technologies are increasingly important to me…
View article: An Autotuning-based Optimization Framework for Mixed-kernel SVM Classifications in Smart Pixel Datasets and Heterojunction Transistors
An Autotuning-based Optimization Framework for Mixed-kernel SVM Classifications in Smart Pixel Datasets and Heterojunction Transistors Open
Support Vector Machine (SVM) is a state-of-the-art classification method widely used in science and engineering due to its high accuracy, its ability to deal with high dimensional data, and its flexibility in modeling diverse sources of da…
View article: SAGIPS: A Scalable Asynchronous Generative Inverse Problem Solver
SAGIPS: A Scalable Asynchronous Generative Inverse Problem Solver Open
Large scale, inverse problem solving deep learning algorithms have become an essential part of modern research and industrial applications. The complexity of the underlying inverse problem often poses challenges to the algorithm and requir…
View article: Industrial multivariate time-series data anomaly detection incorporating attention mechanisms and adversarial training
Industrial multivariate time-series data anomaly detection incorporating attention mechanisms and adversarial training Open
To address the challenges faced in industrial anomaly detection, including data sample imbalance, lack of anomaly labels, and complex spatiotemporal relationships in high-dimensional data, this paper proposes a novel multi-modal time-serie…
View article: Integrating ytopt and libEnsemble to Autotune OpenMC
Integrating ytopt and libEnsemble to Autotune OpenMC Open
ytopt is a Python machine-learning-based autotuning software package developed within the ECP PROTEAS-TUNE project. The ytopt software adopts an asynchronous search framework that consists of sampling a small number of input parameter conf…
View article: Autotuning Apache TVM-based Scientific Applications Using Bayesian Optimization
Autotuning Apache TVM-based Scientific Applications Using Bayesian Optimization Open
Apache TVM (Tensor Virtual Machine), an open source machine learning compiler framework designed to optimize computations across various hardware platforms, provides an opportunity to improve the performance of dense matrix factorizations …
View article: Transfer-learning-based Autotuning using Gaussian Copula
Transfer-learning-based Autotuning using Gaussian Copula Open
As diverse high-performance computing (HPC) systems are built, many opportunities arise for applications to solve larger problems than ever before. Given the significantly increased complexity of these HPC systems and application tuning, e…
View article: ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales
ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales Open
As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging. We propose a low-overhead autotuning fram…
View article: DNPC: A Dynamic Node-Level Power Capping Library for Scientific Applications
DNPC: A Dynamic Node-Level Power Capping Library for Scientific Applications Open
As the race to exa-scale computing accerlerates, power consumption continues to be a critical challenge. While several technologies are available for power management, balancing energy efficiency and application performance during executio…
View article: Performance Debugging and Tuning of Flash-X with Data Analysis Tools
Performance Debugging and Tuning of Flash-X with Data Analysis Tools Open
State-of-the-art multiphysics simulations running on large scale leadership computing platforms have many variables contributing to their performance and scaling behavior. We recently encountered an interesting performance anomaly in Flash…
View article: Performance and power modeling and prediction using MuMMI and 10 machine learning methods
Performance and power modeling and prediction using MuMMI and 10 machine learning methods Open
Summary Energy‐efficient scientific applications require insight into how high performance computing system features impact the applications' power and performance. This insight can result from the development of performance and power mode…
View article: Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization
Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization Open
We develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effectiveness. We sel…
View article: Utilizing ensemble learning for performance and power modeling and improvement of parallel cancer deep learning CANDLE benchmarks
Utilizing ensemble learning for performance and power modeling and improvement of parallel cancer deep learning CANDLE benchmarks Open
Machine learning (ML) continues to grow in importance across nearly all domains in modeling to learn from data. Often a tradeoff exists between a model's ability to minimize bias and variance. In this article, we utilize ensemble learning …
View article: Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations
Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations Open
Polly is the LLVM project's polyhedral loop nest optimizer. Recently, user-directed loop transformation pragmas were proposed based on LLVM/Clang and Polly. The search space exposed by the transformation pragmas is a tree, wherein each nod…
View article: Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization (extended version)
Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization (extended version) Open
In this paper, we develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effect…
View article: Utilizing Ensemble Learning for Performance and Power Modeling and Improvement of Parallel Cancer Deep Learning CANDLE Benchmarks
Utilizing Ensemble Learning for Performance and Power Modeling and Improvement of Parallel Cancer Deep Learning CANDLE Benchmarks Open
Machine learning (ML) continues to grow in importance across nearly all domains and is a natural tool in modeling to learn from data. Often a tradeoff exists between a model's ability to minimize bias and variance. In this paper, we utiliz…
View article: Performance and Power Modeling and Prediction Using MuMMI and Ten Machine Learning Methods
Performance and Power Modeling and Prediction Using MuMMI and Ten Machine Learning Methods Open
In this paper, we use modeling and prediction tool MuMMI (Multiple Metrics Modeling Infrastructure) and ten machine learning methods to model and predict performance and power and compare their prediction error rates. We use a fault-tolera…
View article: Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization
Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization Open
An autotuning is an approach that explores a search space of possible implementations/configurations of a kernel or an application by selecting and evaluating a subset of implementations/configurations on a target platform and/or use model…
View article: Autotuning Search Space for Loop Transformations
Autotuning Search Space for Loop Transformations Open
One of the challenges for optimizing compilers is to predict whether applying an optimization will improve its execution speed. Programmers may override the compiler's profitability heuristic using optimization directives such as pragmas i…
View article: Toward an End-to-End Auto-tuning Framework in HPC PowerStack
Toward an End-to-End Auto-tuning Framework in HPC PowerStack Open
Efficiently utilizing procured power and optimizing performance of scientific applications under power and energy constraints are challenging. The HPC PowerStack defines a software stack to manage power and energy of high-performance compu…
View article: Distribution and molecular identification of Meloidogyne spp. parasitising flue-cured tobacco in Yunnan, China
Distribution and molecular identification of Meloidogyne spp. parasitising flue-cured tobacco in Yunnan, China Open
Twenty-one populations of root-knot nematodes (RKNs) were recovered from diseased roots collected from flue-cured tobacco in 21 locations in Yunnan (China) during 2014-2015. Molecular diagnosis on species was performed based on characteris…