Carl M. Pearson
YOU?
Author Swipe
View article: Trilinos: Enabling Scientific Computing Across Diverse Hardware Architectures at Scale
Trilinos: Enabling Scientific Computing Across Diverse Hardware Architectures at Scale Open
Trilinos is a community-developed, open-source software framework that facilitates building large-scale, complex, multiscale, multiphysics simulation code bases for scientific and engineering problems. Since the Trilinos framework has unde…
View article: Interconnect Bandwidth Heterogeneity on AMD MI250x and Infinity Fabric
Interconnect Bandwidth Heterogeneity on AMD MI250x and Infinity Fabric Open
Demand for low-latency and high-bandwidth data transfer between GPUs has driven the development of multi-GPU nodes. Physical constraints on the manufacture and integration of such systems has yielded heterogeneous intra-node interconnects,…
View article: Machine Learning for CUDA+MPI Design Rules
Machine Learning for CUDA+MPI Design Rules Open
We present a new strategy for automatically exploring the design space of key CUDA+MPI programs and providing design rules that discriminate slow from fast implementations. In such programs, the order of operations (e.g., GPU kernels, MPI …
View article: tenzing
tenzing Open
SAND2022-3576 O tenzing provides techniques for improving the performance of key applications and libraries. The program is specified as a directed acyclic graph of operations. It uses Monte-Carlo tree search to explore the design space of…
View article: TEMPI: An Interposed MPI Library with Canonical Representation of MPI Datatypes [Slides]
TEMPI: An Interposed MPI Library with Canonical Representation of MPI Datatypes [Slides] Open
These points are covered in this presentation: Distributed GPU stencil, non-contiguous data; Equivalence of strided datatypes and minimal representation; GPU communication methods; Deploying on managed systems; Large messages and MPI datat…
View article: TEMPI: An Interposed MPI Library with Canonical Representation of MPI Datatypes [Poster]
TEMPI: An Interposed MPI Library with Canonical Representation of MPI Datatypes [Poster] Open
TEMPI provides a transparent non-contiguous data-handling layer compatible with various MPIs. MPI Datatypes are a powerful abstraction for allowing an MPI implementation to operate on non-contiguous data. CUDA-aware MPI implementations mus…
View article: Biotechnology and Bioengineering: Volume 116, Number 4, April 2019
Biotechnology and Bioengineering: Volume 116, Number 4, April 2019 Open
US$18141 (US and Rest of World), € 11707 (Europe), £ 9260 (UK
View article: SCOPE: C3SR Systems Characterization and Benchmarking Framework
SCOPE: C3SR Systems Characterization and Benchmarking Framework Open
This report presents the design of the Scope infrastructure for extensible and portable benchmarking. Improvements in high- performance computing systems rely on coordination across different levels of system abstraction. Developing and de…