Michèle Weiland
YOU?
Author Swipe
View article: Protocol for Developing the Sustainable Scientific Machine-learning Reporting Toolkit (SSMART) v1
Protocol for Developing the Sustainable Scientific Machine-learning Reporting Toolkit (SSMART) v1 Open
This protocol describes the development of the Sustainable Scientific Machine-learning Reporting Toolkit (SSMART). This will be a checklist-based reporting guideline for scientific machine learning (SML) based studies in the physical scien…
View article: Sustainable AI: Experiences, Challenges & Recommendations
Sustainable AI: Experiences, Challenges & Recommendations Open
The use of Artificial Intelligence (AI) and Machine Learning (ML) as part of scientific workloads is becoming increasingly widespread. It is imperative to understand how to configure AI and ML applications on HPC systems to optimise their …
View article: Performance and scaling of the LFRic weather and climate model on different generations of HPE Cray EX supercomputers
Performance and scaling of the LFRic weather and climate model on different generations of HPE Cray EX supercomputers Open
This study presents scaling results and a performance analysis across different supercomputers and compilers for the Met Office weather and climate model, LFRic. The model is shown to scale to large numbers of nodes which meets the design …
View article: Evaluating and optimising compiler code generation for NVIDIA Grace
Evaluating and optimising compiler code generation for NVIDIA Grace Open
In this paper, we explore the performance of the main optimising compiler toolchains currently available for high-performance AArch64 processors, namely the Arm Compiler for Linux (ACFL), GNU, LLVM and the NVIDIA HPC (NVHPC) compilers, on …
View article: Morpheus: A library for efficient runtime switching of sparse matrix storage formats
Morpheus: A library for efficient runtime switching of sparse matrix storage formats Open
Sparse matrix storage formats have evolved over the years to better exploit the particular strengths of different hardware architectures or to better match the sparsity patterns of matrices, with the aim to optimize operations on the matri…
View article: An approach to performance portability through generic programming
An approach to performance portability through generic programming Open
The expanding hardware diversity in high performance computing adds enormous complexity to scientific software development. Developers who aim to write maintainable software have two options: 1) To use a so-called data locality abstraction…
View article: Vectorizing and distributing number‐theoretic transform to count Goldbach partitions on Arm‐based supercomputers
Vectorizing and distributing number‐theoretic transform to count Goldbach partitions on Arm‐based supercomputers Open
Summary In this article, we explore the usage of scalable vector extension (SVE) to vectorize number‐theoretic transforms (NTTs). In particular, we show that 64‐bit modular arithmetic operations, including modular multiplication, can be ef…
View article: eCSE-0302 Final Report
eCSE-0302 Final Report Open
This report presents the work conducted for the ARCHER2 eCSE-0302 project. The project goal is to address the I/O bottleneck in the Xcompact3D CFD application by facilitating user defined in- situ analyses. This was achieved by first addin…
View article: Morpheus unleashed: Fast cross-platform SpMV on emerging architectures
Morpheus unleashed: Fast cross-platform SpMV on emerging architectures Open
Sparse matrices and linear algebra are at the heart of scientific simulations. Over the years, more than 70 sparse matrix storage formats have been developed, targeting a wide range of hardware architectures and matrix types, each of which…
View article: Optimizing Sparse Linear Algebra Through Automatic Format Selection and Machine Learning
Optimizing Sparse Linear Algebra Through Automatic Format Selection and Machine Learning Open
Sparse matrices are an integral part of scientific simulations. As hardware evolves new sparse matrix storage formats are proposed aiming to exploit optimizations specific to the new hardware. In the era of heterogeneous computing, users o…
View article: Exploiting dynamic sparse matrices for performance portable linear algebra operations
Exploiting dynamic sparse matrices for performance portable linear algebra operations Open
Sparse matrices and linear algebra are at the heart of scientific simulations. More than 70 sparse matrix storage formats have been developed over the years, targeting a wide range of hardware architectures and matrix types. Each format is…
View article: Interim Report: Complexity, Challenges and Opportunities for Carbon Neutral Digital Research
Interim Report: Complexity, Challenges and Opportunities for Carbon Neutral Digital Research Open
UK Research and Innovation has placed the net zero transition at the top of its priorities (UK Research & Innovation 2020). This has two implications. Firstly, UKRI supports research which develops and identifies solutions which create adv…
View article: Performance Evaluation of Adaptive Routing on Dragonfly-based Production Systems
Performance Evaluation of Adaptive Routing on Dragonfly-based Production Systems Open
Performance of applications in production environments can be sensitive to network congestion. Cray Aries supports adaptively routing each network packet independently based on the load or congestion encountered as a packet traverses the n…
View article: Porting the microphysics model CASIM to GPU and KNL Cray machines
Porting the microphysics model CASIM to GPU and KNL Cray machines Open
CASIM is a microphysics scheme which calculates the interaction between moisture droplets in the atmosphere and forms a critical part of weather and climate modelling codes. However the calculations involved are computationally intensive a…
View article: A highly scalable Met Office NERC Cloud model
A highly scalable Met Office NERC Cloud model Open
Large Eddy Simulation is a critical modelling tool for scientists investigating atmospheric flows, turbulence and cloud microphysics. Within the UK, the principal LES model used by the atmospheric research community is the Met Office Large…
View article: Investigating Applications on the A64FX
Investigating Applications on the A64FX Open
The A64FX processor from Fujitsu, being designed for computational simulation and machine learning applications, has the potential for unprecedented performance in HPC systems. In this paper, we evaluate the A64FX by benchmarking against a…
View article: Progressive Load Balancing in Distributed Memory
Progressive Load Balancing in Distributed Memory Open
System performance variability is a significant challenge to scalability of tightly-coupled iterative applications. Asynchronous variants perform better, but an imbalance in progress can result in slower convergence or even failure to conv…
View article: An early evaluation of Intel's optane DC persistent memory module and its impact on high-performance scientific applications
An early evaluation of Intel's optane DC persistent memory module and its impact on high-performance scientific applications Open
Memory and I/O performance bottlenecks in supercomputing simulations are two key challenges that must be addressed on the road to Exascale. The new byte-addressable persistent non-volatile memory technology from Intel, DCPMM, promises to b…
View article: Evaluating the Arm Ecosystem for High Performance Computing
Evaluating the Arm Ecosystem for High Performance Computing Open
In recent years, Arm-based processors have arrived on the HPC scene, offering an alternative the existing status quo, which was largely dominated by x86 processors. In this paper, we evaluate the Arm ecosystem, both the hardware offering a…
View article: Leveraging MPI RMA to optimize halo‐swapping communications in MONC on Cray machines
Leveraging MPI RMA to optimize halo‐swapping communications in MONC on Cray machines Open
Summary Remote Memory Access (RMA), also known as single‐sided communications, provides a way for reading and writing directly into the memory of other processes without having to issue explicit message passing style communication calls. P…
View article: Architectures for High Performance Computing and Data Systems using Byte-Addressable Persistent Memory
Architectures for High Performance Computing and Data Systems using Byte-Addressable Persistent Memory Open
Non-volatile, byte addressable, memory technology with performance close to main memory promises to revolutionise computing systems in the near future. Such memory technology provides the potential for extremely large memory regions (i.e. …
View article: Exploiting the Performance Benefits of Storage Class Memory for HPC and HPDA Workflows
Exploiting the Performance Benefits of Storage Class Memory for HPC and HPDA Workflows Open
Byte-addressable storage class memory (SCM) is an upcoming technology that will transform the memory and storage hierarchy of HPC systems by dramatically reducing the latency gap between DRAM and persistent storage. In this paper, we discu…
View article: Met Office NERC Cloud model (MONC)
Met Office NERC Cloud model (MONC) Open
This is the source code for the Met Office NERC Cloud model (MONC) which is an atmospheric model used to study clouds and turbulent flows. It has been shown to scale to over 32768 compute cores and includes both the simulation (computation…
View article: Progressive load balancing of asynchronous algorithms
Progressive load balancing of asynchronous algorithms Open
Massively parallel supercomputers are susceptible to variable performance due to factors such as differences in chip manufacturing, heat management and network congestion. As a result, the same code with the same input can have a different…
View article: In situ data analytics for highly scalable cloud modelling on Cray machines
In situ data analytics for highly scalable cloud modelling on Cray machines Open
Summary MONC is a highly scalable modelling tool for the investigation of atmospheric flows, turbulence, and cloud microphysics. Typical simulations produce very large amounts of raw data, which must then be analysed for scientific investi…
View article: On the trade-offs between energy to solution and runtime for real-world CFD test-cases
On the trade-offs between energy to solution and runtime for real-world CFD test-cases Open
This paper provides an insight into the optimisation of runtime and energy performance for two widely-used CFD codes. Energy efficiency is a hot-topic in HPC and methods to reduce the energy consumption of large machines are an active area…