Sameh Abdulah
YOU?
Author Swipe
View article: Scalable Asynchronous Federated Modeling for Spatial Data
Scalable Asynchronous Federated Modeling for Spatial Data Open
Spatial data are central to applications such as environmental monitoring and urban planning, but are often distributed across devices where privacy and communication constraints limit direct sharing. Federated modeling offers a practical …
View article: High-Performance Statistical Computing (HPSC): Challenges, Opportunities, and Future Directions
High-Performance Statistical Computing (HPSC): Challenges, Opportunities, and Future Directions Open
We recognize the emergence of a statistical computing community focused on working with large computing platforms and producing software and applications that exemplify high-performance statistical computing (HPSC). The statistical computi…
View article: RCOMPSs: A Scalable Runtime System for R Code Execution on Manycore Systems
RCOMPSs: A Scalable Runtime System for R Code Execution on Manycore Systems Open
R has become a cornerstone of scientific and statistical computing due to its extensive package ecosystem, expressive syntax, and strong support for reproducible analysis. However, as data sizes and computational demands grow, native R par…
View article: Scaled Block Vecchia Approximation for High-Dimensional Gaussian Process Emulation on GPUs
Scaled Block Vecchia Approximation for High-Dimensional Gaussian Process Emulation on GPUs Open
Emulating computationally intensive scientific simulations is crucial for enabling uncertainty quantification, optimization, and informed decision-making at scale. Gaussian Processes (GPs) offer a flexible and data-efficient foundation for…
View article: Decentralized Inference for Spatial Data Using Low-Rank Models
Decentralized Inference for Spatial Data Using Low-Rank Models Open
Advancements in information technology have enabled the creation of massive spatial datasets, driving the need for scalable and efficient computational methodologies. While offering viable solutions, centralized frameworks are limited by v…
View article: GPU-Accelerated Modified Bessel Function of the Second Kind for Gaussian Processes
GPU-Accelerated Modified Bessel Function of the Second Kind for Gaussian Processes Open
Modified Bessel functions of the second kind are widely used in physics, engineering, spatial statistics, and machine learning. Since contemporary scientific applications, including machine learning, rely on GPUs for acceleration, providin…
View article: Accelerating Mixed-Precision Out-of-Core Cholesky Factorization with Static Task Scheduling
Accelerating Mixed-Precision Out-of-Core Cholesky Factorization with Static Task Scheduling Open
This paper explores the performance optimization of out-of-core (OOC) Cholesky factorization on shared-memory systems equipped with multiple GPUs. We employ fine-grained computational tasks to expose concurrency while creating opportunitie…
View article: Block Vecchia Approximation for Scalable and Efficient Gaussian Process Computations
Block Vecchia Approximation for Scalable and Efficient Gaussian Process Computations Open
Gaussian Processes (GPs) are vital for modeling and predicting irregularly-spaced, large geospatial datasets. However, their computations often pose significant challenges in large-scale applications. One popular method to approximate GPs …
View article: A Novel Approach to Translate Structural Aggregation Queries to MapReduce Code
A Novel Approach to Translate Structural Aggregation Queries to MapReduce Code Open
Data management applications are growing and require more attention, especially in the "big data" era. Thus, supporting such applications with novel and efficient algorithms that achieve higher performance is critical. Array database manag…
View article: Boosting Earth System Model Outputs And Saving PetaBytes in their Storage Using Exascale Climate Emulators
Boosting Earth System Model Outputs And Saving PetaBytes in their Storage Using Exascale Climate Emulators Open
We present the design and scalable implementation of an exascale climate emulator for addressing the escalating computational and storage requirements of high-resolution Earth System Model simulations. We utilize the spherical harmonic tra…
View article: MPCR: Multi- and Mixed-Precision Computations Package in R
MPCR: Multi- and Mixed-Precision Computations Package in R Open
Computational statistics has traditionally utilized double-precision (64-bit) data structures and full-precision operations, resulting in higher-than-necessary accuracy for certain applications. Recently, there has been a growing interest …
View article: Parallel Approximations for High-Dimensional Multivariate Normal Probability Computation in Confidence Region Detection Applications
Parallel Approximations for High-Dimensional Multivariate Normal Probability Computation in Confidence Region Detection Applications Open
Addressing the statistical challenge of computing the multivariate normal (MVN) probability in high dimensions holds significant potential for enhancing various applications. One common way to compute high-dimensional MVN probabilities is …
View article: GPU-Accelerated Vecchia Approximations of Gaussian Processes for Geospatial Data using Batched Matrix Computations
GPU-Accelerated Vecchia Approximations of Gaussian Processes for Geospatial Data using Batched Matrix Computations Open
Gaussian processes (GPs) are commonly used for geospatial analysis, but they suffer from high computational complexity when dealing with massive data. For instance, the log-likelihood function required in estimating the statistical model p…
View article: GPU-Accelerated Vecchia Approximations of Gaussian Processes for Geospatial Data using Batched Matrix Computations
GPU-Accelerated Vecchia Approximations of Gaussian Processes for Geospatial Data using Batched Matrix Computations Open
Gaussian processes (GPs) are commonly used for geospatial analysis, but they suffer from high computational complexity when dealing with massive data. For instance, the log-likelihood function required in estimating the statistical model p…
View article: On the Impact of Spatial Covariance Matrix Ordering on Tile Low-Rank Estimation of Matérn Parameters
On the Impact of Spatial Covariance Matrix Ordering on Tile Low-Rank Estimation of Matérn Parameters Open
Spatial statistical modeling and prediction involve generating and manipulating an n*n symmetric positive definite covariance matrix, where n denotes the number of spatial locations. However, when n is large, processing this covariance mat…
View article: Portability and Scalability Evaluation of Large-Scale Statistical Modeling and Prediction Software through HPC-Ready Containers
Portability and Scalability Evaluation of Large-Scale Statistical Modeling and Prediction Software through HPC-Ready Containers Open
HPC-based applications often have complex workflows with many software dependencies that hinder their portability on contemporary HPC architectures. In addition, these applications often require extraordinary efforts to deploy and execute …
View article: Which Parameterization of the Matérn Covariance Function?
Which Parameterization of the Matérn Covariance Function? Open
The Matérn family of covariance functions is currently the most popularly used model in spatial statistics, geostatistics, and machine learning to specify the correlation between two geographical locations based on spatial distance. Compar…
View article: Efficient Large-scale Nonstationary Spatial Covariance Function Estimation Using Convolutional Neural Networks
Efficient Large-scale Nonstationary Spatial Covariance Function Estimation Using Convolutional Neural Networks Open
Spatial processes observed in various fields, such as climate and environmental science, often occur on a large scale and demonstrate spatial nonstationarity. Fitting a Gaussian process with a nonstationary Matérn covariance is challenging…
View article: Front Cover Image, Volume 34, Number 1, February 2023
Front Cover Image, Volume 34, Number 1, February 2023 Open
The cover image is based on the Research Article Large-scale environmental data science with ExaGeoStatR by Sameh Abdulah et al., https://doi.org/10.1002/env.2770. Image Credit: Xavier Pita, KAUST.
View article: A Novel Approach to Translate Structural Aggregation Queries to MapReduce Code
A Novel Approach to Translate Structural Aggregation Queries to MapReduce Code Open
Data management applications are rapidly growing applications that require more attention, especially in the big data era. Thus, it is critical to support these applications with novel and efficient algorithms that satisfy higher performan…
View article: The Second Competition on Spatial Statistics for Large Datasets
The Second Competition on Spatial Statistics for Large Datasets Open
In the last few decades, the size of spatial and spatio-temporal datasets in many research areas has rapidly increased with the development of data collection technologies. As a result, classical statistical methods in spatial statistics a…
View article: Parallel space-time likelihood optimization for air pollution prediction on large-scale systems
Parallel space-time likelihood optimization for air pollution prediction on large-scale systems Open
Gaussian geostatistical space-time modeling is an effective tool for performing statistical inference of field data evolving in space and time, generalizing spatial modeling alone at the cost of the greater complexity of operations and sto…
View article: The Second Competition on Spatial Statistics for Large Datasets
The Second Competition on Spatial Statistics for Large Datasets Open
In the last few decades, the size of spatial and spatio-temporal datasets in many research areas has rapidly increased with the development of data collection technologies. As a result, classical statistical methods in spatial statistics a…
View article: Editorial: Large-Scale Spatial Data Science
Editorial: Large-Scale Spatial Data Science Open
Publisher: School of Statistics, Renmin University of China, Journal: Journal of Data Science, Title: Editorial - Large-Scale Spatial Data Science, Authors: Sameh Abdulah, Stefano Castruccio, Marc G. Genton
View article: Active clustering data streams with affinity propagation
Active clustering data streams with affinity propagation Open
Most existing applications have a large number of evolving data streams. Clustering data streams is still a critical problem for these applications as the data are evolving and changes over time. Most existing algorithms are unsupervised l…