MIMD
View article
Exploiting recent SIMD architectural advances for irregular applications Open
A broad class of applications involve indirect or datadependent memory accesses and are referred to as irregular applications. Recent developments in SIMD architectures – specifically, the emergence of wider SIMD lanes, combination of SIMD…
View article
GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks Open
Generative Adversarial Networks (GANs) are one of the most recent deep learning models that generate synthetic data from limited genuine datasets. GANs are on the frontier as further extension of deep learning into many domains (e.g., medi…
View article
Multi-core K-means Open
Today's microprocessors consist of multiple cores each of which can perform multiple additions, multiplications, or other operations simultaneously in one clock cycle. To maximize performance, two types of parallelism must be applied in a …
View article
A Permutational Boltzmann Machine with Parallel Tempering for Solving Combinatorial Optimization Problems Open
Boltzmann Machines are recurrent neural networks that have been used extensively in combinatorial optimization due to their simplicity and ease of parallelization. This paper introduces the Permutational Boltzmann Machine, a neural network…
View article
Dual-rotor misalignment fault quantitative identification based on DBN and improved D-S evidence theory Open
Misalignment fault is the main factor that affects the normal running of dual-rotor system. Quantitative identification the misalignment fault is an important way to ensure the safe and stable service of the dual-rotor system, while the id…
View article
Enabling SIMT Execution Model on Homogeneous Multi-Core System Open
Single-instruction multiple-thread (SIMT) machine emerges as a primary computing device in high-perfor-mance computing, since the SIMT execution paradigm can exploit data-level parallelism effectively. This article explores the SIMT execut…
View article
iPUG for Multiple Graphcore IPUs: Optimizing Performance and Scalability of Parallel Breadth-First Search Open
This research delves into the potential of Graphcore's IPU AI-optimized architecture in the realm of High-Performance Computing (HPC). Focusing on solving complex challenges like sparse systems and irregular applications, the dissertation …
View article
Nested MIMD-SIMD Parallelization for Heterogeneous Microprocessors Open
Heterogeneous microprocessors integrate a CPU and GPU on the same chip, providing fast CPU-GPU communication and enabling cores to compute on data “in place.” This permits exploiting a finer granularity of parallelism on the integrated GPU…
View article
MODIFICATION AND PARALLELIZATION OF GENETIC ALGORITHM FOR SYNTHESIS OF ARTIFICIAL NEURAL NETWORKS Open
Context. The problem of automation synthesis of artificial neural networks for further use in diagnosing, forecasting and patternrecognition is solved. The object of the study was the process of synthesis of ANN using a modified genetic al…
View article
Exploiting Flow Graph of System of ODEs to Accelerate the Simulation of Biologically-Detailed Neural Networks Open
Exposing parallelism in scientific applications has become a core requirement for efficiently running on modern distributed multicore SIMD compute architectures. The granularity of parallelism that can be attained is a key determinant for …
View article
Stochastic optimization of GeantV code by use of genetic algorithms Open
GeantV is a complex system based on the interaction of different modules needed for detector simulation, which include transport of particles in fields, physics models simulating their interactions with matter and a geometrical modeler lib…
View article
MIMD Programs Execution Support on SIMD Machines: A Holistic Survey Open
The Single Instruction Multiple Data (SIMD) architecture, supported by various high-performance computing platforms, efficiently utilizes data-level parallelism. The SIMD model is used in traditional CPUs, dedicated vector systems, and acc…
View article
Derivation of short-term design rainfall intensity from daily rainfall data for urban drainage design using empirical equations in robe town, Ethiopia Open
Flooding is a significant impact that regularly affects the majority of cities/towns in developing countries due to inadequate drainage systems that were designed without considering hydrological-hydraulic efficiency caused by design rainf…
View article
Distributed PLC Based on Multicore CPUs - Architecture and Programming Open
The paper presents a complete approach to distributed control system design including architecture and programming. The system consists of distributed controllers connected with a network. A multiple core controller architecture is propose…
View article
THE PARSEC MACHINE: A NON-NEWTONIAN SUPRA-LINEAR SUPERCOMPUTER Open
This paper describes how transfer-learning can turn a Beowulf cluster into a full super-computer with supra-linear qualitative acceleration.Harmonic Analysis is used as a real-world example to show the kind of result that can be achieved w…
View article
A Fast Topological Parallel Algorithm for Traversing Large Datasets Open
This work presents a parallel implementation of a graph-generating algorithm designed to be straightforwardly adapted to traverse large datasets. This new approach has been validated in a correlated scenario known as the word ladder proble…
View article
Efficient parallel execution of genetic algorithms on Epiphany manycore processor Open
Recent years have seen a growing trend towards the introduction of more advanced manycore processors.On the other hand, there is also a growing popularity for cheap, creditcard-sized, devices offering more and more advanced features and co…
View article
Task Mapping and Scheduling on RISC-V MIMD Processor With Vector Accelerator Using Model-Based Parallelization Open
In this paper, we propose a model-based workflow to generate parallel code on a multiple instruction stream, multiple data stream (MIMD) processor with vector accelerator (MIMDV) from a Simulink model. Solving data- and task-parallelism is…
View article
Massively Parallel Graph Drawing and Representation Learning Open
To fully exploit the performance potential of modern multi-core processors, machine learning and data mining algorithms for big data must be parallelized in multiple ways. Today's CPUs consist of multiple cores, each following an independe…
View article
Analysis and selection of the structure of a multiprocessor computing system according to the performance criterion Open
Objectives. Analysis of the various architectures of computing systems (CSs) used in recent decades has allowed us to identify the most common structures. One of the key features is the use of mass-produced equipment to create data process…
View article
Parallelization of the Symplectic Massive Body Algorithm (SyMBA) N-body Code Open
Direct N -body simulations of a large number of particles, especially in the study of planetesimal dynamics and planet formation, have been computationally challenging even with modern machines. This work presents the combination of fully …
View article
Load-Balancing and Performance of a Gridless Particle Simulation on MIMD, SIMD, and Vector Supercomputers Open
Our charged particle simulation models a relativistic electron beam for which the field solution is local and thus requires no grid. We have implemented the simulation on a CRAY and on two parallel machines, a nCUBE 2 and Connection Machin…
View article
Analysis of High Performance Parallel Computing Instruction Sets Open
This study explores existing design principles of the processor architecture and identifies future design approach that will help to solve existing business problems that are operable on the scalable environment. We considered the two broa…
View article
On the Efficiency of Algorithms with Multi-level Parallelism Open
The paper investigates the efficiency of algorithms for solving computational mathematics problems that use a multilevel model of parallel computing on heterogeneous computer systems. A methodology for estimating the acceleration of algori…
View article
New polymorphous computing fabric. Open
This paper introduces a new polymorphous computing Fabric well suited to DSP and Image Processing and describes its implementation on a Configurable System on a Chip (CSOC). The architecture is highly parameterized and enables customizatio…
View article
HIGH PERFORMANCE PARALLEL SORT FOR SHARED AND DISTRIBUTED MEMORY MIMD Open
We present four high performance hybrid sorting methods developed for various parallel platforms: shared memorymultiprocessors, distributed multiprocessors, and clusters taking advantage of existence of both shared and distributedmemory. M…
View article
SIMD-MIMD cocktail in a hybrid memory glass Open
Hybrid memory systems consisting of DRAM and NVRAM offer a great opportunity for column-oriented data systems to persistently store and to efficiently process columnar data completely in main memory. While vectorization (SIMD) of query ope…
View article
Multiple Instruction Multiple Data (MIMD) Implementation on Clusters of Terminals Open
Today's life style is totally infatuated with computer and technical world and we all are also the part of the crowd . Many scientific and economic fields need a large computer power for their solution, but maximum solution are highly econ…
View article
Enhanced π Approximation Through MIMD Parallel Computing: An Efficiency Analysis Utilizing Raspberry Pi Open
Multiple Instruction Multiple Data is one of the parallel computing architectures in Flynn's taxonomy, where cores can execute independent sets of instructions on independent sets of data.Parallel computing could be used in scientific subf…
View article
(Poly)Logarithmic Time Construction of Round-optimal $n$-Block Broadcast Schedules for Broadcast and irregular Allgather in MPI Open
We give a fast(er), communication-free, parallel construction of optimal communication schedules that allow broadcasting of $n$ distinct blocks of data from a root processor to all other processors in $1$-ported, $p$-processor networks wit…