TY - JOUR
T1 - Performance evaluation of OpenMP-based algorithms for handling Kronecker descriptors
AU - Lima, Antonio M.
AU - Netto, Marco A.S.
AU - Webber, Thais
AU - Czekster, Ricardo M.
AU - De Rose, Cesar A.F.
AU - Fernandes, Paulo
PY - 2012/5/1
Y1 - 2012/5/1
N2 - Numerical analysis of Markovian models is relevant for performance evaluation and probabilistic analysis of systems’ behavior from several fields in science and engineering. These models can be represented in a compact fashion using Kronecker algebra. The Vector-Descriptor Product (VDP) is the key operation to obtain stationary and transient solutions of models represented by Kronecker-based descriptors. VDP algorithms are usually CPU intensive, requiring alternatives such as data partitioning to produce results in less time. This paper introduces a set of parallel implementations of a hybrid algorithm for handling descriptors and a detailed performance analysis on four real Markovian models. The implementations are based on different scheduling strategies using OpenMP and existing techniques of static and dynamic load balancing, along with data partitioning presented in the literature. The performance evaluation study contains analysis of speed-up, synchronization and scheduling overheads, task mapping policies, and memory affinity. The results presented here provide insights into different implementation choices for an application on shared-memory systems and how this application benefited from this architecture.
AB - Numerical analysis of Markovian models is relevant for performance evaluation and probabilistic analysis of systems’ behavior from several fields in science and engineering. These models can be represented in a compact fashion using Kronecker algebra. The Vector-Descriptor Product (VDP) is the key operation to obtain stationary and transient solutions of models represented by Kronecker-based descriptors. VDP algorithms are usually CPU intensive, requiring alternatives such as data partitioning to produce results in less time. This paper introduces a set of parallel implementations of a hybrid algorithm for handling descriptors and a detailed performance analysis on four real Markovian models. The implementations are based on different scheduling strategies using OpenMP and existing techniques of static and dynamic load balancing, along with data partitioning presented in the literature. The performance evaluation study contains analysis of speed-up, synchronization and scheduling overheads, task mapping policies, and memory affinity. The results presented here provide insights into different implementation choices for an application on shared-memory systems and how this application benefited from this architecture.
KW - Kronecker descriptors
KW - Markovian models
KW - NUMA machines
KW - OpenMP
KW - Parallel algorithms
KW - Performance evaluation
KW - Scientific computing
UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-84859161221&partnerID=MN8TOARS
UR - https://www.sciencedirect.com/science/article/pii/S0743731512000354?via%3Dihub
U2 - 10.1016/j.jpdc.2012.02.001
DO - 10.1016/j.jpdc.2012.02.001
M3 - Article
SN - 0743-7315
VL - 72
SP - 678
EP - 692
JO - Journal of Parallel and Distributed Computing
JF - Journal of Parallel and Distributed Computing
IS - 5
ER -