Informace o publikaci

Acceleration of dRMSD Calculation and Efficient Usage of GPU Caches

Název česky Akcelerace dRMSD výpočtu a efektivní užití GPU cache
Autoři

FILIPOVIČ Jiří PLHÁK Jan STŘELÁK David

Rok publikování 2015
Druh Článek ve sborníku
Konference Proceedings of IEEE International Conference on High Performance Computing & Simulation
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
Doi http://dx.doi.org/10.1109/HPCSim.2015.7237020
Obor Informatika
Klíčová slova RMSD; GPU; code optimization; cache
Popis In this paper, we introduce the GPU acceleration of dRMSD algorithm, used to compare different structures of a molecule. Comparing to multithreaded CPU implementation, we have reached 13.4x speedup in clustering and 62.7x speedup in 1:1 dRMSD computation using mid-end GPU. The dRMSD computation exposes strong memory locality and thus is compute-bound. Along with conservative implementation using shared memory, we have decided to implement variants of the algorithm using GPU caches to maintain memory locality. Our implementation using cache reaches 96.5 % and 91.6 % of shared memory performance on Fermi and Maxwell, respectively. We have identified several performance pitfalls related to cache blocking in compute-bound codes and suggested optimization techniques to improve the performance.
Související projekty: