DFG implementation on multi GPU cluster with computation-communication overlap - AGPIG Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

DFG implementation on multi GPU cluster with computation-communication overlap

Résumé

Nowadays, computers embed many CPUs and at least one GPU. Workstations can host several GPU cards, which are well suited for scientific and engineering computations. Such computers are linked through high bandwidth networks to compose clusters for HPC. These machines provide highly parallel multicore architectures while being cost-effective. Moreover, they significantly reduce dissipated power, and space needs compared to classical HPC clusters. Recently NVIDIA or ATI announced Tesla or Firestream boards, performing more than 500 gigaflops of double precision performance and dissipating less than 250 W for single GPU board. However, the real challenge is to achieve the highest performances on muti-GPU architectures. The programmer has to design architecture-specific code including GPU communications and memory management, task scheduling and synchronization. So, a high level programming abstract model is required to express all these important operations. In this paper, we propose a design flow allowing an efficient implementation of a DSP application specified as a DFG on a multi GPU computer cluster. We focus particularly on the effective implementation of communications by automating the computation-communication overlap. After presenting the related work, we show the interest of the implementation of communication-computation overlap on multi-GPU architectures. Then, we present our design flow that allows an efficient implementation of an algorithm expressed as DFG on a multi-GPU architecture. Finally, it is applied on a real world application of 3D granulometry developed for research on materials.
Fichier principal
Vignette du fichier
DFG_implementation_on_multi_GPU_cluster_with_computation-communication_overlap_.pdf (863.7 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00657536 , version 1 (06-01-2012)

Identifiants

  • HAL Id : hal-00657536 , version 1

Citer

Sylvain Huet, Vincent Boulos, Vincent Fristot, Luc Salvo. DFG implementation on multi GPU cluster with computation-communication overlap. DASIP 2011 - Conference on Design and Architectures for Signal and Image Processing, Nov 2011, Tampere, Finland. pp.1-8. ⟨hal-00657536⟩
223 Consultations
275 Téléchargements

Partager

Gmail Facebook X LinkedIn More