Abstract:
A number of features of an efficient implementation of the lattice Boltzmann method (LBM) for hybrid supercomputers with many graphics processing units (GPU) are discussed. The main strategies for reducing the memory space required by LBM are described. The performance dependence of the implemented solver on the number of the GPUs in use is analyzed for the Lomonosov supercomputer installed at Moscow State University.