Abstract:
A Dataflow processor can execute up to 16 instructions per cycle compared to 4 to 6 instructions of the best von Neumann processors.
Simulation of the vector dataflow processor (VDP) showed that it is possible to raise its core vector performance up to 256 flops per clock, and using modern manufacturing process to implement up to 4 such cores on a single die. Simulation results of the matrix multiplication program and 2D Stencil on double core VDP with shared memory are given in this paper.
It is shown that the matrix multiplication program scales well on VDP, while the performance of 2D Stencil is limited by the shared memory bandwidth. (In Russian).
Key words and phrases:supercomputer, vector processor, dataflow architecture, performance evaluation, matrix multiplication, 2d stencil.