Abstract:
Framework MapReduce allows executing efficiently easily divisible tasks. Although for the group of tasks the task of optimal scheduling is NP-hard itself. Article describes mathematical model of MapReduce framework in two different distributed environments: with and without communication latency. For the environment without communication latency the optimal plan of MapReduce tasks scheduling is offered. For the environment in the presence of communication latency the author estimates the range of values of environment parameters when the offered plan is still efficient.
Keywords:scheduling in distributed environment, scheduling in the presence of communication, optimal scheduling, MapReduce framework.