In the mid-90s we would only look to a mainframe for any kind of large scale computing jobs. That changed when we got distributed computing over fast networks, whereby we could leverage several simpler computers to solve a complex job. It became the de facto method to employ large number of “garden variety” computers that would work together in a group to break down massive problems into smaller bite size chunks and solve them simultaneously.
Advent of cloud computing gave this approach a boost because one did not have to own thousands of computers to crunch large data sets. Instead, one could lease the machines as and when needed. MapReduce is one technology that features heavily in this space. MapReduce is an approach to manage and process large datasets in a distributed manner.