Big Data 101 – Demystifying the enabling technologies

Although the Big Data hype has recently invaded mainstream media and popular culture, Big Data has been on the technology radar for some time now. It’s one of those terms that is hot and cool, and means different things to different people.

In the mid-90s we would only look to a mainframe for any kind of large scale computing jobs. That changed when we got distributed computing over fast networks, whereby we could leverage several simpler computers to solve a complex job. It became the de facto method to employ large number of “garden variety” computers that would work together in a group to break down massive problems into smaller bite size chunks and solve them simultaneously.

Advent of cloud computing gave this approach a boost because one did not have to own thousands of computers to crunch large data sets. Instead, one could lease the machines as and when needed. MapReduce is one technology that features heavily in this space. MapReduce is an approach to manage and process large datasets in a distributed manner.

Click here for full text