Tuesday, 30 August 2016

Hadoop MapReduce

in: MapReduce

MapReduce

One line Definition :

To process huge amount of data.

Introduction

Its a programming model to process huge amount of data.
Huge data means, exists in terabytes or petabytes.
Data can be any form structured and unstructured.. etc.
It's basically difficult and hard to handle the huge amount of data.
We can achieve this with MapReduce.

Benefits

Simplicity

Developers have a choice to develop the applications in their native technical languages like Java, C++ or Python, and MapReduce jobs are easy to run.

Scalability

MapReduce can process the terabytes and petabytes of data which stored in HDFS on cluster mode.

Speed

MapReduce process the data in an speed way becasue of its holding the feature of Parallel processing.

Recovery

Its a BIG challenge to keep the data from failures.
MapReduce takes care of data failures or loses.
MapReduce clones the data in machines to overcome the data failure.

Thanks for your time.

- Nireekshan