Block
- Internally, a file is split it into one or more block.
- In Hadoop 1.0 the default block size is 64MB.
- In Hadoop 2.0 the default block size is 128MB
- We can customize this size in configuration
level.
- These split-ted blocks will be stored in data nodes.
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg151u-ir-C-ewX59iC3zH7enGyZN86e5CroGXNXb4SWIUPPtu5qm0uPcazN5DiGnYjI9xSQnhm8cYaVw74RKMr_lHdsqHLWiNKmivFjdHBEUWEPlceF2QRK8pp8tS3LciolA6D7hLed0W5/s640/block.png)
- Each block will be stored in different Data nodes to get fault tolerant.
- Hadoop maintains replication factor,by default replication factor is 3.
- We can customize this value
- At the cluster level
- At file creation
- Later stage for stored file
Thanks for your time.
Nireekshan