Site icon Tutorial

Hadoop and Mapreduce Tutorial | Introduction to Apache Hadoop

Apache Hadoop

Apache Hadoop is an open-source software framework for storage and large-scale processing of data-sets on clusters of commodity hardware. Hadoop is an Apache top-level project being built and used by a global community of contributors and users. It is licensed under the Apache License 2.0.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Advantages & Disadvantages

Advantages of Hadoop

Disadvantages of Hadoop

Following are the major common areas found as weaknesses of Hadoop framework or system:

MapReduce is a batch-based architecture hence it is inefficient for real-time data access needs.

Apply for Big Data and Hadoop Developer Certification

https://www.vskills.in/certification/certified-big-data-and-apache-hadoop-developer

Back to Tutorials

Exit mobile version