Learning Resources
 

 

Introduction
  • History of Hadoop Project
  • Need and requirement for Hadoop
  • Components of Hadoop project
HDFS
  • Basics (Blocks, Namenodes and Datanodes)
  • Interfaces and Data read and write process
  • HAR files and distcp
  • Command Line Interface
  • SequenceFile and MapFile, Checksumming, codecs and Writables
MapReduce
  • Basics and Configuration API
  • Combiner functions and streaming
  • Counters, sorting, joins and side data
  • Input formats (Text, binary, database, multiple)
  • Output formats (Text, binary, database, multiple)
  • Submission and initialization of job and task
  • JobTracker and TaskTracker classes
  • Scheduling, Shuffle and sort
  • Environment and side effects
  • Configuration API
  • Debugging and Optimizing
Cluster
  • Installation
  • Configuration
  • Testing and benchmarking
Administration
  • dfsadmin, fsck and balancer
  • log4j logging, log levels, stack trace and metrics
  • Backup and filesystem checks
  • Add and removal of nodes
Pig
  • Installation, Local and hadoop mode
  • Grunt, script and embedded execution
  • Pig Latin
  • UDF and data processing operator
Hbase
  • Need and evolution
  • Installation
  • Clients
Zookeeper
  • Installation
  • Group membership and management
  • Znodes
  • API, triggers and ACL
  • States, consistency and sessions
  • Implementation

Apply for Certification

https://www.vskills.in/certification/Certified-Cassandra-Professional