Learning Resources
Introduction
- History of Hadoop Project
- Need and requirement for Hadoop
- Components of Hadoop project
HDFS
- Basics (Blocks, Namenodes and Datanodes)
- Interfaces and Data read and write process
- HAR files and distcp
- Command Line Interface
- SequenceFile and MapFile, Checksumming, codecs and Writables
MapReduce
- Basics and Configuration API
- Combiner functions and streaming
- Counters, sorting, joins and side data
- Input formats (Text, binary, database, multiple)
- Output formats (Text, binary, database, multiple)
- Submission and initialization of job and task
- JobTracker and TaskTracker classes
- Scheduling, Shuffle and sort
- Environment and side effects
- Configuration API
- Debugging and Optimizing
Cluster
- Installation
- Configuration
- Testing and benchmarking
Administration
- dfsadmin, fsck and balancer
- log4j logging, log levels, stack trace and metrics
- Backup and filesystem checks
- Add and removal of nodes
Pig
- Installation, Local and hadoop mode
- Grunt, script and embedded execution
- Pig Latin
- UDF and data processing operator
Hbase
- Need and evolution
- Installation
- Clients
Zookeeper
- Installation
- Group membership and management
- Znodes
- API, triggers and ACL
- States, consistency and sessions
- Implementation
Apply for Certification
https://www.vskills.in/certification/Certified-Cassandra-Professional