LIVE VIRTUAL
Duration: 4 Days
Videos: 0
COST: $995
AUDIENCE: Developers, Solution Architect
Comfortable with programming and unix commands
Section 1: Introduction to Hadoop
hadoop history, concepts,eco system, distributions, high level architecture, hadoop myths, hadoop challenges, hardware / software
Section 2: HDFS
concepts (horizontal scaling, replication, data locality, rack awareness), architecture, Namenode, Secondary namenode, Data node, communications / heart-beats block manager / balancer, health check / safemode, read / write path, file systems abstractions, data integrity, future of HDFS : Namenode HA, Federation, lab exercises
Section 3: Map Reduce
mapreduce concepts, daemons : jobtracker / tasktracker, phases : driver, mapper, shuffle/sort, reducer, counters, distributed cache, combiners, mapreduce configuration, MR types and formats, sorting,Joins (map side & reduce side),job schedulers,unit testing,Thinking in map reduce, Future of mapreduce (yarn), lab exercises
Section 4: Pig
pig vs java map reduce, pig job flow, pig latin language, lab exercises
Section 5: Hive
hive concepts, architecture, data types, hive vs sql, lab exercises
Section 6: HBase
Intro, concepts, architecture, hbase vs RDBMS, read path / write path, schema design, lab exercises