Duration : 5 Days | Classes : 4 | Modules : 0 | Labs : 0 | Videos : 5
This course will begin with the big data motivation and explaining all the components in Hadoop including Hadoop cluster and distributed file systems. How to use HDFS distributed storage. What is map reduce and its effect in distributed processing. Write a count program in java and explore map reduce further in python.
Audience : Hadoop Developer
Prerequisites : basic computer skills, basic knowledge in programming
Course outline :
Module 01 - Big Data. Motivation, Hadoop components
Module 02 - Using the Hadoop HDFS Distributed Storage
Module 03 - Distributed processing Map Reduce
Module 04 - Word count java program in Map Reduce
Module 05 - A Better Word count program
Module 06 - Map Reduce and other languages (a simple example in python
Hive -
Module 01. Hive – Basic Concepts
Module 02. Hive - Joins
Module 03 Hive - Partitions
Module 04 Hive - Bucketing and external tables
Module 05 Hive - Data pipeline version 1 (This is basic data warehouse)
Module 06 Hive - Data pipeline upgrade (build on top of previous case)
Pig -
Module 01. Pig – Basic Concepts and comparison with Hive
Module 02. Pig – Programming language
Module 03 Pig – Programming language (Continuation)
Module 04 Pig – Reading date from Hive Tables
Module 04 Pig – Ad Hoc data analytics with Pig
Module 05 Hive - Re-implementing Data pipeline using pig
Course modules
Classes
Class name |
Start date |
End date |
Seat |
Labs |
Hadoop and Big Data Fundamentals |
16 Feb 2015 |
20 Feb 2015 |
20 |
40 |
Hadoop and Big Data Fundamentals (Live Virtual) |
13 Apr 2015 |
17 Apr 2015 |
20 |
20 |
Hadoop and Big Data Fundamentals (Live Virtual) |
15 Jun 2015 |
19 Jun 2015 |
20 |
20 |
Hadoop and Big Data Fundamentals (Live Virtual) |
13 Jul 2015 |
17 Jul 2015 |
20 |
20 |