Big data Hadoop Developer

After completing the course successfully, participants should be able to:

Explain the need for Big Data, and list its applications

Demonstrate the mastery of HDFS concepts and MapReduce framework

Use Kafka, Sqoop and Flume to load data into Hadoop File System

Run queries using Pig, and Hive

Develop and Demonstrate programs in Apache Spark

Handle NoSql databases like Hbase, Cassandra

Discuss and differentiate various commercial distributions of Big Data like Cloudera and Hortonworks

Differentiate between Hadoop 1.0 and Hadoop 2.0

  • Weekend Batch Is Consisting Of 16 Classes Each Running For 5 Hours.
  • Weekday Batch Is Consisting Of 40 Classes Each Running For 2 Hours.
  • Total 80 Hours.
  • Pre- Training
  • Actual- Training
  • Post-Training

SYLLABUS

ARCHITECTURE
  • Introduction to Big Data Technology Stack
  • Introduction to Hadoop and Ecosystem Build
  • Understanding Cluster Setup Activities
  • HDFS Architecture
  • HIVE Architecture
PIG ARCHITECTURE
  • Introduction to NOSQL
  • HBase Architecture
  • Understanding Cloudera Manager and HUE
HADOOP DEVELOPER

MAPREDUCE

  • Introduction to MAPREDUCE
  • MapReduce Engine– JobTracker
  •  Tasktracker
  • MapReduce Programming Model
  •  Mapper Class
  •  Reduce Class
  • Executing MapReduce Jobs
  • MapReduce and Java
  • MapReduce Programs in Java and Eclipse
Hive – The Data Warehouse in Hadoop
  • Concepts of Hive
  • Hive Architecture
  • Metastore
  • Driver
  • Thrift Server
  • Web Interface
  • JDBC / ODBC
  • CLI
  • Introduction to HQL (Hive Query Language)
  • The Hive Data Model
  • Partitions
  • Data types
  • Hive Configuration
  • Sample Hive Queries and commands
PIG
  • Pig Execution Mode
  • Local Mode
  • MapReduce Mode
  • Pig Engine
  • Pig Latin Scripts
  • Interactive Mode
  • Batch Mode
  • Configuring PIG
  • Sample PIG Scripts
HADOOP DATA ANALYTICS
  • Working With Hive-E-Commerce Use Case
  • Working With Pig-Financial Uses Case
  • Twitter Use Case-Sentimental Analysis
  • MR Optimization
  • Custom Combiner, Custom Partitioner And Distributed Cache
  • Advanced MapReduce
  • Datatypes in MapReduce
  • Input Formats in MapReduce
  • Output Formats in MapReduce
  • Joins in MapReduce
  • Reduce side join
  • Replicated join
  • Composite join
  • Use cases of MapReduce
HADOOP DATA INGESTION:

SQOOP, FLUME and KAFKA – MOVING DATA TO AND FROM HDFS

  • Introduction to Sqoop
  • Sqoop Connectors to RDBMS
  • Importing Data from Sqoop to Hive
  • Sqoop Commands
  • Introduction to Flume
  • Flume Data Model
  • Flume Examples
  • Use Cases of Sqoop and Flume
  • Introduction to KAFKA
  • Basic operations
  • Consumer Group Examples
  • Use Cases of Apache Kafka
NOSQL DATABASES
  • Introduction to NoSQL Databases
  • History of NoSQL
  • RDBMS vs NoSQL Comparision
  • Popular NoSQL DataBases
HBase – Distributed Columnar Database
  • NoSQL Movement
  • HBase Architecture
  • Region Servers
  • HBase Storage
  • Introduction to Zookeeper
  • Entities of Zookeeper
  •  Leader
  • Follower
  • Observer
  • Zookeeper Data Model
  • Configuring HBase and Zookeeper
  • HBase Examples
  • HBase Use Cases
APACHE SPARK
  • Introduction to Apache Spark
  • Apache Spark RDD
  • Core Programming
PROJECT USE CASES:
  • Entertainment Use Case
  • Twitter Use Case
  • Health Care Use Case
  • E-Commerce Use Case
  • Bio-Informatics Use Case
Your Title Goes Here

Your content goes here. Edit or remove this text inline or in the module Content settings. You can also style every aspect of this content in the module Design settings and even apply custom CSS to this text in the module Advanced settings.

Share This