Hadoop Training Course

Hadoop Application Development

Hadoop (High-availability distributed object-oriented platform), better known as Apache Hadoop is a software framework that supports across-the-board, data- focused, distributed applications using a free license. The platform can be used on a single system effectively, but the real effectiveness is experienced when hundreds and thousands of systems with separate processor cores are scaled by it. Hadoop is derived from Google File System and MapReduce papers.

What Scope Does Hadoop Training Hold for Candidates?

When you browse through the websites for prospective web opportunities, you will come across a number of job vacancies for Hadoop professionals. Many companies are constantly looking out for certified Hadoop professionals with adequate skills and experience. With the demand increasing every day huge companies are in the lookout for suitable candidates to manage the huge data base in their firms. You could be one of them if you are certified in Hadoop with suitable training and certification. Hiring monsters such as Yahoo, Google, LinkedIn, Cisco, IBM, eBay, Facebook, Amazon and more hire qualified Hadoop professionals for the posts of developers, Hadoop Administrators, Data Scientists, Hadoop engineers, Hadoop architects, Hadoop consultants and more.

Requirements for Hadoop Courses

Though passionate candidates with basic internet and computers skills can apply for the basic level course, people with knowledge in Linux from the point of systems administration and management point of view, Java and other scripting languages such as Python, C, C++ and PHP are preferred to take up the advanced level courses in Hadoop.

Benefits Offered by Magneto Academy

Print Course material and software (Hadoop)
Practical and classroom sessions
Soft skills training
Opportunity to work for a real time Hadoop project and get involved in the project phases
Regular preps and practicals
Hands-on lab exercises
Presentation material
Online training by real time Hadoop Experts
Job placements with leading companies

Course Details

Modules Topic Description
Module 1 Introduction and Overview of Hadoop
  • What is Hadoop?
  • History of Hadoop
  • Building Blocks - Hadoop Eco-System
  • Who is behind Hadoop?
  • What Hadoop is good for and what it is not
  • Parallel Computer vs. Distributed Computing
  • How to configure Hadoop on your system
  • NameNode architecture (EditLog, FsImage, location of replicas)
  • Secondary NameNode architecture
  • DataNode architecture
Module 2 Hadoop Distributed FileSystem (HDFS)
  • HDFS Overview and Architecture
  • HDFS Installation
  • HDFS Use Cases
  • Hadoop FileSystem Shell
  • FileSystem Java API
Module 3 HBase - The Hadoop Database
  • HBase Overview and Architecture
  • HBase Installation
  • HBase Shell
  • Java Client API
  • Java Administrative API
  • Filters
  • Scan Caching and Batching
  • Key Design
  • Table Design
Module 4 Map/Reduce 2.0/YARN
  • MapReduce 2.0 and YARN Overview
  • MapReduce 2.0 and YARN Architecture
  • Installation
  • Input and Output Formats
  • Job Scheduling (FIFO, Fair Scheduler, Capacity Scheduler)
  • HDFS and HBase as Source and Sink
  • Job Configuration
  • Job Submission and Monitoring
  • Anatomy of Job Execution on YARN
  • Distributed Cache
  • Hadoop Streaming
Module 5 Hadoop Developer Tasks
  • Writting a map-reduce programme
  • Reading and writing data using Java
  • Hadoop Eclipse integration
  • Mapper in details
  • Reducer in details
  • Using Combiners
  • Reducing Intermediate Data with Combiners
  • Writing Partitioners for Better Load Balancing
  • Sorting in HDFS
  • Searching in HDFS
  • Indexing in HDFS
  • SHands-On Exercise
Module 6 Hadoop Administrative Tasks
  • Routine Administrative Procedures
  • Understanding dfsadmin and mradmin
  • Block Scanner, Balancer
  • Health Check & Safe mode
  • DataNode commissioning/decommissioning
  • Monitoring and Debugging on a production cluster
  • NameNode Back up and Recovery
  • Upgrading Hadoop
Module 7 MapReduce Workflows
  • Decomposing Problems into MapReduce Workflow
  • Using JobControl
  • Oozie Introduction and Architecture
  • Oozie Installation
  • Developing, deploying, and Executing Oozie Workflows
Module 8 Pig
  • Pig Overview
  • Installation
  • Pig Latin
  • Developing Pig Scripts
  • Processing Big Data with Pig
  • Joining data-sets with Pig
Module 9 Inheritance
  • Types of in inheritance
  • Advantage of inheritance
  • Single inheritance
  • Multilevel inheritance
  • Hierarchical inheritance
  • Overriding methods
  • Runtime polymorphism
Module 10 Hive
  • Hive Overview
  • Installation
  • Hive QL
Magneto Academy

Back to Top