BIG DATA HADOOP ADMINISTRATION
Big Data Hadoop Admin Training course for Administrators and aspiring Administrators. Coverage extended to Big Data on Cloud and DevOps for Hadoop and Cloud.
COURSE OBJECTIVE:
Provide an in-depth understanding of Architecture and implementing Hadoop cluster using CDH and HDP. This course covers Installation, Configuration, Securing, and Monitoring Hadoop Cluster with all ecosystem components to setup production level Big Data management. We also Discuss Data Lake concept with real time use cases of Analytics. This course is oriented towards Certification and also provides real time implementation to skill participant to work in production. |
|
LESSON PLANS
SESSION 1: UNDERSTANDING BIG DATA FROM ADMINISTRATION PERSPECTIVE
Session Goal: Introduction to the topology of common existing limitations when dealing with a large amount of data along with the common solutions. The goal here is to lay down the foundation of a heterogeneous architecture that will be described in the following Sessions.
SESSION 2: HADOOP CORE COMPONENTS: PLANNING, INSTALLATION & CONFIGURATION Session Goal: Introduction to the core components of Hadoop like HDFS, MapReduce and YARN. Participants will also learn Planning, Installation and Configuration in various scenarios.
SESSION 3: MANAGING & TROUBLESHOOTING HDFS Session Goal: HDFS is a major component of Hadoop that needs to be understood properly. You will learn various HDFS features like Replication, NameNode and DataNode, etc. Participants will also learn the various ways of managing HDFS when in a bad health and ensure a balanced HDFS is retained in production.
SESSION 4: UNDERSTANDING LOAD BALANCING & FAILOVER Session Goal: Understand the limitations of Hadoop version 1.0 and elaborate on the High Availability (HA) capability that was added in version 2.0. Participants will learn planning and configuring NameNode HA using various methods along with Manual and Automated Failover Configuration using ZooKeeper.
SESSION 5: MANAGING JOBS (MAPREDUCE & YARN) Session Goal: Leveraging the Data Processing capabilities of MapReduce and YARN. Learn to use MapReduce components like JobTracker, TaskTracker and YARN components like ResourceManager, NodeManager, Containers and Application Master.
SESSION 6: SECURITY APPLICATION: AUTHORISATION & AUTHENTICATION Session Goal: Securing HDFS and Jobs is one of the important tasks of an Administrator. This session will teach Configuration and Application of Authentication and Authorisation using various Frameworks like Kerberos and Ranger.
SESSION 7: HADOOP ECOSYSTEM CONFIGURATION |
Session Goal:
You may already know by now that Hadoop provides Infrastructure to manage Big Data. There are numerous products designed to utilise Hadoop Infrastructures, provide simplified and enhanced capabilities for Data Processing and Management. Here you will learn to setup and understand use of ecosystem products like Sqoop, Hive, Pig, Zookeeper, HBase, Flume, Mahout and Spark as Administrators.
SESSION 8: NETWORK CONSIDERATIONS & TOPOLOGIES: RACK CONFIGURATION Session Goal: The most important aspect of Distributed System is Network. Hadoop being a Distributed Architecture relies heavily on Network and its reliability. Here you'll learn about various Network Architectures and Considerations that you as an Administrator has to focus on to get a reliable Hadoop Cluster.
SESSION 9: SETTING UP MONITORING SYSTEMS Session Goal: Monitoring is a continuous task. You'll learn and understand Monitoring Parameters and set-up Monitoring Agents like Ganglia and Nagios.
SESSION 10: TUNING HADOOP CLUSTER & MAINTAINING QOS Session Goal: You'll learn fine-tuning Hadoop Cluster to meet the QOS specifications and requirements. You'll also learn about the common issues and the necessary Action Plans to resolve those problems.
SESSION 11: EVALUATING HADOOP DISTRIBUTION (CDH, HDP & PIVOTAL) Session Goal: As you already are aware, Hadoop is distributed in various flavours. Participants will understand evaluation and capabilities of popular distributions like CDH, HDP and Pivotal HD.
SESSION 12: CLOUD & BIG DATA INTEGRATION Session Goal: Cloud; today, is considered a great alternate option to design Architectures because of the elasticity and resource management capabilities that it offers. Here you'll learn to set up and evaluate Big Data on Cloud Infrastructure.
SESSION 13: HADOOP & DEVOPS Session Goal: Here you'll learn the use of DevOps to manage Cloud.
|
CASE STUDY AND PROJECTS
Case studies are integral part of training. As part of this course we will ensure you implement Real-time Case studies in various domains which includes:
|
TRAINING FEATURES
1) Extensive Real Time Live Examples, Projects & POCs for improved practical competency, ensure deployment readiness and implementation. 2) Custom Lab, Software and Environment provided with Real-time Project Simulation. 3) Recorded Videos complemented with corresponding lecture ppts, materials & lab guides. (Provided in the form of MP4 videos, pdf, ppt for offline access as well). 4) Certification and Job-Interview Counselling & Coaching after every training. |