BIG DATA HADOOP DEVELOPMENT
Big Data Hadoop Development course providing in-depth coverage of MapReduce and Yarn development along with Spark Framework & NoSQL. R Language for Data Analytics and Solr as the Enterprise Search Application will also be introduced in this module.
COURSE OBJECTIVE
With the emergence of a huge market need of skilled manpower in Big Data Analytics and developers we have designed this course to develop development skills on Big Data technologies. This course covers all ecosystem products used in ingestion, cleansing and getting value out of Big Data using best practices applied in handling Big Data. We cover the course with real time use cases and implementations from various domains which includes predictive analytics and Data Science algorithms. |
|
LESSON PLANS
SESSION 1: UNDERSTANDING THE BIG DATA PROBLEM
Session Goal: Introduction to the topology of common existing limitations when dealing with a large amount of data along with the common solutions. The goal here is to lay down the foundation of a heterogeneous architecture that will be described in the following Sessions.
SESSION 2: UNDERSTANDING DEVELOPMENT CHALLENGES Session Goal: IT has become a lot more important for many organisations than it was before. With the appearance of the Internet of Things and the Industrial internet, all the unconnected devices will become datafied and start generating vast amount of data. We will now learn about the challenges involved in Big Data Development.
SESSION 3: PROGRAMMING MAPREDUCE JOBS & YARN DEVELOPMENT Session Goal: Implementing the components of MapReduce and YARN to derive an approach for determining logic and patterns to implement various Data Processing Methodologies.
SESSION 4: DATA PROCESSING USING PIG Session Goal: Understanding Pig Latin Language and its implementation. Pig simplifies the complexity of MapReduce by providing a language called Pig Latin and Interpreter called Pig to translate Pig Scripts into MapReduce.
SESSION 5: DEVELOPMENT USING HIVE & IMPALA Session Goal: Facebook provided a simplified Data Processing Technique which is similar to SQL. Here you'll learn about the fundamentals and advanced concept of Hive along with Impala. You'll also learn about the critical comparison among them.
SESSION 6: PROCESS CHOREOGRAPHY USING OOZIE Session Goal: Understanding the need for combining Data Processing Jobs and the role Oozie plays choreographing the process. You'll also understand how collection of actions are arranged in a Controlled Dependency like the Direct Acyclic graphs.
|
SESSION 7: DEVELOPMENT USING SPARK FRAMEWORK Session Goal: Understanding the Architecture of Spark and its relevance in processing Big Data. You'll learn about the core of Spark along with the components of Spark like Spark Streaming, Spark SQL, Spark Mlib and Graphx with in-depth coverage of its implementation using Scala.
SESSION 8: DATA MODELLING USING NOSQL Session Goal: NoSql is an important approach to overcome the problems and restrictions of scalability in RDBMS. This session will help you understand Storage Architecture properly, CRUID Operations and Querying NoSql Stores.
SESSION 9: UNDERSTANDING DATA SCIENCE & R LANGUAGE Session Goal: This special session will help Developers understand Analytics and the various approach to Analytics on Big Data.
SESSION 10: BIG DATA ANALYTICS & TECHNIQUES OF IMPLEMENTATION USING R Session Goal Analytics requires proper understanding of Statistics and Maths. Here you'll learn to implement various algorithms using R and other Tools.
SESSION 11: IMPLEMENTING SOLR IN BIG DATA Session Goal: Data is growing exponentially in this age of advancement and innovation, and handling such massive data demands great focus on the development of Scalable Search Engines. Here you'll learn and understand Solr; which is an open source Enterprise Search Application that provides the capability of implementing and executing search functionality on Structured and Unstructured Data .
|
CASE STUDY AND PROJECTS:
Case studies are Integral part of Training. As part of this course we will ensure you implement Real-time case studies in various domains which includes:
|
TRAINING FEATURES:
1) Extensive Real Time Live Examples, Projects & POCs for improved practical competency, ensure deployment readiness and implementation. 2) Custom Lab, Software and Environment provided with Real-time Project Simulation. 3) Recorded Videos complemented with corresponding lecture ppts, materials & lab guides. (Provided in the form of MP4 videos, pdf, ppt for offline access as well). 4) Certification and Job-Interview Counselling & Coaching after every training. |