Curriculum


• The Motivation for Hadoop
• Hadoop Overview
• Data Storage: HDFS
• Distributed Data Processing: YARN, MapReduce, and Spark
• Data Processing and Analysis: Pig, Hive, and Impala
• Data Integration: Sqoop
• Other Hadoop Data Tools
• Exercise Scenarios Explanation

• What Is Pig?
• Pig’s Features
• Pig Use Cases
• Interacting with Pig

• Pig Latin Syntax
• Loading Data
• Simple Data Types
• Field Definitions
• Data Output
• Viewing the Schema
• Filtering and Sorting Data
• Commonly-Used Functions

• Storage Formats
• Complex/Nested Data Types
• Grouping
• Built-In Functions for Complex Data
• Iterating Grouped Data

• Techniques for Combining Data Sets
• Joining Data Sets in Pig
• Set Operations
• Splitting Data Sets
[curriculum_content question="Pig Troubleshooting and Optimization"]
• Troubleshooting Pig
• Logging
• Using Hadoop’s Web UI
• Data Sampling and Debugging
• Performance Overview
• Understanding the Execution Plan
• Tips for Improving the Performance of Your Pig Jobs

• What Is Hive?
• What Is Impala?
• Schema and Data Storage
• Comparing Hive to Traditional Databases
• Hive Use Cases

• Databases and Tables
• Basic Hive and Impala Query Language Syntax
• Data Types
• Differences Between Hive and Impala Query Syntax
• Using Hue to Execute Queries
• Using the Impala Shell

• Data Storage
• Creating Databases and Tables
• Loading Data
• Altering Databases and Tables
• Simplifying Queries with Views
• Storing Query Results

• Partitioning Tables
• Choosing a File Format
• Managing Metadata
• Controlling Access to Data

• Joining Datasets
• Common Built-In Functions
• Aggregation and Windowing

• How Impala Executes Queries
• Extending Impala with User-Defined Functions
• Improving Impala Performance

• Complex Values in Hive
• Using Regular Expressions in Hive
• Sentiment Analysis and N-Grams
• Conclusion

• Understanding Query Performance
• Controlling Job Execution Plan
• Bucketing
• Indexing Data

• SerDes
• Data Transformation with Custom Scripts
• User-Defined Functions
• Parameterized Queries

• Comparing MapReduce, Pig, Hive, Impala, and Relational Databases
Which to Choose?

• The Case for Apache Hadoop
• Why Hadoop?
• Core Hadoop Components
• Fundamental Concepts

• HDFS Features
• Writing and Reading Files
• NameNode Memory Considerations
• Overview of HDFS Security
• Using the Namenode Web UI
• Using the Hadoop File Shell

• Ingesting Data from External Sources with Flume
• Ingesting Data from Relational Databases with Sqoop
• REST Interfaces
• Best Practices for Importing Data

• What Is MapReduce?
• Basic MapReduce Concepts
• YARN Cluster Architecture
• Resource Allocation
• Failure Recovery
• Using the YARN Web UI
• MapReduce Version 1

• General Planning Considerations
• Choosing the Right Hardware
• Network Considerations
• Configuring Nodes
• Planning for Cluster Management

• Deployment Types
• Installing Hadoop
• Specifying the Hadoop Configuration
• Performing Initial HDFS Configuration
• Performing Initial YARN and MapReduce Configuration
• Hadoop Logging

• What is a Hadoop Client?
• Installing and Configuring Hadoop Clients
• Installing and Configuring Hue
• Hue Authentication and Authorization

• The Motivation for Cloudera Manager
• Cloudera Manager Features
• Express and Enterprise Versions
• Cloudera Manager Topology
• Installing Cloudera Manager
• Installing Hadoop Using Cloudera Manager
• Performing Basic Administration Tasks Using Cloudera Manager

• Advanced Configuration Parameters
• Configuring Hadoop Ports
• Explicitly Including and Excluding Hosts
• Configuring HDFS for Rack Awareness
• Configuring HDFS High Availability

• Why Hadoop Security Is Important
• Hadoop’s Security System Concepts
• What Kerberos Is and How it Works
• Securing a Hadoop Cluster with Kerberos

• Managing Running Jobs
• Scheduling Hadoop Jobs
• Configuring the Fair Scheduler
• Impala Query Scheduling

• Checking HDFS Status
• Copying Data Between Clusters
• Adding and Removing Cluster Nodes
• Rebalancing the Cluster
• Cluster Upgrading

• General System Monitoring
• Monitoring Hadoop Clusters
• Common Troubleshooting Hadoop Clusters
• Common Misconfigurations

Training Options

Self-Paced Learning

46,996.00 29,900.00
  • Learn at your convenient time and pace
  • Gain on-the-job kind of learning experience through high quality Videos built by industry experts.
  • Interactive Sessions as good as Classroom experience.
  • Learn end to end course content that is similar to instructor led virtual/classroom training.
  • Cost Effective as well as Convenient.

Blended Learning

  • Everything in Self-Paced Plus
  • Learn in an instructor-led online training class

Corporate Training

Customized to your team’s needs

  • Customized learning delivery model (self-paced and/or instructor-led)
  • Flexible pricing options
  • Enterprise grade learning management system (LMS)
  • Enterprise dashboards for individuals and teams
  • 24×7 learner assistance and support

Course Description

BigData Hadoop Analyst

The Hadoop Analyst training enables you to work with the versatile frameworks of the Apache Hadoop ecosystem. This Big Data Analyst course covers master Big Data Analysis using Hadoop, Pig and Hive.

What you will Learn in this Course?

  • Apache Hadoop Fundamentals
  • Introduction to Apache Hive and Impala
  • Querying with Apache Hive and Impala
  • Data Management
  • Data Storage and Performance
  • Analytic Functions and Windowing
  • Analysing Text
  • Apache Hive Optimization
  • Extending Apache Hive and Impala
  • Preparing for the Cloudera Data Analyst Training (CCA 159)exam

What are the pre-requisites for this Hadoop Analyst Certification Course?

There are no pre-requisites as such for Hadoop Analyst Training, but basic knowledge of Linux command line interface will be considered beneficial.

BigData Hadoop Developer

This Spark and Hadoop Developer training enables you to work with the versatile frameworks of the Apache Hadoop ecosystem. It is a comprehensive Hadoop Big Data training course designed by industry experts considering current industry job requirements to help you learn Big Data Hadoop and Spark modules. This Cloudera Hadoop and Spark training will prepare you to clear Cloudera CCA175 Big Data Exam.

What you will Learn in this Course?

  • Introduction to Hadoop and the Hadoop Ecosystem
  • Hadoop Architecture and HDFS
  • Importing Relational Data with Apache Sqoop
  • Introduction to Impala and Hive
  • Modelling and Managing Data with Impala and Hive
  • Data Partitioning and Capturing Data with Apache Flume
  • Spark Basics, Working with RDDs in Spark
  • Writing and Deploying Spark Applications
  • Parallel Programming with Spark, Spark Caching and Persistence
  • Common Patterns in Spark Data Processing
  • Preview: Spark SQL
  • Preparing for the Cloudera CCA Spark and Hadoop Developer Exam (CCA175) exam

What are the pre-requisites for this Spark and Hadoop Developer Certification Course?

There are no pre-requisites as such for Spark and Hadoop Developer Training, but basic knowledge of Linux command line interface will be considered beneficial.

BigData Hadoop Administrator

The Hadoop Administrator training enables you to work with the versatile frameworks of the Apache Hadoop ecosystem. This Big Data administrator course covers Hadoop installation and configuration, computational frameworks for processing Big Data, Hadoop administrator activities, cluster management with Sqoop, Flume, Pig, Hive, Impala, and Cloudera.

What you will Learn in this Course?

  • Hadoop architecture and its main components
  • Hadoop installation and configuration
  • Hadoop Distributed File System (HDFS)
  • MapReduce abstraction and its working
  • Troubleshooting cluster issues and recovering from node failures
  • Concepts of Hive, Pig, Oozie, Sqoop and Flume
  • Optimizing Hadoop cluster for high performance
  • Preparing for the Cloudera Certified Administrator for Apache Hadoop exam

What are the pre-requisites for this Hadoop Administration Certification Course?

There are no pre-requisites as such for Hadoop Administration Training, but basic knowledge of Linux command line interface will be considered beneficial.

BigData Hadoop Tester

This BigData Hadoop Tester training enables you to work with the versatile frameworks of the Apache Hadoop ecosystem. Hadoop testing training will provide you with the right skills to detect, analyse and rectify errors in Hadoop framework. You will be trained in the Hadoop software, architecture, MapReduce, HDFS and various components like Sqoop, Flume and Oozie. With this Hadoop testing training you will also be fully equipped with experience in various test case scenarios, proof of concepts implementation and real-world scenarios. It is a comprehensive Hadoop Big Data training course designed by industry experts considering current industry job requirements to help you learn Big Data Hadoop Testing.

What you will Learn in this Course?

  • Introduction to Hadoop and the Hadoop Ecosystem
  • Hadoop Architecture and HDFS
  • Getting Data into HDFS
  • Hadoop Testing
  • Big Data Testing
  • System Testing
  • Security Testing
  • Automation Testing
  • Oozie

What are the pre-requisites for this Big Data Hadoop Testing Certification Course?

There are no pre-requisites as such for BigData Hadoop Testing Training, but basic knowledge of Linux command line interface will be considered beneficial.

Big Data Security with Kerberos

Tecklearn Big Data Security with Kerberos will provide you details about the Kerberos protocol. Kerberos is a secure network authentication protocol that is widely used and is implemented based on symmetric cryptographic technology. In the Hadoop big data ecosystem, Kerberos protocol is the only built-in user authentication mode that is secure Upon completion of this online training, you will hold a solid understanding and hands-on experience with BigData Security with Kerberos.

Benefits

  • Average salary for a Hadoop Administrator ranges from approximately $104,528 to $141,391 per annum – Indeed.com
  • Hadoop Market is expected to reach $99.31B by 2022 growing at a CAGR of 42.1% from 2015 – Forbes
  • Kerberos is built in to all major operating systems, from companies like Microsoft, Apple, Red Hat and Sun as well as others. Kerberos is the authentication mechanism for Microsoft’s Active Directory and even for some devices like the X-Box.

What you will Learn in this Course?

  • Introduction to Cloudera Manager
  • Advanced Cluster Configuration
  • Hadoop Security
  • Secure A Hadoop Cluster with Kerberos

Key Features

Self-Paced Online Video

• Self-paced Videos: 120 Hrs
• Exercises & Project Work: 216 Hrs
• A 360-degree learning approach that you can adapt to your learning style

1 Year Unlimited Access

You get 1 Year unlimited access to LMS where presentations, quizzes, installation guide & class recordings are there.

24 x 7 Expert Support

We have 24x7 online support team to resolve all your technical queries, through ticket-based tracking system

Certification

Successfully complete your course and Tecklearn will provide you Course Completion Certificate.

Real-life Case Studies

Live project based on any of the selected use cases, involving implementation of the various Big Data Hadoop concepts.

Learn at your Convenience

• Certification and Job Assistance
• Flexible Schedule

Reviews

K

Kunal Puri

BigData Hadoop-Architect (All in 1)

I completed Hadoop Architect course from Tecklearn. It was awesome learning experience. I also could complete Cloudera Certification after this course and reading couple of books in addition. Trainer was very good and Content, Presentation was excellent . Also ,Practice tests were also great and was actually making us to remember key stuff.... I completed Hadoop Architect course from Tecklearn. It was awesome learning experience. I also could complete Cloudera Certification after this course and reading couple of books in addition. Trainer was very good and Content, Presentation was excellent . Also ,Practice tests were also great and was actually making us to remember key stuff. Read More
G

Gaurav Saxena

BigData Hadoop-Architect (All in 1)

I am thankful to Tecklearn which is one of the best Educational organization. I have undergone two highly rated courses (Big data and Hadoop Architect, Spark and Scala). Now i am doing well with the stuff learnt, after getting certified for big data and hadoop, I'm getting many offers from many companies. After the great experience of learning hadoop technology, I am now keen to…... I am thankful to Tecklearn which is one of the best Educational organization. I have undergone two highly rated courses (Big data and Hadoop Architect, Spark and Scala). Now i am doing well with the stuff learnt, after getting certified for big data and hadoop, I'm getting many offers from many companies. After the great experience of learning hadoop technology, I am now keen to enroll for Data science course. I hope i get the same learning experience which i got while undergoing my previous courses. I heartily thank tecklearn for helping me to make my career. Read More
P

Punit Chauhan

BigData Hadoop-Architect (All in 1)

The classes are very informative. Many simple and real world examples make the technical topics easy to understand. Now, I will never forget the Big Data and Hadoop concepts.... The classes are very informative. Many simple and real world examples make the technical topics easy to understand. Now, I will never forget the Big Data and Hadoop concepts. Read More
K

Krishna Pal Singh Songara

BigData Hadoop-Architect (All in 1)

I love Tecklearn Courses since It has evolving content (Trending technologies are keep evolving as online training provider - needs to deliver latest updated content.) in the course.... I love Tecklearn Courses since It has evolving content (Trending technologies are keep evolving as online training provider - needs to deliver latest updated content.) in the course. Read More
P

Patrick Wilson

BigData Hadoop-Architect (All in 1)

I would like to thank Tecklearn for providing such a excellent environment for learning online. The Hadoop developer course was well organised and included most of the topics which a developer should know.... I would like to thank Tecklearn for providing such a excellent environment for learning online. The Hadoop developer course was well organised and included most of the topics which a developer should know. Read More

Certification

This course is designed for clearing the following Certifications:

  • Cloudera CCA Administrator Exam (CCA131).
  • Cloudera CCA Data Analyst Exam (CCA159).
  • Cloudera CCA Spark and Hadoop Developer Exam (CCA175)

As part of this training, you will be working on real-time projects and assignments that have immense implications in the real-world industry scenarios, thus helping you fast-track your career effortlessly. Tecklearn’s Course Completion Certificate for the courses included in this combo course will be awarded upon the completion of the course.

Projects

  • To put your knowledge on into action, you will be required to work on various industry-based projects that discuss significant real-time use cases.
  • These projects are completely in-line with the modules mentioned in the curriculum and help you to clear the certification exam.

FAQ Content


You will never miss a lecture at Tecklearn. Tecklearn provides recordings of each class so you can review them as needed before the next session.

Your access to the Support Team is for lifetime and will be available 24/7. The team will help you in resolving queries, during and after the course.

Post-enrolment, the LMS access will be instantly provided to you and will be available for lifetime. You will be able to access the complete set of previous class recordings, PPTs, PDFs, assignments. Moreover, the access to our 24x7 support team will be granted instantly as well. You can start learning right away.

Yes, the access to the course material will be available for lifetime once you have enrolled into the course.

All the instructors at Tecklearn are practitioners from the Industry with minimum 10-15 years of relevant IT experience. Each of them has gone through a rigorous selection process that includes profile screening, technical evaluation, and a training demo before they are certified to train for us. We also ensure that only those trainers with a high alumni rating remain on our faculty.

Learning pedagogy has evolved with the advent of technology. Online training adds convenience and quality to the training module. With our 24x7 support system, our online learners will have someone to help them all the time even after the class ends. This is one of the driving factors to make sure that people achieve their end learning objective. We also provide life-time access of our updated course material to all our learners.

Tecklearn actively provides placement assistance to all learners who have successfully completed the training. We also help you with the job interview and resume preparation part as well.
ENROLL NOW
  • 46,996.00 29,900.00
  • 10 years, 1 month
  • Course Certificate
56 STUDENTS ENROLLED

Contact Us

Contact Us

Course Curriculum

Architect Quiz 01:00:00

Related Courses

TRENDING COURSES

X