Artificial Intelligence
Articles
eBooks
Interview Questions
Videos
Keras
Articles
Create Model using both Sequential and Functional API in Keras
Deep dive into Keras - Convolution Neural Network (CNN)
Deep Dive into Modules provided by Keras Library
Detail understanding of Keras Layers
Detailed understanding of Keras applications
Detailed understanding of the Keras Model compilation process
How Keras help in Deep Learning and Architecture of Keras Library
How to write a simple MPL based Artificial Neural Network to perform regression prediction.
Keras Backend Implementations and overview of Deep Learning
Overview of Deep Learning Library Keras and How to install Keras Library on your machine
Write a simple Long Short-Term Memory (LSTM) based RNN to do sequence analysis
eBooks
Interview Questions
Videos
Create Model using both Sequential and Functional API in Keras
Deep dive into Keras - Convolution Neural Network (CNN)
Deep Dive into Modules provided by Keras Library
Detail understanding of Keras Layers
Detailed understanding of Keras applications
Detailed understanding of the Keras Model compilation process
How Keras help in Deep Learning and Architecture of Keras Library
How to write a simple MPL based Artificial Neural Network to perform regression prediction.
Keras Backend Implementations and overview of Deep Learning
Overview of Deep Learning Library Keras and How to install Keras Library on your machine
Write a simple Long Short-Term Memory (LSTM) based RNN to do sequence analysis
Tensor Flow
Articles
Concept of Agents and Environments in AI
Hidden Layer Perceptron in TensorFlow
Multi-layer Perceptron in TensorFlow
Concept of Fuzzy Logic Systems
Deep Dive into TensorFlow Playground
Difference between TensorFlow and Keras
Difference between TensorFlow and PyTorch
Difference between TensorFlow and Theano
How to Install TensorFlow Through pip in Windows
Idea of Intelligence and components of Intelligence
Implementation of Neural Network in TensorFlow
Linear Regression in TensorFlow
Machine Learning and Deep Learning
How to Install TensorFlow through Anaconda
Introduction of Convolutional Neural Network in TensorFlow
Long short-term memory (LSTM) RNN in TensorFlow
What are Artificial Neural Networks?
Working of Convolutional Neural Network
Advantages and Disadvantages of TensorFlow
Architecture of TensorFlow explained
AI - Popular Search Algorithms
Artificial Intelligence - Research Areas
Artificial Neural Network in TensorFlow
CIFAR-10 and CIFAR-100 Dataset in TensorFlow
Classification of Neural Network in TensorFlow
TensorFlow Single and Multiple GPU
TensorFlow Security and TensorFlow Vs Caffe
Style Transferring in TensorFlow
Single Layer Perceptron in TensorFlow
Robotics in Artificial Intelligence
Recurrent Neural Network (RNN) in TensorFlow
eBooks
Interview Questions
Videos
Concept of Agents and Environments in AI
Hidden Layer Perceptron in TensorFlow
Multi-layer Perceptron in TensorFlow
Concept of Fuzzy Logic Systems
Deep Dive into TensorFlow Playground
Difference between TensorFlow and Keras
Difference between TensorFlow and PyTorch
Difference between TensorFlow and Theano
How to Install TensorFlow Through pip in Windows
Idea of Intelligence and components of Intelligence
Implementation of Neural Network in TensorFlow
Linear Regression in TensorFlow
Machine Learning and Deep Learning
How to Install TensorFlow through Anaconda
Introduction of Convolutional Neural Network in TensorFlow
Long short-term memory (LSTM) RNN in TensorFlow
What are Artificial Neural Networks?
Working of Convolutional Neural Network
Advantages and Disadvantages of TensorFlow
Architecture of TensorFlow explained
AI - Popular Search Algorithms
Artificial Intelligence - Research Areas
Artificial Neural Network in TensorFlow
CIFAR-10 and CIFAR-100 Dataset in TensorFlow
Classification of Neural Network in TensorFlow
TensorFlow Single and Multiple GPU
TensorFlow Security and TensorFlow Vs Caffe
Style Transferring in TensorFlow
Single Layer Perceptron in TensorFlow
Robotics in Artificial Intelligence
Recurrent Neural Network (RNN) in TensorFlow
Data Science Introduction and How to set up python
How to use Hibernate Query Language
Handling Arrays and Strings in PHP
Cookies and Sessions Handling in PHP
Cookies and Sessions Handling in PHP
Concept of Agents and Environments in AI
Hidden Layer Perceptron in TensorFlow
Multi-layer Perceptron in TensorFlow
Concept of Fuzzy Logic Systems
Deep Dive into TensorFlow Playground
Difference between TensorFlow and Keras
Difference between TensorFlow and PyTorch
Difference between TensorFlow and Theano
How to Install TensorFlow Through pip in Windows
Idea of Intelligence and components of Intelligence
Implementation of Neural Network in TensorFlow
Linear Regression in TensorFlow
Machine Learning and Deep Learning
How to Install TensorFlow through Anaconda
Introduction of Convolutional Neural Network in TensorFlow
Long short-term memory (LSTM) RNN in TensorFlow
What are Artificial Neural Networks?
Working of Convolutional Neural Network
Advantages and Disadvantages of TensorFlow
Architecture of TensorFlow explained
AI - Popular Search Algorithms
Artificial Intelligence - Research Areas
Artificial Neural Network in TensorFlow
CIFAR-10 and CIFAR-100 Dataset in TensorFlow
Classification of Neural Network in TensorFlow
Create Model using both Sequential and Functional API in Keras
Deep dive into Keras - Convolution Neural Network (CNN)
Deep Dive into Modules provided by Keras Library
Detail understanding of Keras Layers
Detailed understanding of Keras applications
Detailed understanding of the Keras Model compilation process
How Keras help in Deep Learning and Architecture of Keras Library
How to write a simple MPL based Artificial Neural Network to perform regression prediction.
Keras Backend Implementations and overview of Deep Learning
Overview of Deep Learning Library Keras and How to install Keras Library on your machine
Write a simple Long Short-Term Memory (LSTM) based RNN to do sequence analysis
Model Evaluation and Model Prediction in Keras
How to create our own Customized Layer in Keras Library
TensorFlow Single and Multiple GPU
TensorFlow Security and TensorFlow Vs Caffe
Style Transferring in TensorFlow
Single Layer Perceptron in TensorFlow
Robotics in Artificial Intelligence
Recurrent Neural Network (RNN) in TensorFlow
Overview of Artificial Intelligence and its Application
Natural Language Processing in AI
Test Article Main differences between Selenium RC and Selenium WebDriver - Dont Delete
Top Artificial Intelligence Interview Questions and Answers
Top Oracle DBA Interview Questions and Answers
Basics of Splunk and Installation of Splunk Environment
Microsoft Azure Solutions Architect Certification Exam Questions (AZ-300 & AZ-301)
Best Approach for Storing data to AWS DynamoDB and S3 – AWS Implementation
Maintain High Availability in AWS with anticipated Additional Load
BI and Visualization
Articles
eBooks
Videos
Cognos Analytics
Articles
Perform Report Operations in IBM Cognos
Introduction to IBM Cognos and its Components and Services
How to open, create, save, run and print report in Cognos
How to open, create and save Analysis in Analysis Studio in Cognos
How to create report in Report Studio
How to create List and CrossTab Report in Cognos
How to create a package using Cognos
Filters and Custom Calculations in Cognos
Data Warehouse Schemas, ETL and Reporting Tools
Cognos Studios and other capabilities
eBooks
Videos
Perform Report Operations in IBM Cognos
Introduction to IBM Cognos and its Components and Services
How to open, create, save, run and print report in Cognos
How to open, create and save Analysis in Analysis Studio in Cognos
How to create report in Report Studio
How to create List and CrossTab Report in Cognos
How to create a package using Cognos
Filters and Custom Calculations in Cognos
Data Warehouse Schemas, ETL and Reporting Tools
Cognos Studios and other capabilities
Cognos - Relationships in Metadata Model
Top Tableau Desktop Interview Questions and Answers
Top Tableau Server Interview Questions and Answers
Top Power BI Interview Questions and Answers
Top Cognos TM1 Interview Questions and Answers
Cognos TM1
eBooks
Interview Questions
Videos
Microsoft Excel
Articles
How to Merge & Wrap Cells, Borders and Shades and Apply Formatting in Excel
BackStage View and Explore Window in Excel
Creating Formulas, Copying Formulas in Excel
Data Sorting and Using Ranges in Excel
Data Tables and Pivot Tables in Excel
Excel Fill Handle and Excel If Function
Freeze Panes and Conditional Format in Excel
Header and Footer, Page Break and Set Background in Excel
How to add Graphics and Perform Cross-Referencing in Excel
How to Create and Copy Worksheet in Excel
How to Create Worksheets in Excel
How to enter values and move around in Excel
How to Insert Comments and Add Text Box in Excel
How to Open, Close, Delete and Hide Worksheet in Excel
How to Perform Copy & Paste, Find & Replace in Excel
Using Styles, Themes and Templates in Excel
Using Functions and Built-in Functions in Excel
Translate Worksheet and workbook Security in Excel
Simple Charts and Pivot Charts in Excel
Sheet Options, Adjust Margins and Page Orientation in Excel
Printing Worksheets and Email workbooks in Excel
Perform Spell Check, Zoom In-Out and Use Special Symbols in Excel
How to use COUNT, COUNTIF, and COUNTIFS Function and Advanced If in Excel
How to Undo Changes, Setting Cell and Fonts, Text Decoration in Excel
How to Select, Insert, Delete and Move Data in Excel
How to Rotate Cells, Setting Colors and Text Alignment in Excel
eBooks
Interview Questions
Videos
How to Merge & Wrap Cells, Borders and Shades and Apply Formatting in Excel
BackStage View and Explore Window in Excel
Creating Formulas, Copying Formulas in Excel
Data Sorting and Using Ranges in Excel
Data Tables and Pivot Tables in Excel
Excel Fill Handle and Excel If Function
Freeze Panes and Conditional Format in Excel
Header and Footer, Page Break and Set Background in Excel
How to add Graphics and Perform Cross-Referencing in Excel
How to Create and Copy Worksheet in Excel
How to Create Worksheets in Excel
How to enter values and move around in Excel
How to Insert Comments and Add Text Box in Excel
How to Open, Close, Delete and Hide Worksheet in Excel
How to Perform Copy & Paste, Find & Replace in Excel
Using Styles, Themes and Templates in Excel
Using Functions and Built-in Functions in Excel
Translate Worksheet and workbook Security in Excel
Simple Charts and Pivot Charts in Excel
Sheet Options, Adjust Margins and Page Orientation in Excel
Printing Worksheets and Email workbooks in Excel
Perform Spell Check, Zoom In-Out and Use Special Symbols in Excel
How to use COUNT, COUNTIF, and COUNTIFS Function and Advanced If in Excel
How to Undo Changes, Setting Cell and Fonts, Text Decoration in Excel
How to Select, Insert, Delete and Move Data in Excel
How to Rotate Cells, Setting Colors and Text Alignment in Excel
OBIEEE
Articles
Concept of Testing Repository in OBIEE
Understanding Schemas in OBIEE
Overview of Oracle Business Intelligence Edition (OBIEE)
Multiple Logical Table Sources, Calculation Measures and Dimension Hierarchies
Level-Based Measures and Aggregates in OBIEE
Deep Dive into Repositories in OBIEE
eBooks
Interview Questions
Videos
Concept of Testing Repository in OBIEE
Understanding Schemas in OBIEE
Overview of Oracle Business Intelligence Edition (OBIEE)
Multiple Logical Table Sources, Calculation Measures and Dimension Hierarchies
Level-Based Measures and Aggregates in OBIEE
Deep Dive into Repositories in OBIEE
Pentaho
Articles
User interfaces available in Pentaho and their navigation
Overview of Pentaho and How to install Pentaho on your system
How to use the Pentaho Reporting Designer
How to use Grouping in Pentaho
How to use Functions in Reports in Pentaho
How to create Chart Report in Pentaho
eBooks
Interview Questions
Videos
User interfaces available in Pentaho and their navigation
Overview of Pentaho and How to install Pentaho on your system
How to use the Pentaho Reporting Designer
How to use Grouping in Pentaho
How to use Functions in Reports in Pentaho
How to create Chart Report in Pentaho
Power BI
Articles
Visualization Options in Power BI
Power BI Data Sources and How to connect with them
Power BI - Supported Data Sources
Power BI - Comparison with Other BI Tools
Overview of Power BI Embedded, Power BI Gateway and Power BI Report Server
Overview of Business Intelligence (BI) and Power BI
How to use various DAX functions in Power BI
How to Share Power BI Dashboard
How to Integrate Excel in Power BI
How to Download and Install Power BI Desktop
eBooks
Interview Questions
Videos
Visualization Options in Power BI
Power BI Data Sources and How to connect with them
Power BI - Supported Data Sources
Power BI - Comparison with Other BI Tools
Overview of Power BI Embedded, Power BI Gateway and Power BI Report Server
Overview of Business Intelligence (BI) and Power BI
How to use various DAX functions in Power BI
How to Share Power BI Dashboard
How to Integrate Excel in Power BI
How to Download and Install Power BI Desktop
Qlik View
Articles
List Box and Multi Box in QlikView
Navigation Options in QlikView
Overview of Data files (QVD) in QlikView
Processing Web Files in QlikView
Resident Load, Preceding Load and Incremental Load in QlikView
How to create Cross Tables in QlikView
How to create Pie Chart in QlikView
Straight Tables and Pivot Tables in QlikView
Database Connection in QlikView
Dimensions and Measures in QlikView
Usage of Keep Command in QlikView
Using Peek and RangeSum Function in QlikView
Handling Delimited Files in QlikView
Handling Excel Files in QlikView
How to create Bar Chart in QlikView
Using Match and Rank Function in QlikView
Overview of QlikView and How to install QlikView on your machine
Inline Data and Scripting in QlikView
Data Transformation in QlikView
Creating Dashboard in QlikView
Concept of Star Schema and Synthetic Key in QlikView
Concatenation and Master Calendar in QlikView
Column Manipulation in QlikView
eBooks
Interview Questions
Videos
List Box and Multi Box in QlikView
Navigation Options in QlikView
Overview of Data files (QVD) in QlikView
Processing Web Files in QlikView
Resident Load, Preceding Load and Incremental Load in QlikView
How to create Cross Tables in QlikView
How to create Pie Chart in QlikView
Straight Tables and Pivot Tables in QlikView
Database Connection in QlikView
Dimensions and Measures in QlikView
Usage of Keep Command in QlikView
Using Peek and RangeSum Function in QlikView
Handling Delimited Files in QlikView
Handling Excel Files in QlikView
How to create Bar Chart in QlikView
Using Match and Rank Function in QlikView
Overview of QlikView and How to install QlikView on your machine
Inline Data and Scripting in QlikView
Data Transformation in QlikView
Creating Dashboard in QlikView
Concept of Star Schema and Synthetic Key in QlikView
Concatenation and Master Calendar in QlikView
Column Manipulation in QlikView
Circular Reference in QlikView
QLikSense
Articles
Navigating in Qlik Sense Selections
Qlik Sense Conditional Functions
Qlik Sense Counter and Exponential and Logarithmic Functions
Qlik Sense Developer: Roles and Responsibilities
Overview of Gauge Chart in Qlik Sense
Qlik Sense Advantages and Limitations
Qlik Sense Architecture Components
Qlik Sense Capabilities for people, Groups and Organizations
What is Qlik Sense Pivot Table?
Ways of Qlik Sense Collaboration
Qlik Sense Formatting Functions
Qlik Sense distribution and Trigonometric and HyperBolic Functions
Qlik Sense Mapping and Logical Functions
Qlik Sense Financial Functions
Types of Qlik Sense Aggregation Functions
Tableau vs Qlik Sense vs Power BI
Significance of Text and Image in Qlik Sense
Set Analysis and Set Expressions in Qlik Sense
QlikView Vs Qlik Sense: Overview
Using a Scatter Plot in Qlik Sense
Types of Operators in Qlik Sense
Qlik Sense System Requirements
Treemap Visualization in Qlik Sense
Qlik Sense Interpretation Functions
Modulo Functions in Qlik Sense
Key Performance Indicators (KPI) in Qlik Sense
Introduction to Qlik Sense Mashup
How to Manage Content and Resources in Qlik Management Console
How to Interact With Qlik Sense Visualizations?
How to Interact with Qlik Sense interface
How to create Qlik Sense Application
General Numeric Functions in Qlik Sense
Concept of Social Engineering Attacks and Cross-Site Scripting
Components of Qlik Sense Desktop
eBooks
Interview Questions
Videos
Navigating in Qlik Sense Selections
Qlik Sense Conditional Functions
Qlik Sense Counter and Exponential and Logarithmic Functions
Qlik Sense Developer: Roles and Responsibilities
Overview of Gauge Chart in Qlik Sense
Qlik Sense Advantages and Limitations
Qlik Sense Architecture Components
Qlik Sense Capabilities for people, Groups and Organizations
What is Qlik Sense Pivot Table?
Ways of Qlik Sense Collaboration
Qlik Sense Formatting Functions
Qlik Sense distribution and Trigonometric and HyperBolic Functions
Qlik Sense Mapping and Logical Functions
Qlik Sense Financial Functions
Types of Qlik Sense Aggregation Functions
Tableau vs Qlik Sense vs Power BI
Significance of Text and Image in Qlik Sense
Set Analysis and Set Expressions in Qlik Sense
QlikView Vs Qlik Sense: Overview
Using a Scatter Plot in Qlik Sense
Types of Operators in Qlik Sense
Qlik Sense System Requirements
Treemap Visualization in Qlik Sense
Qlik Sense Interpretation Functions
Modulo Functions in Qlik Sense
Key Performance Indicators (KPI) in Qlik Sense
Introduction to Qlik Sense Mashup
How to Manage Content and Resources in Qlik Management Console
How to Interact With Qlik Sense Visualizations?
How to Interact with Qlik Sense interface
How to create Qlik Sense Application
General Numeric Functions in Qlik Sense
Concept of Social Engineering Attacks and Cross-Site Scripting
Components of Qlik Sense Desktop
SSAS
eBooks
Interview Questions
Videos
SSIS
eBooks
Interview Questions
Videos
SSRS
eBooks
Interview Questions
Videos
Tableau Desktop
Articles
How to create Pareto Chart in Tableau
How to create Gantt Chart in Tableau
How to create Dual Axis Chart, Box Plot and Heat Map in Tableau
How to create Crosstab and Motion Chart in Tableau
How to create Bump and Bubble Chart in Tableau
How to create Bar, Line and Pie Chart in Tableau
How to Build Hierarchy and Groups in Tableau
Filter Operations and Extract Filters in Tableau
Different Tools of Tableau and Tableau Architecture
Understanding Tableau Navigation and Data Terminology
Understanding Tableau Desktop Workspace
Data Window, Data Types, Data Aggregation and File Types in Tableau
Top 10 Data Visualization Tools
Tableau Quick and Context Filters
Perform Table Calculations in Tableau
Perform Data Sorting in Tableau
Perform Calculation and Operators and Functions in Tableau
Overview of Tableau and Data Visualization
How to perform Numeric, String and Date Calculations in Tableau
How to Join Data in Tableau using multiple sources
Condition Filters, Data Source and Top Filters in Tableau
Comparison of Tableau and Power BI
How to install Tableau on your system
How to create Waterfall, Bullet and Area Chart in Tableau
eBooks
Interview Questions
Videos
How to create Pareto Chart in Tableau
How to create Gantt Chart in Tableau
How to create Dual Axis Chart, Box Plot and Heat Map in Tableau
How to create Crosstab and Motion Chart in Tableau
How to create Bump and Bubble Chart in Tableau
How to create Bar, Line and Pie Chart in Tableau
How to Build Hierarchy and Groups in Tableau
Filter Operations and Extract Filters in Tableau
Different Tools of Tableau and Tableau Architecture
Understanding Tableau Navigation and Data Terminology
Understanding Tableau Desktop Workspace
Data Window, Data Types, Data Aggregation and File Types in Tableau
Top 10 Data Visualization Tools
Tableau Quick and Context Filters
Perform Table Calculations in Tableau
Perform Data Sorting in Tableau
Perform Calculation and Operators and Functions in Tableau
Overview of Tableau and Data Visualization
How to perform Numeric, String and Date Calculations in Tableau
How to Join Data in Tableau using multiple sources
Condition Filters, Data Source and Top Filters in Tableau
Comparison of Tableau and Power BI
How to install Tableau on your system
How to create Waterfall, Bullet and Area Chart in Tableau
Tableau Server
TIBCO BW
eBooks
Videos
How to Merge & Wrap Cells, Borders and Shades and Apply Formatting in Excel
How to Clone Repository in Git
Navigating in Qlik Sense Selections
Qlik Sense Conditional Functions
Qlik Sense Counter and Exponential and Logarithmic Functions
Qlik Sense Developer: Roles and Responsibilities
Overview of Gauge Chart in Qlik Sense
Qlik Sense Advantages and Limitations
Qlik Sense Architecture Components
Qlik Sense Capabilities for people, Groups and Organizations
What is Qlik Sense Pivot Table?
Ways of Qlik Sense Collaboration
Qlik Sense Formatting Functions
Qlik Sense distribution and Trigonometric and HyperBolic Functions
Qlik Sense Mapping and Logical Functions
Qlik Sense Financial Functions
Types of Qlik Sense Aggregation Functions
Tableau vs Qlik Sense vs Power BI
Significance of Text and Image in Qlik Sense
Set Analysis and Set Expressions in Qlik Sense
QlikView Vs Qlik Sense: Overview
Using a Scatter Plot in Qlik Sense
Types of Operators in Qlik Sense
Qlik Sense System Requirements
List Box and Multi Box in QlikView
Navigation Options in QlikView
Overview of Data files (QVD) in QlikView
Processing Web Files in QlikView
Resident Load, Preceding Load and Incremental Load in QlikView
How to create Cross Tables in QlikView
How to create Pie Chart in QlikView
Straight Tables and Pivot Tables in QlikView
Database Connection in QlikView
Dimensions and Measures in QlikView
Usage of Keep Command in QlikView
Using Peek and RangeSum Function in QlikView
Handling Delimited Files in QlikView
Handling Excel Files in QlikView
How to create Bar Chart in QlikView
How to create Pareto Chart in Tableau
BackStage View and Explore Window in Excel
Creating Formulas, Copying Formulas in Excel
Treemap Visualization in Qlik Sense
Overview of SSRS and its Architecture
Overview of SSIS and why SSIS is required
Introduction to TIBCO Business Works (TIBCO BW)
Overview of Tableau Server and How to install it
Overview of SSAS and its Architecture
Using Match and Rank Function in QlikView
Data Sorting and Using Ranges in Excel
Data Tables and Pivot Tables in Excel
Excel Fill Handle and Excel If Function
Freeze Panes and Conditional Format in Excel
Overview of QlikView and How to install QlikView on your machine
Header and Footer, Page Break and Set Background in Excel
Inline Data and Scripting in QlikView
How to add Graphics and Perform Cross-Referencing in Excel
How to Create and Copy Worksheet in Excel
How to Create Worksheets in Excel
How to enter values and move around in Excel
How to Insert Comments and Add Text Box in Excel
How to Open, Close, Delete and Hide Worksheet in Excel
How to Perform Copy & Paste, Find & Replace in Excel
Qlik Sense Interpretation Functions
Setting Up Distributed Servers in Tableau Server
Concept of Testing Repository in OBIEE
Understanding Schemas in OBIEE
Modulo Functions in Qlik Sense
Key Performance Indicators (KPI) in Qlik Sense
Introduction to Qlik Sense Mashup
Data Transformation in QlikView
Creating Dashboard in QlikView
Concept of Star Schema and Synthetic Key in QlikView
How to Manage Content and Resources in Qlik Management Console
Concatenation and Master Calendar in QlikView
Column Manipulation in QlikView
How to Interact With Qlik Sense Visualizations?
How to Interact with Qlik Sense interface
Circular Reference in QlikView
Aggregate Functions in QlikView
Perform Report Operations in IBM Cognos
Overview of Oracle Business Intelligence Edition (OBIEE)
Introduction to IBM Cognos and its Components and Services
How to open, create, save, run and print report in Cognos
Multiple Logical Table Sources, Calculation Measures and Dimension Hierarchies
How to open, create and save Analysis in Analysis Studio in Cognos
Level-Based Measures and Aggregates in OBIEE
How to create report in Report Studio
Deep Dive into Repositories in OBIEE
Concept of Data Warehouse and Dimension Modelling
How to create List and CrossTab Report in Cognos
How to create a package using Cognos
How to create Qlik Sense Application
General Numeric Functions in Qlik Sense
Concept of Social Engineering Attacks and Cross-Site Scripting
Business and Presentation Layer of OBIEE explained
Filters and Custom Calculations in Cognos
Components of Qlik Sense Desktop
BI Tools for giant Data Visualization
Data Warehouse Schemas, ETL and Reporting Tools
Aggregation Functions in Qlik Sense
How to create Gantt Chart in Tableau
How to create Dual Axis Chart, Box Plot and Heat Map in Tableau
How to create Crosstab and Motion Chart in Tableau
How to create Bump and Bubble Chart in Tableau
How to create Bar, Line and Pie Chart in Tableau
Introduction to Cognos TM1 Perspective
How to Setup TM1 Application Server
How to Build Hierarchy and Groups in Tableau
How to Configure Security in TM1
Concept of Dimensions in Cognos TM1
Filter Operations and Extract Filters in Tableau
Cognos TM1 Installation and Configuration
Different Tools of Tableau and Tableau Architecture
Understanding Tableau Navigation and Data Terminology
Understanding Tableau Desktop Workspace
Data Window, Data Types, Data Aggregation and File Types in Tableau
Top 10 Data Visualization Tools
Tableau Quick and Context Filters
Perform Table Calculations in Tableau
Perform Data Sorting in Tableau
Perform Calculation and Operators and Functions in Tableau
Overview of Tableau and Data Visualization
How to perform Numeric, String and Date Calculations in Tableau
How to Join Data in Tableau using multiple sources
Condition Filters, Data Source and Top Filters in Tableau
Comparison of Tableau and Power BI
How to install Tableau on your system
How to create Waterfall, Bullet and Area Chart in Tableau
How to create Tree Maps and Heat Maps in Tableau
How to create Scatter Plot and Histogram Chart in Tableau
User interfaces available in Pentaho and their navigation
Overview of Pentaho and How to install Pentaho on your system
How to use the Pentaho Reporting Designer
How to use Grouping in Pentaho
How to use Functions in Reports in Pentaho
How to create Chart Report in Pentaho
How to add Page Footer Fields in Pentaho
Formatting Report Elements in Pentaho Reporting Designer
Using Styles, Themes and Templates in Excel
Using Functions and Built-in Functions in Excel
Translate Worksheet and workbook Security in Excel
Simple Charts and Pivot Charts in Excel
Sheet Options, Adjust Margins and Page Orientation in Excel
Printing Worksheets and Email workbooks in Excel
Perform Spell Check, Zoom In-Out and Use Special Symbols in Excel
How to use COUNT, COUNTIF, and COUNTIFS Function and Advanced If in Excel
How to Undo Changes, Setting Cell and Fonts, Text Decoration in Excel
How to Select, Insert, Delete and Move Data in Excel
How to Rotate Cells, Setting Colors and Text Alignment in Excel
How to Perform Data Validation and Data Filtering in Excel
Cognos Studios and other capabilities
Cognos - Relationships in Metadata Model
Visualization Options in Power BI
Power BI Data Sources and How to connect with them
Power BI - Supported Data Sources
Power BI - Comparison with Other BI Tools
Overview of Power BI Embedded, Power BI Gateway and Power BI Report Server
Overview of Business Intelligence (BI) and Power BI
How to use various DAX functions in Power BI
How to Share Power BI Dashboard
How to Integrate Excel in Power BI
How to Download and Install Power BI Desktop
How to create Power BI Dashboard and Reports
Top Qlik Sense Interview Questions and Answers
Top Microsoft BI Interview Questions and Answers
Top TIBCO Spotfire Interview Questions and Answers
Top OBIEE Interview Questions and Answers
Top Tableau Desktop Interview Questions and Answers
Top Tableau Server Interview Questions and Answers
Top Qlik View Interview Questions and Answers
Top TIBCO Business Works Interview Questions and Answers
Top Oracle Hyperion Interview Questions and Answers
Top Power BI Interview Questions and Answers
Top Pentaho Interview Questions and Answers
Top Cognos TM1 Interview Questions and Answers
Top IBM DataStage Interview Questions and Answers
Top IBM Cognos Analytics Interview Questions and Answers
Best Approach for Storing data to AWS DynamoDB and S3 – AWS Implementation
Maintain High Availability in AWS with anticipated Additional Load
Big Data
eBooks
Videos
Aapche Cassandra
Articles
Deep dive into Cassandra Query Language Collections and user defined data types.
Deep dive into Cassandra Shell Commands
How to Create and Alter Tables in Apache Cassandra
How to Create and Drop Indexes in Apache Cassandra
How to create, alter and drop Keyspaces in Cassandra
How to Drop and Truncate Tables in Apache Cassandra
How to set up Both cqlsh and Java environments to work with Cassandra
How to Perform CRUD ( Create , Read , Update and Delete ) Operations in Table in Apache Cassandra
Introduction to Apache Cassandra, History and Architecture
Overview of How Cassandra Stores its data
Overview of important class in Cassandra and introduction of Cassandra query shell language
eBooks
Interview Questions
Videos
Deep dive into Cassandra Query Language Collections and user defined data types.
Deep dive into Cassandra Shell Commands
How to Create and Alter Tables in Apache Cassandra
How to Create and Drop Indexes in Apache Cassandra
How to create, alter and drop Keyspaces in Cassandra
How to Drop and Truncate Tables in Apache Cassandra
How to set up Both cqlsh and Java environments to work with Cassandra
How to Perform CRUD ( Create , Read , Update and Delete ) Operations in Table in Apache Cassandra
Introduction to Apache Cassandra, History and Architecture
Overview of How Cassandra Stores its data
Overview of important class in Cassandra and introduction of Cassandra query shell language
Apache NiFi
Articles
How to Monitor System statistics using Apache NiFi
Concept of Logging in Apache NiFi
Basic Concepts of Apache NiFi and its Installation
Deep Dive into Apache Nifi – Flow Files, Queues, Process Groups and Labels
Deep dive into Apache NiFi-Processors
Detailed understanding of Apache NiFi -Templates
How to Administer Apache NiFi and Create Flows in Apache NiFi
Understanding Apache NiFi API’s with request and response example
Understanding Apache Nifi Processors Categorization and its relationship
Introduction to Apache NiFi, its History, Features and Architecture
eBooks
Interview Questions
Videos
How to Monitor System statistics using Apache NiFi
Concept of Logging in Apache NiFi
Basic Concepts of Apache NiFi and its Installation
Deep Dive into Apache Nifi – Flow Files, Queues, Process Groups and Labels
Deep dive into Apache NiFi-Processors
Detailed understanding of Apache NiFi -Templates
How to Administer Apache NiFi and Create Flows in Apache NiFi
Understanding Apache NiFi API’s with request and response example
Understanding Apache Nifi Processors Categorization and its relationship
Introduction to Apache NiFi, its History, Features and Architecture
Apache Oozie
eBooks
Interview Questions
Videos
Apache Pig
Articles
Explanation of Apache Pig Group and Cogroup Operators
Detailed Study of Architecture of Apache Pig
Deep Dive into Pig Latin Diagnostic Operators
Deep Dive into Apache Pig Functions: Load & Store, Bag & Tuple, String, Date-time, Math
Apache Pig Basics, Features and Comparison with MapReduce, Hive & SQL and History of Apache Pig
Explanation of Shell and Utility Commands provided by Apache Grunt Shell
How to Install Apache Pig and Configure Pig
How to Load data to Apache Pig from Hadoop File System
How to run Apache Pig Scripts in Batch Mode
How to Store data in Apache Pig using Store Operator
How to use Cross Operator and Union Operator in Pig Latin
How to use Split and Filter Operator in Apache Pig Latin
How to use the Join Operators in Pig Latin
How to use Distinct, For Each, Order By, Limit Operators and Eval Functions in Apache Pig
eBooks
Interview Questions
Videos
Explanation of Apache Pig Group and Cogroup Operators
Detailed Study of Architecture of Apache Pig
Deep Dive into Pig Latin Diagnostic Operators
Deep Dive into Apache Pig Functions: Load & Store, Bag & Tuple, String, Date-time, Math
Apache Pig Basics, Features and Comparison with MapReduce, Hive & SQL and History of Apache Pig
Explanation of Shell and Utility Commands provided by Apache Grunt Shell
How to Install Apache Pig and Configure Pig
How to Load data to Apache Pig from Hadoop File System
How to run Apache Pig Scripts in Batch Mode
How to Store data in Apache Pig using Store Operator
How to use Cross Operator and Union Operator in Pig Latin
How to use Split and Filter Operator in Apache Pig Latin
How to use the Join Operators in Pig Latin
How to use Distinct, For Each, Order By, Limit Operators and Eval Functions in Apache Pig
Apache Spark
Articles
Overview of Scala programming language and How to install Scala on your system
How to Install Apache Spark on your system
How to perform pattern matching in Scala and use of Regex expressions
How to use Functions in Scala programming Language
How to use Collections in Scala
How to use Arrays in Scala Programming Language
How to perform Exception Handling in Scala Language
How to Deploy Spark Application on Cluster
Extractor Object in Scala and how to perform pattern matching using extractors
Details of Data Types and Basic Literals in Scala
Detailed understanding of Operators in Scala Language
Deep dive into File Handling in Scala
Deep dive into Advanced programming in Spark
Basics of Scala Programming Language
Concept of String Manipulation in Scala
Conditional statements and Loop control structures in Scala
Concept of Resilient Distributed Datasets (RDD) in Apache Spark
How to use Classes and Objects in Scala programming
Overview of Apache Spark Framework
Spark Core and implementation of RDD transformations and actions in RDD programming
eBooks
Interview Questions
Videos
Overview of Scala programming language and How to install Scala on your system
How to Install Apache Spark on your system
How to perform pattern matching in Scala and use of Regex expressions
How to use Functions in Scala programming Language
How to use Collections in Scala
How to use Arrays in Scala Programming Language
How to perform Exception Handling in Scala Language
How to Deploy Spark Application on Cluster
Extractor Object in Scala and how to perform pattern matching using extractors
Details of Data Types and Basic Literals in Scala
Detailed understanding of Operators in Scala Language
Deep dive into File Handling in Scala
Deep dive into Advanced programming in Spark
Basics of Scala Programming Language
Concept of String Manipulation in Scala
Conditional statements and Loop control structures in Scala
Concept of Resilient Distributed Datasets (RDD) in Apache Spark
How to use Classes and Objects in Scala programming
Overview of Apache Spark Framework
Spark Core and implementation of RDD transformations and actions in RDD programming
Detailed understanding of Scala Access Modifiers
Apache Sqoop
Articles
How to use the Apache Sqoop Eval and Codegen tool
How to list out the databases and tables of a particular database using Sqoop
How to import data and tables from MySQL to Hadoop HDFS
How to export data back from Hadoop HDFS to RDBMS and Create and maintain the Sqoop jobs
Introduction, Installation and Configuration of Apache Sqoop
eBooks
Interview Questions
Videos
How to use the Apache Sqoop Eval and Codegen tool
How to list out the databases and tables of a particular database using Sqoop
How to import data and tables from MySQL to Hadoop HDFS
How to export data back from Hadoop HDFS to RDBMS and Create and maintain the Sqoop jobs
Introduction, Installation and Configuration of Apache Sqoop
Apache Storm
Articles
Application of Apache Storm Framework in Yahoo Finance
Concept of Cluster Architecture in Apache Storm
Introduction to Apache Storm and Core Concepts of Apache Storm
How to Install Apache Storm framework on your machine
How to implement Mobile Call log Analyzer using Apache Storm
How Apache Storm is used in Twitter
eBooks
Interview Questions
Videos
Application of Apache Storm Framework in Yahoo Finance
Concept of Cluster Architecture in Apache Storm
Introduction to Apache Storm and Core Concepts of Apache Storm
How to Install Apache Storm framework on your machine
How to implement Mobile Call log Analyzer using Apache Storm
How Apache Storm is used in Twitter
Detailed understanding of Workflow of Apache Storm
Hadoop and MapReduce
Articles
Concept of Combiners in Hadoop MapReduce
Concept of MapReduce in BigData
Detailed understanding of Hadoop Architecture and Hadoop Distributed File System (HDFS)
Concept of Partitioner in MapReduce and its implementation using example
Deep dive into Hadoop administration
Deep dive into the MapReduce API
Detailed understanding of Hadoop Distributed File System (HDFS)
Phases of MapReduce Data flow and detailed understanding of Mapreduce API
Overview of YARN and its components and benefits of YARN
Overview of Big Data and Hadoop, Big Data technologies
Implementation of Word Count program using Hadoop MapReduce
Operation of MapReduce in Hadoop framework using Java
Implementation of Character Count program using Hadoop MapReduce
How to set up Hadoop Multi-Node Cluster on a distributed environment
How to perform operations in Hadoop and commands used in Hadoop
How to install Hadoop on your system
eBooks
Videos
Concept of Combiners in Hadoop MapReduce
Concept of MapReduce in BigData
Detailed understanding of Hadoop Architecture and Hadoop Distributed File System (HDFS)
Concept of Partitioner in MapReduce and its implementation using example
Deep dive into Hadoop administration
Deep dive into the MapReduce API
Detailed understanding of Hadoop Distributed File System (HDFS)
Phases of MapReduce Data flow and detailed understanding of Mapreduce API
Overview of YARN and its components and benefits of YARN
Overview of Big Data and Hadoop, Big Data technologies
Implementation of Word Count program using Hadoop MapReduce
Operation of MapReduce in Hadoop framework using Java
Implementation of Character Count program using Hadoop MapReduce
How to set up Hadoop Multi-Node Cluster on a distributed environment
How to perform operations in Hadoop and commands used in Hadoop
How to install Hadoop on your system
How to install Hadoop Framework on your system
How the MapReduce Algorithm works using example
HBase
Articles
Deep dive into HBase architecture
Deep dive into Java Client API for HBase and its associated classes
How to create and List Table in HBase shell
How to create data in an HBase table
How to delete data in Table in HBase
How to enable and disable a Table using HBase shell
How to install HBase and configure on your system
How to make changes to an existing Table and describe it in HBase
How to read data from Table in HBase
How to start HBase interactive shell and how HBase general commands works
How to Stop HBase using Java API
How to update data in Table using HBase Shell
How to verify the existence of a Table and How to Drop a Table in HBase
Overview of HBase, its Advantages, Features and history
Deep dive into HBase Scan, Count and Truncate command and how to achieve security in HBase
Deep dive into HBase Scan, Count and Truncate command and how to achieve security in HBase
eBooks
Interview Questions
Videos
Deep dive into HBase architecture
Deep dive into Java Client API for HBase and its associated classes
How to create and List Table in HBase shell
How to create data in an HBase table
How to delete data in Table in HBase
How to enable and disable a Table using HBase shell
How to install HBase and configure on your system
How to make changes to an existing Table and describe it in HBase
How to read data from Table in HBase
How to start HBase interactive shell and how HBase general commands works
How to Stop HBase using Java API
How to update data in Table using HBase Shell
How to verify the existence of a Table and How to Drop a Table in HBase
Overview of HBase, its Advantages, Features and history
Top Apache HBase Interview Questions and Answers
Deep dive into HBase Scan, Count and Truncate command and how to achieve security in HBase
Deep dive into HBase Scan, Count and Truncate command and how to achieve security in HBase
Hive and Impala
Articles
Concept of Partitioning of table in Hive
Detailed understanding of built-in functions available in Hive
Different Data Types in Hive which are involved in creation of table.
How to Alter the attributes of a table and delete a Table in Hive
How to create a table in Hive and how to insert data into it
How to create and drop a database in Hive
How to create and manage Views and Create and Drop an index in Hive
How to install Hive on your system
How to perform Join operations in Hive Query Language (HQL)
How to use the select statement in Hive Query Language
Introduction to Impala, its features, advantages and disadvantages
How to start Impala Shell and the various options of the shell
How to select a database using Command and select database using Hue Browser in Impala
How to perform changes on a given table and how to delete table in Impala
How to fetch the data from one or more tables in a database and fetch description in Impala
How to download, install and set up Impala in your system
How to create a table in the required database in Impala
How to Create, Alter and Drop a View in Impala
Explanation of Union Clause, With Clause and Distinct Operator in Impala
Explanation of Limit Clause and Offset Clause in Impala
Data Types in Impala Query Language
Detailed understanding of Architecture of Impala
Explanation of Order by Clause, Group by Clause and Having Clause in Impala
How to add new records into an existing table in a database using INSERT in Impala
How to create a database in Impala
eBooks
Videos
Concept of Partitioning of table in Hive
Detailed understanding of built-in functions available in Hive
Different Data Types in Hive which are involved in creation of table.
How to Alter the attributes of a table and delete a Table in Hive
How to create a table in Hive and how to insert data into it
How to create and drop a database in Hive
How to create and manage Views and Create and Drop an index in Hive
How to install Hive on your system
How to perform Join operations in Hive Query Language (HQL)
How to use the select statement in Hive Query Language
Introduction to Impala, its features, advantages and disadvantages
How to start Impala Shell and the various options of the shell
How to select a database using Command and select database using Hue Browser in Impala
How to perform changes on a given table and how to delete table in Impala
How to fetch the data from one or more tables in a database and fetch description in Impala
How to download, install and set up Impala in your system
How to create a table in the required database in Impala
How to Create, Alter and Drop a View in Impala
Explanation of Union Clause, With Clause and Distinct Operator in Impala
Explanation of Limit Clause and Offset Clause in Impala
Data Types in Impala Query Language
Detailed understanding of Architecture of Impala
Explanation of Order by Clause, Group by Clause and Having Clause in Impala
How to add new records into an existing table in a database using INSERT in Impala
How to create a database in Impala
How to drop a database in Impala
Top Apache Impala Interview Questions and Answers
Top Apache Hive Interview Questions and Answers
MongoDB
Articles
Advanced Indexing in MongoDB and Limitation of Indexing in MongoDB
Concept of Capped Collections and Auto-Increment Sequence in MongoDB
Concept of Map Reduce in MongoDB
Concept of Relationships and Database References in MongoDB
Concept of Sharding process and How to create a backup in MongoDB
Data Modelling in MongoDB and How to create and Drop database in MongoDB
Deep dive into Covered Queries in MongoDB and Analyzing queries
Deep dive into Replication process in MongoDB
How to Create and Drop a collection using MongoDB
How to Insert, Update, Delete and Query Document in MongoDB Collection
How to Install MongoDB on your system
How to limit records using MongoDB ad use projection in MongoDB
How to Set up MongoDB JDBC driver
How to sort records in MongoDB and concept of Indexing and Aggregation in MongoDB
How to use Regex Expressions and Text Search in MongoDB
MongoDB Administration using RockMongo and concept of GridFS in MongoDB
Overview of MongoDB, its history and purpose of building MongoDB
Understand NoSQL Databases and MongoDB advantages over Relational DBMS
eBooks
Interview Questions
Videos
Advanced Indexing in MongoDB and Limitation of Indexing in MongoDB
Concept of Capped Collections and Auto-Increment Sequence in MongoDB
Concept of Map Reduce in MongoDB
Concept of Relationships and Database References in MongoDB
Concept of Sharding process and How to create a backup in MongoDB
Data Modelling in MongoDB and How to create and Drop database in MongoDB
Deep dive into Covered Queries in MongoDB and Analyzing queries
Deep dive into Replication process in MongoDB
How to Create and Drop a collection using MongoDB
How to Insert, Update, Delete and Query Document in MongoDB Collection
How to Install MongoDB on your system
How to limit records using MongoDB ad use projection in MongoDB
How to Set up MongoDB JDBC driver
How to sort records in MongoDB and concept of Indexing and Aggregation in MongoDB
How to use Regex Expressions and Text Search in MongoDB
MongoDB Administration using RockMongo and concept of GridFS in MongoDB
Overview of MongoDB, its history and purpose of building MongoDB
Understand NoSQL Databases and MongoDB advantages over Relational DBMS
Splunk
Articles
Deep dive into Splunk Search processing Language (SPL)
How to perform Basic Search in Splunk
How to perform searching using fields in Splunk
How to perform Time Range search in Splunk
How to share and export the search result in Splunk
A Deep Dive into Splunk Web Interface
eBooks
Videos
Deep dive into Splunk Search processing Language (SPL)
How to perform Basic Search in Splunk
How to perform searching using fields in Splunk
How to perform Time Range search in Splunk
How to share and export the search result in Splunk
Top Splunk SIEM Interview Questions and Answers
Top Splunk Interview Questions and Answers
A Deep Dive into Splunk Web Interface
Application of Apache Storm Framework in Yahoo Finance
Deep dive into Cassandra Query Language Collections and user defined data types.
Deep dive into Cassandra Shell Commands
How to Create and Alter Tables in Apache Cassandra
How to Create and Drop Indexes in Apache Cassandra
How to create, alter and drop Keyspaces in Cassandra
How to Drop and Truncate Tables in Apache Cassandra
Concept of Combiners in Hadoop MapReduce
Concept of MapReduce in BigData
Detailed understanding of Hadoop Architecture and Hadoop Distributed File System (HDFS)
Concept of Partitioner in MapReduce and its implementation using example
Deep dive into Hadoop administration
Deep dive into the MapReduce API
Detailed understanding of Hadoop Distributed File System (HDFS)
How to set up Both cqlsh and Java environments to work with Cassandra
How to Perform CRUD ( Create , Read , Update and Delete ) Operations in Table in Apache Cassandra
Explanation of Apache Pig Group and Cogroup Operators
Detailed Study of Architecture of Apache Pig
Deep Dive into Pig Latin Diagnostic Operators
Deep Dive into Apache Pig Functions: Load & Store, Bag & Tuple, String, Date-time, Math
Apache Pig Basics, Features and Comparison with MapReduce, Hive & SQL and History of Apache Pig
Phases of MapReduce Data flow and detailed understanding of Mapreduce API
Overview of YARN and its components and benefits of YARN
Overview of Big Data and Hadoop, Big Data technologies
Implementation of Word Count program using Hadoop MapReduce
Operation of MapReduce in Hadoop framework using Java
Implementation of Character Count program using Hadoop MapReduce
How to set up Hadoop Multi-Node Cluster on a distributed environment
How to perform operations in Hadoop and commands used in Hadoop
How to install Hadoop on your system
How to install Hadoop Framework on your system
How the MapReduce Algorithm works using example
Introduction to Apache Cassandra, History and Architecture
Overview of How Cassandra Stores its data
Overview of important class in Cassandra and introduction of Cassandra query shell language
Concept of Partitioning of table in Hive
Detailed understanding of built-in functions available in Hive
Different Data Types in Hive which are involved in creation of table.
How to Alter the attributes of a table and delete a Table in Hive
How to create a table in Hive and how to insert data into it
How to create and drop a database in Hive
How to create and manage Views and Create and Drop an index in Hive
How to install Hive on your system
How to perform Join operations in Hive Query Language (HQL)
How to use the select statement in Hive Query Language
Deep dive into Splunk Search processing Language (SPL)
Explanation of Shell and Utility Commands provided by Apache Grunt Shell
How to perform Basic Search in Splunk
How to Install Apache Pig and Configure Pig
How to perform searching using fields in Splunk
How to Load data to Apache Pig from Hadoop File System
How to perform Time Range search in Splunk
How to run Apache Pig Scripts in Batch Mode
How to Store data in Apache Pig using Store Operator
How to share and export the search result in Splunk
How to use Cross Operator and Union Operator in Pig Latin
How to use Split and Filter Operator in Apache Pig Latin
How to use the Join Operators in Pig Latin
How to use Distinct, For Each, Order By, Limit Operators and Eval Functions in Apache Pig
User Defined Functions in Apache Pig Latin
Advanced Indexing in MongoDB and Limitation of Indexing in MongoDB
Concept of Capped Collections and Auto-Increment Sequence in MongoDB
Concept of Map Reduce in MongoDB
Concept of Relationships and Database References in MongoDB
Concept of Sharding process and How to create a backup in MongoDB
Data Modelling in MongoDB and How to create and Drop database in MongoDB
Deep dive into Covered Queries in MongoDB and Analyzing queries
Deep dive into Replication process in MongoDB
How to Create and Drop a collection using MongoDB
How to Insert, Update, Delete and Query Document in MongoDB Collection
How to Install MongoDB on your system
How to limit records using MongoDB ad use projection in MongoDB
Deep dive into HBase architecture
Deep dive into Java Client API for HBase and its associated classes
How to create and List Table in HBase shell
How to create data in an HBase table
How to delete data in Table in HBase
How to enable and disable a Table using HBase shell
How to Set up MongoDB JDBC driver
How to install HBase and configure on your system
How to make changes to an existing Table and describe it in HBase
How to read data from Table in HBase
How to sort records in MongoDB and concept of Indexing and Aggregation in MongoDB
How to start HBase interactive shell and how HBase general commands works
How to Stop HBase using Java API
How to update data in Table using HBase Shell
How to verify the existence of a Table and How to Drop a Table in HBase
Overview of HBase, its Advantages, Features and history
How to use Regex Expressions and Text Search in MongoDB
Overview of Scala programming language and How to install Scala on your system
MongoDB Administration using RockMongo and concept of GridFS in MongoDB
Overview of MongoDB, its history and purpose of building MongoDB
Understand NoSQL Databases and MongoDB advantages over Relational DBMS
How to Monitor System statistics using Apache NiFi
Concept of Logging in Apache NiFi
How to use the Apache Sqoop Eval and Codegen tool
How to list out the databases and tables of a particular database using Sqoop
How to import data and tables from MySQL to Hadoop HDFS
How to export data back from Hadoop HDFS to RDBMS and Create and maintain the Sqoop jobs
How to Install Apache Spark on your system
How to perform pattern matching in Scala and use of Regex expressions
How to use Functions in Scala programming Language
How to use Collections in Scala
How to use Arrays in Scala Programming Language
How to perform Exception Handling in Scala Language
How to Deploy Spark Application on Cluster
Extractor Object in Scala and how to perform pattern matching using extractors
Details of Data Types and Basic Literals in Scala
Detailed understanding of Operators in Scala Language
Deep dive into File Handling in Scala
Deep dive into Advanced programming in Spark
Basics of Scala Programming Language
Concept of String Manipulation in Scala
Conditional statements and Loop control structures in Scala
Introduction to Impala, its features, advantages and disadvantages
How to start Impala Shell and the various options of the shell
How to select a database using Command and select database using Hue Browser in Impala
How to perform changes on a given table and how to delete table in Impala
How to fetch the data from one or more tables in a database and fetch description in Impala
How to download, install and set up Impala in your system
How to create a table in the required database in Impala
How to Create, Alter and Drop a View in Impala
Explanation of Union Clause, With Clause and Distinct Operator in Impala
Explanation of Limit Clause and Offset Clause in Impala
Data Types in Impala Query Language
Concept of Resilient Distributed Datasets (RDD) in Apache Spark
How to use Classes and Objects in Scala programming
Overview of Apache Spark Framework
Spark Core and implementation of RDD transformations and actions in RDD programming
Introduction, Installation and Configuration of Apache Sqoop
Basic Concepts of Apache NiFi and its Installation
Deep Dive into Apache Nifi – Flow Files, Queues, Process Groups and Labels
Deep dive into Apache NiFi-Processors
Detailed understanding of Apache NiFi -Templates
How to Administer Apache NiFi and Create Flows in Apache NiFi
Understanding Apache NiFi API’s with request and response example
Understanding Apache Nifi Processors Categorization and its relationship
Detailed understanding of Architecture of Impala
Explanation of Order by Clause, Group by Clause and Having Clause in Impala
How to add new records into an existing table in a database using INSERT in Impala
How to create a database in Impala
How to drop a database in Impala
Detailed understanding of Scala Access Modifiers
How to use Variables in Scala with the help of example
Introduction to Apache NiFi, its History, Features and Architecture
Concept of Cluster Architecture in Apache Storm
Introduction to Apache Storm and Core Concepts of Apache Storm
How to Install Apache Storm framework on your machine
How to implement Mobile Call log Analyzer using Apache Storm
How Apache Storm is used in Twitter
Detailed understanding of Workflow of Apache Storm
Deep Dive into Trident – an extension of Apache Storm
Top Splunk SIEM Interview Questions and Answers
Top Big Data Hadoop Interview Questions and Answers
Top MongoDB Interview Questions and Answers
Top Scala Interview Questions and Answers
Top Splunk Interview Questions and Answers
Top Hadoop Administration Interview Questions and Answers
Top Apache Sqoop Interview Questions and Answers
Top Apache NiFi Interview Questions and Answers
Top Apache Impala Interview Questions and Answers
Top Apache HBase Interview Questions and Answers
Top Apache Flume Interview Questions and Answers
Top Apache Spark Interview Questions and Answers
Top Apache Pig Interview Questions and Answers
Top Apache Cassandra Interview Questions and Answers
Top Apache Hive Interview Questions and Answers
Top Apache Oozie Interview Questions and Answers
Top Apache Storm Interview Questions and Answers
Deep dive into HBase Scan, Count and Truncate command and how to achieve security in HBase
Deep Dive into Apache NiFi User Interface
Deep dive into built-in operators of Hive
Concept of Atomic Operations in MongoDB
A Deep Dive into Splunk Web Interface
Deep Dive into Apache Oozie Workflow
How to Configure Oozie Workflow using Property File
Concept of Coordinators applications using Apache Oozie
Basics of Apache Oozie and Oozie Editors
Deep Dive into Oozie Bundle System and CLI & Extensions
Process of Data Ingestion in Splunk Environment
Deep Dive into Apache NiFi User Interface
Deep dive into HBase Scan, Count and Truncate command and how to achieve security in HBase
Basics of Splunk and Installation of Splunk Environment
Top Apache Oozie Interview Questions and Answers You must Prepare Gaurav
Blockchain
Articles
Introduction to Ethereum and Smart Contracts
Ethereum - Interacting with Deployed Contract
Ethereum – Attaching Wallet to Ganache Blockchain
Ethereum - Creating Contract Users
Concept of Blockchain Double Spending and Bitcoin Cash
Bitcoin Forks and SegWit and BlockChain Merkel Tree
Comparison between Blockchain and Database
Basic Components of Bitcoin and Blockchain Proof of Work
Ethereum - Solidity for Contract Writing
Ethereum - Ganache for Blockchain
eBooks
Interview Questions
Videos
BlockChain and Ethereum
Articles
eBooks
Interview Questions
Videos
Introduction to Ethereum and Smart Contracts
Ethereum - Interacting with Deployed Contract
Ethereum – Attaching Wallet to Ganache Blockchain
Ethereum - Creating Contract Users
Concept of Blockchain Double Spending and Bitcoin Cash
Bitcoin Forks and SegWit and BlockChain Merkel Tree
Comparison between Blockchain and Database
Basic Components of Bitcoin and Blockchain Proof of Work
Ethereum - Solidity for Contract Writing
Ethereum - Ganache for Blockchain
Overview and History of Blockchain
Overview of Bitcoin and Key Concepts of Bitcoin
Top BlockChain Interview Questions and Answers
Top Ethereum Interview Questions and Answers
Best Approach for Storing data to AWS DynamoDB and S3 – AWS Implementation
Maintain High Availability in AWS with anticipated Additional Load
Cloud Computing
Articles
eBooks
Interview Questions
Videos
AWS
Articles
How to Use Amazon Machine Learning
How to use Amazon KCL and set up Amazon EMR
How to Set Up Amazon RDS (Relational Database Service)
How to Configure AWS Direct Connect
How to Configure Amazon Simple Storage Service (S3)
How to Configure Amazon Route 53
How AWS CloudFront Delivers the Content
Amazon Elastic Block Storage (EBS) and Storage Gateway
How to use Simple Workflow Service (SWF) and Amazon WorkMail
Understanding of AWS Management Console
How to Set Up AWS Data Pipeline
eBooks
Videos
How to Use Amazon Machine Learning
How to use Amazon KCL and set up Amazon EMR
How to Set Up Amazon RDS (Relational Database Service)
How to Configure AWS Direct Connect
How to Configure Amazon Simple Storage Service (S3)
How to Configure Amazon Route 53
How AWS CloudFront Delivers the Content
Amazon Elastic Block Storage (EBS) and Storage Gateway
How to use Simple Workflow Service (SWF) and Amazon WorkMail
Understanding of AWS Management Console
How to Set Up AWS Data Pipeline
How to Create Amazon Workspaces
Top Azure Developer Interview Questions and Answers
Top Amazon Web Services (AWS) Interview Questions and Answers
Azure
Articles
How to configure Azure Cloud Service
How to configure Azure Load Balancer
How to Configure Azure Storage Security
How to create Azure Mobile App
Overview of Microsoft Azure and Cloud Computing
Creating App Service Plan in Azure Portal
Azure Virtual Machines and Compute Service
Azure Virtual Machine Scale Set and Auto Scaling
Azure Table, Queue and Disk Storage
Azure Storage Monitoring and Resource Tool
Azure Storage Building Blocks and Storage Account
Azure Storage account and Blob service configuration
Azure SQL Managed Instance and SQL Stretch Database
Azure SQL Database and its Configuration
Azure Network Service and Azure Virtual Network
Azure Media Service and Database Service
Azure Backup and Virtual Machine Security
Azure Availability Zones and Sets and VNet Connectivity
Azure App Service Monitoring and Azure CDN
eBooks
Interview Questions
Videos
How to configure Azure Cloud Service
How to configure Azure Load Balancer
How to Configure Azure Storage Security
How to create Azure Mobile App
Overview of Microsoft Azure and Cloud Computing
Creating App Service Plan in Azure Portal
Azure Virtual Machines and Compute Service
Azure Virtual Machine Scale Set and Auto Scaling
Azure Table, Queue and Disk Storage
Azure Storage Monitoring and Resource Tool
Azure Storage Building Blocks and Storage Account
Azure Storage account and Blob service configuration
Azure SQL Managed Instance and SQL Stretch Database
Azure SQL Database and its Configuration
Azure Network Service and Azure Virtual Network
Azure Media Service and Database Service
Azure Backup and Virtual Machine Security
Azure Availability Zones and Sets and VNet Connectivity
Azure App Service Monitoring and Azure CDN
Azure App Service Backup and Security
How to Use Amazon Machine Learning
How to use Amazon KCL and set up Amazon EMR
How to Set Up Amazon RDS (Relational Database Service)
How to Configure AWS Direct Connect
How to Configure Amazon Simple Storage Service (S3)
How to Configure Amazon Route 53
How AWS CloudFront Delivers the Content
Amazon Elastic Block Storage (EBS) and Storage Gateway
How to use Simple Workflow Service (SWF) and Amazon WorkMail
Understanding of AWS Management Console
How to configure Azure Cloud Service
How to configure Azure Load Balancer
How to Configure Azure Storage Security
How to create Azure Mobile App
Overview of Microsoft Azure and Cloud Computing
Creating App Service Plan in Azure Portal
Azure Virtual Machines and Compute Service
Azure Virtual Machine Scale Set and Auto Scaling
Azure Table, Queue and Disk Storage
Azure Storage Monitoring and Resource Tool
Azure Storage Building Blocks and Storage Account
Azure Storage account and Blob service configuration
How to Set Up AWS Data Pipeline
How to Create Amazon Workspaces
Azure SQL Managed Instance and SQL Stretch Database
Azure SQL Database and its Configuration
Azure Network Service and Azure Virtual Network
Azure Media Service and Database Service
Azure Backup and Virtual Machine Security
Azure Availability Zones and Sets and VNet Connectivity
Azure App Service Monitoring and Azure CDN
Azure App Service Backup and Security
Azure API Apps and API Management
Top Azure Developer Interview Questions and Answers
Top Azure Architect Interview Questions and Answers
Top Amazon Web Services (AWS) Interview Questions and Answers
Best Approach for Storing data to AWS DynamoDB and S3 – AWS Implementation
Migration of 3-tier e-commerce web application using Amazon web Services (AWS)
Cyber Security
Articles
eBooks
Interview Questions
Videos
Ethical Hacking
Articles
Concept of Enumeration in Ethical Hacking
Concept of Exploitation in Ethical Hacking
Concept of Social Engineering Attacks and Cross-Site Scripting
Concept of SQL Injection Attack
Concept of TCP/IP Hijacking and Trojan Attacks
DDOS Attacks in Ethical Hacking
Ethical Hacking - Fingerprinting
Ethical Hacking - Footprinting
Processes in Ethical Hacking and Reconnaissance
eBooks
Interview Questions
Videos
Concept of Enumeration in Ethical Hacking
Concept of Exploitation in Ethical Hacking
Concept of Social Engineering Attacks and Cross-Site Scripting
Concept of SQL Injection Attack
Concept of TCP/IP Hijacking and Trojan Attacks
DDOS Attacks in Ethical Hacking
Ethical Hacking - Fingerprinting
Ethical Hacking - Footprinting
Processes in Ethical Hacking and Reconnaissance
Concept of Enumeration in Ethical Hacking
Concept of Exploitation in Ethical Hacking
Concept of Social Engineering Attacks and Cross-Site Scripting
Concept of SQL Injection Attack
Concept of TCP/IP Hijacking and Trojan Attacks
DDOS Attacks in Ethical Hacking
Ethical Hacking - Fingerprinting
Ethical Hacking - Footprinting
Processes in Ethical Hacking and Reconnaissance
Data Science
Articles
Regression Analysis in Machine learning
Regression vs Classification in Machine Learning
Simple Linear Regression in Machine Learning
Naïve Bayes Classifier Algorithm
Support Vector Machine Algorithm
Logistic Regression in Machine Learning
Linear Regression in Machine Learning
K-Nearest Neighbor (KNN) Algorithm for Machine Learning
eBooks
Interview Questions
Videos
Machine Learning
Python with Data Science
Articles
Processing JSON Data in Python and Matplotlib
Processing Unstructured Data and rectilinear regression and Chi-Square Test in Python
P-Value and Correlation in Python
Python - Data Science Introduction
Relational Databases in Python
Perform Data Cleansing in Python
Performing Data Wrangling in Python
Introduction to Pandas, NumPy and SciPy Libraries
How to Read HTML Pages in Python
How to interact with MongoDB in Python
Box Plots and Scatter Plots and Heat Maps in Python
Bubble Charts and 3D Charts in Python
Data Aggregation and binomial distribution in Python
How to create Geographical Maps and Graphs in Python
Measuring Central Tendency and Variance in Python
eBooks
Interview Questions
Videos
Processing JSON Data in Python and Matplotlib
Processing Unstructured Data and rectilinear regression and Chi-Square Test in Python
P-Value and Correlation in Python
Python - Data Science Introduction
Relational Databases in Python
Perform Data Cleansing in Python
Performing Data Wrangling in Python
Introduction to Pandas, NumPy and SciPy Libraries
How to Read HTML Pages in Python
How to interact with MongoDB in Python
Box Plots and Scatter Plots and Heat Maps in Python
Bubble Charts and 3D Charts in Python
Data Aggregation and binomial distribution in Python
How to create Geographical Maps and Graphs in Python
Measuring Central Tendency and Variance in Python
Normal, Binomial and Poisson distribution in Python
R Language
Articles
Arrays and Factors in R Language
Binomial Distribution and Poisson Regression in R
Analysis of Covariance in R Language
Decision making and Loops in R Language
Handling Excel and Binary Files in R
Handling XML Files in R Language
How to create Line Graphs in R
How to create Scatterplots in R
How to create Histograms and Box Plots in R
Random Forest and Survival Analysis in R
Operators and Variables in R Language
Normal Distribution in R Language
Multiple and Logistic Regression in R
eBooks
Interview Questions
Videos
Arrays and Factors in R Language
Binomial Distribution and Poisson Regression in R
Analysis of Covariance in R Language
Decision making and Loops in R Language
Handling Excel and Binary Files in R
Handling XML Files in R Language
How to create Line Graphs in R
How to create Scatterplots in R
How to create Histograms and Box Plots in R
Random Forest and Survival Analysis in R
Operators and Variables in R Language
Normal Distribution in R Language
Multiple and Logistic Regression in R
SAS
Articles
One Way Anova and Hypothesis Testing
Overview of SAS and its Features
SAS - Basic Syntax and Program Structure
How to Perform Standard Deviation in SAS
How to perform Correlation Analysis in SAS
How to perform Bland Altman Analysis
SAS Applications and Loops and Decision Making
SAS Intelligence Platform Architecture
Strings Manipulation and Arrays in SAS
How to Format Data Sets in SAS
How to create Scatter Plots in SAS
How to create Pie Charts in SAS
How to create Histogram and Simulations in SAS
How to create Box Plots in SAS
How to Create Bar Charts in SAS
How to Concatenate Data Sets in SAS
How to calculate Arithmetic Mean and Handling Data and Time
Frequency Distributions and Cross Tabulations in SAS
eBooks
Interview Questions
Videos
One Way Anova and Hypothesis Testing
Overview of SAS and its Features
SAS - Basic Syntax and Program Structure
How to Perform Standard Deviation in SAS
How to perform Correlation Analysis in SAS
How to perform Bland Altman Analysis
SAS Applications and Loops and Decision Making
SAS Intelligence Platform Architecture
Strings Manipulation and Arrays in SAS
How to Format Data Sets in SAS
How to create Scatter Plots in SAS
How to create Pie Charts in SAS
How to create Histogram and Simulations in SAS
How to create Box Plots in SAS
How to Create Bar Charts in SAS
How to Concatenate Data Sets in SAS
How to calculate Arithmetic Mean and Handling Data and Time
Frequency Distributions and Cross Tabulations in SAS
Fishers Exact Tests and Repeated Measure Analysis in SAS
Regression Analysis in Machine learning
Regression vs Classification in Machine Learning
Simple Linear Regression in Machine Learning
Naïve Bayes Classifier Algorithm
Support Vector Machine Algorithm
Logistic Regression in Machine Learning
Linear Regression in Machine Learning
K-Nearest Neighbor (KNN) Algorithm for Machine Learning
Difference between Supervised and Unsupervised Learning
Classification Algorithm in Machine Learning
How to get datasets for Machine Learning
Processing JSON Data in Python and Matplotlib
Processing Unstructured Data and rectilinear regression and Chi-Square Test in Python
P-Value and Correlation in Python
Python - Data Science Introduction
Relational Databases in Python
One Way Anova and Hypothesis Testing
Overview of SAS and its Features
SAS - Basic Syntax and Program Structure
How to Perform Standard Deviation in SAS
How to perform Correlation Analysis in SAS
How to perform Bland Altman Analysis
SAS Applications and Loops and Decision Making
SAS Intelligence Platform Architecture
Strings Manipulation and Arrays in SAS
Perform Data Cleansing in Python
Performing Data Wrangling in Python
Introduction to Pandas, NumPy and SciPy Libraries
How to Read HTML Pages in Python
How to interact with MongoDB in Python
Box Plots and Scatter Plots and Heat Maps in Python
Arrays and Factors in R Language
Binomial Distribution and Poisson Regression in R
Bubble Charts and 3D Charts in Python
Analysis of Covariance in R Language
Difference between Artificial intelligence and Machine learning
Data Preprocessing in Machine learning
Data Aggregation and binomial distribution in Python
How to create Geographical Maps and Graphs in Python
Measuring Central Tendency and Variance in Python
Introduction to Machine Learning
Normal, Binomial and Poisson distribution in Python
Installing Anaconda and Python
Applications of Machine learning
Handle Date and Time in Python
Decision making and Loops in R Language
Handling Excel and Binary Files in R
Handling XML Files in R Language
How to create Line Graphs in R
How to create Scatterplots in R
How to create Histograms and Box Plots in R
How to Format Data Sets in SAS
How to create Scatter Plots in SAS
How to create Pie Charts in SAS
How to create Histogram and Simulations in SAS
How to create Box Plots in SAS
How to Create Bar Charts in SAS
How to Concatenate Data Sets in SAS
How to calculate Arithmetic Mean and Handling Data and Time
Frequency Distributions and Cross Tabulations in SAS
Fishers Exact Tests and Repeated Measure Analysis in SAS
Advantages and Disadvantages of SAS Programming Language
Random Forest and Survival Analysis in R
Operators and Variables in R Language
Normal Distribution in R Language
Multiple and Logistic Regression in R
Linear Regression in R Language
Top Data Science Interview Questions and Answers
Top Machine Learning Interview Questions and Answers
Top SAS Interview Questions and Answers
Top Python Interview Questions and Answers
Data Warehousing and ETL
Articles
eBooks
Interview Questions
Videos
ETL Testing
eBooks
Interview Questions
Videos
Informatica
Articles
Aggregator Transformation in Informatica
Concept of Informatica (Big Data Management) BDM
Informatica Master Data Management (MDM) Process
Lookup and Normalizer Transformation in Informatica
Performance Tuning and Partitioning in Informatica
Rank Transformation in Informatica
Router and Joiner Transformation in Informatica
Source Qualifier Transformation in Informatica
Transaction Control Transformation in Informatica
Sequence Generator Transformation in Informatica
eBooks
Interview Questions
Videos
Aggregator Transformation in Informatica
Concept of Informatica (Big Data Management) BDM
Informatica Master Data Management (MDM) Process
Lookup and Normalizer Transformation in Informatica
Performance Tuning and Partitioning in Informatica
Rank Transformation in Informatica
Router and Joiner Transformation in Informatica
Source Qualifier Transformation in Informatica
Transaction Control Transformation in Informatica
Sequence Generator Transformation in Informatica
Concept of Informatica IDQ (Informatica Data Quality)
Concept of ETL Pipeline and Files
Overview of ELT Testing and its Architecture
Aggregator Transformation in Informatica
Concept of Informatica (Big Data Management) BDM
Informatica Master Data Management (MDM) Process
Lookup and Normalizer Transformation in Informatica
Performance Tuning and Partitioning in Informatica
Rank Transformation in Informatica
Router and Joiner Transformation in Informatica
Source Qualifier Transformation in Informatica
Transaction Control Transformation in Informatica
Sequence Generator Transformation in Informatica
Concept of Informatica IDQ (Informatica Data Quality)
Installation of Informatica PowerCenter
Comparison between ETL and ELT
Detailed understanding of ETL (Extraction, Transformation and Loading) Testing
Databases
Articles
eBooks
Interview Questions
Videos
MS-SQL Server
Articles
Backup and restore a database in SQL Server
Concept of Primary Key in SQL Server
CRUD Operations of Data in MS SQL Server
How to Enable, Disable and Drop a Foreign Key
Popular Functions in MS SQL Server
SQL Server BETWEEN Condition (Operator)
SQL Server Comparison Operator
Create and Delete Table in MS SQL Server
SQL Server DISTINCT and GROUP BY Clause
eBooks
Interview Questions
Videos
Backup and restore a database in SQL Server
Concept of Primary Key in SQL Server
CRUD Operations of Data in MS SQL Server
How to Enable, Disable and Drop a Foreign Key
Popular Functions in MS SQL Server
SQL Server BETWEEN Condition (Operator)
SQL Server Comparison Operator
Create and Delete Table in MS SQL Server
SQL Server DISTINCT and GROUP BY Clause
Oracle DBA
Overview of Oracle Tablespace Group
Overview of Oracle Database and its Architecture
Oracle ALTER USER and DROP USER
Introduction to Oracle Data Pump Import and Export tool
Introduction to Oracle CREATE USER statement
How to use the Oracle STARTUP command to start out an Oracle Database instance
How to shut down the Oracle Database
How to Manage Tablespaces in Oracle
How To List Users within the Oracle Database
How to Grant SELECT Object Privilege on One or More Tables to a User and Unlock a User in Oracle
How to Grant All Privileges to a User in Oracle
How to Grant All Privileges to a User in Oracle
How to Create User Profiles in Oracle
How to Create Oracle Database Links
How to Alter and Drop Roles in Oracle
Oracle PL-SQL
eBooks
Interview Questions
Videos
Date and Time Handling in PL-SQL
Constants and Literals and Operators in PL-SQL
Conditions and Loops in PL-SQL
Backup and restore a database in SQL Server
Concept of Primary Key in SQL Server
CRUD Operations of Data in MS SQL Server
How to Enable, Disable and Drop a Foreign Key
Popular Functions in MS SQL Server
SQL Server BETWEEN Condition (Operator)
SQL Server Comparison Operator
Create and Delete Table in MS SQL Server
SQL Server DISTINCT and GROUP BY Clause
SQL Server NOT Condition (Operator)
Overview of Oracle Tablespace Group
Overview of Oracle Database and its Architecture
Oracle ALTER USER and DROP USER
Introduction to Oracle Data Pump Import and Export tool
Introduction to Oracle CREATE USER statement
How to use the Oracle STARTUP command to start out an Oracle Database instance
How to shut down the Oracle Database
How to Manage Tablespaces in Oracle
How To List Users within the Oracle Database
How to Grant SELECT Object Privilege on One or More Tables to a User and Unlock a User in Oracle
How to Grant All Privileges to a User in Oracle
How to Grant All Privileges to a User in Oracle
How to Create User Profiles in Oracle
How to Create Oracle Database Links
How to Alter and Drop Roles in Oracle
How to Alter and Drop Oracle Database Link
Introduction to PL-SQL and Environment setup
Top Oracle PL-SQL Interview Questions and Answers
Top MS-SQL Server Interview Questions and Answers
Best Approach for Storing data to AWS DynamoDB and S3 – AWS Implementation
Maintain High Availability in AWS with anticipated Additional Load
DevOps
eBooks
Interview Questions
Videos
Ansible
Articles
Overview of YAML and Ad-hoc commands in Ansible
A Detailed comparison of Ansible and Puppet
A Detailed comparison of Ansible Vs Chef
Detailed understanding of concept of Playbooks in Ansible
Deep dive into Pip module in Ansible
How to perform troubleshooting in Ansible
How to use variables in playbooks in Ansible and concept of exception handling
Overview of Ansible, its History and How to set-up Ansible on your machine
eBooks
Interview Questions
Videos
Overview of YAML and Ad-hoc commands in Ansible
A Detailed comparison of Ansible and Puppet
A Detailed comparison of Ansible Vs Chef
Detailed understanding of concept of Playbooks in Ansible
Deep dive into Pip module in Ansible
How to perform troubleshooting in Ansible
How to use variables in playbooks in Ansible and concept of exception handling
Overview of Ansible, its History and How to set-up Ansible on your machine
Chef
Articles
Chef-Client as Daemon and Chef-Shell
Concept of Libraries , Definition and setting environment variable
Concept of Lightweight Resource Provider and Blueprints in Chef
Concept of Templates and Dynamically Configuring Recipes
Dealing with Files and Software packages and Community Cookbooks
Execute Cookbook on Node and run Chef-Client
Detailed understanding of Resources in Chef
How to Set up Chef on your system
How to set up Test Kitchen Workflow
How to write Cross-Platform Cookbooks
Overview of Chef and its Architecture
Plain Ruby with Chef DSL and Ruby Gems with Recipes
Testing Cookbook with Test Kitchen
Roles in Chef and perform environment specific configuration
eBooks
Interview Questions
Videos
Chef-Client as Daemon and Chef-Shell
Concept of Libraries , Definition and setting environment variable
Concept of Lightweight Resource Provider and Blueprints in Chef
Concept of Templates and Dynamically Configuring Recipes
Dealing with Files and Software packages and Community Cookbooks
Execute Cookbook on Node and run Chef-Client
Detailed understanding of Resources in Chef
How to Set up Chef on your system
How to set up Test Kitchen Workflow
How to write Cross-Platform Cookbooks
Overview of Chef and its Architecture
Plain Ruby with Chef DSL and Ruby Gems with Recipes
Testing Cookbook with Test Kitchen
Roles in Chef and perform environment specific configuration
Docker
Articles
Concept of Docker Cloud Service
Deep dive into Docker Architecture
Concept of public repositories in Docker
Concept of Container Linking and Storage in Docker
Building a web Server Docker File
Working with Docker Toolbox and how to use the Jenkins Docker image from Docker Hub
Overview of Docker and its features
Managing ports and private registries in Docker
Instruction commands in Docker
How to work with Containers in Docker
How to Set-up MongoDB in Docker
How to set up Node.js in Docker
How to set up Kubernetes in Docker
How to set up ASP.net in Docker
How to perform Continuous integration using Jenkins in Docker
How to install Docker on Windows
eBooks
Interview Questions
Videos
Concept of Docker Cloud Service
Deep dive into Docker Architecture
Concept of public repositories in Docker
Concept of Container Linking and Storage in Docker
Building a web Server Docker File
Working with Docker Toolbox and how to use the Jenkins Docker image from Docker Hub
Overview of Docker and its features
Managing ports and private registries in Docker
Instruction commands in Docker
How to work with Containers in Docker
How to Set-up MongoDB in Docker
How to set up Node.js in Docker
How to set up Kubernetes in Docker
How to set up ASP.net in Docker
How to perform Continuous integration using Jenkins in Docker
How to install Docker on Windows
How to install docker on Linux
Git and GitHub
eBooks
Interview Questions
Videos
Concept of Git Index and Git Head
Comparison of Git with SVN and Mercurial
Deep dive into Git Branching Model
Git Repository and How to Fork Repository
Git Terminology and General Tools
How to Clone Repository in Git
Working with Remote Repository
Version Control System and its Types
Overview of GitHub and Comparison of Git and GitHub
Overview of Git and its features
Merging Branches and Resolve conflicts in Git
How to use Git via the command line
How to switch branches without committing the current branch in Git
How to perform Rebasing in Git
How to Install Git on Linux (Ubuntu) and Mac
Jenkins
Articles
Deep dive into Metrics and Trends for builds
Server maintenance and Plugins Management in Jenkins
Perform Continuous Deployment using Jenkins
Overview of Jenkins, its History and Architecture
How to take Back-up in Jenkins using Backup plugin
How to set up Git and Maven Plugin in Jenkins
How to set up Distributed build and Automated deployment in Jenkins
How to set up Build jobs in Jenkins
How to run Remote tests using Jenkins
How to perform Notification, Reporting and Code Analysis
How to perform Junit Testing in Jenkins
How to perform Automation Testing in Jenkins
How to install Jenkins on your system
Comparison of Jenkins with Ansible and Hudson Frameworks
Comparison of Jenkins with Bamboo and TeamCity
eBooks
Interview Questions
Videos
Deep dive into Metrics and Trends for builds
Server maintenance and Plugins Management in Jenkins
Perform Continuous Deployment using Jenkins
Overview of Jenkins, its History and Architecture
How to take Back-up in Jenkins using Backup plugin
How to set up Git and Maven Plugin in Jenkins
How to set up Distributed build and Automated deployment in Jenkins
How to set up Build jobs in Jenkins
How to run Remote tests using Jenkins
How to perform Notification, Reporting and Code Analysis
How to perform Junit Testing in Jenkins
How to perform Automation Testing in Jenkins
How to install Jenkins on your system
Comparison of Jenkins with Ansible and Hudson Frameworks
Comparison of Jenkins with Bamboo and TeamCity
Comparison of Jenkins with GoCD and Maven Tools
Kubernetes
Articles
eBooks
Interview Questions
Videos
How to setup Kubernetes on your machine
How to Set up Kubernetes Dashboard
How to manage Deployments and Concept of Kubernetes Volume
How to achieve Autoscaling in Kubernetes cluster
Deep dive into Kubectl command line utility
Create an Application for Kubernetes deployment
Concept of Secrets, Network Policy and Kubernetes API
Concept of Replication Controller and Replica Sets
Concept of Node, Service and Pod in Kubernetes
Concept of Images and creating a Job in Kubernetes
Namespace, Labels and Selectors in Kubernetes
Overview of Kubernetes and its Architecture and components
Maven
Articles
Introduction to Maven and How to Set up Maven Environment
How to manage Maven Project in NetBeans and IntelliJ IDEA
How to manage a web-based project using Maven
How to import Maven Project in Eclipse IDE
How to create documentation of Application in Maven
How to automate the Deployment process in Maven
Deep dive into Build Automation
Creating Java Project in Maven
Concept of Project Object Model (POM) in Maven
Concept of Maven Repositories and Plugins in Maven
eBooks
Interview Questions
Videos
Introduction to Maven and How to Set up Maven Environment
How to manage Maven Project in NetBeans and IntelliJ IDEA
How to manage a web-based project using Maven
How to import Maven Project in Eclipse IDE
How to create documentation of Application in Maven
How to automate the Deployment process in Maven
Deep dive into Build Automation
Creating Java Project in Maven
Concept of Project Object Model (POM) in Maven
Concept of Maven Repositories and Plugins in Maven
Nagios
Articles
Look into Nagios Features, applications, Hosts and services and Commands
Overview of Nagios, its architecture and Nagios products
Ports and protocols and Add-ons and plugins in Nagios
Detailed understanding of Checks and States in Nagios
How to run Nagios plugins on other machines remotely using NRPE
eBooks
Interview Questions
Videos
Look into Nagios Features, applications, Hosts and services and Commands
Overview of Nagios, its architecture and Nagios products
Ports and protocols and Add-ons and plugins in Nagios
Detailed understanding of Checks and States in Nagios
How to run Nagios plugins on other machines remotely using NRPE
Puppet
Articles
Implementation of Live working demo project in Puppet
How to Set-up and configure Puppet Master
How to install and configure r10k tool and validate puppet setup
How to install and configure puppet on your machine
How to define Functions and Custom functions in Puppet
Concept of Templating in Puppet
Concept of Type and Provider in Puppet
How to create custom environment in Puppet
Detailed understanding of architecture of puppet and its components and application of puppet
Detail understanding of environment conf file in puppet
Deep Dive into Resources in Puppet
Concept of Resource Abstraction Layer (RAL) in Puppet
Concept of File Server in Puppet
Concept of Facter and Facts in Puppet
Understanding Puppet Manifest files and How to write a manifest file in Puppet
Overview of Puppet and its components and concept of configuration management
How to Set-up Puppet agent and How to sign and check for SSL Ceritficate
eBooks
Interview Questions
Videos
How to use RESTful APIs in Puppet
Implementation of Live working demo project in Puppet
How to Set-up and configure Puppet Master
How to install and configure r10k tool and validate puppet setup
How to install and configure puppet on your machine
How to define Functions and Custom functions in Puppet
Concept of Templating in Puppet
Concept of Type and Provider in Puppet
How to create custom environment in Puppet
Detailed understanding of architecture of puppet and its components and application of puppet
Detail understanding of environment conf file in puppet
Deep Dive into Resources in Puppet
Concept of Resource Abstraction Layer (RAL) in Puppet
Concept of File Server in Puppet
Concept of Facter and Facts in Puppet
Understanding Puppet Manifest files and How to write a manifest file in Puppet
Overview of Puppet and its components and concept of configuration management
How to Set-up Puppet agent and How to sign and check for SSL Ceritficate
Concept of Git Index and Git Head
Comparison of Git with SVN and Mercurial
Deep dive into Git Branching Model
Git Repository and How to Fork Repository
Git Terminology and General Tools
How to Clone Repository in Git
Deep dive into Metrics and Trends for builds
Concept of Docker Cloud Service
Server maintenance and Plugins Management in Jenkins
How to use RESTful APIs in Puppet
Implementation of Live working demo project in Puppet
Perform Continuous Deployment using Jenkins
Overview of Jenkins, its History and Architecture
How to take Back-up in Jenkins using Backup plugin
How to set up Git and Maven Plugin in Jenkins
How to set up Distributed build and Automated deployment in Jenkins
How to set up Build jobs in Jenkins
How to run Remote tests using Jenkins
How to perform Notification, Reporting and Code Analysis
How to perform Junit Testing in Jenkins
How to perform Automation Testing in Jenkins
How to install Jenkins on your system
Deep dive into Docker Architecture
Concept of public repositories in Docker
Concept of Container Linking and Storage in Docker
Building a web Server Docker File
Comparison of Jenkins with Ansible and Hudson Frameworks
How to setup Kubernetes on your machine
How to Set up Kubernetes Dashboard
How to manage Deployments and Concept of Kubernetes Volume
How to achieve Autoscaling in Kubernetes cluster
Deep dive into Kubectl command line utility
Create an Application for Kubernetes deployment
Concept of Secrets, Network Policy and Kubernetes API
Concept of Replication Controller and Replica Sets
Concept of Node, Service and Pod in Kubernetes
Concept of Images and creating a Job in Kubernetes
How to Set-up and configure Puppet Master
How to install and configure r10k tool and validate puppet setup
How to install and configure puppet on your machine
How to define Functions and Custom functions in Puppet
Concept of Templating in Puppet
Concept of Type and Provider in Puppet
How to create custom environment in Puppet
Detailed understanding of architecture of puppet and its components and application of puppet
Detail understanding of environment conf file in puppet
Deep Dive into Resources in Puppet
Concept of Resource Abstraction Layer (RAL) in Puppet
Concept of File Server in Puppet
Concept of Facter and Facts in Puppet
Working with Docker Toolbox and how to use the Jenkins Docker image from Docker Hub
Overview of Docker and its features
Managing ports and private registries in Docker
Instruction commands in Docker
How to work with Containers in Docker
How to Set-up MongoDB in Docker
How to set up Node.js in Docker
How to set up Kubernetes in Docker
How to set up ASP.net in Docker
How to perform Continuous integration using Jenkins in Docker
How to install Docker on Windows
How to install docker on Linux
Namespace, Labels and Selectors in Kubernetes
Comparison of Jenkins with Bamboo and TeamCity
Comparison of Jenkins with GoCD and Maven Tools
Comparison of Jenkins with Travis CI and Circle CI
Working with Remote Repository
Version Control System and its Types
Overview of GitHub and Comparison of Git and GitHub
Overview of Git and its features
Merging Branches and Resolve conflicts in Git
How to use Git via the command line
How to switch branches without committing the current branch in Git
How to perform Rebasing in Git
How to Install Git on Linux (Ubuntu) and Mac
How to create a new Blank Repository and commit code in it
Overview of Kubernetes and its Architecture and components
Monitor processes in Kubernetes
Introduction to Maven and How to Set up Maven Environment
How to manage Maven Project in NetBeans and IntelliJ IDEA
How to manage a web-based project using Maven
How to import Maven Project in Eclipse IDE
How to create documentation of Application in Maven
How to automate the Deployment process in Maven
Deep dive into Build Automation
Creating Java Project in Maven
Concept of Project Object Model (POM) in Maven
Concept of Maven Repositories and Plugins in Maven
Concept of Dependency Management in Maven
Understanding Puppet Manifest files and How to write a manifest file in Puppet
Overview of Puppet and its components and concept of configuration management
Look into Nagios Features, applications, Hosts and services and Commands
Overview of Nagios, its architecture and Nagios products
Ports and protocols and Add-ons and plugins in Nagios
Detailed understanding of Checks and States in Nagios
How to run Nagios plugins on other machines remotely using NRPE
Chef-Client as Daemon and Chef-Shell
Concept of Libraries , Definition and setting environment variable
Concept of Lightweight Resource Provider and Blueprints in Chef
Concept of Templates and Dynamically Configuring Recipes
Dealing with Files and Software packages and Community Cookbooks
Execute Cookbook on Node and run Chef-Client
Detailed understanding of Resources in Chef
How to Set up Chef on your system
How to set up Test Kitchen Workflow
How to write Cross-Platform Cookbooks
Overview of Chef and its Architecture
Plain Ruby with Chef DSL and Ruby Gems with Recipes
Testing Cookbook with Test Kitchen
How to Set-up Puppet agent and How to sign and check for SSL Ceritficate
Roles in Chef and perform environment specific configuration
Overview of YAML and Ad-hoc commands in Ansible
A Detailed comparison of Ansible and Puppet
A Detailed comparison of Ansible Vs Chef
Detailed understanding of concept of Playbooks in Ansible
Deep dive into Pip module in Ansible
How to perform troubleshooting in Ansible
How to use variables in playbooks in Ansible and concept of exception handling
Overview of Ansible, its History and How to set-up Ansible on your machine
Concept of Advanced Execution with Ansible
Popular DevOps and DevOps Automation Tools
Comparison between DevOps and Agile methodologies
Concept of DevOps Pipeline and Who are DevOps Engineers
Overview of DevOps and its Architecture
DevOps Training Certification and Azure and AWS DevOps
How to set-up Nagios on Ubuntu
Top Docker Interview Questions and Answers
Top Ansible Interview Questions and Answers
Top Chef Interview Questions and Answers
Top Git and GitHub Interview Questions and Answers
Top DevOps Interview Questions and Answers
Top Puppet Interview Questions and Answers
Top Nagios Interview Questions and Answers
Top Kubernetes Interview Questions and Answers
Digital Marketing
Articles
Understanding Mobile marketing
Understanding Google Analytics
Online Marketing - Web Analytics
Why can we need an SEO Friendly Website?
Concept of Pay Per Click (PPC) and Conversion Rate Optimization (CRO) explained
Online Marketing - Impact, Pros & Cons
Online Marketing - Blogs, banners and forums
Introduction to Online Marketing
Digital Marketing using Twitter and LinkedIn
Digital Marketing using Social Media and YouTube
Digital Marketing using Facebook and Pinterest
Digital Marketing using Content marketing and Email Marketing
eBooks
Interview Questions
Videos
SEO and SMM
Articles
Social Media Marketing using Blogs
Social Media Marketing using Facebook
Social Media Marketing using Google Plus
Social Media Marketing using Linkedin
Social Media Marketing using Pinterest
Social Media Marketing using Twitter
Social Media Marketing using Video
Social Media Analysis and Monitoring Social Media Accounts
SMM - Image Optimization and Social Bookmarking
eBooks
Interview Questions
Videos
Social Media Marketing using Blogs
Social Media Marketing using Facebook
Social Media Marketing using Google Plus
Social Media Marketing using Linkedin
Social Media Marketing using Pinterest
Social Media Marketing using Twitter
Social Media Marketing using Video
Social Media Analysis and Monitoring Social Media Accounts
SMM - Image Optimization and Social Bookmarking
Social Media Marketing using Blogs
Social Media Marketing using Facebook
Social Media Marketing using Google Plus
Social Media Marketing using Linkedin
Social Media Marketing using Pinterest
Social Media Marketing using Twitter
Social Media Marketing using Video
Understanding Mobile marketing
Understanding Google Analytics
Online Marketing - Web Analytics
Why can we need an SEO Friendly Website?
Concept of Pay Per Click (PPC) and Conversion Rate Optimization (CRO) explained
Online Marketing - Impact, Pros & Cons
Online Marketing - Blogs, banners and forums
Introduction to Online Marketing
Digital Marketing using Twitter and LinkedIn
Digital Marketing using Social Media and YouTube
Digital Marketing using Facebook and Pinterest
Digital Marketing using Content marketing and Email Marketing
Overview of Digital Marketing and SEO
Social Media Analysis and Monitoring Social Media Accounts
SMM - Image Optimization and Social Bookmarking
SEO Strategy to Optimize Keywords and Metatags
Affiliate Marketing and Email Marketing
Frontend Development
Articles
eBooks
Interview Questions
Videos
Angular JS
Articles
Create Angular Application and Angular MVC Architecture
Custom Directives in Angular JS
Dependency Injection in Angular JS
Directives and Filters in Angular JS
Embedding Html Pages within HTML page
Expressions and Controllers in Angular JS
How to create Forms in Angular JS
How to create Single Page Application via multiple views
Internationalization in Angular JS
Services Architecture in Angular JS
Spring Angular CRUD Application
Spring Angular Login & Logout Application
Spring Angular Search Field Application
Tables and HTML DOM in Angular JS
Using Directives and Expressions in Angular JS
eBooks
Interview Questions
Videos
Create Angular Application and Angular MVC Architecture
Custom Directives in Angular JS
Dependency Injection in Angular JS
Directives and Filters in Angular JS
Embedding Html Pages within HTML page
Expressions and Controllers in Angular JS
How to create Forms in Angular JS
How to create Single Page Application via multiple views
Internationalization in Angular JS
Services Architecture in Angular JS
Spring Angular CRUD Application
Spring Angular Login & Logout Application
Spring Angular Search Field Application
Tables and HTML DOM in Angular JS
Using Directives and Expressions in Angular JS
React JS
Articles
Comparison Between AngularJS and ReactJS
How to implement flux pattern in React Applications
How to Animate elements using React
Error Handling using Error Boundaries
Environment Setup for React JS
Component Life Cycle Methods in React JS
Comparison between ReactJS and React Native
Overview of ReactJS and its Features
Overview of React Redux with an example
How to set up Router for an app
Using Refs and Keys in React JS
eBooks
Interview Questions
Videos
Comparison Between AngularJS and ReactJS
How to implement flux pattern in React Applications
How to Animate elements using React
Error Handling using Error Boundaries
Environment Setup for React JS
Component Life Cycle Methods in React JS
Comparison between ReactJS and React Native
Overview of ReactJS and its Features
Overview of React Redux with an example
How to set up Router for an app
Using Refs and Keys in React JS
Create Angular Application and Angular MVC Architecture
Custom Directives in Angular JS
Dependency Injection in Angular JS
Directives and Filters in Angular JS
Embedding Html Pages within HTML page
Expressions and Controllers in Angular JS
How to create Forms in Angular JS
How to create Single Page Application via multiple views
Internationalization in Angular JS
Services Architecture in Angular JS
Spring Angular CRUD Application
Spring Angular Login & Logout Application
Spring Angular Search Field Application
Tables and HTML DOM in Angular JS
Using Directives and Expressions in Angular JS
How to Setup AngularJS Environment
Comparison Between AngularJS and ReactJS
How to implement flux pattern in React Applications
How to Animate elements using React
Error Handling using Error Boundaries
Environment Setup for React JS
Component Life Cycle Methods in React JS
Comparison between ReactJS and React Native
Overview of ReactJS and its Features
Overview of React Redux with an example
How to set up Router for an app
Using Refs and Keys in React JS
Understanding ReactJS Components
Top React JS Interview Questions and Answers
IOT
Articles
IoT project of controlling home light using WiFi Node MCU, and Relay module
IoT project of Sonar system using Ultrasonic Sensor HC-SR04 and Arduino device
IoT project of Temperature and Pressure measurement using Pressure sensor BMP180 and Arduino device
IoT (Internet of Things) Project: Google Firebase controlling LED with NodeMCU
IoT link Communication Protocol
IoT Decision Framework and Architecture
IoT in Energy and Biometrics Domain
IoT in Security Camera and Smart Home
IoT in Smart Agriculture and Healthcare Domain
IoT Network Layer and Session Layer Protocols
IoT – Platform and Thing Worx in IoT
IoT Project Google Firebase controlling LED using Android App
IoT Project: Google Firebase using NodeMCU ESP8266
IoT project of controlling home light using WiFi Node MCU, and Relay module
Overview of Internet of Things (IoT)
CISCO Virtualized Packet Zone and Salesforce in IoT
Embedded Devices (System) in (IoT) and IoT Ecosystem
GE Predix Platform and Eclipse IoT
How is IoT transforming businesses and IoT in transportation
eBooks
Interview Questions
Videos
IoT project of controlling home light using WiFi Node MCU, and Relay module
IoT project of Sonar system using Ultrasonic Sensor HC-SR04 and Arduino device
IoT project of Temperature and Pressure measurement using Pressure sensor BMP180 and Arduino device
IoT (Internet of Things) Project: Google Firebase controlling LED with NodeMCU
IoT link Communication Protocol
IoT Decision Framework and Architecture
IoT in Energy and Biometrics Domain
IoT in Security Camera and Smart Home
IoT in Smart Agriculture and Healthcare Domain
IoT Network Layer and Session Layer Protocols
IoT – Platform and Thing Worx in IoT
IoT Project Google Firebase controlling LED using Android App
IoT Project: Google Firebase using NodeMCU ESP8266
IoT project of controlling home light using WiFi Node MCU, and Relay module
Overview of Internet of Things (IoT)
CISCO Virtualized Packet Zone and Salesforce in IoT
Embedded Devices (System) in (IoT) and IoT Ecosystem
GE Predix Platform and Eclipse IoT
How is IoT transforming businesses and IoT in transportation
Internet of Things – Contiki and Security Flaws
Internet of Things – Security and Identity Protection
Top Internet of Things (IoT) Interview Questions and Answers
Mobile Development
Articles
eBooks
Interview Questions
Videos
Operating Systems
Articles
eBooks
Interview Questions
Videos
Programming and Frameworks
Articles
Cookies in Laravel based web applications
Encryption and Hashing in Laravel
How to create Blade Templates Layout
How to Create Façade in Laravel
How to perform Redirections and connect to Database
Installation Process of Laravel
Introduction to Laravel and its History
Laravel vs CodeIgniter and Laravel Vs Symphony
Laravel vs Django and Laravel vs WordPress
Middleware Mechanism in Laravel
Process of Authentication and Authorization in Laravel
Responses in Laravel web applications
Understanding Release Process in Laravel
How to setup Check/Money Order payment method in Magento 2
Dynamic Content Handling in PHP
eBooks
Interview Questions
Videos
Hibernate and Spring
Articles
How to use Node Package Manager and REPL Terminal
Handling GET and POST Request in NodeJS
Using Sessions and POJO Classes in Hibernate
Transaction Management in Spring
Overview and Architecture of Spring Framework
ORM Overview and Overview of Hibernate
IoC Containers, AOP and JDBC Framework in Spring
Injecting Inner Beans and Collections in Spring
How to use Criteria Queries in Hibernate
How to perform Java Based Configuration in Spring
How to Install Hibernate and its Configuration
eBooks
Interview Questions
Videos
How to use Node Package Manager and REPL Terminal
Handling GET and POST Request in NodeJS
Using Sessions and POJO Classes in Hibernate
Transaction Management in Spring
Overview and Architecture of Spring Framework
ORM Overview and Overview of Hibernate
IoC Containers, AOP and JDBC Framework in Spring
Injecting Inner Beans and Collections in Spring
How to use Criteria Queries in Hibernate
How to perform Java Based Configuration in Spring
How to Install Hibernate and its Configuration
Java
Articles
Variables and Keywords in Java
Transaction Management and Batch Processing in JDBC
StringBuffer and StringBuilder Class in Java
String Vs StringBuffer Vs StringBuilder
Stream API Improvement in Java 9
Static Binding and Dynamic Binding and Final Keyword
Serialization and Reflection in Java
Properties class and Generics in Java
Method Parameter Reflection in Java
Java StringJoiner and ArrayList Vs Vector
Java Queue and Deque Interface
Java Parallel Array Sorting and Type Inference
Java Networking and Socket Programming
Java Nested Interface and Method Overloading and Overriding
Java Method References and Functional Interfaces
Java Garbage Collection and Java Runtime Class
Java forEach loop and Collectors
Java Comments and Naming Conventions
Java 9 Process API Improvement
Java 9 Module System and Control Panel
Java 9 Anonymous Inner Classes Improvement and SafeVarargs Annotation
Introduction to Java and History of Java
Inter-thread communication and Deadlock in Java
How to write the Hello World Java program
How to create Immutable class in Java
Features of Java and C++ Vs Java
ExceptionHandling with MethodOverriding in Java
Difference between JDK, JRE, and JVM
Deep Dive into Threads in Java
Deep Dive into LinkedList in Java
Deep dive into LinkedHashMap and TreeMap
Deep dive into HashSet , LinkedHashSet and TreeSet
Deep Dive into HashMap in Java
Deep Dive into ArrayList in Java
Conditional Statements in Java
Concept of Method Overloading and Method Overriding in Java
Concept of Inheritance and Aggregation in Java
Comparable and Comparator interface in Java
eBooks
Interview Questions
Videos
Variables and Keywords in Java
Transaction Management and Batch Processing in JDBC
StringBuffer and StringBuilder Class in Java
String Vs StringBuffer Vs StringBuilder
Stream API Improvement in Java 9
Static Binding and Dynamic Binding and Final Keyword
Serialization and Reflection in Java
Properties class and Generics in Java
Method Parameter Reflection in Java
Java StringJoiner and ArrayList Vs Vector
Java Queue and Deque Interface
Java Parallel Array Sorting and Type Inference
Java Networking and Socket Programming
Java Nested Interface and Method Overloading and Overriding
Java Method References and Functional Interfaces
Java Garbage Collection and Java Runtime Class
Java forEach loop and Collectors
Java Comments and Naming Conventions
Java 9 Process API Improvement
Java 9 Module System and Control Panel
Java 9 Anonymous Inner Classes Improvement and SafeVarargs Annotation
Introduction to Java and History of Java
Inter-thread communication and Deadlock in Java
How to write the Hello World Java program
How to create Immutable class in Java
Features of Java and C++ Vs Java
ExceptionHandling with MethodOverriding in Java
Difference between JDK, JRE, and JVM
Deep Dive into Threads in Java
Deep Dive into LinkedList in Java
Deep dive into LinkedHashMap and TreeMap
Deep dive into HashSet , LinkedHashSet and TreeSet
Deep Dive into HashMap in Java
Deep Dive into ArrayList in Java
Conditional Statements in Java
Concept of Method Overloading and Method Overriding in Java
Concept of Inheritance and Aggregation in Java
Comparable and Comparator interface in Java
JSP
eBooks
Interview Questions
Videos
Laravel
Articles
Understanding Release Process in Laravel
Responses in Laravel web applications
Process of Authentication and Authorization in Laravel
Middleware Mechanism in Laravel
Laravel vs Django and Laravel vs WordPress
Laravel vs CodeIgniter and Laravel Vs Symphony
Introduction to Laravel and its History
Installation Process of Laravel
How to perform Redirections and connect to Database
How to Create Façade in Laravel
How to create Blade Templates Layout
Encryption and Hashing in Laravel
Cookies in Laravel based web applications
Contracts and CSRF Protection in Laravel
Available Validation Rules of Laravel
eBooks
Interview Questions
Videos
Understanding Release Process in Laravel
Responses in Laravel web applications
Process of Authentication and Authorization in Laravel
Middleware Mechanism in Laravel
Laravel vs Django and Laravel vs WordPress
Laravel vs CodeIgniter and Laravel Vs Symphony
Introduction to Laravel and its History
Installation Process of Laravel
How to perform Redirections and connect to Database
How to Create Façade in Laravel
How to create Blade Templates Layout
Encryption and Hashing in Laravel
Cookies in Laravel based web applications
Contracts and CSRF Protection in Laravel
Available Validation Rules of Laravel
Magento
Articles
Architecture of Magento 2 and Product Overview
How to use the multi language feature of Magento
How to Setup System Theme, Page Title, Layout and New Pages in Magento
How to Setup Shipping Rates and Payment Plans in Magento
How to setup shipping methods in Magento 2
How to Setup Paypal Payment and Google checkout in Magento
How to Setup Newsletter in Magento
How to Setup Google Analytics Youtube Videos and Facebook Likes in Magento
How to setup Check/Money Order payment method in Magento 2
How to set up Zero Subtotal Checkout payment method in Magento 2
How to set up the tax rules, tax rates, and tax zones in Magento 2
How to set up Purchase Order (PO) payment method in Magento 2
How to set up Order Emails in Magento 2
How to set up multiple websites, stores, and store views in Magento 2
How to Set up Contact, Categories, Products and Inventory in Magento
How to set up Cash on Delivery (COD) payment method in Magento 2
How to set up Bank Transfer payment method in Magento 2
How to set up Authorize.net method in Magento 2
How to Manage Tax Classes in Magento
How to Install Magento on your system
How to install Magento 2 using Composer
How to install Magento 2 on windows
Ways for Site Optimization in Magento
Store Configuration in Magento 2
Search Engine Optimization in Magento 2
Products and their Types in Magento 2
Overview of Magento and its Features
Orders Life Cycle in Magento 2
Ways for Site Optimization in Magento
Basic Configuration in Magento 2
Create and Manage CMS (Content Management System) in Magento 2
How to add the product on Home page in Magento 2
How to configure and Manage the Inventory in Magento 2
How to create Attribute Sets in Magento 2
How to create Product Attributes in Magento 2
How to create Product Category in Magento 2
eBooks
Interview Questions
Videos
Architecture of Magento 2 and Product Overview
How to use the multi language feature of Magento
How to Setup System Theme, Page Title, Layout and New Pages in Magento
How to Setup Shipping Rates and Payment Plans in Magento
How to setup shipping methods in Magento 2
How to Setup Paypal Payment and Google checkout in Magento
How to Setup Newsletter in Magento
How to Setup Google Analytics Youtube Videos and Facebook Likes in Magento
How to setup Check/Money Order payment method in Magento 2
How to set up Zero Subtotal Checkout payment method in Magento 2
How to set up the tax rules, tax rates, and tax zones in Magento 2
How to set up Purchase Order (PO) payment method in Magento 2
How to set up Order Emails in Magento 2
How to set up multiple websites, stores, and store views in Magento 2
How to Set up Contact, Categories, Products and Inventory in Magento
How to set up Cash on Delivery (COD) payment method in Magento 2
How to set up Bank Transfer payment method in Magento 2
How to set up Authorize.net method in Magento 2
How to Manage Tax Classes in Magento
How to Install Magento on your system
How to install Magento 2 using Composer
How to install Magento 2 on windows
Ways for Site Optimization in Magento
Store Configuration in Magento 2
Search Engine Optimization in Magento 2
Products and their Types in Magento 2
Overview of Magento and its Features
Orders Life Cycle in Magento 2
Ways for Site Optimization in Magento
Basic Configuration in Magento 2
Create and Manage CMS (Content Management System) in Magento 2
How to add the product on Home page in Magento 2
How to configure and Manage the Inventory in Magento 2
How to create Attribute Sets in Magento 2
How to create Product Attributes in Magento 2
How to create Product Category in Magento 2
NodeJS
Articles
Scaffolding and Middleware in ExpressJS
Overview of expressJS, installation and Request-response model
NodeJS environment setup and Creating First Application
How to scale application in NodeJS and concept of packaging
Event Driven Programming in NodeJS
Cookies Management, Routing and Template Engine in ExpressJS
eBooks
Interview Questions
Videos
Scaffolding and Middleware in ExpressJS
Overview of expressJS, installation and Request-response model
NodeJS environment setup and Creating First Application
How to scale application in NodeJS and concept of packaging
Event Driven Programming in NodeJS
Cookies Management, Routing and Template Engine in ExpressJS
PHP
Articles
Variable Types and Constant Types in PHP
Operations in MySQL DB using PHP
Object Oriented Programming in PHP
Login with Facebook and Paypal Integration in PHP
Dynamic Content Handling in PHP
How to Install PHP on your system
How to access information from DB using PHP and AJAX
Error and Exception Handling in PHP
CRUD operations in MySQL DB using PHP
eBooks
Interview Questions
Videos
Variable Types and Constant Types in PHP
Operations in MySQL DB using PHP
Object Oriented Programming in PHP
Login with Facebook and Paypal Integration in PHP
Dynamic Content Handling in PHP
How to Install PHP on your system
How to access information from DB using PHP and AJAX
Error and Exception Handling in PHP
CRUD operations in MySQL DB using PHP
Python
Articles
Variable Types and Basic Operators in Python
Time Series, Geographical and Graph Data in Python
Sending Email using SMTP in Python
Processing CSV, JSON and XLS Data in Python
MySQL Database Access in Python
Multithreaded Programming in Python
Introduction to Python and Installing Python
How to draw different Charts in Python
Handling Relational and NoSQL Databases in Python
Handling Date and Time in Python
Extension Programming with C in Python
Data Wrangling and Data Aggregations in Python
Data Science Libraries in Python
eBooks
Interview Questions
Videos
Variable Types and Basic Operators in Python
Time Series, Geographical and Graph Data in Python
Sending Email using SMTP in Python
Processing CSV, JSON and XLS Data in Python
MySQL Database Access in Python
Multithreaded Programming in Python
Introduction to Python and Installing Python
How to draw different Charts in Python
Handling Relational and NoSQL Databases in Python
Handling Date and Time in Python
Extension Programming with C in Python
Data Wrangling and Data Aggregations in Python
Data Science Libraries in Python
Servlet
eBooks
Interview Questions
Videos
Spring Boot
Articles
How to write a Scheduler on the Spring applications and CORS Support
Service Components in Spring Boot
Tracing Micro Service Logs in Spring Boot
How to perform Bootstrapping on a Spring Boot application
How to use Spring Boot JDBC driver connection to connect the database
How to write a unit test case by using Mockito and Web Controller
Spring Boot - Code Structure and Build Systems
Spring Boot - Enabling Swagger2
Spring Boot - Google Cloud Platform
Spring Boot - Rest Controller Unit Test
Spring Boot - Securing Web Applications
Spring Boot - Tomcat Deployment
Spring Boot Architecture and Why Spring Boot is used
Spring Boot Security mechanisms and OAuth2 with JWT
Spring Vs Spring Boot Vs Spring MVC
Application Properties in Spring Boot
How to implement the SMS sending and making voice calls by using Spring Boot with Twilio
Building RESTful Web Services using Spring Boot
Consuming RESTful Web Services by using jQuery AJAX
Create a Web Application in Spring Boot using Thymeleaf
Creating Servlet Filter using Spring Boot
Exception Handling in Spring Boot
File Handling using Spring Boot
How to add the Google OAuth2 Sign-In by using Spring Boot application with Gradle build
How to build a Eureka Server using Spring Boot
How to build an interactive web application by using Spring Boot with Web sockets
How to Build Spring Boot Admin Server and Client
How to Create Applications that consume Restful Web Services
How to Create Spring Cloud Configuration Server
How to Configure Flyway Database in your Spring Boot application
How to create a Docker Image using Maven and Gradle
How to create a Spring Boot Application using Maven and Gradle
How to create Zuul Proxy Server application in Spring Boot
How to implement the Apache Kafka in Spring Boot application
eBooks
Interview Questions
Videos
How to write a Scheduler on the Spring applications and CORS Support
Service Components in Spring Boot
Tracing Micro Service Logs in Spring Boot
How to perform Bootstrapping on a Spring Boot application
How to use Spring Boot JDBC driver connection to connect the database
How to write a unit test case by using Mockito and Web Controller
Spring Boot - Code Structure and Build Systems
Spring Boot - Enabling Swagger2
Spring Boot - Google Cloud Platform
Spring Boot - Rest Controller Unit Test
Spring Boot - Securing Web Applications
Spring Boot - Tomcat Deployment
Spring Boot Architecture and Why Spring Boot is used
Spring Boot Security mechanisms and OAuth2 with JWT
Spring Vs Spring Boot Vs Spring MVC
Application Properties in Spring Boot
How to implement the SMS sending and making voice calls by using Spring Boot with Twilio
Building RESTful Web Services using Spring Boot
Consuming RESTful Web Services by using jQuery AJAX
Create a Web Application in Spring Boot using Thymeleaf
Creating Servlet Filter using Spring Boot
Exception Handling in Spring Boot
File Handling using Spring Boot
How to add the Google OAuth2 Sign-In by using Spring Boot application with Gradle build
How to build a Eureka Server using Spring Boot
How to build an interactive web application by using Spring Boot with Web sockets
How to Build Spring Boot Admin Server and Client
How to Create Applications that consume Restful Web Services
How to Create Spring Cloud Configuration Server
How to Configure Flyway Database in your Spring Boot application
How to create a Docker Image using Maven and Gradle
How to create a Spring Boot Application using Maven and Gradle
How to create Zuul Proxy Server application in Spring Boot
How to implement the Apache Kafka in Spring Boot application
Variable Types and Basic Operators in Python
Time Series, Geographical and Graph Data in Python
Sending Email using SMTP in Python
Processing CSV, JSON and XLS Data in Python
MySQL Database Access in Python
Multithreaded Programming in Python
Introduction to Python and Installing Python
How to draw different Charts in Python
Handling Relational and NoSQL Databases in Python
Handling Date and Time in Python
Extension Programming with C in Python
Data Wrangling and Data Aggregations in Python
Data Science Libraries in Python
Calendar and Date and Time in Python
Scaffolding and Middleware in ExpressJS
Overview of expressJS, installation and Request-response model
NodeJS environment setup and Creating First Application
How to use Node Package Manager and REPL Terminal
How to scale application in NodeJS and concept of packaging
Handling GET and POST Request in NodeJS
Event Driven Programming in NodeJS
Cookies Management, Routing and Template Engine in ExpressJS
Concept of Callbacks and Streams in NodeJS
Comparison of NodeJS with other programming languages
Using Sessions and POJO Classes in Hibernate
Transaction Management in Spring
Overview and Architecture of Spring Framework
ORM Overview and Overview of Hibernate
IoC Containers, AOP and JDBC Framework in Spring
Architecture of Magento 2 and Product Overview
Injecting Inner Beans and Collections in Spring
How to use Criteria Queries in Hibernate
How to perform Java Based Configuration in Spring
How to Install Hibernate and its Configuration
Environment Setup for Spring Framework
Variables and Keywords in Java
Transaction Management and Batch Processing in JDBC
StringBuffer and StringBuilder Class in Java
String Vs StringBuffer Vs StringBuilder
Stream API Improvement in Java 9
Static Binding and Dynamic Binding and Final Keyword
Serialization and Reflection in Java
Properties class and Generics in Java
Method Parameter Reflection in Java
Java StringJoiner and ArrayList Vs Vector
Java Queue and Deque Interface
Java Parallel Array Sorting and Type Inference
Java Networking and Socket Programming
Java Nested Interface and Method Overloading and Overriding
Java Method References and Functional Interfaces
Java Garbage Collection and Java Runtime Class
Java forEach loop and Collectors
Java Comments and Naming Conventions
Java 9 Process API Improvement
Java 9 Module System and Control Panel
Java 9 Anonymous Inner Classes Improvement and SafeVarargs Annotation
Introduction to Java and History of Java
Inter-thread communication and Deadlock in Java
How to write the Hello World Java program
How to create Immutable class in Java
Features of Java and C++ Vs Java
ExceptionHandling with MethodOverriding in Java
Difference between JDK, JRE, and JVM
Deep Dive into Threads in Java
Deep Dive into LinkedList in Java
Deep dive into LinkedHashMap and TreeMap
Deep dive into HashSet , LinkedHashSet and TreeSet
Deep Dive into HashMap in Java
Deep Dive into ArrayList in Java
Conditional Statements in Java
Concept of Method Overloading and Method Overriding in Java
Concept of Inheritance and Aggregation in Java
Comparable and Comparator interface in Java
Call by Value and Call by Reference in Java
Cookies in Laravel based web applications
Encryption and Hashing in Laravel
How to create Blade Templates Layout
How to Create Façade in Laravel
How to perform Redirections and connect to Database
Installation Process of Laravel
Introduction to Laravel and its History
Laravel vs CodeIgniter and Laravel Vs Symphony
Laravel vs Django and Laravel vs WordPress
Middleware Mechanism in Laravel
Process of Authentication and Authorization in Laravel
Responses in Laravel web applications
Understanding Release Process in Laravel
How to setup Check/Money Order payment method in Magento 2
Dynamic Content Handling in PHP
Object Oriented Programming in PHP
How to use the multi language feature of Magento
How to Setup System Theme, Page Title, Layout and New Pages in Magento
How to Setup Shipping Rates and Payment Plans in Magento
How to setup shipping methods in Magento 2
How to Setup Paypal Payment and Google checkout in Magento
How to Setup Newsletter in Magento
How to Setup Google Analytics Youtube Videos and Facebook Likes in Magento
How to setup Check/Money Order payment method in Magento 2
How to set up Zero Subtotal Checkout payment method in Magento 2
How to set up the tax rules, tax rates, and tax zones in Magento 2
How to set up Purchase Order (PO) payment method in Magento 2
How to set up Order Emails in Magento 2
How to set up multiple websites, stores, and store views in Magento 2
How to Set up Contact, Categories, Products and Inventory in Magento
How to set up Cash on Delivery (COD) payment method in Magento 2
How to set up Bank Transfer payment method in Magento 2
How to set up Authorize.net method in Magento 2
How to Manage Tax Classes in Magento
How to Install Magento on your system
How to install Magento 2 using Composer
How to install Magento 2 on windows
Ways for Site Optimization in Magento
Store Configuration in Magento 2
Search Engine Optimization in Magento 2
Products and their Types in Magento 2
Overview of Magento and its Features
Orders Life Cycle in Magento 2
Ways for Site Optimization in Magento
Basic Configuration in Magento 2
Create and Manage CMS (Content Management System) in Magento 2
How to add the product on Home page in Magento 2
How to configure and Manage the Inventory in Magento 2
How to create Attribute Sets in Magento 2
How to create Product Attributes in Magento 2
How to create Product Category in Magento 2
How to generate Order Report in Magento 2
How to create Product in Magento 2
Variable Types and Constant Types in PHP
Operations in MySQL DB using PHP
Object Oriented Programming in PHP
Login with Facebook and Paypal Integration in PHP
Dynamic Content Handling in PHP
How to Install PHP on your system
How to access information from DB using PHP and AJAX
Error and Exception Handling in PHP
CRUD operations in MySQL DB using PHP
Handling Arrays and Strings in PHP
Standard Tag Library (JSTL) in JSP
Page Redirecting and Hits Counter and Auto Refresh
Overview of Java Server Pages and its Architecture
How to Access Database with JSP
Servlets - Server HTTP Response
Servlets - Page Redirection and Auto Refresh
Internationalization in Servlets
Handling Date and Time using Servlets
Exception Handling in Servlets
Overview of Servlets and setup of Environment
How to write a Scheduler on the Spring applications and CORS Support
Service Components in Spring Boot
Tracing Micro Service Logs in Spring Boot
How to perform Bootstrapping on a Spring Boot application
How to use Spring Boot JDBC driver connection to connect the database
How to write a unit test case by using Mockito and Web Controller
Spring Boot - Code Structure and Build Systems
Spring Boot - Enabling Swagger2
Spring Boot - Google Cloud Platform
Spring Boot - Rest Controller Unit Test
Spring Boot - Securing Web Applications
Spring Boot - Tomcat Deployment
Spring Boot Architecture and Why Spring Boot is used
Spring Boot Security mechanisms and OAuth2 with JWT
Spring Vs Spring Boot Vs Spring MVC
Application Properties in Spring Boot
How to implement the SMS sending and making voice calls by using Spring Boot with Twilio
Building RESTful Web Services using Spring Boot
Consuming RESTful Web Services by using jQuery AJAX
Create a Web Application in Spring Boot using Thymeleaf
Creating Servlet Filter using Spring Boot
Exception Handling in Spring Boot
File Handling using Spring Boot
How to add the Google OAuth2 Sign-In by using Spring Boot application with Gradle build
How to build a Eureka Server using Spring Boot
How to build an interactive web application by using Spring Boot with Web sockets
How to Build Spring Boot Admin Server and Client
How to Create Applications that consume Restful Web Services
How to Create Spring Cloud Configuration Server
How to Configure Flyway Database in your Spring Boot application
How to create a Docker Image using Maven and Gradle
How to create a Spring Boot Application using Maven and Gradle
How to create Zuul Proxy Server application in Spring Boot
How to implement the Apache Kafka in Spring Boot application
How to implement the Internationalization in Spring Boot
How to implement the Hystrix in a Spring Boot application
Login with Facebook and Paypal Integration in PHP
Understanding Release Process in Laravel
Responses in Laravel web applications
Process of Authentication and Authorization in Laravel
Middleware Mechanism in Laravel
Laravel vs Django and Laravel vs WordPress
Laravel vs CodeIgniter and Laravel Vs Symphony
Introduction to Laravel and its History
Installation Process of Laravel
How to perform Redirections and connect to Database
How to Create Façade in Laravel
How to create Blade Templates Layout
Encryption and Hashing in Laravel
Cookies in Laravel based web applications
Contracts and CSRF Protection in Laravel
Available Validation Rules of Laravel
Artisan Console for interaction in Laravel
Application Structure of Laravel
Expression Language (EL) in JSP
Expression Language (EL) in JSP
Project Management and Methodologies
Articles
eBooks
Interview Questions
Videos
Robotic Process Automation
eBooks
Interview Questions
Videos
RPA-UiPath
Articles
eBooks
Interview Questions
Videos
Working of RPA and its Services
Understanding User Interface Components
UiPath Studio - Workflow Design
RPA Use Cases and Applications
RPA Life Cycle and Implementation
Recording using UiPath in Detail
Keyboard Shortcuts and Customization in UiPath Studio
Key Basics of UiPath and the related concepts
Installation of UiPath on your local system
How to work with Automation Projects in UiPath and their Debugging methods
How to deal and work with variables and arguments in UiPath
Data Scraping and Screen Scraping in UiPath
Comparison of RPA and AI, Test Automation and Traditional Automation
Architecture and Components of RPA
Advantages and drawbacks of RPA
Top Robotic Process Automation (RPA) with UiPath Interview Questions and Answers
Salesforce
Articles
Different Levels of Data Access in Salesforce
Variables & Formulas in Salesforce
Using Records, Fields and Tables in Salesforce
Using Forms and List Controllers in Salesforce
Creating Static Resources in Salesforce
Standard and Custom Objects in Salesforce platform
Overview of Salesforce and its architecture
Master Detail Relationship in Salesforce
Lookup Relationship in Salesforce
How to Import Data in Salesforce
How to Export Data from Salesforce
How to Define Sharing Rules in Salesforce
How to create Visual force Pages in Salesforce
How to create Reports and Dashboards in Salesforce
Get Started with Salesforce - Environment
How to Create a Role Hierarchy in Salesforce
eBooks
Interview Questions
Videos
Apex Programming
Articles
Classes and Methods in Apex programming language
Concept of Objects and Interfaces in Apex programming language
Database Methods and process of executing the Apex class in Salesforce
Deployment in Salesforce using Sandbox
Enterprise Application Development Example
How to Perform Debugging in Apex
How to perform the various Database Modification Functionalities in Salesforce
How to perform Unit Testing in Apex
Overview of Apex Programming and its environment
Search Functionality using SOSL and SOQL
Understand Batch Processing in Salesforce Apex
Understanding deciding, Loops and Collections in Apex
Understanding Governor Limits in Salesforce Apex
Understanding the info Types and variables in Apex programming language
Understanding the environment for Salesforce Apex development
Understanding the String Manipulation, Arrays and Constants in Apex programming language
eBooks
Interview Questions
Videos
Classes and Methods in Apex programming language
Concept of Objects and Interfaces in Apex programming language
Database Methods and process of executing the Apex class in Salesforce
Deployment in Salesforce using Sandbox
Enterprise Application Development Example
How to Perform Debugging in Apex
How to perform the various Database Modification Functionalities in Salesforce
How to perform Unit Testing in Apex
Overview of Apex Programming and its environment
Search Functionality using SOSL and SOQL
Understand Batch Processing in Salesforce Apex
Understanding deciding, Loops and Collections in Apex
Understanding Governor Limits in Salesforce Apex
Understanding the info Types and variables in Apex programming language
Understanding the environment for Salesforce Apex development
Understanding the String Manipulation, Arrays and Constants in Apex programming language
Different Levels of Data Access in Salesforce
Variables & Formulas in Salesforce
Using Records, Fields and Tables in Salesforce
Using Forms and List Controllers in Salesforce
Creating Static Resources in Salesforce
Standard and Custom Objects in Salesforce platform
Overview of Salesforce and its architecture
Master Detail Relationship in Salesforce
Lookup Relationship in Salesforce
How to Import Data in Salesforce
How to Export Data from Salesforce
How to Define Sharing Rules in Salesforce
How to create Visual force Pages in Salesforce
How to create Reports and Dashboards in Salesforce
Get Started with Salesforce - Environment
How to Create a Role Hierarchy in Salesforce
Classes and Methods in Apex programming language
Concept of Objects and Interfaces in Apex programming language
Database Methods and process of executing the Apex class in Salesforce
Deployment in Salesforce using Sandbox
Enterprise Application Development Example
How to Perform Debugging in Apex
How to perform the various Database Modification Functionalities in Salesforce
How to perform Unit Testing in Apex
Overview of Apex Programming and its environment
Search Functionality using SOSL and SOQL
Understand Batch Processing in Salesforce Apex
Understanding deciding, Loops and Collections in Apex
Understanding Governor Limits in Salesforce Apex
Understanding the info Types and variables in Apex programming language
Understanding the environment for Salesforce Apex development
Understanding the String Manipulation, Arrays and Constants in Apex programming language
Using Formula Fields in Salesforce
SAP
Articles
unv Universe in SAP Business Object
Using Formula Bar and Universe Operations in SAP Universe Designer
Using LOVs and Create, Edit and Save a Universe
How to Display Financial Tables in SAP Simple Finance
Concept of Period Lock Transaction in SAP Simple Finance
Concept of Asset Scrapping in SAP Simple Finance
Create Default Account Assignment in SAP Simple Finance
How to Create a Primary Cost in G-L Account
Asset Accounting in SAP Simple Finance
Concept of Integrated Business Planning and Integration of Simple Finance with other Modules
eBooks
Interview Questions
Videos
SAP Business Object
Articles
Using Filters in SAP BO Analysis
Sheets and Sharing Workspaces in SAP BO Analysis
Perform Conditional Formatting in SAP BO Analysis
Overview of SAP Business Object Analysis
How to create a Workspace in SAP Business Objects
How to Connect to SAP BW in SAP Business Objects
Export Options in SAP BO Analysis
Concept of Sub Analysis in SAP BO
eBooks
Interview Questions
Videos
Using Filters in SAP BO Analysis
Sheets and Sharing Workspaces in SAP BO Analysis
Perform Conditional Formatting in SAP BO Analysis
Overview of SAP Business Object Analysis
How to create a Workspace in SAP Business Objects
How to Connect to SAP BW in SAP Business Objects
Export Options in SAP BO Analysis
Concept of Sub Analysis in SAP BO
Calculations in SAP BO Analysis
SAP Hana
Articles
Alert Monitoring and Logging in SAP Hana
Authentications and Authorization Methods in SAP HANA
DXC Replication Method and CTL Method and MDX provider in SAP Hana
Excel Integration with SAP Hana and Bi 4.0 Connectivity to Hana Views
User Administration & Role Management and Security Overview in SAP Hana
Usage of SQL Script in SAP Hana
SQL Triggers, Synonym and Data Profiling in SAP Hana
SQL Overview and Data Types in SAP Hana
SQL Functions and Operators in SAP Hana
SQL Expressions, Stored Procedures and Sequences in SAP Hana
Packages and Attribute and Analytic View in SAP Hana
Modeling and Schemas in SAP HANA
Log Based and ETL Based Replication in SAP Hana
License Management and Auditing in SAP Hana
Information Modeler and System Monitor in SAP HANA
High Availability and Backup and Recovery in SAP Hana
Export and Import Options in Sap Hana
eBooks
Videos
Alert Monitoring and Logging in SAP Hana
Authentications and Authorization Methods in SAP HANA
DXC Replication Method and CTL Method and MDX provider in SAP Hana
Excel Integration with SAP Hana and Bi 4.0 Connectivity to Hana Views
User Administration & Role Management and Security Overview in SAP Hana
Usage of SQL Script in SAP Hana
SQL Triggers, Synonym and Data Profiling in SAP Hana
SQL Overview and Data Types in SAP Hana
SQL Functions and Operators in SAP Hana
SQL Expressions, Stored Procedures and Sequences in SAP Hana
Packages and Attribute and Analytic View in SAP Hana
Modeling and Schemas in SAP HANA
Log Based and ETL Based Replication in SAP Hana
License Management and Auditing in SAP Hana
Information Modeler and System Monitor in SAP HANA
High Availability and Backup and Recovery in SAP Hana
Export and Import Options in Sap Hana
Data Replication Overview in SAP Hana
Analytic Privileges and Information Composer in SAP Hana
SAP Hana Adminstration
Articles
SAP HANA Admin Studio and System Management
Overview of SAP HANA Administration
SAP HANA License Management and Multitenant DB Container Management
Smart Data Access and Integration with Hadoop
How to Start, Stop and Monitor a HANA System
HANA XS Application Service and Data Provisioning in SAP Hana
Data Compression and Solman Integration in SAP Hana
eBooks
Interview Questions
Videos
SAP HANA Admin Studio and System Management
Overview of SAP HANA Administration
SAP HANA License Management and Multitenant DB Container Management
Smart Data Access and Integration with Hadoop
How to Start, Stop and Monitor a HANA System
HANA XS Application Service and Data Provisioning in SAP Hana
Data Compression and Solman Integration in SAP Hana
SAP Hana Finance
Articles
Profitability Analysis and Management Accounting in SAP Simple Finance
Overview of SAP Hana and SAP Hana Finance
Migration and Manual Reposting of Costs in SAP Simple Finance
How to Display Financial Tables in SAP Simple Finance
Concept of Period Lock Transaction in SAP Simple Finance
Concept of Asset Scrapping in SAP Simple Finance
Create Default Account Assignment in SAP Simple Finance
How to Create a Primary Cost in G-L Account
Ledger Management in SAP Simple Finance
Reporting Options and G/L Accounting in SAP Simple Finance
Universal Journal and Document Number in SAP Simple Finance
SAP Simple Finance Architecture and Deployment Options
Asset Accounting in SAP Simple Finance
Concept of Integrated Business Planning and Integration of Simple Finance with other Modules
eBooks
Interview Questions
Videos
Profitability Analysis and Management Accounting in SAP Simple Finance
Overview of SAP Hana and SAP Hana Finance
Migration and Manual Reposting of Costs in SAP Simple Finance
How to Display Financial Tables in SAP Simple Finance
Concept of Period Lock Transaction in SAP Simple Finance
Concept of Asset Scrapping in SAP Simple Finance
Create Default Account Assignment in SAP Simple Finance
How to Create a Primary Cost in G-L Account
Ledger Management in SAP Simple Finance
Reporting Options and G/L Accounting in SAP Simple Finance
Universal Journal and Document Number in SAP Simple Finance
SAP Simple Finance Architecture and Deployment Options
Asset Accounting in SAP Simple Finance
Concept of Integrated Business Planning and Integration of Simple Finance with other Modules
SAP Hana Logistics
Articles
Supply Chain Planning and Integrated Business Planning in SAP Hana Logistics
Overview of SAP Hana Simple Logistics
MRP Procedures and Key Features in SAP Simple Logistics
MIGO Transactions in SAP Simple Logistics
Manufacturing Process in SAP Simple Logistics
Invoice Management and Operational Procurement in SAP Simple Logistics
How to Manage Business Partner in SAP Simple Logistics
How to Execute MRP Live planning
How to Create Business Partner in SAP HANA Logistics
Fiori UX and Deployment and Procurement Types in SAP Hana Logistics
Execute Discrete Production in SAP Hana Logistics
Contract Management and Perform Procurement Transfer Stock in SAP Hana Logistics
eBooks
Interview Questions
Videos
Supply Chain Planning and Integrated Business Planning in SAP Hana Logistics
Overview of SAP Hana Simple Logistics
MRP Procedures and Key Features in SAP Simple Logistics
MIGO Transactions in SAP Simple Logistics
Manufacturing Process in SAP Simple Logistics
Invoice Management and Operational Procurement in SAP Simple Logistics
How to Manage Business Partner in SAP Simple Logistics
How to Execute MRP Live planning
How to Create Business Partner in SAP HANA Logistics
Fiori UX and Deployment and Procurement Types in SAP Hana Logistics
Execute Discrete Production in SAP Hana Logistics
Contract Management and Perform Procurement Transfer Stock in SAP Hana Logistics
Concept of Simplification Item in SAP Simple Logistics
SAP UDT & IDT
Articles
Building Data Foundation in SAP IDT
Building Query in Query Panel, Publishing in SAP IDT
Business Layer Properties in SAP IDT
Dealing with Published Universes in SAP IDT
Deploying Universe in SAP Universe Designer
Format Editor Overview in SAP IDT
How to create universe in SAP IDT
How to use Table Browser and Derived Tables in SAP Universal Designer
Joins In Data Foundation in SAP IDT
Managing Connections in SAP IDT
Managing Resources in Repository, Qualifiers and Owners
OLAP Data Sources in SAP Universe Designer
Overview of SAP Universe Designer
unv Universe in SAP Business Object
Using Formula Bar and Universe Operations in SAP Universe Designer
Using LOVs and Create, Edit and Save a Universe
Concept of Calculated Measures and Aggregate Awareness
Business Layer View in SAP IDT
eBooks
Interview Questions
Videos
Building Data Foundation in SAP IDT
Building Query in Query Panel, Publishing in SAP IDT
Business Layer Properties in SAP IDT
Dealing with Published Universes in SAP IDT
Deploying Universe in SAP Universe Designer
Format Editor Overview in SAP IDT
How to create universe in SAP IDT
How to use Table Browser and Derived Tables in SAP Universal Designer
Joins In Data Foundation in SAP IDT
Managing Connections in SAP IDT
Managing Resources in Repository, Qualifiers and Owners
OLAP Data Sources in SAP Universe Designer
Overview of SAP Universe Designer
unv Universe in SAP Business Object
Using Formula Bar and Universe Operations in SAP Universe Designer
Using LOVs and Create, Edit and Save a Universe
Concept of Calculated Measures and Aggregate Awareness
Business Layer View in SAP IDT
Sap Webi
Articles
Working with Reports in SAP Webi
Sending Documents in SAP Web Intelligence
Query Filters and Filters Type in SAP Webi
Queries using Bex and Analysis View in SAP Webi
How to use Formulas and Variables in SAP Webi
How to use Breaks, Sorts and Ranking Data in SAP Webi
How to Create SAP Webi documents
How to achieve Conditional Formatting in SAP Webi
eBooks
Interview Questions
Videos
Working with Reports in SAP Webi
Sending Documents in SAP Web Intelligence
Query Filters and Filters Type in SAP Webi
Queries using Bex and Analysis View in SAP Webi
How to use Formulas and Variables in SAP Webi
How to use Breaks, Sorts and Ranking Data in SAP Webi
How to Create SAP Webi documents
How to achieve Conditional Formatting in SAP Webi
SAP HANA Admin Studio and System Management
Overview of SAP HANA Administration
SAP HANA License Management and Multitenant DB Container Management
Smart Data Access and Integration with Hadoop
Building Data Foundation in SAP IDT
Building Query in Query Panel, Publishing in SAP IDT
Business Layer Properties in SAP IDT
Dealing with Published Universes in SAP IDT
Deploying Universe in SAP Universe Designer
Format Editor Overview in SAP IDT
How to create universe in SAP IDT
How to use Table Browser and Derived Tables in SAP Universal Designer
Joins In Data Foundation in SAP IDT
Managing Connections in SAP IDT
Managing Resources in Repository, Qualifiers and Owners
OLAP Data Sources in SAP Universe Designer
Overview of SAP Universe Designer
unv Universe in SAP Business Object
Using Formula Bar and Universe Operations in SAP Universe Designer
Using LOVs and Create, Edit and Save a Universe
Concept of Calculated Measures and Aggregate Awareness
Business Layer View in SAP IDT
Profitability Analysis and Management Accounting in SAP Simple Finance
Overview of SAP Hana and SAP Hana Finance
Migration and Manual Reposting of Costs in SAP Simple Finance
How to Display Financial Tables in SAP Simple Finance
Concept of Period Lock Transaction in SAP Simple Finance
Concept of Asset Scrapping in SAP Simple Finance
Create Default Account Assignment in SAP Simple Finance
How to Create a Primary Cost in G-L Account
Ledger Management in SAP Simple Finance
Reporting Options and G/L Accounting in SAP Simple Finance
Universal Journal and Document Number in SAP Simple Finance
SAP Simple Finance Architecture and Deployment Options
Alert Monitoring and Logging in SAP Hana
Authentications and Authorization Methods in SAP HANA
DXC Replication Method and CTL Method and MDX provider in SAP Hana
Excel Integration with SAP Hana and Bi 4.0 Connectivity to Hana Views
Working with Reports in SAP Webi
Sending Documents in SAP Web Intelligence
Query Filters and Filters Type in SAP Webi
Queries using Bex and Analysis View in SAP Webi
How to use Formulas and Variables in SAP Webi
How to use Breaks, Sorts and Ranking Data in SAP Webi
How to Create SAP Webi documents
How to achieve Conditional Formatting in SAP Webi
Filtering Report Data in SAP Webi
Drill Options in Reports and Sharing Reports in SAP Webi
Supply Chain Planning and Integrated Business Planning in SAP Hana Logistics
Overview of SAP Hana Simple Logistics
MRP Procedures and Key Features in SAP Simple Logistics
MIGO Transactions in SAP Simple Logistics
Manufacturing Process in SAP Simple Logistics
Invoice Management and Operational Procurement in SAP Simple Logistics
How to Manage Business Partner in SAP Simple Logistics
How to Execute MRP Live planning
How to Create Business Partner in SAP HANA Logistics
Fiori UX and Deployment and Procurement Types in SAP Hana Logistics
Execute Discrete Production in SAP Hana Logistics
Contract Management and Perform Procurement Transfer Stock in SAP Hana Logistics
Concept of Simplification Item in SAP Simple Logistics
Analyze Sales Orders in SAP Simple Logistics
User Administration & Role Management and Security Overview in SAP Hana
Usage of SQL Script in SAP Hana
SQL Triggers, Synonym and Data Profiling in SAP Hana
SQL Overview and Data Types in SAP Hana
SQL Functions and Operators in SAP Hana
SQL Expressions, Stored Procedures and Sequences in SAP Hana
Packages and Attribute and Analytic View in SAP Hana
Modeling and Schemas in SAP HANA
Log Based and ETL Based Replication in SAP Hana
License Management and Auditing in SAP Hana
Information Modeler and System Monitor in SAP HANA
High Availability and Backup and Recovery in SAP Hana
Export and Import Options in Sap Hana
Data Replication Overview in SAP Hana
Analytic Privileges and Information Composer in SAP Hana
Using Filters in SAP BO Analysis
Sheets and Sharing Workspaces in SAP BO Analysis
Perform Conditional Formatting in SAP BO Analysis
Overview of SAP Business Object Analysis
How to create a Workspace in SAP Business Objects
Asset Accounting in SAP Simple Finance
Concept of Integrated Business Planning and Integration of Simple Finance with other Modules
How to Connect to SAP BW in SAP Business Objects
Export Options in SAP BO Analysis
Concept of Sub Analysis in SAP BO
Calculations in SAP BO Analysis
Aggregations and Hierarchies in SAP BO Analysis
SAP IDT - Overview and User Interface
Creating Parameters and Schemas in SAP Universe Designer
How to Start, Stop and Monitor a HANA System
HANA XS Application Service and Data Provisioning in SAP Hana
Data Compression and Solman Integration in SAP Hana
Authentication Methods supported by SAP HANA
Auditing Activities in SAP Hana
Top SAP S4 HANA Logistics Interview Questions and Answers
Top SAP S4 HANA Finance Interview Questions and Answers
Top SAP HANA Interview Questions and Answers
Software Testing
Articles
eBooks
Interview Questions
Videos
Selenium WebDriver
Articles
How to run your Selenium Test Scripts on IE Browser
How to run your Selenium Test Scripts on Firefox Browser
Comparison of Selenium vs QTP and Selenium Tool Suite
How to run your Selenium Test Scripts on Safari Browser
Overview of Selenium WebDriver
Overview of Selenium, its features and limitations
Scrolling an internet page in Selenium WebDriver
Selenium IDE- Locating Strategies by Identifier and By Id
Selenium IDE- Locating Strategies by Name, XPath , CSS and DOM
How to run your Selenium Test Scripts on Chrome Browser
How to Handle Alerts in Selenium WebDriver
Selenium WebDriver - Navigation and Web Element Commands
How to handle radio buttons and checkbox in selenium web driver
Selenium WebDriver - Browser Commands
Selenium WebDriver- Locating Strategies and Handling Drop-downs
Comparison between Selenium WebDriver and Selenium RC
Creating Test Cases Manually in Selenium IDE
How to create Login test suit in Selenium IDE
How to create your First Selenium Automation Test Script
Selenium IDE- Commands (Selenese)
Using Assertions in Selenium WebDriver
Overview of Selenium Integrated Development Environment (IDE)
eBooks
Interview Questions
Videos
How to run your Selenium Test Scripts on IE Browser
How to run your Selenium Test Scripts on Firefox Browser
Comparison of Selenium vs QTP and Selenium Tool Suite
How to run your Selenium Test Scripts on Safari Browser
Overview of Selenium WebDriver
Overview of Selenium, its features and limitations
Scrolling an internet page in Selenium WebDriver
Selenium IDE- Locating Strategies by Identifier and By Id
Selenium IDE- Locating Strategies by Name, XPath , CSS and DOM
How to run your Selenium Test Scripts on Chrome Browser
How to Handle Alerts in Selenium WebDriver
Selenium WebDriver - Navigation and Web Element Commands
How to handle radio buttons and checkbox in selenium web driver
Selenium WebDriver - Browser Commands
Selenium WebDriver- Locating Strategies and Handling Drop-downs
Comparison between Selenium WebDriver and Selenium RC
Creating Test Cases Manually in Selenium IDE
How to create Login test suit in Selenium IDE
How to create your First Selenium Automation Test Script
Selenium IDE- Commands (Selenese)
Using Assertions in Selenium WebDriver
Overview of Selenium Integrated Development Environment (IDE)
Selenium with Maven
Articles
Execute Selenium code through Maven and TestNG
How to Configure Selenium using NUnit in Visual Studio
How to Configure Selenium with Visual Studio in C#
How to handle or download dependency Jar using Maven
Write a Selenium test script using C#
Selenium Test Script using NUnit
How to write a Selenium test script using C#
eBooks
Interview Questions
Videos
Execute Selenium code through Maven and TestNG
How to Configure Selenium using NUnit in Visual Studio
How to Configure Selenium with Visual Studio in C#
How to handle or download dependency Jar using Maven
Write a Selenium test script using C#
Selenium Test Script using NUnit
How to write a Selenium test script using C#
Test NG
Articles
How to Run test cases in TestNG without java compiler
Overview of TestNG and its Features
Importance of XML file in TestNG Configuration
How to use TestNG Annotation Attributes
How to Run test cases with Regex in TestNG
How to install TestNG Framework and Configuration in Eclipse
How to enable and disable test cases in TestNG
eBooks
Interview Questions
Videos
How to Run test cases in TestNG without java compiler
Overview of TestNG and its Features
Importance of XML file in TestNG Configuration
How to use TestNG Annotation Attributes
How to Run test cases with Regex in TestNG
How to install TestNG Framework and Configuration in Eclipse
How to enable and disable test cases in TestNG
How to run your Selenium Test Scripts on IE Browser
How to run your Selenium Test Scripts on Firefox Browser
Comparison of Selenium vs QTP and Selenium Tool Suite
How to run your Selenium Test Scripts on Safari Browser
Overview of Selenium WebDriver
Overview of Selenium, its features and limitations
Scrolling an internet page in Selenium WebDriver
Selenium IDE- Locating Strategies by Identifier and By Id
Selenium IDE- Locating Strategies by Name, XPath , CSS and DOM
How to run your Selenium Test Scripts on Chrome Browser
How to Handle Alerts in Selenium WebDriver
Selenium WebDriver - Navigation and Web Element Commands
How to handle radio buttons and checkbox in selenium web driver
Selenium WebDriver - Browser Commands
Selenium WebDriver- Locating Strategies and Handling Drop-downs
Comparison between Selenium WebDriver and Selenium RC
Creating Test Cases Manually in Selenium IDE
How to create Login test suit in Selenium IDE
How to create your First Selenium Automation Test Script
Execute Selenium code through Maven and TestNG
How to Configure Selenium using NUnit in Visual Studio
How to Configure Selenium with Visual Studio in C#
How to handle or download dependency Jar using Maven
Write a Selenium test script using C#
Selenium Test Script using NUnit
How to write a Selenium test script using C#
Write and Execute the Selenium test script
Using Maven with Selenium TestNG
Selenium IDE- Commands (Selenese)
Using Assertions in Selenium WebDriver
Overview of Selenium Integrated Development Environment (IDE)
How to Run test cases in TestNG without java compiler
Overview of TestNG and its Features
Importance of XML file in TestNG Configuration
How to use TestNG Annotation Attributes
How to Run test cases with Regex in TestNG
How to install TestNG Framework and Configuration in Eclipse
How to enable and disable test cases in TestNG
How to create TestNG Listeners
Table of Contents
Top Apache Spark Interview Questions and Answers
What is a lazy evaluation in Spark?
When Spark operates on any dataset, it remembers the instructions. When a transformation such as a map() is called on an RDD, the operation is not performed instantly. Transformations in Spark are not evaluated until you perform an action, which aids in optimizing the overall data processing workflow, known as lazy evaluation.
What is shuffling in Spark? When does it occur?
Shuffling is the process of redistributing data across partitions that may lead to data movement across the executors. The shuffle operation is implemented differently in Spark compared to Hadoop.
Shuffling has 2 important compression parameters:
spark.shuffle.compress – checks whether the engine would compress shuffle outputs or not spark.shuffle.spill.compress – decides whether to compress intermediate shuffle spill files or not
It occurs while joining two tables or while performing byKey operations such as GroupByKey or ReduceByKey
What is the use of coalesce in Spark?
Spark uses a coalesce method to reduce the number of partitions in a DataFrame.
Suppose you want to read data from a CSV file into an RDD having four partitions.
This is how a filter operation is performed to remove all the multiple of 10 from the data.
The RDD has some empty partitions. It makes sense to reduce the number of partitions, which can be achieved by using coalesce.
This is how the resultant RDD would look like after applying to coalesce.
How is Apache Spark different from MapReduce?
Explain how Spark runs applications with the help of its architecture.
Spark applications run as independent processes that are coordinated by the SparkSession object in the driver program. The resource manager or cluster manager assigns tasks to the worker nodes with one task per partition. Iterative algorithms apply operations repeatedly to the data so they can benefit from caching datasets across iterations. A task applies its unit of work to the dataset in its partition and outputs a new partition dataset. Finally, the results are sent back to the driver application or can be saved to the disk.
What are the important components of the Spark ecosystem?
Apache Spark has 3 main categories that comprise its ecosystem. Those are:
- Language support: Spark can integrate with different languages to applications and perform analytics. These languages are Java, Python, Scala, and R.
- Core Components: Spark supports 5 main core components. There are Spark Core, Spark SQL, Spark Streaming, Spark MLlib, and GraphX.
- Cluster Management: Spark can be run in 3 environments. Those are the Standalone cluster, Apache Mesos, and YARN.
What are the different cluster managers available in Apache Spark?
- Standalone Mode: By default, applications submitted to the standalone mode cluster will run in FIFO order, and each application will try to use all available nodes. You can launch a standalone cluster either manually, by starting a master and workers by hand, or use our provided launch scripts. It is also possible to run these daemons on a single machine for testing.
- Apache Mesos: Apache Mesos is an open-source project to manage computer clusters, and can also run Hadoop applications. The advantages of deploying Spark with Mesos include dynamic partitioning between Spark and other frameworks as well as scalable partitioning between multiple instances of Spark.
- Hadoop YARN: Apache YARN is the cluster resource manager of Hadoop 2. Spark can be run on YARN as well.
- Kubernetes: Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.
What is the significance of Resilient Distributed Datasets in Spark?
Resilient Distributed Datasets are the fundamental data structure of Apache Spark. It is embedded in Spark Core. RDDs are immutable, fault-tolerant, distributed collections of objects that can be operated on in parallel. RDD’s are split into partitions and can be executed on different nodes of a cluster.
RDDs are created by either transformation of existing RDDs or by loading an external dataset from stable storage like HDFS or HBase.
Here is how the architecture of RDD looks like:
How can you trigger automatic clean-ups in Spark to handle accumulated metadata?
To trigger the clean-ups, you need to set the parameter spark.cleaner.ttlx.
How can you connect Spark to Apache Mesos?
There are a total of 4 steps that can help you connect Spark to Apache Mesos.
- Configure the Spark Driver program to connect with Apache Mesos
- Put the Spark binary package in a location accessible by Mesos
- Install Spark in the same location as that of the Apache Mesos
- Configure the spark.mesos.executor.home property for pointing to the location where Spark is installed
What makes Spark good at low latency workloads like graph processing and Machine Learning?
Apache Spark stores data in-memory for faster processing and building machine learning models. Machine Learning algorithms require multiple iterations and different conceptual steps to create an optimal model. Graph algorithms traverse through all the nodes and edges to generate a graph. These low latency workloads that need multiple iterations can lead to increased performance.
What is a Parquet file and what are its advantages?
Parquet is a columnar format that is supported by several data processing systems. With the Parquet file, Spark can perform both read and write operations.
Some of the advantages of having a Parquet file are:
- It enables you to fetch specific columns for access.
- It consumes less space
- It follows the type-specific encoding
- It supports limited I/O operations
How can you calculate the executor memory?
Consider the following cluster information:
Here is the number of core identification:
To calculate the number of executor identification:
What are the various functionalities supported by Spark Core?
Spark Core is the engine for parallel and distributed processing of large data sets. The various functionalities supported by Spark Core include:
- Scheduling and monitoring jobs
- Memory management
- Fault recovery
- Task dispatching
How do you convert a Spark RDD into a DataFrame?
There are 2 ways to convert a Spark RDD into a DataFrame:
- Using the helper function – toDF
import com.mapr.db.spark.sql._ val df = sc.loadFromMapRDB(<table-name>) .where(field(“first_name”) === “Peter”) .select(“_id”, “first_name”).toDF()
- Using SparkSession.createDataFrame
You can convert an RDD[Row] to a DataFrame by
calling createDataFrame on a SparkSession object
def createDataFrame(RDD, schema:StructType)
Explain the types of operations supported by RDDs.
RDDs support 2 types of operation:
Transformations: Transformations are operations that are performed on an RDD to create a new RDD containing the results (Example: map, filter, join, union)
Actions: Actions are operations that return a value after running a computation on an RDD (Example: reduce, first, count)
What is a Lineage Graph?
A Lineage Graph is a dependencies graph between the existing RDD and the new RDD. It means that all the dependencies between the RDD will be recorded in a graph, rather than the original data.
The need for an RDD lineage graph happens when we want to compute a new RDD or if we want to recover the lost data from the lost persisted RDD. Spark does not support data replication in memory. So, if any data is lost, it can be rebuilt using RDD lineage. It is also called an RDD operator graph or RDD dependency graph.
What do you understand about DStreams in Spark?
Discretized Streams is the basic abstraction provided by Spark Streaming.
It represents a continuous stream of data that is either in the form of an input source or processed data stream generated by transforming the input stream.
Explain Caching in Spark Streaming.
Caching also known as Persistence is an optimization technique for Spark computations. Similar to RDDs, DStreams also allow developers to persist the stream’s data in memory. That is, using the persist() method on a DStream will automatically persist every RDD of that DStream in memory. It helps to save interim partial results so they can be reused in subsequent stages.
The default persistence level is set to replicate the data to two nodes for fault-tolerance, and for input streams that receive data over the network.
What is the need for broadcast variables in Spark?
Broadcast variables allow the programmer to keep a read-only variable cached on each machine rather than shipping a copy of it with tasks. They can be used to give every node a copy of a large input dataset in an efficient manner. Spark distributes broadcast variables using efficient broadcast algorithms to reduce communication costs.
scala> val broadcastVar = sc.broadcast(Array(1, 2, 3)) broadcastVar: org.apache.spark.broadcast.Broadcast[Array[Int]] = Broadcast(0) scala> broadcastVar.value res0: Array[Int] = Array(1, 2, 3)
How to programmatically specify a schema for DataFrame?
DataFrame can be created programmatically with three steps:
- Create an RDD of Rows from the original RDD;
- Create the schema represented by a StructType matching the structure of Rows in the RDD created in Step 1.
- Apply the schema to the RDD of Rows via createDataFrame method provided by SparkSession.
Which transformation returns a new DStream by selecting only those records of the source DStream for which the function returns true?
- map(func)
- transform(func)
- filter(func)
- count()
The correct answer is c) filter(func).
Does Apache Spark provide checkpoints?
Yes, Apache Spark provides an API for adding and managing checkpoints. Checkpointing is the process of making streaming applications resilient to failures. It allows you to save the data and metadata into a checkpointing directory. In case of a failure, the spark can recover this data and start from wherever it has stopped.
There are 2 types of data for which we can use checkpointing in Spark.
Metadata Checkpointing: Metadata means the data about data. It refers to saving the metadata to fault-tolerant storage like HDFS. Metadata includes configurations, DStream operations, and incomplete batches.
Data Checkpointing: Here, we save the RDD to reliable storage because its need arises in some of the stateful transformations. In this case, the upcoming RDD depends on the RDDs of previous batches.
What do you mean by sliding window operation?
Controlling the transmission of data packets between multiple computer networks is done by the sliding window. Spark Streaming library provides windowed computations where the transformations on RDDs are applied over a sliding window of data.
What are the different levels of persistence in Spark?
DISK_ONLY – Stores the RDD partitions only on the disk
MEMORY_ONLY_SER – Stores the RDD as serialized Java objects with a one-byte array per partition
MEMORY_ONLY – Stores the RDD as deserialized Java objects in the JVM. If the RDD is not able to fit in the memory available, some partitions won’t be cached
OFF_HEAP – Works like MEMORY_ONLY_SER but stores the data in off-heap memory
MEMORY_AND_DISK – Stores RDD as deserialized Java objects in the JVM. In case the RDD is not able to fit in the memory, additional partitions are stored on the disk
MEMORY_AND_DISK_SER – Identical to MEMORY_ONLY_SER with the exception of storing partitions not able to fit in the memory to the disk
What is the difference between map and flatMap transformation in Spark Streaming?
How would you compute the total count of unique words in Spark?
- Load the text file as RDD:
sc.textFile(“hdfs://Hadoop/user/test_file.txt”);
- Function that breaks each line into words:
def toWords(line): return line.split();
- Run the toWords function on each element of RDD in Spark as flatMap transformation:
words = line.flatMap(toWords);
- Convert each word into (key,value) pair:
def toTuple(word): return (word, 1); wordTuple = words.map(toTuple);
- Perform reduceByKey() action:
def sum(x, y): return x+y: counts = wordsTuple.reduceByKey(sum)
- Print:
counts.collect()
Suppose you have a huge text file. How will you check if a particular keyword exists using Spark?
lines = sc.textFile(“hdfs://Hadoop/user/test_file.txt”);
def isFound(line): if line.find(“my_keyword”) > -1 return 1 return 0 foundBits = lines.map(isFound); sum = foundBits.reduce(sum); if sum > 0: print “Found” else: print “Not Found”;
What is the role of accumulators in Spark?
Accumulators are variables used for aggregating information across the executors. This information can be about the data or API diagnosis like how many records are corrupted or how many times a library API was called.
What are the different MLlib tools available in Spark?
- ML Algorithms: Classification, Regression, Clustering, and Collaborative filtering
- Featurization: Feature extraction, Transformation, Dimensionality reduction,
and Selection
- Pipelines: Tools for constructing, evaluating, and tuning ML pipelines
- Persistence: Saving and loading algorithms, models, and pipelines
- Utilities: Linear algebra, statistics, data handling
What are the different data types supported by Spark MLlib?
Spark MLlib supports local vectors and matrices stored on a single machine, as well as distributed matrices.
Local Vector: MLlib supports two types of local vectors – dense and sparse
Example: vector(1.0, 0.0, 3.0)
dense format: [1.0, 0.0, 3.0]
sparse format: (3, [0, 2]. [1.0, 3.0])
Labeled point: A labeled point is a local vector, either dense or sparse that is associated with a label/response.
Example: In binary classification, a label should be either 0 (negative) or 1 (positive)
Local Matrix: A local matrix has integer type row and column indices, and double type values that are stored in a single machine.
Distributed Matrix: A distributed matrix has long-type row and column indices and double-type values, and is stored in a distributed manner in one or more RDDs.
Types of the distributed matrix:
- RowMatrix
- IndexedRowMatrix
- CoordinatedMatrix
What is a Sparse Vector?
A Sparse vector is a type of local vector which is represented by an index array and a value array.
public class SparseVector
extends Object
implements Vector
Example: sparse1 = SparseVector(4, [1, 3], [3.0, 4.0])
where:
4 is the size of the vector
[1,3] are the ordered indices of the vector
[3,4] are the value
Describe how model creation works with MLlib and how the model is applied.
MLlib has 2 components:
Transformer: A transformer reads a DataFrame and returns a new DataFrame with a specific transformation applied.
Estimator: An estimator is a machine learning algorithm that takes a DataFrame to train a model and returns the model as a transformer.
Spark MLlib lets you combine multiple transformations into a pipeline to apply complex data transformations.
The following image shows such a pipeline for training a model:
The model produced can then be applied to live data:
What are the functions of Spark SQL?
Spark SQL is Apache Spark’s module for working with structured data.
Spark SQL loads the data from a variety of structured data sources.
It queries data using SQL statements, both inside a Spark program and from external tools that connect to Spark SQL through standard database connectors (JDBC/ODBC).
It provides a rich integration between SQL and regular Python/Java/Scala code, including the ability to join RDDs and SQL tables and expose custom functions in SQL.
How can you connect Hive to Spark SQL?
To connect Hive to Spark SQL, place the hive-site.xml file in the conf directory of Spark.
Using the Spark Session object, you can construct a DataFrame.
result=spark.sql(“select * from <hive_table>”)
What is the role of Catalyst Optimizer in Spark SQL?
Catalyst optimizer leverages advanced programming language features (such as Scala’s pattern matching and quasi quotes) in a novel way to build an extensible query optimizer.
How can you manipulate structured data using domain-specific language in Spark SQL?
Structured data can be manipulated using domain-Specific language as follows:
Suppose there is a DataFrame with the following information:
val df = spark.read.json("examples/src/main/resources/people.json") // Displays the content of the DataFrame to stdout df.show() // +----+-------+ // | age| name| // +----+-------+ // |null|Michael| // | 30| Andy| // | 19| Justin| // +----+-------+ // Select only the "name" column df.select("name").show() // +-------+ // | name| // +-------+ // |Michael| // | Andy| // | Justin| // +-------+ // Select everybody, but increment the age by 1 df.select($"name", $"age" + 1).show() // +-------+---------+ // | name|(age + 1)| // +-------+---------+ // |Michael| null| // | Andy| 31| // | Justin| 20| // +-------+---------+ // Select people older than 21 df.filter($"age" > 21).show() // +---+----+ // |age|name| // +---+----+ // | 30|Andy| // +---+----+ // Count people by age df.groupBy("age").count().show() // +----+-----+ // | age|count| // +----+-----+ // | 19| 1| // |null| 1| // | 30| 1| // +----+-----+
What are the different types of operators provided by the Apache GraphX library?
Property Operator: Property operators modify the vertex or edge properties using a user-defined map function and produce a new graph.
Structural Operator: Structure operators operate on the structure of an input graph and produce a new graph.
Join Operator: Join operators add data to graphs and generate new graphs.
What are the analytic algorithms provided in Apache Spark GraphX?
GraphX is Apache Spark’s API for graphs and graph-parallel computation. GraphX includes a set of graph algorithms to simplify analytics tasks. The algorithms are contained in the org.apache.spark.graphx.lib package and can be accessed directly as methods on Graph via GraphOps.
PageRank: PageRank is a graph parallel computation that measures the importance of each vertex in a graph. Example: You can run PageRank to evaluate what the most important pages in Wikipedia are.
Connected Components: The connected components algorithm labels each connected component of the graph with the ID of its lowest-numbered vertex. For example, in a social network, connected components can approximate clusters.
Triangle Counting: A vertex is part of a triangle when it has two adjacent vertices with an edge between them. GraphX implements a triangle counting algorithm in the TriangleCount object that determines the number of triangles passing through each vertex, providing a measure of clustering.
What is the PageRank algorithm in Apache Spark GraphX?
PageRank measures the importance of each vertex in a graph, assuming an edge from u to v represents an endorsement of v’s importance by u.
If a Twitter user is followed by many other users, that handle will be ranked high.
PageRank algorithm was originally developed by Larry Page and Sergey Brin to rank websites for Google. It can be applied to measure the influence of vertices in any network graph. PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The assumption is that more important websites are likely to receive more links from other websites.
A typical example of using Scala’s functional programming with Apache Spark RDDs to iteratively compute Page Ranks is shown below:
Compare Hadoop and Spark.
We will compare Hadoop MapReduce and Spark based on the following aspects:
Table: Apache Spark versus Hadoop
What is Apache Spark?
- Apache Spark is an open-source cluster computing framework for real-time processing.
- It has a thriving open-source community and is the most active Apache project at the moment.
- Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance.
Spark is of the most successful projects in the Apache Software Foundation. Spark has clearly evolved as the market leader for Big Data processing. Many organizations run Spark on clusters with thousands of nodes. Today, Spark is being adopted by major players like Amazon, eBay, and Yahoo!
What are benefits of Spark over MapReduce?
Spark has the following benefits over MapReduce:
- Due to the availability of in-memory processing, Spark implements the processing around 10 to 100 times faster than Hadoop MapReduce whereas MapReduce makes use of persistence storage for any of the data processing tasks.
- Unlike Hadoop, Spark provides inbuilt libraries to perform multiple tasks from the same core like batch processing, Steaming, Machine learning, Interactive SQL queries. However, Hadoop only supports batch processing.
- Hadoop is highly disk-dependent whereas Spark promotes caching and in-memory data storage.
- Spark is capable of performing computations multiple times on the same dataset. This is called iterative computation while there is no iterative computing implemented by Hadoop.
What is YARN?
Similar to Hadoop, YARN is one of the key features in Spark, providing a central and resource management platform to deliver scalable operations across the cluster. YARN is a distributed container manager, like Mesos for example, whereas Spark is a data processing tool. Spark can run on YARN; the same way Hadoop Map Reduce can run on YARN. Running Spark on YARN necessitates a binary distribution of Spark as built on YARN support.
Do you need to install Spark on all nodes of YARN cluster?
No, because Spark runs on top of YARN. Spark runs independently from its installation. Spark has some options to use YARN when dispatching jobs to the cluster, rather than its own built-in manager, or Mesos. Further, there are some configurations to run YARN. They include master, deploy-mode, driver-memory, executor-memory, executor-cores, and queue.
Is there any benefit of learning MapReduce if Spark is better than MapReduce?
Yes, MapReduce is a paradigm used by many big data tools including Spark as well. It is extremely relevant to use MapReduce when the data grows bigger and bigger. Most tools like Pig and Hive convert their queries into MapReduce phases to optimize them better.
Explain the concept of Resilient Distributed Dataset (RDD).
RDD stands for Resilient Distribution Datasets. An RDD is a fault-tolerant collection of operational elements that run in parallel. The partitioned data in RDD is immutable and distributed in nature. There are primarily two types of RDD:
- Parallelized Collections: Here, the existing RDDs running parallel with one another.
- Hadoop Datasets: They perform functions on each file record in HDFS or other storage systems.
RDDs are basically parts of data that are stored in the memory distributed across many nodes. RDDs are lazily evaluated in Spark. This lazy evaluation is what contributes to Spark’s speed.
Explain the key features of Apache Spark.
The following are the key features of Apache Spark:
- Polyglot
- Speed
- Multiple Format Support
- Lazy Evaluation
- Real Time Computation
- Hadoop Integration
- Machine Learning
Let us look at these features in detail:
- Polyglot: Spark provides high-level APIs in Java, Scala, Python and R. Spark code can be written in any of these four languages. It provides a shell in Scala and Python. The Scala shell can be accessed through ./bin/spark-shell and Python shell through ./bin/pyspark from the installed directory.
- Speed: Spark runs upto 100 times faster than Hadoop MapReduce for large-scale data processing. Spark is able to achieve this speed through controlled partitioning. It manages data using partitions that help parallelize distributed data processing with minimal network traffic.
- Multiple Formats: Spark supports multiple data sources such as Parquet, JSON, Hive and Cassandra. The Data Sources API provides a pluggable mechanism for accessing structured data though Spark SQL. Data sources can be more than just simple pipes that convert data and pull it into Spark.
- Lazy Evaluation: Apache Spark delays its evaluation till it is absolutely necessary. This is one of the key factors contributing to its speed. For transformations, Spark adds them to a DAG of computation and only when the driver requests some data, does this DAG actually gets executed.
- Real Time Computation: Spark’s computation is real-time and has less latency because of its in-memory computation. Spark is designed for massive scalability and the Spark team has documented users of the system running production clusters with thousands of nodes and supports several computational models.
- Hadoop Integration: Apache Spark provides smooth compatibility with Hadoop. This is a great boon for all the Big Data engineers who started their careers with Hadoop. Spark is a potential replacement for the MapReduce functions of Hadoop, while Spark has the ability to run on top of an existing Hadoop cluster using YARN for resource scheduling.
- Machine Learning: Spark’s MLlib is the machine learning component which is handy when it comes to big data processing. It eradicates the need to use multiple tools, one for processing and one for machine learning. Spark provides data engineers and data scientists with a powerful, unified engine that is both fast and easy to use.
What are the languages supported by Apache Spark and which is the most popular one?
Apache Spark supports the following four languages: Scala, Java, Python and R. Among these languages, Scala and Python have interactive shells for Spark. The Scala shell can be accessed through ./bin/spark-shell and the Python shell through ./bin/pyspark. Scala is the most used among them because Spark is written in Scala and it is the most popularly used for Spark.
How do we create RDDs in Spark?
Spark provides two methods to create RDD:
- By parallelizing a collection in your Driver program.
- This makes use of SparkContext’s ‘parallelize’
1 2 3 |
method val DataArray = Array(2,4,6,8,10) val DataRDD = sc.parallelize(DataArray) |
- By loading an external dataset from external storage like HDFS, HBase, shared file system.
What operations does RDD support?
RDD (Resilient Distributed Dataset) is main logical data unit in Spark. An RDD has distributed a collection of objects. Distributed means, each RDD is divided into multiple partitions. Each of these partitions can reside in memory or stored on the disk of different machines in a cluster. RDDs are immutable (Read Only) data structure. You can’t change original RDD, but you can always transform it into different RDD with all changes you want.
RDDs support two types of operations: transformations and actions.
Transformations: Transformations create new RDD from existing RDD like map, reduceByKey and filter we just saw. Transformations are executed on demand. That means they are computed lazily.
Actions: Actions return final results of RDD computations. Actions triggers execution using lineage graph to load the data into original RDD, carry out all intermediate transformations and return final results to Driver program or write it out to file system.
What is Executor Memory in a Spark application?
Every spark application has same fixed heap size and fixed number of cores for a spark executor. The heap size is what referred to as the Spark executor memory which is controlled with the spark.executor.memory property of the –executor-memory flag. Every spark application will have one executor on each worker node. The executor memory is basically a measure on how much memory of the worker node will the application utilize.
Define Partitions in Apache Spark.
As the name suggests, partition is a smaller and logical division of data similar to ‘split’ in MapReduce. It is a logical chunk of a large distributed data set. Partitioning is the process to derive logical units of data to speed up the processing process. Spark manages data using partitions that help parallelize distributed data processing with minimal network traffic for sending data between executors. By default, Spark tries to read data into an RDD from the nodes that are close to it. Since Spark usually accesses distributed partitioned data, to optimize transformation operations it creates partitions to hold the data chunks. Everything in Spark is a partitioned RDD.
What do you understand by Transformations in Spark?
Transformations are functions applied on RDD, resulting into another RDD. It does not execute until an action occurs. map() and filter() are examples of transformations, where the former applies the function passed to it on each element of RDD and results into another RDD. The filter() creates a new RDD by selecting elements from current RDD that pass function argument.
1 2 3 |
val rawData=sc.textFile("path to/movies.txt") val moviesData=rawData.map(x=>x.split(" ")) |
As we can see here, rawData RDD is transformed into moviesData RDD. Transformations are lazily evaluated.
Define Actions in Spark.
An action helps in bringing back the data from RDD to the local machine. An action’s execution is the result of all previously created transformations. Actions triggers execution using lineage graph to load the data into original RDD, carry out all intermediate transformations and return final results to Driver program or write it out to file system.
reduce() is an action that implements the function passed again and again until one value if left. take() action takes all the values from RDD to a local node.
1 |
moviesData.saveAsTextFile(“MoviesData.txt”) |
As we can see here, moviesData RDD is saved into a text file called MoviesData.txt.
Define functions of SparkCore.
Spark Core is the base engine for large-scale parallel and distributed data processing. The core is the distributed execution engine and the Java, Scala, and Python APIs offer a platform for distributed ETL application development. SparkCore performs various important functions like memory management, monitoring jobs, fault-tolerance, job scheduling and interaction with storage systems. Further, additional libraries, built atop the core allow diverse workloads for streaming, SQL, and machine learning. It is responsible for:
- Memory management and fault recovery
- Scheduling, distributing and monitoring jobs on a cluster
- Interacting with storage systems
What do you understand by Pair RDD?
Apache defines PairRDD functions class as
1 |
class PairRDDFunctions[K, V] extends Logging with HadoopMapReduceUtil with Serializable |
Special operations can be performed on RDDs in Spark using key/value pairs and such RDDs are referred to as Pair RDDs. Pair RDDs allow users to access each key in parallel. They have a reduceByKey() method that collects data based on each key and a join() method that combines different RDDs together, based on the elements having the same key.
Name the components of Spark Ecosystem.
- Spark Core: Base engine for large-scale parallel and distributed data processing
- Spark Streaming: Used for processing real-time streaming data
- Spark SQL: Integrates relational processing with Spark’s functional programming API
- GraphX: Graphs and graph-parallel computation
- MLlib: Performs machine learning in Apache Spark
How is Streaming implemented in Spark? Explain with examples.
Spark Streaming is used for processing real-time streaming data. Thus it is a useful addition to the core Spark API. It enables high-throughput and fault-tolerant stream processing of live data streams. The fundamental stream unit is DStream which is basically a series of RDDs (Resilient Distributed Datasets) to process the real-time data. The data from different sources like Flume, HDFS is streamed and finally processed to file systems, live dashboards and databases. It is similar to batch processing as the input data is divided into streams like batches.
Is there an API for implementing graphs in Spark?
GraphX is the Spark API for graphs and graph-parallel computation. Thus, it extends the Spark RDD with a Resilient Distributed Property Graph.
The property graph is a directed multi-graph which can have multiple edges in parallel. Every edge and vertex have user defined properties associated with it. Here, the parallel edges allow multiple relationships between the same vertices. At a high-level, GraphX extends the Spark RDD abstraction by introducing the Resilient Distributed Property Graph: a directed multigraph with properties attached to each vertex and edge.
To support graph computation, GraphX exposes a set of fundamental operators (e.g., subgraph, joinVertices, and mapReduceTriplets) as well as an optimized variant of the Pregel API. In addition, GraphX includes a growing collection of graph algorithms and builders to simplify graph analytics tasks.
What is PageRank in GraphX?
PageRank measures the importance of each vertex in a graph, assuming an edge from u to v represents an endorsement of v’s importance by u. For example, if a Twitter user is followed by many others, the user will be ranked highly.
GraphX comes with static and dynamic implementations of PageRank as methods on the PageRank Object. Static PageRank runs for a fixed number of iterations, while dynamic PageRank runs until the ranks converge (i.e., stop changing by more than a specified tolerance). GraphOps allows calling these algorithms directly as methods on Graph.
How is machine learning implemented in Spark?
MLlib is scalable machine learning library provided by Spark. It aims at making machine learning easy and scalable with common learning algorithms and use cases like clustering, regression filtering, dimensional reduction, and alike.
Is there a module to implement SQL in Spark? How does it work?
Spark SQL is a new module in Spark which integrates relational processing with Spark’s functional programming API. It supports querying data either via SQL or via the Hive Query Language. For those of you familiar with RDBMS, Spark SQL will be an easy transition from your earlier tools where you can extend the boundaries of traditional relational data processing.
Spark SQL integrates relational processing with Spark’s functional programming. Further, it provides support for various data sources and makes it possible to weave SQL queries with code transformations thus resulting in a very powerful tool.
The following are the four libraries of Spark SQL.
- Data Source API
- DataFrame API
- Interpreter & Optimizer
- SQL Service
How can Apache Spark be used alongside Hadoop?
The best part of Apache Spark is its compatibility with Hadoop. As a result, this makes for a very powerful combination of technologies. Here, we will be looking at how Spark can benefit from the best of Hadoop. Using Spark and Hadoop together helps us to leverage Spark’s processing to utilize the best of Hadoop’s HDFS and YARN.
Figure: Using Spark and Hadoop
Hadoop components can be used alongside Spark in the following ways:
- HDFS: Spark can run on top of HDFS to leverage the distributed replicated storage.
- MapReduce: Spark can be used along with MapReduce in the same Hadoop cluster or separately as a processing framework.
- YARN: Spark applications can also be run on YARN (Hadoop NextGen).
- Batch & Real Time Processing: MapReduce and Spark are used together where MapReduce is used for batch processing and Spark for real-time processing.
What is RDD Lineage?
Spark does not support data replication in the memory and thus, if any data is lost, it is rebuild using RDD lineage. RDD lineage is a process that reconstructs lost data partitions. The best is that RDD always remembers how to build from other datasets.
What is Spark Driver?
Spark Driver is the program that runs on the master node of the machine and declares transformations and actions on data RDDs. In simple terms, a driver in Spark creates SparkContext, connected to a given Spark Master.
The driver also delivers the RDD graphs to Master, where the standalone cluster manager runs.
What file systems does Spark support?
The following three file systems are supported by Spark:
- Hadoop Distributed File System (HDFS).
- Local File system.
- Amazon S3
List the functions of Spark SQL.
Spark SQL is capable of:
- Loading data from a variety of structured sources.
- Querying data using SQL statements, both inside a Spark program and from external tools that connect to Spark SQL through standard database connectors (JDBC/ODBC). For instance, using business intelligence tools like Tableau.
- Providing rich integration between SQL and regular Python/Java/Scala code, including the ability to join RDDs and SQL tables, expose custom functions in SQL, and more.
What is Spark Executor?
When SparkContext connects to a cluster manager, it acquires an Executor on nodes in the cluster. Executors are Spark processes that run computations and store the data on the worker node. The final tasks by SparkContext are transferred to executors for their execution.
What is a Parquet file?
Parquet is a columnar format file supported by many other data processing systems. Spark SQL performs both read and write operations with Parquet file and consider it be one of the best big data analytics formats so far.
Parquet is a columnar format, supported by many data processing systems. The advantages of having a columnar storage are as follows:
- Columnar storage limits IO operations.
- It can fetch specific columns that you need to access.
- Columnar storage consumes less space.
- It gives better-summarized data and follows type-specific encoding.
Name types of Cluster Managers in Spark.
The Spark framework supports three major types of Cluster Managers:
- Standalone: A basic manager to set up a cluster.
- Apache Mesos: Generalized/commonly-used cluster manager, also runs Hadoop MapReduce and other applications.
- YARN: Responsible for resource management in Hadoop.
What do you understand by worker node?
Worker node refers to any node that can run the application code in a cluster. The driver program must listen for and accept incoming connections from its executors and must be network addressable from the worker nodes.
Worker node is basically the slave node. Master node assigns work and worker node actually performs the assigned tasks. Worker nodes process the data stored on the node and report the resources to the master. Based on the resource availability, the master schedule tasks.
Illustrate some demerits of using Spark.
The following are some of the demerits of using Apache Spark:
- Since Spark utilizes more storage space compared to Hadoop and MapReduce, there may arise certain problems.
- Developers need to be careful while running their applications in Spark.
- Instead of running everything on a single node, the work must be distributed over multiple clusters.
- Spark’s “in-memory” capability can become a bottleneck when it comes to cost-efficient processing of big data.
- Spark consumes a huge amount of data when compared to Hadoop.
List some use cases where Spark outperforms Hadoop in processing.
- Sensor Data Processing: Apache Spark’s “In-memory” computing works best here, as data is retrieved and combined from different sources.
- Real Time Processing: Spark is preferred over Hadoop for real-time querying of data. e.g. Stock Market Analysis, Banking, Healthcare, Telecommunications, etc.
- Stream Processing: For processing logs and detecting frauds in live streams for alerts, Apache Spark is the best solution.
- Big Data Processing: Spark runs upto 100 times faster than Hadoop when it comes to processing medium and large-sized datasets.
What is a Sparse Vector?
A sparse vector has two parallel arrays; one for indices and the other for values. These vectors are used for storing non-zero entries to save space.
1 |
Vectors.sparse(7,Array(0,1,2,3,4,5,6),Array(1650d,50000d,800d,3.0,3.0,2009,95054)) |
The above sparse vector can be used instead of dense vectors.
1 |
val myHouse = Vectors.dense(4450d,2600000d,4000d,4.0,4.0,1978.0,95070d,1.0,1.0,1.0,0.0) |
Can you use Spark to access and analyze data stored in Cassandra databases?
Yes, it is possible if you use Spark Cassandra Connector.To connect Spark to a Cassandra cluster, a Cassandra Connector will need to be added to the Spark project. In the setup, a Spark executor will talk to a local Cassandra node and will only query for local data. It makes queries faster by reducing the usage of the network to send data between Spark executors (to process data) and Cassandra nodes (where data lives).
Is it possible to run Apache Spark on Apache Mesos?
Yes, Apache Spark can be run on the hardware clusters managed by Mesos. In a standalone cluster deployment, the cluster manager in the below diagram is a Spark master instance. When using Mesos, the Mesos master replaces the Spark master as the cluster manager. Mesos determines what machines handle what tasks. Because it takes into account other frameworks when scheduling these many short-lived tasks, multiple frameworks can coexist on the same cluster without resorting to a static partitioning of resources.
How can Spark be connected to Apache Mesos?
To connect Spark with Mesos:
- Configure the spark driver program to connect to Mesos.
- Spark binary package should be in a location accessible by Mesos.
- Install Apache Spark in the same location as that of Apache Mesos and configure the property ‘spark.mesos.executor.home’ to point to the location where it is installed.
How can you minimize data transfers when working with Spark?
Minimizing data transfers and avoiding shuffling helps write spark programs that run in a fast and reliable manner. The various ways in which data transfers can be minimized when working with Apache Spark are:
- Using Broadcast Variable- Broadcast variable enhances the efficiency of joins between small and large RDDs.
- Using Accumulators – Accumulators help update the values of variables in parallel while executing.
The most common way is to avoid operations ByKey, repartition or any other operations which trigger shuffles.
What are broadcast variables?
Broadcast variables allow the programmer to keep a read-only variable cached on each machine rather than shipping a copy of it with tasks. They can be used to give every node a copy of a large input dataset in an efficient manner. Spark also attempts to distribute broadcast variables using efficient broadcast algorithms to reduce communication cost.
Please explain the concept of RDD (Resilient Distributed Dataset). Also, state how you can create RDDs in Apache Spark.
An RDD or Resilient Distribution Dataset is a fault-tolerant collection of operational elements that are capable to run in parallel. Any partitioned data in an RDD is distributed and immutable.
Fundamentally, RDDs are portions of data that are stored in the memory distributed over many nodes. These RDDs are lazily evaluated in Spark, which is the main factor contributing to the hastier speed achieved by Apache Spark. RDDs are of two types:
- Hadoop Datasets – Perform functions on each file record in HDFS (Hadoop Distributed File System) or other types of storage systems
- Parallelized Collections – Extant RDDs running parallel with one another
There are two ways of creating an RDD in Apache Spark:
- By parallelizing a collection in the Driver program. It makes use of SparkContext’s parallelize() method. For instance:
method val DataArray = Array(22,24,46,81,101) val DataRDD = sc.parallelize(DataArray)
- By means of loading an external dataset from some external storage, including HBase, HDFS, and shared file system
Define Spark Executor?
Spark Executor supports the SparkContext connecting with the cluster manager through nodes in the cluster. It runs the computation and data storing process on the worker node.
Can we run Apache Spark on the Apache Mesos?
Yes, we can run Apache Spark on the Apache Mesos by using the hardware clusters that are managed by Mesos.
Can we trigger automated clean-ups in Spark?
Yes, we can trigger automated clean-ups in Spark to handle the accumulated metadata. It can be done by setting the parameters, namely, “spark.cleaner.ttl.”
What is another method than “Spark.cleaner.ttl” to trigger automated clean-ups in Spark?
Another method than “Spark.clener.ttl” to trigger automated clean-ups in Spark is by dividing the long-running jobs into different batches and writing the intermediary results on the disk.
What is the role of Akka in Spark?
Akka in Spark helps in the scheduling process. It helps the workers and masters to send and receive messages for workers for tasks and master requests for registering.
Define Schema RDD in Apache Spark RDD?
Schema RDD is an RDD that carries various row objects such as wrappers around the basic string or integer arrays along with schema information about types of data in each column. It is now renamed as Data Frame API.
Why is Schema RDD designed?
Schema RDD is designed to make it easier for the developers for code debugging and unit testing on the Spark SQL core module.
What is the basic difference between Spark SQL, HQL, and SQL?
Spark SQL supports SQL and Hiver Query language without changing any syntax. We can join SQL and HQL table with the Spark SQL.
What kinds of file systems are supported by Spark?
Spark supports three kinds of file systems, which include the following:
- Amazon S3
- Hadoop Distributed File System (HDFS)
- Local File System.
Explain accumulators in Apache Spark.
Accumulators are variables that are only added through an associative and commutative operation. They are used to implement counters or sums. Tracking accumulators in the UI can be useful for understanding the progress of running stages. Spark natively supports numeric accumulators. We can create named or unnamed accumulators.
Why is there a need for broadcast variables when working with Apache Spark?
Broadcast variables are read only variables, present in-memory cache on every machine. When working with Spark, usage of broadcast variables eliminates the necessity to ship copies of a variable for every task, so data can be processed faster. Broadcast variables help in storing a lookup table inside the memory which enhances the retrieval efficiency when compared to an RDD lookup().
How can you trigger automatic clean-ups in Spark to handle accumulated metadata?
You can trigger the clean-ups by setting the parameter ‘spark.cleaner.ttl’ or by dividing the long running jobs into different batches and writing the intermediary results to the disk.
What is the significance of Sliding Window operation?
Sliding Window controls transmission of data packets between various computer networks. Spark Streaming library provides windowed computations where the transformations on RDDs are applied over a sliding window of data. Whenever the window slides, the RDDs that fall within the particular window are combined and operated upon to produce new RDDs of the windowed DStream.
What is a DStream in Apache Spark?
Discretized Stream (DStream) is the basic abstraction provided by Spark Streaming. It is a continuous stream of data. It is received from a data source or from a processed data stream generated by transforming the input stream. Internally, a DStream is represented by a continuous series of RDDs and each RDD contains data from a certain interval. Any operation applied on a DStream translates to operations on the underlying RDDs.
DStreams can be created from various sources like Apache Kafka, HDFS, and Apache Flume. DStreams have two operations:
- Transformations that produce a new DStream.
- Output operations that write data to an external system.
There are many DStream transformations possible in Spark Streaming. Let us look at filter(func). filter(func) returns a new DStream by selecting only the records of the source DStream on which func returns true.
Explain Caching in Spark Streaming.
DStreams allow developers to cache/ persist the stream’s data in memory. This is useful if the data in the DStream will be computed multiple times. This can be done using the persist() method on a DStream. For input streams that receive data over the network (such as Kafka, Flume, Sockets, etc.), the default persistence level is set to replicate the data to two nodes for fault-tolerance.
When running Spark applications, is it necessary to install Spark on all the nodes of YARN cluster?
Spark need not be installed when running a job under YARN or Mesos because Spark can execute on top of YARN or Mesos clusters without affecting any change to the cluster.
What are the various data sources available in Spark SQL?
Parquet file, JSON datasets and Hive tables are the data sources available in Spark SQL.
What are the various levels of persistence in Apache Spark?
Apache Spark automatically persists the intermediary data from various shuffle operations, however, it is often suggested that users call persist () method on the RDD in case they plan to reuse it. Spark has various persistence levels to store the RDDs on disk or in memory or as a combination of both with different replication levels.
The various storage/persistence levels in Spark are:
- MEMORY_ONLY: Store RDD as deserialized Java objects in the JVM. If the RDD does not fit in memory, some partitions will not be cached and will be recomputed on the fly each time they’re needed. This is the default level.
- MEMORY_AND_DISK: Store RDD as deserialized Java objects in the JVM. If the RDD does not fit in memory, store the partitions that don’t fit on disk, and read them from there when they’re needed.
- MEMORY_ONLY_SER: Store RDD as serialized Java objects (one-byte array per partition).
- MEMORY_AND_DISK_SER: Similar to MEMORY_ONLY_SER, but spill partitions that don’t fit in memory to disk instead of recomputing them on the fly each time they’re needed.
- DISK_ONLY: Store the RDD partitions only on disk.
- OFF_HEAP: Similar to MEMORY_ONLY_SER, but store the data in off-heap memory.
Does Apache Spark provide checkpoints?
Checkpoints are similar to checkpoints in gaming. They make it run 24/7 and make it resilient to failures unrelated to the application logic.
Figure: Spark Interview Questions – Checkpoints
Lineage graphs are always useful to recover RDDs from a failure but this is generally time-consuming if the RDDs have long lineage chains. Spark has an API for checkpointing i.e. a REPLICATE flag to persist. However, the decision on which data to checkpoint – is decided by the user. Checkpoints are useful when the lineage graphs are long and have wide dependencies.
How Spark uses Akka?
Spark uses Akka basically for scheduling. All the workers request for a task to master after registering. The master just assigns the task. Here Spark uses Akka for messaging between the workers and masters.
What do you understand by Lazy Evaluation?
Spark is intellectual in the manner in which it operates on data. When you tell Spark to operate on a given dataset, it heeds the instructions and makes a note of it, so that it does not forget – but it does nothing, unless asked for the final result. When a transformation like map() is called on an RDD, the operation is not performed immediately. Transformations in Spark are not evaluated till you perform an action. This helps optimize the overall data processing workflow.
What do you understand by SchemaRDD in Apache Spark RDD?
SchemaRDD is an RDD that consists of row objects (wrappers around the basic string or integer arrays) with schema information about the type of data in each column.
SchemaRDD was designed as an attempt to make life easier for developers in their daily routines of code debugging and unit testing on SparkSQL core module. The idea can boil down to describing the data structures inside RDD using a formal description similar to the relational database schema. On top of all basic functions provided by common RDD APIs, SchemaRDD also provides some straightforward relational query interface functions that are realized through SparkSQL.
Now, it is officially renamed to DataFrame API on Spark’s latest trunk.
How is Spark SQL different from HQL and SQL?
Spark SQL is a special component on the Spark Core engine that supports SQL and Hive Query Language without changing any syntax. It is possible to join SQL table and HQL table to Spark SQL.
Explain a scenario where you will be using Spark Streaming.
When it comes to Spark Streaming, the data is streamed in real-time onto our Spark program.
Twitter Sentiment Analysis is a real-life use case of Spark Streaming. Trending Topics can be used to create campaigns and attract a larger audience. It helps in crisis management, service adjusting and target marketing.
Sentiment refers to the emotion behind a social media mention online. Sentiment Analysis is categorizing the tweets related to a particular topic and performing data mining using Sentiment Automation Analytics Tools.
Spark Streaming can be used to gather live tweets from around the world into the Spark program. This stream can be filtered using Spark SQL and then we can filter tweets based on the sentiment. The filtering logic will be implemented using MLlib where we can learn from the emotions of the public and change our filtering scale accordingly.
The above figure displays the sentiments for the tweets containing the word ‘Trump’.
What is Apache Spark?
Spark is a fast, easy-to-use, and flexible data processing framework. It has an advanced execution engine supporting a cyclic data flow and in-memory computing. Apache Spark can run standalone, on Hadoop, or in the cloud and is capable of accessing diverse data sources including HDFS, HBase, and Cassandra, among others.
Explain the key features of Spark.
- Apache Spark allows integrating with Hadoop.
- It has an interactive language shell, Scala (the language in which Spark is written).
- Spark consists of RDDs (Resilient Distributed Datasets), which can be cached across the computing nodes in a cluster.
- Apache Spark supports multiple analytic tools that are used for interactive query analysis, real-time analysis, and graph processing
Define RDD.
RDD is the acronym for Resilient Distribution Datasets—a fault-tolerant collection of operational elements that run in parallel. The partitioned data in an RDD is immutable and distributed. There are primarily two types of RDDs:
- Parallelized collections: The existing RDDs running in parallel with one another
- Hadoop datasets: Those performing a function on each file record in HDFS or any other storage system
What is Hive on Spark?
- Hive contains significant support for Apache Spark, wherein Hive execution is configured to Spark:
- hive> set spark.home=/location/to/sparkHome;
- hive> set hive.execution.engine=spark;
- Hive supports Spark on YARN mode by default.
Name the commonly used Spark Ecosystems.
- Spark SQL (Shark) for developers
- Spark Streaming for processing live data streams
- GraphX for generating and computing graphs
- MLlib (Machine Learning Algorithms)
- SparkR to promote R programming in the Spark engine
Define Spark Streaming.
- Spark supports stream processing—an extension to the Spark API allowing stream processing of live data streams.
Data from different sources like Kafka, Flume, Kinesis is processed and then pushed to file systems, live dashboards, and databases. It is similar to batch processing in terms of the input data which is here divided into streams like batches in batch processing.
What does a Spark Engine do?
A Spark engine is responsible for scheduling, distributing, and monitoring the data application across the cluster.
Define Partitions.
As the name suggests, a partition is a smaller and logical division of data similar to a ‘split’ in MapReduce. Partitioning is the process of deriving logical units of data to speed up data processing. Everything in Spark is a partitioned RDD.
What operations does an RDD support?
- Transformations
- Actions
What do you understand by Transformations in Spark?
Transformations are functions applied to RDDs, resulting in another RDD. It does not execute until an action occurs. Functions such as map() and filer() are examples of transformations, where the map() function iterates over every line in the RDD and splits into a new RDD. The filter() function creates a new RDD by selecting elements from the current RDD that passes the function argument.
Define Actions in Spark.
In Spark, an action helps in bringing back data from an RDD to the local machine. They are RDD operations giving non-RDD values. The reduce() function is an action that is implemented again and again until only one value if left. The take() action takes all the values from an RDD to the local node.
Define the functions of Spark Core.
Serving as the base engine, Spark Core performs various important functions like memory management, monitoring jobs, providing fault-tolerance, job scheduling, and interaction with storage6 systems.
What is RDD Lineage?
Spark does not support data replication in memory and thus, if any data is lost, it is rebuild using RDD lineage.
RDD lineage is a process that reconstructs lost data partitions. The best thing about this is that RDDs always remember how to build from other datasets.
What is Spark Driver?
Spark driver is the program that runs on the master node of a machine and declares transformations and actions on data RDDs. In simple terms, a driver in Spark creates SparkContext, connected to a given Spark Master. It also delivers RDD graphs to Master, where the standalone Cluster Manager runs.
What is GraphX?
Spark uses GraphX for graph processing to build and transform interactive graphs. The GraphX component enables programmers to reason about structured data at scale.
What does MLlib do?
MLlib is a scalable Machine Learning library provided by Spark. It aims at making Machine Learning easy and scalable with common learning algorithms and use cases like clustering, regression filtering, dimensional reduction, and the like.
What is Spark SQL?
Spark SQL, better known as Shark, is a novel module introduced in Spark to perform structured data processing. Through this module, Spark executes relational SQL queries on data. The core of this component supports an altogether different RDD called Schema RDD, composed of row objects and schema objects defining the data type of each column in a row. It is similar to a table in relational databases.
What is a Parquet file?
Parquet is a columnar format file supported by many other data processing systems. Spark SQL performs both read and write operations with the Parquet file and considers it be one of the best Big Data Analytics formats so far.
What file systems does Apache Spark support?
- Hadoop Distributed File System (HDFS)
- Local file system
- Amazon S3
What is YARN?
Similar to Hadoop, YARN is one of the key features in Spark, providing a central and resource management platform to deliver scalable operations across the cluster. Running Spark on YARN needs a binary distribution of Spark that is built on YARN support.
List the functions of Spark SQL.
Spark SQL is capable of:
- Loading data from a variety of structured sources
- Querying data using SQL statements, both inside a Spark program and from external tools that connect to Spark SQL through standard database connectors (JDBC/ODBC), e.g., using Business Intelligence tools like Tableau
- Providing rich integration between SQL and the regular Python/Java/Scala code, including the ability to join RDDs and SQL tables, expose custom functions in SQL, and more.
What are the benefits of Spark over MapReduce?
- Due to the availability of in-memory processing, Spark implements data processing 10–100x faster than Hadoop MapReduce. MapReduce, on the other hand, makes use of persistence storage for any of the data processing tasks.
- Unlike Hadoop, Spark provides in-built libraries to perform multiple tasks using batch processing, steaming, Machine Learning, and interactive SQL queries. However, Hadoop only supports batch processing.
- Hadoop is highly disk-dependent, whereas Spark promotes caching and in-memory data storage.
- Spark is capable of performing computations multiple times on the same dataset, which is called iterative computation. Whereas, there is no iterative computing implemented by Hadoop.
Is there any benefit of learning MapReduce?
Yes, MapReduce is a paradigm used by many Big Data tools, including Apache Spark. It becomes extremely relevant to use MapReduce when data grows bigger and bigger. Most tools like Pig and Hive convert their queries into MapReduce phases to optimize them better.
What is Spark Executor?
When SparkContext connects to Cluster Manager, it acquires an executor on the nodes in the cluster. Executors are Spark processes that run computations and store data on worker nodes. The final tasks by SparkContext are transferred to executors for their execution.
Name the types of Cluster Managers in Spark.
The Spark framework supports three major types of Cluster Managers.
- Standalone: A basic Cluster Manager to set up a cluster
- Apache Mesos: A generalized/commonly-used Cluster Manager, running Hadoop MapReduce and other applications
- YARN: A Cluster Manager responsible for resource management in Hadoop
What do you understand by a Worker node?
A worker node refers to any node that can run the application code in a cluster.
What is PageRank?
A unique feature and algorithm in GraphX, PageRank is the measure of each vertex in a graph. For instance, an edge from u to v represents an endorsement of v‘s importance w.r.t. u. In simple terms, if a user at Instagram is followed massively, he/she will be ranked high on that platform.
Do you need to install Spark on all the nodes of the YARN cluster while running Spark on YARN?
No, because Spark runs on top of YARN.
Illustrate some demerits of using Spark.
Since Spark utilizes more storage space when compared to Hadoop and MapReduce, there might arise certain problems. Developers need to be careful while running their applications on Spark. To resolve the issue, they can think of distributing the workload over multiple clusters, instead of running everything on a single node.
How to create an RDD?
Spark provides two methods to create an RDD:
- By parallelizing a collection in the driver program. This makes use of SparkContext’s ‘parallelize’ method val
TecklearnData = Array(2,4,6,8,10)
val distTecklearnData = sc.parallelize(TecklearnData)
By loading an external dataset from external storage like HDFS, the shared file system
What is Spark DataFrames?
When a dataset is organized into SQL-like columns, it is known as a DataFrame.
This is, in concept, equivalent to a data table in a relational database or a literal ‘DataFrame’ in R or Python. The only difference is the fact that Spark DataFrames are optimized for Big Data.
What are Spark Datasets?
Datasets are data structures in Spark (added since Spark 1.6) that provide the JVM object benefits of RDDs (the ability to manipulate data with lambda functions), alongside a Spark SQL-optimized execution engine.
Which languages can Spark be integrated with?
Spark can be integrated with the following languages:
- Python, using the Spark Python API
- R, using the R on Spark API
- Java, using the Spark Java API
- Scala, using the Spark Scala API
What do you mean by in-memory processing?
In-memory processing refers to the instant access of data from physical memory whenever the operation is called for.
This methodology significantly reduces the delay caused by the transfer of data. Spark uses this method to access large chunks of data for querying or processing.
What is lazy evaluation?
Spark implements a functionality, wherein if you create an RDD out of an existing RDD or a data source, the materialization of the RDD will not occur until the RDD needs to be interacted with. This is to ensure the avoidance of unnecessary memory and CPU usage that occurs due to certain mistakes, especially in the case of Big Data Analytics.
Apache Spark Vs Hadoop
Why Spark?
Spark is the third generation distributed data processing platform. It’s unified bigdata solution for all bigdata processing problems such as batch , interacting, streaming processing.So it can ease many bigdata problems.
What is RDD?
Spark’s primary core abstraction is called Resilient Distributed Datasets. RDD is a collection of partitioned data that satisfies these properties. Immutable, distributed, lazily evaluated, catchable are common RDD properties.
What is Immutable?
Once created and assign a value, it’s not possible to change, this property is called Immutability. Spark is by default immutable, it does not allow updates and modifications. Please note data collection is not immutable, but data value is immutable.
What is Distributed?
RDD can automatically the data is distributed across different parallel computing nodes.
What is Catchable?
Keep all the data in-memory for computation, rather than going to the disk. So Spark can catch the data 100 times faster than Hadoop.
What is Spark engine responsibility?
Spark responsible for scheduling, distributing, and monitoring the application across the cluster.
What are common Spark Ecosystems?
- Spark SQL(Shark) for SQL developers,
- Spark Streaming for streaming data,
- MLLib for machine learning algorithms,
- GraphX for Graph computation,
- SparkR to run R on Spark engine,
- BlinkDB enabling interactive queries over massive data are common Spark ecosystems. GraphX, SparkR, and BlinkDB are in the incubation stage.
What is Partitions?
Partition is a logical division of the data, this idea derived from Map-reduce (split). Logical data specifically derived to process the data. Small chunks of data also it can support scalability and speed up the process. Input data, intermediate data, and output data everything is Partitioned RDD.
How spark partition the data?
Spark use map-reduce API to do the partition the data. In Input format we can create number of partitions. By default HDFS block size is partition size (for best performance), but its’ possible to change partition size like Split.
How Spark store the data?
Spark is a processing engine, there is no storage engine. It can retrieve data from any storage engine like HDFS, S3 and other data resources.
Is it mandatory to start Hadoop to run spark application?
No not mandatory, but there is no separate storage in Spark, so it use local file system to store the data. You can load data from local system and process it, Hadoop or HDFS is not mandatory to run spark application.
What is SparkContext?
When a programmer creates a RDDs, SparkContext connect to the Spark cluster to create a new SparkContext object. SparkContext tell spark how to access the cluster. SparkConf is key factor to create programmer application.
What is SparkCore functionalities?
SparkCore is a base engine of apache spark framework. Memory management, fault tolarance, scheduling and monitoring jobs, interacting with store systems are primary functionalities of Spark.
How SparkSQL is different from HQL and SQL?
SparkSQL is a special component on the sparkCore engine that support SQL and HiveQueryLanguage without changing any syntax. It’s possible to join SQL table and HQL table.
When did we use Spark Streaming?
Spark Streaming is a real time processing of streaming data API. Spark streaming gather streaming data from different resources like web server log files, social media data, stock market data or Hadoop ecosystems like Flume, and Kafka.
How Spark Streaming API works?
Programmer set a specific time in the configuration, within this time how much data gets into the Spark, that data separates as a batch. The input stream (DStream) goes into spark streaming. Framework breaks up into small chunks called batches, then feeds into the spark engine for processing. Spark Streaming API passes that batches to the core engine. Core engine can generate the final results in the form of streaming batches. The output also in the form of batches. It can allows streaming data and batch data for processing.
What is Spark MLlib?
Mahout is a machine learning library for Hadoop, similarly MLlib is a Spark library. MetLib provides different algorithms, that algorithms scale out on the cluster for data processing. Most of the data scientists use this MLlib library.
What is GraphX?
GraphX is a Spark API for manipulating Graphs and collections. It unifies ETL, other analysis, and iterative graph computation. It’s fastest graph system, provides fault tolerance and ease of use without special skills.
What is File System API?
FS API can read data from different storage devices like HDFS, S3 or local FileSystem. Spark uses FS API to read data from different storage engines.
Why Partitions are immutable?
Every transformation generates new partition. Partitions use HDFS API so that partition is immutable, distributed and fault tolerance. Partition also aware of data locality.
What is Transformation in spark?
Spark provides two special operations on RDDs called transformations and Actions. Transformation follows lazy operation and temporary hold the data until unless called the Action. Each transformation generates/return new RDD. Example of transformations: Map, flatMap, groupByKey, reduceByKey, filter, co-group, join, sortByKey, Union, distinct, sample are common spark transformations.
What is Action in Spark?
Actions are RDD’s operation, that value returns back to the spar driver programs, which kick off a job to execute on a cluster. Transformation’s output is an input of Actions. reduce, collect, takeSample, take, first, saveAsTextfile, saveAsSequenceFile, countByKey, foreach are common actions in Apache spark.
What is RDD Lineage?
Lineage is an RDD process to reconstruct lost partitions. Spark not replicate the data in memory, if data lost, Rdd use linege to rebuild lost data.Each RDD remembers how the RDD build from other datasets.
What is Spark?
Spark is a parallel data processing framework. It allows to develop fast, unified big data application combine batch, streaming and interactive analytics.
What is Map and flatMap in Spark?
The map is a specific line or row to process that data. In FlatMap each input item can be mapped to multiple output items (so the function should return a Seq rather than a single item). So most frequently used to return Array elements.
What are broadcast variables?
Broadcast variables let programmer keep a read-only variable cached on each machine, rather than shipping a copy of it with tasks. Spark supports 2 types of shared variables called broadcast variables (like Hadoop distributed cache) and accumulators (like Hadoop counters). Broadcast variables stored as Array Buffers, which sends read-only values to work nodes.
What are Accumulators in Spark?
Spark of-line debuggers called accumulators. Spark accumulators are similar to Hadoop counters, to count the number of events and what’s happening during job you can use accumulators. Only the driver program can read an accumulator value, not the tasks.
How RDD persist the data?
There are two methods to persist the data, such as persist() to persist permanently and cache() to persist temporarily in the memory. Different storage level options there such as MEMORY_ONLY, MEMORY_AND_DISK, DISK_ONLY and many more. Both persist() and cache() uses different options depends on the task.
When do you use apache spark? OR What are the benefits of Spark over Mapreduce?
- Spark is really fast. As per their claims, it runs programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. It aptly utilizes RAM to produce the faster results.
- In map reduce paradigm, you write many Map-reduce tasks and then tie these tasks together using Oozie/shell script. This mechanism is very time consuming and the map-reduce task has heavy latency.
- And quite often, translating the output out of one MR job into the input of another MR job might require writing another code because Oozie may not suffice.
- In Spark, you can basically do everything using single application/console (pyspark or scala console) and get the results immediately. Switching between ‘Running something on cluster’ and ‘doing something locally’ is fairly easy and straightforward. This also leads to less context switch of the developer and more productivity.
- Spark kind of equals to MapReduce and Oozie put together.
Is there is a point of learning MapReduce, then?
Yes. For the following reason:
- MapReduce is a paradigm used by many big data tools including Spark. So, understanding the MapReduce paradigm and how to convert a problem into series of MR tasks is very important.
- When the data grows beyond what can fit into the memory on your cluster, the Hadoop Map-Reduce paradigm is still very relevant.
- Almost, every other tool such as Hive or Pig converts its query into MapReduce phases. If you understand the Mapreduce then you will be able to optimize your queries better.
When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?
Since spark runs on top of Yarn, it utilizes yarn for the execution of its commands over the cluster’s nodes.
So, you just have to install Spark on one node.
What are the downsides of Spark?
Spark utilizes the memory. The developer has to be careful. A casual developer might make following mistakes:
- She may end up running everything on the local node instead of distributing work over to the cluster.
- She might hit some webservice too many times by the way of using multiple clusters.
The first problem is well tackled by Hadoop Map reduce paradigm as it ensures that the data your code is churning is fairly small a point of time thus you can make a mistake of trying to handle whole data on a single node.
The second mistake is possible in Map-Reduce too. While writing Map-Reduce, user may hit a service from inside of map() or reduce() too many times. This overloading of service is also possible while using Spark.
What is an RDD?
The full form of RDD is resilience distributed dataset. It is a representation of data located on a network which is
- Immutable – You can operate on the rdd to produce another rdd but you can’t alter it.
- Partitioned / Parallel – The data located on RDD is operated in parallel. Any operation on RDD is done using multiple nodes.
- Resilience – If one of the node hosting the partition fails, another nodes takes its data.
RDD provides two kinds of operations: Transformations and Actions.
What are Transformations?
The transformations are the functions that are applied on an RDD (resilient distributed data set). The transformation results in another RDD. A transformation is not executed until an action follows.
The example of transformations are:
- map() – applies the function passed to it on each element of RDD resulting in a new RDD.
- filter() – creates a new RDD by picking the elements from the current RDD which pass the function argument.
What are Actions?
An action brings back the data from the RDD to the local machine. Execution of an action results in all the previously created transformation. The example of actions are:
- reduce() – executes the function passed again and again until only one value is left. The function should take two argument and return one value.
- take() – take all the values back to the local node form RDD.
Say I have a huge list of numbers in RDD(say myrdd). And I wrote the following code to compute average:
def myAvg(x, y):
return (x+y)/2.0;
avg = myrdd.reduce(myAvg);
What is wrong with it? And How would you correct it?
The average function is not commutative and associative;
I would simply sum it and then divide by count.
1
2 3 4 |
def sum(x, y):
return x+y; total = myrdd.reduce(sum); avg = total / myrdd.count(); |
The only problem with the above code is that the total might become very big thus over flow. So, I would rather divide each number by count and then sum in the following way.
1
2 3 4 5 |
cnt = myrdd.count();
def devideByCnd(x): return x/cnt; myrdd1 = myrdd.map(devideByCnd); avg = myrdd.reduce(sum); |
Say I have a huge list of numbers in a file in HDFS. Each line has one number.And I want to compute the square root of sum of squares of these numbers. How would you do it?
# We would first load the file as RDD from HDFS on spark
numsAsText = sc.textFile(“hdfs://hadoop1.knowbigdata.com/user/student/sgiri/mynumbersfile.txt”);
# Define the function to compute the squaresdef toSqInt(str):
1
2 |
v = int(str);
return v*v; |
#Run the function on spark rdd as transformation
nums = numsAsText.map(toSqInt);
#Run the summation as reduce action
total = nums.reduce(sum)
#finally compute the square root. For which we need to import math.
1
2 |
import math;
print math.sqrt(total); |
Is the following approach correct? Is the sqrtOfSumOfSq a valid reducer?
1
2 3 4 5 6 7 8 9 |
numsAsText =sc.textFile(“hdfs://hadoop1.knowbigdata.com/user/student/sgiri/mynumbersfile.txt”);
def toInt(str): return int(str); nums = numsAsText.map(toInt); def sqrtOfSumOfSq(x, y): return math.sqrt(x*x+y*y); total = nums.reduce(sum) import math; print math.sqrt(total); |
A: Yes. The approach is correct and sqrtOfSumOfSq is a valid reducer.
Could you compare the pros and cons of your approach (in Question 2 above) and my approach (in Question 3 above)?
You are doing the square and square root as part of reduce action while I am squaring in map() and summing in reduce in my approach.
My approach will be faster because in your case the reducer code is heavy as it is calling math.sqrt() and reducer code is generally executed approximately n-1 times the spark RDD.
The only downside of my approach is that there is a huge chance of integer overflow because I am computing the sum of squares as part of map.
If you have to compute the total counts of each of the unique words on spark, how would you go about it?
#This will load the bigtextfile.txt as RDD in the spark lines = sc.textFile(“hdfs://hadoop1.knowbigdata.com/user/student/sgiri/bigtextfile.txt”); #define a function that can break each line into words
1 2 |
def toWords(line): return line.split(); |
# Run the toWords function on each element of RDD on spark as flatMap transformation. # We are going to flatMap instead of map because our function is returning multiple values. words = lines.flatMap(toWords); # Convert each word into (key, value) pair. Her key will be the word itself and value will be 1.
1 2 3 |
def toTuple(word): return (word, 1); wordsTuple = words.map(toTuple); |
# Now we can easily do the reduceByKey() action.
1 2 3 |
def sum(x, y): return x+y; counts = wordsTuple.reduceByKey(sum) |
# Now, print counts.collect()
In a very huge text file, you want to just check if a particular keyword exists. How would you do this using Spark?
1 2 3 4 5 6 7 8 9 10 11 |
lines = sc.textFile(“hdfs://hadoop1.knowbigdata.com/user/student/sgiri/bigtextfile.txt”); def isFound(line): if line.find(“mykeyword”) > -1: return 1; return 0; foundBits = lines.map(isFound); sum = foundBits.reduce(sum); if sum > 0: print “FOUND”; else: print “NOT FOUND”; |
Can you improve the performance of this code in previous answer?
Yes. The search is not stopping even after the word we are looking for has been found. Our map code would keep executing on all the nodes which is very inefficient.
We could utilize accumulators to report whether the word has been found or not and then stop the job. Something on these line:
import thread, threading from time import sleep
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
result = “Not Set” lock = threading.Lock() accum = sc.accumulator(0) def map_func(line): #introduce delay to emulate the slowness sleep(1); if line.find(“Adventures”) > -1: accum.add(1); return 1; return 0; def start_job(): global result try: sc.setJobGroup(“job_to_cancel”, “some description”) lines = sc.textFile(“hdfs://hadoop1.knowbigdata.com/user/student/sgiri/wordcount/input/big.txt”); result = lines.map(map_func); result.take(1); except Exception as e: result = “Cancelled” lock.release() def stop_job(): while accum.value < 3 : sleep(1); sc.cancelJobGroup(“job_to_cancel”) supress = lock.acquire() supress = thread.start_new_thread(start_job, tuple()) supress = thread.start_new_thread(stop_job, tuple()) supress = lock.acquire() [/tab] |
Can you explain the key features of Apache Spark?
- Support for Several Programming Languages – Spark code can be written in any of the four programming languages, namely Java, Python, R, and Scala. It also provides high-level APIs in these programming languages. Additionally, Apache Spark provides shells in Python and Scala. The Python shell is accessed through the ./bin/pyspark directory, while for accessing the Scala shell one needs to go to the .bin/spark-shell directory.
- Lazy Evaluation – Apache Spark makes use of the concept of lazy evaluation, which is to delay the evaluation up until the point it becomes absolutely compulsory.
- Machine Learning – For big data processing, Apache Spark’s MLib machine learning component is useful. It eliminates the need for using separate engines for processing and machine learning.
- Multiple Format Support – Apache Spark provides support for multiple data sources, including Cassandra, Hive, JSON, and Parquet. The Data Sources API offers a pluggable mechanism for accessing structured data via Spark SQL. These data sources can be much more than just simple pipes able to convert data and pulling the same into Spark.
- Real-Time Computation – Spark is designed especially for meeting massive scalability requirements. Thanks to its in-memory computation, Spark’s computation is real-time and has less latency.
- Speed – For large-scale data processing, Spark can be up to 100 times faster than Hadoop MapReduce. Apache Spark is able to achieve this tremendous speed via controlled portioning. The distributed, general-purpose cluster-computing framework manages data by means of partitions that help in parallelizing distributed data processing with minimal network traffic.
- Hadoop Integration – Spark offers smooth connectivity with Hadoop. In addition to being a potential replacement for the Hadoop MapReduce functions, Spark is able to run on top of an extant Hadoop cluster by means of YARN for resource scheduling.
What advantages does Spark offer over Hadoop MapReduce?
- Enhanced Speed – MapReduce makes use of persistent storage for carrying out any of the data processing tasks. On the contrary, Spark uses in-memory processing that offers about 10 to 100 times faster processing than the Hadoop MapReduce.
- Multitasking – Hadoop only supports batch processing via inbuilt libraries. Apache Spark, on the other end, comes with built-in libraries for performing multiple tasks from the same core, including batch processing, interactive SQL queries, machine learning, and streaming.
- No Disk-Dependency – While Hadoop MapReduce is highly disk-dependent, Spark mostly uses caching and in-memory data storage.
- Iterative Computation – Performing computations several times on the same dataset is termed as iterative computation. Spark is capable of iterative computation while Hadoop MapReduce isn’t.
What are the various functions of Spark Core?
Spark Core acts as the base engine for large-scale parallel and distributed data processing. It is the distributed execution engine used in conjunction with the Java, Python, and Scala APIs that offer a platform for distributed ETL (Extract, Transform, Load) application development.
Various functions of Spark Core are:
- Distributing, monitoring, and scheduling jobs on a cluster
- Interacting with storage systems
- Memory management and fault recovery
Furthermore, additional libraries built on top of the Spark Core allow it to diverse workloads for machine learning, streaming, and SQL query processing.
Please enumerate the various components of the Spark Ecosystem.
- GraphX – Implements graphs and graph-parallel computation
- MLib – Used for machine learning
- Spark Core – Base engine used for large-scale parallel and distributed data processing
- Spark Streaming – Responsible for processing real-time streaming data
- Spark SQL – Integrates Spark’s functional programming API with relational processing
Is there any API available for implementing graphs in Spark?
GraphX is the API used for implementing graphs and graph-parallel computing in Apache Spark. It extends the Spark RDD with a Resilient Distributed Property Graph. It is a directed multi-graph that can have several edges in parallel.
Each edge and vertex of the Resilient Distributed Property Graph has user-defined properties associated with it. The parallel edges allow for multiple relationships between the same vertices.
In order to support graph computation, GraphX exposes a set of fundamental operators, such as joinVertices, mapReduceTriplets, and subgraph, and an optimized variant of the Pregel API.
The GraphX component also includes an increasing collection of graph algorithms and builders for simplifying graph analytics tasks.
Tell us how will you implement SQL in Spark?
Spark SQL modules help in integrating relational processing with Spark’s functional programming API. It supports querying data via SQL or HiveQL (Hive Query Language).
Also, Spark SQL supports a galore of data sources and allows for weaving SQL queries with code transformations. DataFrame API, Data Source API, Interpreter & Optimizer, and SQL Service are the four libraries contained by the Spark SQL.
What do you understand by the Parquet file?
Parquet is a columnar format that is supported by several data processing systems. With it, Spark SQL performs both read as well as write operations. Having columnar storage has the following advantages:
- Able to fetch specific columns for access
- Consumes less space
- Follows type-specific encoding
- Limited I/O operations
- Offers better-summarized data
Can you explain how you can use Apache Spark along with Hadoop?
Having compatibility with Hadoop is one of the leading advantages of Apache Spark. The duo makes up for a powerful tech pair. Using Apache Spark and Hadoop allows for making use of Spark’s unparalleled processing power in line with the best of Hadoop’s HDFS and YARN abilities.
Following are the ways of using Hadoop Components with Apache Spark:
- Batch & Real-Time Processing – MapReduce and Spark can be used together where the former handles the batch processing and the latter is responsible for real-time processing
- HDFS – Spark is able to run on top of the HDFS for leveraging the distributed replicated storage
- MapReduce – It is possible to use Apache Spark along with MapReduce in the same Hadoop cluster or independently as a processing framework
- YARN – Spark applications can run on YARN
Name various types of Cluster Managers in Spark.
- Apache Mesos – Commonly used cluster manager
- Standalone – A basic cluster manager for setting up a cluster
- YARN – Used for resource management
Is it possible to use Apache Spark for accessing and analyzing data stored in Cassandra databases?
Yes, it is possible to use Apache Spark for accessing as well as analyzing data stored in Cassandra databases using the Spark Cassandra Connector. It needs to be added to the Spark project during which a Spark executor talks to a local Cassandra node and will query only local data.
Connecting Cassandra with Apache Spark allows making queries faster by means of reducing the usage of the network for sending data between Spark executors and Cassandra nodes.
What do you mean by the worker node?
Any node that is capable of running the code in a cluster can be said to be a worker node. The driver program needs to listen for incoming connections and then accept the same from its executors. Additionally, the driver program must be network addressable from the worker nodes.
A worker node is basically a slave node. The master node assigns work that the worker node then performs. Worker nodes process data stored on the node and report the resources to the master node. The master node schedule tasks based on resource availability.
Please explain the sparse vector in Spark.
A sparse vector is used for storing non-zero entries for saving space. It has two parallel arrays.
- One for indices
- The other for values
An example of a sparse vector is as follows.
Vectors.sparse(7,Array(0,1,2,3,4,5,6),Array(1650d,50000d,800d,3.0,3.0,2009,95054))
What are the main operations of RDD?
There are two main operations of RDD which includes:
- Transformations
- Actions
Define Transformations in Spark?
Transformations are the functions that are applied to RDD that helps in creating another RDD. Transformation does not occur until action takes place. The examples of transformation are Map () and filter().
How will you connect Apache Spark with Apache Mesos?
Step by step procedure for connecting Apache Spark with Apache Mesos is.
- Configure the Spark driver program to connect with Apache Mesos
- Put the Spark binary package in a location accessible by Mesos
- Install Apache Spark in the same location as that of the Apache Mesos
- Configure the spark.mesos.executor.home property for pointing to the location where the Apache Spark is installed
Can you explain how to minimize data transfers while working with Spark?
Minimizing data transfers as well as avoiding shuffling helps in writing Spark programs capable of running reliably and fast. Several ways for minimizing data transfers while working with Apache Spark are:
- Avoiding – ByKey operations, repartition, and other operations responsible for triggering shuffles
- Using Accumulators – Accumulators provide a way for updating the values of variables while executing the same in parallel
- Using Broadcast Variables – A broadcast variable helps in enhancing the efficiency of joins between small and large RDDs
What are broadcast variables in Apache Spark? Why do we need them?
Rather than shipping a copy of a variable with tasks, a broadcast variable helps in keeping a read-only cached version of the variable on each machine.
Broadcast variables are also used to provide every node with a copy of a large input dataset. Apache Spark tries to distribute broadcast variables by using effectual broadcast algorithms for reducing communication costs.
Using broadcast variables eradicates the need of shipping copies of a variable for each task. Hence, data can be processed quickly. Compared to an RDD lookup(), broadcast variables assist in storing a lookup table inside the memory that enhances retrieval efficiency.
Please provide an explanation on DStream in Spark.
DStream is a contraction for Discretized Stream. It is the basic abstraction offered by Spark Streaming and is a continuous stream of data. DStream is received from either a processed data stream generated by transforming the input stream or directly from a data source.
A DStream is represented by a continuous series of RDDs, where each RDD contains data from a certain interval. An operation applied to a DStream is analogous to applying the same operation on the underlying RDDs. A DStream has two operations:
- Output operations responsible for writing data to an external system
- Transformations resulting in the production of a new DStream
It is possible to create DStream from various sources, including Apache Kafka, Apache Flume, and HDFS. Also, Spark Streaming provides support for several DStream transformations.
Does Apache Spark provide checkpoints?
Yes, Apache Spark provides checkpoints. They allow for a program to run all around the clock in addition to making it resilient towards failures not related to application logic. Lineage graphs are used for recovering RDDs from a failure.
Apache Spark comes with an API for adding and managing checkpoints. The user then decides which data to the checkpoint. Checkpoints are preferred over lineage graphs when the latter are long and have wider dependencies.
What are the different levels of persistence in Spark?
Although the intermediary data from different shuffle operations automatically persists in Spark, it is recommended to use the persist () method on the RDD if the data is to be reused.
Apache Spark features several persistence levels for storing the RDDs on disk, memory, or a combination of the two with distinct replication levels. These various persistence levels are:
- DISK_ONLY – Stores the RDD partitions only on the disk.
- MEMORY_AND_DISK – Stores RDD as deserialized Java objects in the JVM. In case the RDD isn’t able to fit in the memory, additional partitions are stored on the disk. These are read from here each time the requirement arises.
- MEMORY_ONLY_SER – Stores RDD as serialized Java objects with one-byte array per partition.
- MEMORY_AND_DISK_SER – Identical to MEMORY_ONLY_SER with the exception of storing partitions not able to fit in the memory to the disk in place of recomputing them on the fly when required.
- MEMORY_ONLY – The default level, it stores the RDD as deserialized Java objects in the JVM. In case the RDD isn’t able to fit in the memory available, some partitions won’t be cached, resulting in recomputing the same on the fly every time they are required.
- OFF_HEAP – Works like MEMORY_ONLY_SER but stores the data in off-heap memory.
Can you list down the limitations of using Apache Spark?
- It doesn’t have a built-in file management system. Hence, it needs to be integrated with other platforms like Hadoop for benefitting from a file management system
- Higher latency but consequently, lower throughput
- No support for true real-time data stream processing. The live data stream is partitioned into batches in Apache Spark and after processing are again converted into batches. Hence, Spark Streaming is micro-batch processing and not truly real-time data processing
- Lesser number of algorithms available
- Spark streaming doesn’t support record-based window criteria
- The work needs to be distributed over multiple clusters instead of running everything on a single node
- While using Apache Spark for cost-efficient processing of big data, its ‘in-memory’ ability becomes a bottleneck
Define Apache Spark?
Apache Spark is an easy to use, highly flexible and fast processing framework which has an advanced engine that supports the cyclic data flow and in-memory computing process. It can run as a standalone in Cloud and Hadoop, providing access to varied data sources like Cassandra, HDFS, HBase, and various others.
What is the main purpose of the Spark Engine?
The main purpose of the Spark Engine is to schedule, monitor, and distribute the data application along with the cluster.
Define Partitions in Apache Spark?
Partitions in Apache Spark is meant to split the data in MapReduce by making it smaller, relevant, and more logical division of the data. It is a process that helps in deriving the logical units of data so that the speedy pace can be applied for data processing. Apache Spark is partitioned in Resilient Distribution Datasets (RDD).
What is the function of the Map ()?
The function of the Map () is to repeat over every line in the RDD and, after that, split them into new RDD.
What is the function of filer()?
The function of filer() is to develop a new RDD by selecting the various elements from the existing RDD, which passes the function argument.
What are the Actions in Spark?
Actions in Spark helps in bringing back the data from an RDD to the local machine. It includes various RDD operations that give out non-RDD values. The actions in Sparks include functions such as reduce() and take().
What is the difference between reducing () and take() function?
Reduce() function is an action that is applied repeatedly until the one value is left in the last, while the take() function is an action that takes into consideration all the values from an RDD to the local node.
What are the similarities and differences between coalesce () and repartition () in Map Reduce?
The similarity is that both Coalesce () and Repartition () in Map Reduce are used to modify the number of partitions in an RDD. The difference between them is that Coalesce () is a part of repartition(), which shuffles using Coalesce(). This helps repartition() to give results in a specific number of partitions with the whole data getting distributed by application of various kinds of hash practitioners.
Define RDD Lineage?
RDD Lineage is a process of reconstructing the lost data partitions because Spark cannot support the data replication process in its memory. It helps in recalling the method used for building other datasets.
What is a Spark Driver?
Spark Driver is referred to as the program which runs on the master node of the machine and helps in declaring the transformation and action on the data RDDs. It helps in creating Spark Context connected with the given Spark Master and delivers RDD graphs to Masters in the case where only the cluster manager runs.
What is Lazy evaluated?
If you execute a bunch of programs, it’s not mandatory to evaluate immediately. Especially in Transformations, this Laziness is a trigger.
Define YARN in Spark?
YARN in Spark acts as a central resource management platform that helps in delivering scalable operations throughout the cluster and performs the function of a distributed container manager.
Define PageRank in Spark? Give an example?
PageRank in Spark is an algorithm in Graphix which measures each vertex in the graph. For example, if a person on Facebook, Instagram, or any other social media platform has a huge number of followers than his/her page will be ranked higher.
What is Sliding Window in Spark? Give an example?
A Sliding Window in Spark is used to specify each batch of Spark streaming that has to be processed. For example, you can specifically set the batch intervals and several batches that you want to process through Spark streaming.
What are the benefits of Sliding Window operations?
Sliding Window operations have the following benefits:
- It helps in controlling the transfer of data packets between different computer networks.
- It combines the RDDs that falls within the particular window and operates upon it to create a new RDDs of the windowed DStream.
- It offers windowed computations to support the process of transformation of RDDs using the Spark Streaming Library.
So, this brings us to the end of the Apache Spark Interview Questions blog.This Tecklearn ‘Top Apache Spark Interview Questions and Answers’ helps you with commonly asked questions if you are looking out for a job in Apache Spark or Big Data Domain. If you wish to learn Apache Spark and build a career in Big Data domain, then check out our interactive, Apache Spark and Scala Training, that comes with 24*7 support to guide you throughout your learning period.
https://www.tecklearn.com/course/apache-spark-and-scala-certification/
Apache Spark and Scala Training
About the Course
Tecklearn Spark training lets you master real-time data processing using Spark streaming, Spark SQL, Spark RDD and Spark Machine Learning libraries (Spark MLlib). This Spark certification training helps you master the essential skills of the Apache Spark open-source framework and Scala programming language, including Spark Streaming, Spark SQL, machine learning programming, GraphX programming, and Shell Scripting Spark. You will also understand the role of Spark in overcoming the limitations of MapReduce. Upon completion of this online training, you will hold a solid understanding and hands-on experience with Apache Spark.
Why Should you take Apache Spark and Scala Training?
- The average salary for Apache Spark developer ranges from approximately $93,486 per year for Developer to $128,313 per year for Data Engineer. – Indeed.com
- Wells Fargo, Microsoft, Capital One, Apple, JPMorgan Chase & many other MNC’s worldwide use Apache Spark across industries.
- Global Spark market revenue will grow to $4.2 billion by 2022 with a CAGR of 67% Marketanalysis.com
What you will Learn in this Course?
Introduction to Scala for Apache Spark
- What is Scala
- Why Scala for Spark
- Scala in other Frameworks
- Scala REPL
- Basic Scala Operations
- Variable Types in Scala
- Control Structures in Scala
- Loop, Functions and Procedures
- Collections in Scala
- Array Buffer, Map, Tuples, Lists
Functional Programming and OOPs Concepts in Scala
- Functional Programming
- Higher Order Functions
- Anonymous Functions
- Class in Scala
- Getters and Setters
- Custom Getters and Setters
- Constructors in Scala
- Singletons
- Extending a Class using Method Overriding
Introduction to Spark
- Introduction to Spark
- How Spark overcomes the drawbacks of MapReduce
- Concept of In Memory MapReduce
- Interactive operations on MapReduce
- Understanding Spark Stack
- HDFS Revision and Spark Hadoop YARN
- Overview of Spark and Why it is better than Hadoop
- Deployment of Spark without Hadoop
- Cloudera distribution and Spark history server
Basics of Spark
- Spark Installation guide
- Spark configuration and memory management
- Driver Memory Versus Executor Memory
- Working with Spark Shell
- Resilient distributed datasets (RDD)
- Functional programming in Spark and Understanding Architecture of Spark
Playing with Spark RDDs
- Challenges in Existing Computing Methods
- Probable Solution and How RDD Solves the Problem
- What is RDD, It’s Operations, Transformations & Actions Data Loading and Saving Through RDDs
- Key-Value Pair RDDs
- Other Pair RDDs and Two Pair RDDs
- RDD Lineage
- RDD Persistence
- Using RDD Concepts Write a Wordcount Program
- Concept of RDD Partitioning and How It Helps Achieve Parallelization
- Passing Functions to Spark
Writing and Deploying Spark Applications
- Creating a Spark application using Scala or Java
- Deploying a Spark application
- Scala built application
- Creating application using SBT
- Deploying application using Maven
- Web user interface of Spark application
- A real-world example of Spark and configuring of Spark
Parallel Processing
- Concept of Spark parallel processing
- Overview of Spark partitions
- File Based partitioning of RDDs
- Concept of HDFS and data locality
- Technique of parallel operations
- Comparing coalesce and Repartition and RDD actions
Machine Learning using Spark MLlib
- Why Machine Learning
- What is Machine Learning
- Applications of Machine Learning
- Face Detection: USE CASE
- Machine Learning Techniques
- Introduction to MLlib
- Features of MLlib and MLlib Tools
- Various ML algorithms supported by MLlib
Integrating Apache Flume and Apache Kafka
- Why Kafka, what is Kafka and Kafka architecture
- Kafka workflow and Configuring Kafka cluster
- Basic operations and Kafka monitoring tools
- Integrating Apache Flume and Apache Kafka
Apache Spark Streaming
- Why Streaming is Necessary
- What is Spark Streaming
- Spark Streaming Features
- Spark Streaming Workflow
- Streaming Context and DStreams
- Transformations on DStreams
- Describe Windowed Operators and Why it is Useful
- Important Windowed Operators
- Slice, Window and ReduceByWindow Operators
- Stateful Operators
Improving Spark Performance
- Learning about accumulators
- The common performance issues and troubleshooting the performance problems
DataFrames and Spark SQL
- Need for Spark SQL
- What is Spark SQL
- Spark SQL Architecture
- SQL Context in Spark SQL
- User Defined Functions
- Data Frames and Datasets
- Interoperating with RDDs
- JSON and Parquet File Formats
- Loading Data through Different Sources
Scheduling and Partitioning in Apache Spark
- Concept of Scheduling and Partitioning in Spark
- Hash partition and range partition
- Scheduling applications
- Static partitioning and dynamic sharing
- Concept of Fair scheduling
- Map partition with index and Zip
- High Availability
- Single-node Recovery with Local File System and High Order Functions
Got a question for us? Please mention it in the comments section and we will get back to you.
0 responses on "Top Apache Spark Interview Questions and Answers"