Understanding Apache Nifi Processors Categorization and its relationship

Last updated on May 30 2022
Swati Dogra

Table of Contents

Understanding Apache Nifi Processors Categorization and its relationship

Apache NiFi – Processors Categorization

Data Ingestion Processors

The processors under Data Ingestion category are used to ingest data into the NiFi data flow. These are mainly the starting point of any data flow in apache NiFi. Some of the processors that belong to these categories are GetFile, GetHTTP, GetFTP, GetKAFKA, etc.

Routing and Mediation Processors

Routing and Mediation processors are used to route the flowfiles to different processors or data flows according to the information in attributes or content of those flowfiles. These processors are also responsible to regulate the NiFi data flows. Some of the processors that belong to this category are RouteOnAttribute, RouteOnContent, ControlRate, RouteText, etc.

Database Access Processors

The processors of this Database Access category are capable of selecting or inserting data or executing and preparing other SQL statements from database. These processors mainly use data connection pool controller setting of Apache NiFi. Some of the processors that belong to this category are ExecuteSQL, PutSQL, PutDatabaseRecord, ListDatabaseTables, etc.

Attribute Extraction Processors

Attribute Extraction Processors are responsible to extract, analyze, change flowfile attributes processing within the NiFi data flow. Some of the processors that belong to this category are UpdateAttribute, EvaluateJSONPath, ExtractText, AttributesToJSON, etc.

System Interaction Processors

System Interaction processors are used to run processes or commands in any operating system. These processors also run scripts in many languages to interact with a variety of systems. Some of the processors that belong to this category are ExecuteScript, ExecuteProcess, ExecuteGroovyScript, ExecuteStreamCommand, etc.

Data Transformation Processors

Processors that belong to Data Transformation are capable of altering content of the flowfiles. These can be used to fully replace the data of a flowfile normally used when a user has to send flowfile as an HTTP body to invokeHTTP processor. Some of the processors that belong to this category are ReplaceText, JoltTransformJSON, etc.

Sending Data Processors

Sending Data Processors are generally the end processor in a data flow. These processors are responsible to store or send data to the destination server. After successful storing or sending the data, these processors DROP the flowfile with success relationship. Some of the processors that belong to this category are PutEmail, PutKafka, PutSFTP, PutFile, PutFTP, etc.

Splitting and Aggregation Processors

These processors are used to split and merge the content present in a flowfile. Some of the processors that belong to this category are SplitText, SplitJson, SplitXml, MergeContent, SplitContent, etc.

HTTP Processors

These processors deal with the HTTP and HTTPS calls. Some of the processors that belong to this category are InvokeHTTP, PostHTTP, ListenHTTP, etc.

AWS Processors

AWS processors are responsible to interaction with Amazon web services system. Some of the processors that belong to this category are GetSQS, PutSNS, PutS3Object, FetchS3Object, etc.

 

 

Apache NiFi – Processors Relationship

In an Apache NiFi data flow, flowfiles move from one to another processor through connection that gets validated using a relationship between processors. Whenever a connection is created, a developer selects one or more relationships between those processors.

10.1

As you can see within the above image, the check boxes in black rectangle are relationships. If a developer selects these check boxes then, the flowfile will terminate in that particular processor, when the relationship is success or failure or both.

Success

When a processor successfully processes a flowfile like store or fetch data from any datasource without getting any connection, authentication or any other error, then the flowfile goes to success relationship.

Failure

When a processor is not able to process a flowfile without errors like authentication error or connection problem, etc. then the flowfile goes to a failure relationship.

A developer also can transfer the flowfiles to other processors using connections. The developer can select and also load balance it, but load balancing is just released in version 1.8, which will not be covered during this tutorial.

10.2

As you’ll see within the above image the connection marked in red have failure relationship, which means all flowfiles with errors will go to the processor in left and respectively all the flowfiles without errors will be transferred to the connection marked in green.

Let us now proceed with the other relationships.

comms.failure

This relationship is met, when a Flowfile couldn’t be fetched from the remote server due to a communications failure.

not.found

Any Flowfile that we receive a ‘Not Found’ message from the remote server will move to not.found relationship.

permission.denied

When NiFi unable to fetch a flowfile from the remote server due to insufficient permission, It’ll move through this relationship.

So, this brings us to the end of Deep Dive into Apache NiFi User Interface blog. This Tecklearn ‘Understanding Apache NiFi processors categorization and its relationsip’ helps you with commonly asked questions if you are looking out for a job in Apache NiFi and Big Data Domain.

If you wish to learn Apache NiFi and build a career in Apache NiFi or Big Data domain, then check out our interactive, Apace NiFi Training, that comes with 24*7 support to guide you throughout your learning period. Please find the link for course details:

https://www.tecklearn.com/course/apache-nifi-training/

Apache NiFi Training

About the Course

Tecklearn Apache NiFi Training makes you an expert in Cluster integration and the challenges associated, Usefulness of Automation, Apache NiFi configuration challenges and etc. Apache NiFi that helps you master various aspects of automating dataflow, managing flow of information between systems, streaming analytics, the concepts of data lake and constructs, various methods of data ingestion and real-world Apache NiFi projects .Transforming the databases is becoming a challenge for many organizations and thus they often look for those who have certification in Apache NiFi to help them in automating the flow of data between the systems.

Why Should you take Apache NiFi Training?

  • The Average Salary for Apache NiFi Developers is $96,578 per year. – paysa.com
  • Micron, Macquarie Telecom Group , Dovestech, Payoff, Flexilogix , Hashmap Inc. & many other MNC’s worldwide use Ansible across industries.
  • Apache NiFi is an open source software for automating and managing the flow of data between systems. It is a powerful and reliable system to process and distribute data. It provides a web-based User Interface for creating, monitoring, & controlling data flows.

What you will Learn in this Course?

Overview of Apache NiFi and its capabilities

  • Understanding the Apache NiFi
  • Apache NiFi most interesting features and capabilities

High Level Overview of Key Apache NiFi Features

  • Key features categories: Flow management, Ease of use, Security, Extensible architecture and Flexible scaling model

Advantages of Apache NiFi over other traditional ETL tools

  • Features of NiFi which make it different form traditional ETL tool and gives NiFi an edge over them

Apache NiFi as a Data Ingestion Tool

  • Introduction to Apache NiFi for data ingestion
  • Apache NiFi Processor : Data ingestion tools available for transferring , importing , loading and processing of data

Data Lake Concepts and Constructs (Big Data & Hadoop Environment)

  • Concept of data lake and its attributes
  • Support for colocation of data in various formats and overcoming the problem of data silos

Apache NiFi capabilities in Big Data and Hadoop Environment

  • Introduction to NiFi processors which sync with data lake and Hadoop ecosystem
  • An overview of the various components of the Hadoop ecosystem and data lake

Installation Requirements and Cluster Integration

  • Apache NiFi installation requirements and cluster integration
  • Successfully running Apache NiFi and addition of processor to NiFi
  • Working with attributes and Process of scaling up and down
  • Hands On

Apache NiFi Core Concepts

  • Apache NiFi fundamental concepts
  • Overview of FlowFile, Flow Controller ,FlowFile Processor, and their attributes
  • Functions in dataflow

Architecture of Apache NiFi

  • Architecture of Apache NiFi
  • Various components including FlowFile Repository, Content Repository, Provenance Repository and web-based user interface
  • Hands On

Performance Expectation and Characteristics of NiFi

  • How to utilize maximization of resources is particularly strong with respect to CPU and disk
  • Understand the best practices and configuration tips

Queuing and Buffering Data

  • Buffering of Data in Apache NiFi
  • Concept of queuing, recovery and latency
  • Working with controller services and directed graphs
  • Data transformation and routing
  • Processor connection, addition and configuration
  • Hands On

Database Connection with Apache NiFi

  • Apache NiFi Connection with database
  • Data Splitting, Transforming and Aggregation
  • Monitoring of NiFi and process of data egress
  • Reporting and Data lineage
  • Expression language and Administration of Apache NiFi
  • Hands On

Apache NiFi Configuration Best Practices

  • Apache NiFi configuration Best Practices
  • ZooKeeper access, properties, custom properties and encryption
  • Guidelines for developers
  • Security of Data in Hadoop and NiFi Kerberos interface
  • Hands On

Apache NiFi Project

  • Apache NiFi Installation
  • Configuration and Deployment of toolbar
  • Building a dataflow using NiFi
  • Creating, importing and exporting various templates to construct a dataflow
  • Deploying Real-time ingestion and Batch ingestion in NiFi
  • Hands On

Got a question for us? Please mention it in the comments section and we will get back to you.

 

0 responses on "Understanding Apache Nifi Processors Categorization and its relationship"

Leave a Message

Your email address will not be published. Required fields are marked *