Top Splunk Interview Questions and Answers

Last updated on Feb 18 2022
Rajnikanth S

Table of Contents

Top Splunk Interview Questions and Answers

Define Splunk

Splunk is a software platform that allows users to analyze machine-generated data (from hardware devices, networks, servers, IoT devices, etc.). Splunk is widely used for searching, visualizing, monitoring, and reporting enterprise data. It processes and analyzes machine data and converts it into powerful operational intelligence by offering real-time insights into the data through accurate visualizations.

Splunk is used for analyzing machine data because:

  • It offers business insights – Splunk understands the patterns hidden within the data and turns it into real-time business insights that can be used to make informed business decisions.
  • It provides operational visibility – Splunk leverages machine data to get end-to-end visibility into company operations and then breaks it down across the infrastructure.
  • It facilitates proactive monitoring – Splunk uses machine data to monitor systems in real-time to identify system issues and vulnerabilities (external/internal breaches and attacks).

Name the common port numbers used by Splunk.

The common port numbers for Splunk are:

  • Splunk Web Port: 8000
  • Splunk Management Port: 8089
  • Splunk Network port: 514
  • Splunk Index Replication Port: 8080
  • Splunk Indexing Port: 999.
  • KV store: 8191

Name the components of Splunk architecture.

The Splunk architecture is made of the following components:

  • Search Head – It provides GUI for searching
  • Indexer – It indexes the machine data
  • Forwarder – It forwards logs to the Indexer

Deployment server – It manages the Splunk components in a distributed environment and distributes configuration apps.

What are the different types of Splunk dashboards?

 

There are three different kinds of Splunk dashboards:

  • Real-time dashboards
  • Dynamic form-based dashboards
  • Dashboards for scheduled reports

Name the types of search modes supported in Splunk.

Splunk supports three types of dashboards, namely:

  • Fast mode
  • Smart mode
  • Verbose mode

Name the different kinds of Splunk Forwarders.

There are two types of Splunk Forwarders:

  • Universal Forwarder (UF) – It is a lightweight Splunk agent installed on a non-Splunk system to gather data locally. UF cannot parse or index data.
  • Heavyweight Forwarder (HWF) – It is a heavyweight Splunk agent with advanced functionalities, including parsing and indexing capabilities. It is used for filtering data.

 

What is Splunk? Why is Splunk used for analyzing machine data?

This question will most likely be the first question you will be asked in any Splunk interview. You need to start by saying that:

Splunk is a platform which allows people to get visibility into machine data, that is generated from hardware devices, networks, servers, IoT devices and other sources.

Splunk is used for analyzing machine data because of following reasons:

sp1

What are the components of Splunk?

Splunk Architecture is a topic which will make its way into any set of Splunk interview questions. As explained in the previous question, the main components of Splunk are: ForwardersIndexers and Search Heads. You can then mention that another component called Deployment Server(or Management Console Host) will come into the picture in case of a larger environment. Deployment servers:

  • Act like an antivirus policy server for setting up Exceptions and Groups, so that you can map and create different set of data collection policies each for either a windows-based server or a linux based server or a solaris based server
  • Can be used to control different applications running in different operating systems from a central location
  • Can be used to deploy the configurations and set policies for different applications from a central location.

Making use of deployment servers is an advantage because connotations, path naming conventions and machine naming conventions which are independent of every host/machine can be easily controlled using the deployment server.

Explain how Splunk works.

This is a sure-shot question because your interviewer will judge this answer of yours to understand how well you know the concept. The Forwarder acts like a dumb agent which will collect the data from the source and forward it to the Indexer. The Indexer will store the data locally in a host machine or on cloud. The Search Head is then used for searching, analyzing, visualizing and performing various other functions on the data stored in the Indexer.

What are the unique benefits of getting data into a Splunk instance via Forwarders?

You can say that the benefits of getting data into Splunk via forwarders are bandwidth throttlingTCP connection and an encrypted SSL connection for transferring data from a forwarder to an indexer. The data forwarded to the indexer is also load balanced by default and even if one indexer is down due to network outage or maintenance purpose, that data can always be routed to another indexer instance in a very short time. Also, the forwarder caches the events locally before forwarding it, thus creating a temporary backup of that data.

Briefly explain the Splunk Architecture

Look at the below image which gives a consolidated view of the architecture of Splunk.

sp2

What is the use of License Master in Splunk?

License master in Splunk is responsible for making sure that the right amount of data gets indexed. Splunk license is based on the data volume that comes to the platform within a 24hr window and thus, it is important to make sure that the environment stays within the limits of the purchased volume.

Consider a scenario where you get 300 GB of data on day one, 500 GB of data the next day and 1 terabyte of data some other day and then it suddenly drops to 100 GB on some other day. Then, you should ideally have a 1 terabyte/day licensing model. The license master thus makes sure that the indexers within the Splunk deployment have sufficient capacity and are licensing the right amount of data.

Why use only Splunk? Why can’t I go for something that is open source?

This kind of question is asked to understand the scope of your knowledge. You can answer that question by saying that Splunk has a lot of competition in the market for analyzing machine logs, doing business intelligence, for performing IT operations and providing security. But, there is no one single tool other than Splunk that can do all of these operations and that is where Splunk comes out of the box and makes a difference. With Splunk you can easily scale up your infrastructure and get professional support from a company backing the platform. Some of its competitors are Sumo Logic in the cloud space of log management and ELK in the open-source category. You can refer to the below table to understand how Splunk fares against other popular tools feature-wise.

Which Splunk Roles can share the same machine?

This is another frequently asked Splunk interview question which will test the candidate’s hands-on knowledge. In case of small deployments, most of the roles can be shared on the same machine which includes IndexerSearch Head and License Master. However, in case of larger deployments the preferred practice is to host each role on stand-alone hosts. Details about roles that can be shared even in case of larger deployments are mentioned below:

  • Strategically, Indexers and Search Heads should have physically dedicated machines. Using Virtual Machines for running the instances separately is not the solution because there are certain guidelines that need to be followed for using computer resources and spinning multiple virtual machines on the same physical hardware can cause performance degradation.
  • However, a License master and Deployment server can be implemented on the same virtual box, in the same instance by spinning different Virtual machines.
  • You can spin another virtual machine on the same instance for hosting the Cluster master as long as the Deployment master is not hosted on a parallel virtual machine on that same instance because the number of connections coming to the Deployment server will be very high.
  • This is because the Deployment server not only caters to the requests coming from the Deployment master, but also to the requests coming from the Forwarders.

What happens if the License Master is unreachable?

In case the license master is unreachable, then it is just not possible to search the data. However, the data coming in to the Indexer will not be affected. The data will continue to flow into your Splunk deployment, the Indexers will continue to index the data as usual however, you will get a warning message on top your Search head or web UI saying that you have exceeded the indexing volume and you either need to reduce the amount of data coming in or you need to buy a higher capacity of license.

Basically, the candidate is expected to answer that the indexing does not stop; only searching is halted.

Explain ‘license violation’ from Splunk perspective.

If you exceed the data limit, then you will be shown a ‘license violation’ error. The license warning that is thrown up, will persist for 14 days. In a commercial license you can have 5 warnings within a 30 day rolling window before which your Indexer’s search results and reports stop triggering. In a free version however, it will show only 3 counts of warning.

Give a few use cases of Knowledge objects.

Knowledge objects can be used in many domains. Few examples are:

Physical Security: If your organization deals with physical security, then you can leverage data containing information about earthquakes, volcanoes, flooding, etc to gain valuable insights

Application Monitoring: By using knowledge objects, you can monitor your applications in real-time and configure alerts which will notify you when your application crashes or any downtime occurs

Network Security: You can increase security in your systems by blacklisting certain IPs from getting into your network. This can be done by using the Knowledge object called lookups.

Employee Management: If you want to monitor the activity of people who are serving their notice period, then you can create a list of those people and create a rule preventing them from copying data and using them outside

Easier Searching Of Data: With knowledge objects, you can tag information, create event types and create search constraints right at the start and shorten them so that they are easy to remember, correlate and understand rather than writing long searches queries. Those constraints where you put your search conditions, and shorten them are called event types.

These are some of the operations that can be done from a non-technical perspective by using knowledge objects. Knowledge objects are the actual application in business, which means Splunk interview questions are incomplete without Knowledge objects.

Why should we use Splunk Alert? What are the different options while setting up Alerts?

This is a common question aimed at candidates appearing for the role of a Splunk Administrator. Alerts can be used when you want to be notified of an erroneous condition in your system. For example, send an email notification to the admin when there are more than three failed login attempts in a twenty-four hour period. Another example is when you want to run the same search query every day at a specific time to give a notification about the system status.

Different options that are available while setting up alerts are:

  • You can create a web hook, so that you can write to hipchat or github. Here, you can write an email to a group of machines with all your subject, priorities, and body of the message
  • You can add results, .csv or pdf or inline with the body of the message to make sure that the recipient understands where this alert has been fired, at what conditions and what is the action he has taken
  • You can also create tickets and throttle alerts based on certain conditions like a machine name or an IP address. For example, if there is a virus outbreak, you do not want every alert to be triggered because it will lead to many tickets being created in your system which will be an overload. You can control such alerts from the alert window.

Explain Workflow Actions

Workflow actions is one such topic that will make a presence in any set of Splunk Interview questions. Workflow actions is not common to an average Splunk user and can be answered by only those who understand it completely. So it is important that you answer this question aptly.

You can start explaining Workflow actions by first telling why it should be used.

Once you have assigned rules, created reports and schedules then what? It is not the end of the road! You can create workflow actions which will automate certain tasks. For example:

  • You can do a double click, which will perform a drill down into a particular list containing user names and their IP addresses and you can perform further search into that list
  • You can do a double click to retrieve a user name from a report and then pass that as a parameter to the next report
  • You can use the workflow actions to retrieve some data and also send some data to other fields. A use case of that is, you can pass latitude and longitude details to google maps and then you can find where an IP address or location exists.

The screenshot below shows the window where you can set the workflow actions.

sp3

Explain Data Models and Pivot

Data models are used for creating a structured hierarchical model of your data. It can be used when you have a large amount of unstructured data, and when you want to make use of that information without using complex search queries.

A few use cases of Data models are:

  • Create Sales Reports: If you have a sales report, then you can easily create the total number of successful purchases, below that you can create a child object containing the list of failed purchases and other views
  • Set Access Levels: If you want a structured view of users and their various access levels, you can use a data model
  • Enable Authentication: If you want structure in the authentication, you can create a model around VPN, root access, admin access, non-root admin access, authentication on various different applications to create a structure around it in a way that normalizes the way you look at data.
    So when you look at a data model called authentication, it will not matter to Splunk what the source is, and from a user perspective it becomes extremely simple because as and when new data sources are added or when old one’s are deprecated, you do not have to rewrite all your searches and that is the biggest benefit of using data models and pivots.

On the other hand with pivots, you have the flexibility to create the front views of your results and then pick and choose the most appropriate filter for a better view of results. Both these options are useful for managers from a non-technical or semi-technical background.

Explain Search Factor (SF) & Replication Factor (RF)

Questions regarding Search Factor and Replication Factor are most likely asked when you are interviewing for the role of a Splunk Architect. SF & RF are terminologies related to Clustering techniques (Search head clustering & Indexer clustering).

  • The search factor determines the number of searchable copies of data maintained by the indexer cluster. The default value of search factor is 2. However, the Replication Factor in case of Indexer cluster, is the number of copies of data the cluster maintains and in case of a search head cluster, it is the minimum number of copies of each search artifact, the cluster maintains
  • Search head cluster has only a Search Factor whereas an Indexer cluster has both a Search Factor and a Replication Factor
  • Important point to note is that the search factor must be less than or equal to the replication factor

Which commands are included in ‘filtering results’ category?

There will be a great deal of events coming to Splunk in a short time. Thus, it is a little complicated task to search and filter data. But thankfully there are commands like ‘search’, ‘where’, ‘sort’ and ‘rex’ that come to the rescue. That is why, filtering commands are also among the most commonly asked Splunk interview questions.

Search: The ‘search’ command is used to retrieve events from indexes or filter the results of a previous search command in the pipeline. You can retrieve events from your indexes using keywords, quoted phrases, wildcards, and key/value expressions. The ‘search’ command is implied at the beginning of any and every search operation.

Where: The ‘where’ command however uses ‘eval’ expressions to filter search results. While the ‘search’ command keeps only the results for which the evaluation was successful, the ‘where’ command is used to drill down further into those search results. For example, a ‘search’ can be used to find the total number of nodes that are active but it is the ‘where’ command which will return a matching condition of an active node which is running a particular application.

Sort: The ‘sort’ command is used to sort the results by specified fields. It can sort the results in a reverse order, ascending or descending order. Apart from that, the sort command also has the capability to limit the results while sorting. For example, you can execute commands which will return only the top 5 revenue generating products in your business.

Rex: The ‘rex’ command basically allows you to extract data or particular fields from your events. For example, if you want to identify certain fields in an email id: abc@tecklearn.com, the ‘rex’ command allows you to break down the results as abc being the user id, tecklearn.com being the domain name and tecklearn as the company name. You can use rex to breakdown, slice your events and parts of each of your event record the way you want.

What is a lookup command?

Differentiate between inputlookup & outputlookup commands.

Lookup command is that topic into which most interview questions dive into, with questions like: Can you enrich the data? How do you enrich the raw data with external lookup?
You will be given a use case scenario, where you have a csv file and you are asked to do lookups for certain product catalogs and asked to compare the raw data & structured csv or json data. So you should be prepared to answer such questions confidently.

Lookup commands are used when you want to receive some fields from an external file (such as CSV file or any python based script) to get some value of an event. It is used to narrow the search results as it helps to reference fields in an external CSV file that match fields in your event data.

An inputlookup basically takes an input as the name suggests. For example, it would take the product price, product name as input and then match it with an internal field like a product id or an item id. Whereas, an outputlookup is used to generate an output from an existing field list. Basically, inputlookup is used to enrich the data and outputlookup is used to build their information.

What is the difference between ‘eval’, ‘stats’, ‘charts’ and ‘timecharts’ command?

‘Eval’ and ‘stats’ are among the most common as well as the most important commands within the Splunk SPL language and they are used interchangeably in the same way as ‘search’ and ‘where’ commands.

  • At times ‘eval’ and ‘stats’ are used interchangeably however, there is a subtle difference between the two. While ‘stats‘ command is used for computing statistics on a set of events, ‘eval’ command allows you to create a new field altogether and then use that field in subsequent parts for searching the data.
  • Another frequently asked question is the difference between ‘stats’, ‘charts’ and ‘timecharts’ commands. The difference between them is mentioned in the table below.

Stats vs Chart vs TimeChart

sp4

What are the different types of Data Inputs in Splunk?

This is the kind of question which only somebody who has worked as a Splunk administrator can answer. The answer to the question is below.

  • The obvious and the easiest way would be by using files and directories as input
  • Configuring Network ports to receive inputs automatically and writing scripts such that the output of these scripts is pushed into Splunk is another common way
  • But a seasoned Splunk administrator, would be expected to add another option called windows inputs. These windows inputs are of 4 types: registry inputs monitor, printer monitor, network monitor and active directory monitor.

What are the defaults fields for every event in Splunk?

There are about 5 fields that are default and they are barcoded with every event into Splunk.
They are host, source, source type, index and timestamp.

Explain file precedence in Splunk.

File precedence is an important aspect of troubleshooting in Splunk for an administrator, developer, as well as an architect. All of Splunk’s configurations are written within plain text .conf files. There can be multiple copies present for each of these files, and thus it is important to know the role these files play when a Splunk instance is running or restarted. File precedence is an important concept to understand for a number of reasons:

  • To be able to plan Splunk upgrades
  • To be able to plan app upgrades
  • To be able to provide different data inputs and
  • To distribute the configurations to your splunk deployments.

To determine the priority among copies of a configuration file, Splunk software first determines the directory scheme. The directory schemes are either a) Global or b) App/user.

When the context is global (that is, where there’s no app/user context), directory priority descends in this order:

  1. System local directory — highest priority
  2. App local directories
  3. App default directories
  4. System default directory — lowest priority

When the context is app/user, directory priority descends from user to app to system:

  1. User directories for current user — highest priority
  2. App directories for currently running app (local, followed by default)
  3. App directories for all other apps (local, followed by default) — for exported settings only
  4. System directories (local, followed by default) — lowest priority

How can we extract fields?

You can extract fields from either event lists, sidebar or from the settings menu via the UI.
The other way is to write your own regular expressions in props.conf configuration file.

What is the difference between Search time and Index time field extractions?

As the name suggests, Search time field extraction refers to the fields extracted while performing searches whereas, fields extracted when the data comes to the indexer are referred to as Index time field extraction. You can set up the indexer time field extraction either at the forwarder level or at the indexer level.

Another difference is that Search time field extraction’s extracted fields are not part of the metadata, so they do not consume disk space. Whereas index time field extraction’s extracted fields are a part of metadata and hence consume disk space.

Explain how data ages in Splunk?

Data coming in to the indexer is stored in directories called buckets. A bucket moves through several stages as data ages: hotwarmcoldfrozen and thawed. Over time, buckets ‘roll’ from one stage to the next stage.

sp5

  • The first time when data gets indexed, it goes into a hot bucket. Hot buckets are both searchable and are actively being written to. An index can have several hot buckets open at a time
  • When certain conditions occur (for example, the hot bucket reaches a certain size or splunkd gets restarted), the hot bucket becomes a warm bucket (“rolls to warm”), and a new hot bucket is created in its place. Warm buckets are searchable, but are not actively written to. There can be many warm buckets
  • Once further conditions are met (for example, the index reaches some maximum number of warm buckets), the indexer begins to roll the warm buckets to cold based on their age. It always selects the oldest warm bucket to roll to cold. Buckets continue to roll to cold as they age in this manner
  • After a set period of time, cold buckets roll to frozen, at which point they are either archived or deleted.

The bucket aging policy, which determines when a bucket moves from one stage to the next, can be modified by editing the attributes in indexes.conf.

What is summary index in Splunk?

Summary index is another important Splunk interview question from an administrative perspective. You will be asked this question to find out if you know how to store your analytical data, reports and summaries. The answer to this question is below.

The biggest advantage of having a summary index is that you can retain the analytics and reports even after your data has aged out. For example:

  • Assume that your data retention policy is only for 6 months but, your data has aged out and is older than a few months. If you still want to do your own calculation or dig out some statistical value, then during that time, summary index is useful
  • For example, you can store the summary and statistics of the percentage growth of sale that took place in each of the last 6 months and you can pull the average revenue from that. That average value is stored inside summary index.

But the limitations with summary index are:

  • You cannot do a needle in the haystack kind of a search
  • You cannot drill down and find out which products contributed to the revenue
  • You cannot find out the top product from your statistics
  • You cannot drill down and nail which was the maximum contribution to that summary.

That is the use of Summary indexing and in an interview, you are expected to answer both these aspects of benefit and limitation.

How to exclude some events from being indexed by Splunk?

You might not want to index all your events in Splunk instance. In that case, how will you exclude the entry of events to Splunk.
An example of this is the debug messages in your application development cycle. You can exclude such debug messages by putting those events in the null queue. These null queues are put into transforms.conf at the forwarder level itself.

If a candidate can answer this question, then he is most likely to get hired.

What is the use of Time Zone property in Splunk? When is it required the most?

Time zone is extremely important when you are searching for events from a security or fraud perspective. If you search your events with the wrong time zone then you will end up not being able to find that particular event altogether. Splunk picks up the default time zone from your browser settings. The browser in turn picks up the current time zone from the machine you are using. Splunk picks up that timezone when the data is input, and it is required the most when you are searching and correlating data coming from different sources. For example, you can search for events that came in at 4:00 PM IST, in your London data center or Singapore data center and so on. The timezone property is thus very important to correlate such events.

What is Splunk App? What is the difference between Splunk App and Add-on?

Splunk Apps are considered to be the entire collection of reports, dashboards, alerts, field extractions and lookups.
Splunk Apps minus the visual components of a report or a dashboard are Splunk Add-ons. Lookups, field extractions, etc are examples of Splunk Add-on.

Any candidate knowing this answer will be the one questioned more about the developer aspects of Splunk.

How to assign colors in a chart based on field names in Splunk UI?

You need to assign colors to charts while creating reports and presenting results. Most of the time the colors are picked by default. But what if you want to assign your own colors? For example, if your sales numbers fall below a threshold, then you might need that chart to display the graph in red color. Then, how will you be able to change the color in a Splunk Web UI?

You will have to first edit the panels built on top of a dashboard and then modify the panel settings from the UI. You can then pick and choose the colors. You can also write commands to choose the colors from a palette by inputting hexadecimal values or by writing code. But, Splunk UI is the preferred way because you have the flexibility to assign colors easily to different values based on their types in the bar chart or line chart. You can also give different gradients and set your values into a radial gauge or water gauge.

What is sourcetype in Splunk?

Now this question may feature at the bottom of the list, but that doesn’t mean it is the least important among other Splunk interview questions.

Sourcetype is a default field which is used to identify the data structure of an incoming event. Sourcetype determines how Splunk Enterprise formats the data during the indexing process. Source type can be set at the forwarder level for indexer extraction to identify different data formats. Because the source type controls how Splunk software formats incoming data, it is important that you assign the correct source type to your data. It is important that even the indexed version of the data (the event data) also looks the way you want, with appropriate timestamps and event breaks. This facilitates easier searching of data later.

For example, the data maybe coming in the form of a csv, such that the first line is a header, the second line is a blank line and then from the next line comes the actual data. Another example where you need to use sourcetype is if you want to break down date field into 3 different columns of a csv, each for day, month, year and then index it. Your answer to this question will be a decisive factor in you getting recruited.

 

 

Compare Splunk with Spark.

Criteria Splunk Spark
Deployment area Collecting large amounts of machine-generated data Iterative applications and in-memory processing
Nature of tool Proprietary Open-source
Working mode Streaming mode Both streaming and batch modes

What is Splunk?

Splunk is ‘Google’ for our machine-generated data. It’s a software/engine that can be used for searching, visualizing, monitoring, reporting, etc. of our enterprise data. Splunk takes valuable machine data and turns it into powerful operational intelligence by providing real-time insights into our data through charts, alerts, reports, etc.

What are the common port numbers used by Splunk?

Below are the common port numbers used by Splunk. However, we can change them if required.

sp6

What are the components of Splunk? Explain Splunk architecture.

Below are the components of Splunk:

  • Search Head: Provides the GUI for searching
  • Indexer: Indexes the machine data
  • Forwarder: Forwards logs to the Indexer
  • Deployment Server: Manges Splunk components in a distributed environment

Which is the latest Splunk version in use?

Splunk 6.3

What is Splunk Indexer? What are the stages of Splunk Indexing?

Splunk Indexer is the Splunk Enterprise component that creates and manages indexes. The primary functions of an indexer are:

  • Indexing incoming data
  • Searching the indexed data
  • Picture

What is a Splunk Forwarder? What are the types of Splunk Forwarders?

There are two types of Splunk Forwarders as below:

  • Universal Forwarder (UF): The Splunk agent installed on a non-Splunk system to gather data locally; it can’t parse or index data.
  • Heavyweight Forwarder (HWF): A full instance of Splunk with advanced functionalities.

It generally works as a remote collector, intermediate forwarder, and possible data filter, and since it parses data, it is not recommended for production systems.

Can you name a few most important configuration files in Splunk?

  • props.conf
  • indexes.conf
  • inputs.conf
  • transforms.conf
  • server.conf

What are the types of Splunk Licenses?

  • Enterprise license
  • Free license
  • Forwarder license
  • Beta license
  • Licenses for search heads (for distributed search)
  • Licenses for cluster members (for index replication)

What is Splunk App?

Splunk app is a container/directory of configurations, searches, dashboards, etc. in Splunk.

Where is Splunk Default Configuration stored?

$splunkhome/etc/system/default

What are the features not available in Splunk Free?

Splunk Free does not include below features:

  • Authentication and scheduled searches/alerting
  • Distributed search
  • Forwarding in TCP/HTTP (to non-Splunk)
  • Deployment management

What happens if the License Master is unreachable?

If the license master is not available, the license slave will start a 24-hour timer, after which the search will be blocked on the license slave (though indexing continues). However, users will not be able to search for data in that slave until it can reach the license master again.

What is Summary Index in Splunk?

A summary index is the default Splunk index (the index that Splunk Enterprise uses if we do not indicate another one).

If we plan to run a variety of summary index reports, we may need to create additional summary indexes.

What is Splunk DB Connect?

splunk DB Connect is a generic SQL database plugin for Splunk that allows us to easily integrate database information with Splunk queries and reports.

Can you write down a general regular expression for extracting the IP address from logs?

There are multiple ways in which we can extract the IP address from logs. Below are a few examples:

By using a regular expression:

rex field=_raw  “(?<ip_address>\d+\.\d+\.\d+\.\d+)”

OR

rex field=_raw  “(?<ip_address>([0-9]{1,.}[\.]){3}[0-9]{1,3})”

Explain Stats vs Transaction commands.

The transaction command is the most useful in two specific cases:

  • When the unique ID (from one or more fields) alone is not sufficient to discriminate between two transactions. This is the case when the identifier is reused, for example, web sessions identified by a cookie/client IP. In this case, the time span or pauses are also used to segment the data into transactions.
  • When an identifier is reused, say in DHCP logs, a particular message identifies the beginning or end of a transaction.
  • When it is desirable to see the raw text of events combined rather than an analysis of the constituent fields of the events.

In other cases, it’s usually better to use stats.

  • As the performance of the stats command is higher, it can be used especially in a distributed search environment
  • If there is a unique ID, the stats command can be used

How to troubleshoot Splunk performance issues?

The answer to this question would be very wide, but mostly an interviewer would be looking for the following keywords:

  • Check splunkd.log for errors
  • Check server performance issues, i.e., CPU, memory usage, disk I/O, etc.
  • Install the SOS (Splunk on Splunk) app and check for warnings and errors in its dashboard
  • Check the number of saved searches currently running and their consumption of system resources
  • Install and enable Firebug, a Firefox extension. Log into Splunk (using Firefox) and open Firebug’s panels. Then, switch to the ‘Net’ panel (we will have to enable it). The Net panel will show us the HTTP requests and responses, along with the time spent in each. This will give us a lot of information quickly such as which requests are hanging Splunk, which requests are blameless, etc.

What are Buckets? Explain Splunk Bucket Lifecycle.

Splunk places indexed data in directories, called ‘buckets.’ It is physically a directory containing events of a certain period.

A bucket moves through several stages as it ages. Below are the various stages it goes through:

  • Hot: A hot bucket contains newly indexed data. It is open for writing. There can be one or more hot buckets for each index.
  • Warm: A warm bucket consists of data rolled out from a hot bucket. There are many warm buckets.
  • Cold: A cold bucket has data that is rolled out from a warm bucket. There are many cold buckets.
  • Frozen: A frozen bucket is comprised of data rolled out from a cold bucket. The indexer deletes frozen data by default, but we can archive it. Archived data can later be thawed (data in a frozen bucket is not searchable).

By default, the buckets are located in:

$SPLUNK_HOME/var/lib/splunk/defaultdb/db

We should see the hot-db there, and any warm buckets we have. By default, Splunk sets the bucket size to 10 GB for 64-bit systems and .50 MB on 32-bit systems.

What is the difference between stats and eventstats commands?

  • The stats command generates summary statistics of all the existing fields in the search results and saves them as values in new fields.
  • Eventstats is similar to the stats command, except that the aggregation results are added inline to each event and only if the aggregation is pertinent to that event. The eventstats command computes requested statistics, like stats does, but aggregates them to the original raw data.

Who are the top direct competitors to Splunk?

Logstash, Loggly, LogLogic, Sumo Logic, etc. are some of the top direct competitors to Splunk.

What do Splunk Licenses specify?

Splunk licenses specify how much data we can index per calendar day.

How does Splunk determine 1 day, from a licensing perspective?

In terms of licensing, for Splunk, 1 day is from midnight to midnight on the clock of the license master.

How are Forwarder Licenses purchased?

They are included with Splunk. Therefore, no need to purchase separately.

What is the command for restarting Splunk web server?

We can restart Splunk web server by using the following command:

splunk start splunkweb

What is the command for restarting Splunk Daemon?

Splunk Deamon can be restarted with the below command:

splunk start splunkd

What is the command used to check the running Splunk processes on Unix/Linux?

If we want to check the running Splunk Enterprise processes on Unix/Linux, we can make use of the following command:

ps aux | grep splunk

What is the command used for enabling Splunk to boot start?

To boot start Splunk, we have to use the following command:

$SPLUNK_HOME/bin/splunk enable boot-start

How to disable Splunk boot-start?

In order to disable Splunk boot-start, we can use the following:

$SPLUNK_HOME/bin/splunk disable boot-start

What is Source Type in Splunk?

Source type is Splunk way of identifying data.

How to reset Splunk Admin password?

Resetting Splunk Admin password depends on the version of Splunk. If we are using Splunk 1 and above, then we have to follow the below steps:

  • First, we have to stop our Splunk Enterprise
  • Now, we need to find the ‘passwd’ file and rename it to ‘passwd.bk’
  • Then, we have to create a file named ‘user-seed.conf’ in the below directory:

$SPLUNK_HOME/etc/system/local/

In the file, we will have to use the following command (here, in the place of ‘NEW_PASSWORD’, we will add our own new password):

[user_info]

PASSWORD = NEW_PASSWORD

  • After that, we can just restart the Splunk Enterprise and use the new password to log in

Now, if we are using the versions prior to 1, we will follow the below steps:

  • First, stop the Splunk Enterprise
  • Find the passwd file and rename it to ‘passw.bk’
  • Start Splunk Enterprise and log in using the default credentials of admin/changeme
  • Here, when asked to enter a new password for our admin account, we will follow the instructions

Note: In case we have created other users earlier and know their login details, copy and paste their credentials from the passwd.bk file into the passwd file and restart Splunk.

How to disable Splunk Launch Message?

Set value OFFENSIVE=Less in splunk_launch.conf

How to clear Splunk Search History?

We can clear Splunk search history by deleting the following file from Splunk server:

$splunk_home/var/log/splunk/searches.log

What is Btool?/How will you troubleshoot Splunk configuration files?

Splunk Btool is a command-line tool that helps us troubleshoot configuration file issues or just see what values are being used by our Splunk Enterprise installation in the existing environment.

What is the difference between Splunk App and Splunk Add-on?

In fact, both contain preconfigured configuration, reports, etc., but Splunk add-on do not have a visual app. On the other hand, a Splunk app has a preconfigured visual app.

What is .conf files precedence in Splunk?

File precedence is as follows:

System local directory — highest priority

App local directories

App default directories

System default directory — lowest priority

What is Fishbucket? What is Fishbucket Index?

Fishbucket is a directory or index at the default location:

/opt/splunk/var/lib/splunk

It contains seek pointers and CRCs for the files we are indexing, so ‘splunkd’ can tell us if it has read them already. We can access it through the GUI by searching for:

index=_thefishbucket

How do I exclude some events from being indexed by Splunk?

This can be done by defining a regex to match the necessary event(s) and send everything else to NullQueue. Here is a basic example that will drop everything except events that contain the string login:
In props.conf:

<code>[source::/var/log/foo]

# Transforms must be applied in this order

# to make sure events are dropped on the

# floor prior to making their way to the

# index processor

TRANSFORMS-set= setnull,setparsing

 

</code>

In transforms.conf:

[setnull] REGEX = . DEST_KEY = queue FORMAT = nullQueue

[setparsing]

REGEX = login

DEST_KEY = queue

FORMAT = indexQueue

How can I understand when Splunk has finished indexing a log file?

We can figure this out:
By watching data from Splunk’s metrics log in real time:

index=”_internal” source=”*metrics.log” group=”per_sourcetype_thruput” series=”&lt;your_sourcetype_here&gt;” |

eval MB=kb/1024 | chart sum(MB)

By watching everything split by source type:

index=”_internal” source=”*metrics.log” group=”per_sourcetype_thruput” | eval MB=kb/1024 | chart sum(MB) avg(eps) over series

If we are having trouble with a data input and we want a way to troubleshoot it, particularly if our whitelist/blacklist rules are not working the way we expected, we will go to the following URL:

https://yoursplunkhost:8089/services/admin/inputstatus

How to set the default search time in Splunk 6?

To do this in Splunk Enterprise 6.0, we have to use ‘ui-prefs.conf’. If we set the value in the following, all our users would see it as the default setting:

$SPLUNK_HOME/etc/system/local

For example, if our

$SPLUNK_HOME/etc/system/local/ui-prefs.conf file

includes:

[search]

dispatch.earliest_time = @d

dispatch.latest_time = now

The default time range that all users will see in the search app will be today.

The configuration file reference for ui-prefs.conf is here:

http://docs.splunk.com/Documentation/Splunk/latest/Admin/Ui-prefsconf

What is Dispatch Directory?

$SPLUNK_HOME/var/run/splunk/dispatch

contains a directory for each search that is running or has completed. For example, a directory named 1434308943.358 will contain a CSV file of its search results, a search.log with details about the search execution, and other stuff. Using the defaults (which we can override in limits.conf), these directories will be deleted 10 minutes after the search completes—unless the user saves the search results, in which case the results will be deleted after . days.

What is the difference between Search Head Pooling and Search Head Clustering?

Both are features provided by Splunk for the high availability of Splunk search head in case any search head goes down. However, the search head cluster is newly introduced and search head pooling will be removed in the next upcoming versions.

The search head cluster is managed by a captain, and the captain controls its slaves. The search head cluster is more reliable and efficient than the search head pooling.

If I want to add folder access logs from a windows machine to Splunk, how do I do it?

Below are the steps to add folder access logs to Splunk:

  1. Enable Object Access Audit through group policy on the Windows machine on which the folder is located
  2. Enable auditing on a specific folder for which we want to monitor logs
  3. Install Splunk universal forwarder on the Windows machine
  4. Configure universal forwarder to send security logs to Splunk indexer

How would you handle/troubleshoot Splunk License Violation Warning?

A license violation warning means that Splunk has indexed more data than our purchased license quota. We have to identify which index/source type has received more data recently than the usual daily data volume. We can check the Splunk license master pool-wise available quota and identify the pool for which the violation has occurred. Once we know the pool for which we are receiving more data, then we have to identify the top source type for which we are receiving more data than the usual data. Once the source type is identified, then we have to find out the source machine which is sending the huge number of logs and the root cause for the same and troubleshoot it, accordingly.

What is MapReduce algorithm?

MapReduce algorithm is the secret behind Splunk’s faster data searching. It’s an algorithm typically used for batch-based large-scale parallelization. It’s inspired by functional programming’s map() and reduce() functions.

How does Splunk avoid the duplicate indexing of logs?

At the indexer, Splunk keeps track of the indexed events in a directory called fishbucket with the default location:

/opt/splunk/var/lib/splunk

It contains seek pointers and CRCs for the files we are indexing, so splunkd can tell us if it has read them already.

What is the difference between Splunk SDK and Splunk Framework?

Splunk SDKs are designed to allow us to develop applications from scratch and they do not require Splunk Web or any components from the Splunk App Framework. These are separately licensed from Splunk and do not alter the Splunk Software.

Splunk App Framework resides within the Splunk web server and permits us to customize the Splunk Web UI that comes with the product and develop Splunk apps using the Splunk web server. It is an important part of the features and functionalities of Splunk, which does not license users to modify anything in Splunk.

For what purpose inputlookup and outputlookup are used in Splunk Search?

The inputlookup command is used to search the contents of a lookup table. The lookup table can be a CSV lookup or a KV store lookup. The inputlookup command is considered to be an event-generating command. An event-generating command generates events or reports from one or more indexes without transforming them. There are many commands that come under the event-generating commands such as metadata, loadjob, inputcsv, etc. The inputlookup command is one of them.

Syntax:

inputlookup [append=<bool>] [start=<int>] [max=<int>] [<filename> | <tablename>] [WHERE <search-query>]

Now coming to the outputlookup command, it writes the search results to a static lookup table, or KV store collection, that we specify. The outputlookup command is not being used with external lookups.

Syntax:

outputlookup [append=<bool>] [create_empty=<bool>] [max=<int>] [key_field=<field_name>] [

 

Splunk Vs ELK

sp8

What is Splunk?

Splunk is Google for your machine data. It’s a software/Engine which can be used for searching, visualizing, Monitoring, reporting etc of your enterprise data. Plunk takes valuable machine data and turns it into powerful operational intelligence by providing real time insight to your data through charts, alerts, reports etc.

Splunk vs Hadoop

sp9

What are the common port numbers used by Splunk?

Below are common port numbers used by splunk, however you can change them if required

sp9 1 

What are the components of splunk/splunk architecture?

Below are components of splunk:

  • Search head: provides GUI for searching
  • Indexer: indexes machine data
  • Forwarder: Forwards logs to Indexer
  • Deployment server: Mange’s splunk components in distributed environment

Which is latest splunk version in use?

Latest Version Release – Splunk 6.3

Compare Splunk VS Logstash Vs Sumo Logic:

sp11

What is splunk indexer? What are stages of splunk indexing?

The indexer is the Splunk Enterprise component that creates and manages indexes. The primary functions of an indexer are:

  • Indexing incoming data.
  • Searching the indexed data.

What is a splunk forwarder and what are types of splunk forwarder?

There are two types of splunk forwarder as below

  1. Universal forwarder (UF) -Splunk agent installed on non-Splunk system to gather data locally, can’t parse or index data.
  2. Heavy weight forwarder (HWF) – full instance of splunk with advance functionality generally works as a remote collector, intermediate forwarder, and possible data filter because they parse data, they are not recommended for production systems

What are the most important configuration files of splunk OR can you tell name of few important configuration files in splunk?

  • props.conf
  • indexes.conf
  • inputs.conf
  • transforms.conf
  • server.conf

What are the types of splunk licenses?

  • Enterprise license
  • free license
  • Forwarder license
  • Beta license
  • Licenses for search heads (for distributed search)
  • Licenses for cluster members (for index replication)

What is splunk app?

Splunk app is container/directory of configurations, searches, dashboards etc in splunk

Where splunk default configuration does is stored?

$splunkhome/etc/system/default

What features are not available in splunk free?

Splunk free lacks these features:

  • authentication and scheduled searches/alerting
  • distributed search
  • forwarding in TCP/HTTP (to non-splunk)
  • deployment management

what happens if the license master is unreachable? 

License slave will start a 24-hour timer, after which search will be blocked on the license slave (though indexing continues). Users will not be able to search data in that slave until it can reach license master again

What is summary index in splunk?

The Summary index is the default summary index (the index that plunk Enterprise uses if you do not indicate another one). If you plan to run a variety of summary index reports you may need to create additional summary indexes.

What is splunk DB connect?

Splunk DB Connect is a generic SQL database plugin for Splunk that allows you to easily integrate database information with Splunk queries and reports.

Can you write down a general regular expression for extracting ip address from logs?

There are multiple ways we can extract ip address from logs. Below are few examples.
Regular Expression for extracting ip address:

What is difference between stats vs transaction command?

The transaction command is most useful in two specific cases:

Unique id (from one or more fields) alone is not sufficient to discriminate between two transactions. This is the case when the identifier is reused, for example web sessions identified by cookie/client IP. In this case, time span or pauses are also used to segment the data into transactions. In other cases when an identifier is reused, say in DHCP logs, a particular message may identify the beginning or end of a transaction.

When it is desirable to see the raw text of the events combined rather than analysis on the constituent fields of the events.

In other cases, it’s usually better to use stats as the performance is higher, especially in a distributed search environment. Often there is a unique id and stats can be used.

 How to troubleshoot splunk performance issues?

Answer to this question would be very wide but basically interviewer would be looking for following keywords in interview:

  • Check splunkd.log for any errors
  • Check server performance issues i.e. cpu/memory usag,disk i/o etc
  • Install SOS (Splunk on splunk) app and check for warning and errors in dashboard
  • Check number of saved searches currently running and their system resources consumption
  • Install Firebug, which is a firefox extension. After it’s installed and enabled, log into splunk (using firefox), open firebug’s panels, switch to the ‘Net’ panel (you will have to enable it).The Net panel will show you the HTTP requests and responses along with the time spent in each. This will give you a lot of information quickly over which requests are hanging splunk for a few seconds, and which are blameless. etc

What are the buckets? Explain splunk bucket lifecycle?

Splunk places indexed data in directories, called as “buckets”. It is physically a directory containing events of a certain period. A bucket moves through several stages as it ages:

  • Hot: Contains newly indexed data. Open for writing. One or more hot buckets for each index.
  • Warm: Data rolled from hot. There are many warm buckets.
  • Colld: Data rolled from warm. There are many cold buckets.
  • Frozen: Data rolled from cold. The indexer deletes frozen data by default, but you can also archive it. Archived data can later be thawed (Data in frozen buckets is not searchable)

By default, your buckets are located in $SPLUNK_HOME/var/lib/splunk/defaultdb/db. You should see the hot-db there, and any warm buckets you have. By default, Splunk sets the bucket size to 10GB for 64bit systems and .50MB on 32bit systems.

What is the different between stats and eventstats commands?

Stats command generate summary statistics of all existing fields in your search results and save them as values in new fields. Eventstats is similar to the stats command, except that aggregation results are added inline to each event and only if the aggregation is pertinent to that event.
eventstats computes the requested statistics like stats, but aggregates them to the original raw data.

Who are the biggest direct competitors to Splunk?

logstash, Loggly, Loglogic, sumo logic etc

Splunk licenses specify what?

How much data you can index per calendar day

How does splunk determine 1 day, from a licensing perspective?

Midnight to midnight on the clock of the license master

How are forwarder licenses purchased?

They are included with splunk, no need to purchase separately

What is command for restarting just the splunk webserver?

Splunk start splunkweb

What is command for restarting just the splunk daemon?

Splunk start splunkd

What is command to check for running splunk processes on unix/Linux ?

ps aux | grep splunk

What is Command to enable splunk to boot start?

$SPLUNK_HOME/bin/splunk enable boot-start

How to disable splunk boot start?

$SPLUNK_HOME/bin/splunk disable boot-start

What is sourcetype in splunk?

Sourcetype is splunk way of identifying data

How to reset splunk admin password?

To reset your password log in to server on which splunk is installed and rename passwd file at below location and then restart splunk. After restart you can login using default username:admin password:changeme

How to disable splunk launch message?

Set value OFFENSIVE=Less in splunk_launch.conf

How to clear splunk search history?

Delete following file on splunk server

1 $splunk_home/var/log/splunk/searches.log

—— Related Page: Splunk Tool ——

What is btool or how will you troubleshoot splunk configuration files?

Splunk btool is a command line tool that helps us to troubleshoot configuration file issues or just see what values are being used by your Splunk Enterprise installation in existing environment.

What is difference between splunk app and splunk add on?

Basically both contains preconfigured configuration and reports etc, but splunk add on do not have visual app. Splunk apps have preconfigured visual app.

 What is .conf files precedence in splunk?

File precedence is as follows:

  • System local directory — highest priority
  • App local directories
  • App default directories
  • System default directory — lowest priority

What is fishbucket or what is fishbucket index?

It’s a directory or index at default location /opt/splunk/var/lib/splunk .It contains seek pointers and CRCs for the files you are indexing, so splunkd can tell if it has read them already. We can access it through GUI by seraching for  “index=_thefishbucket”

How do i exclude some events from being indexed by Splunk?

This can be done by defining a regex to match the necessary event(s) and send everything else to nullqueue. Here is a basic example that will drop everything except events that contain the string login In props.conf:

1

2

.

4

5

6

.

.

9

10

11

12

13

14

15

.

1.

18

19

——————————————————————–

[source::/var/log/foo]

# Transforms must be applied in this order

# to make sure events are dropped on the

# floor prior to making their way to the

# index processor

 

TRANSFORMS-set= setnull,setparsing

 

————————————————————————-

In transforms.conf

————————————————————————————–

[setnull] REGEX = . DEST_KEY = queue FORMAT = nullQueue

 

[setparsing]

REGEX = login

DEST_KEY = queue

FORMAT = indexQueue

—————————————————————————————

How can I tell when splunk is finished indexing a log file?

By watching  data from splunk’s metrics log in real-time.

. index=”_internal” source=”*metrics.log” group=”per_sourcetype_thruput” series=”” | eval MB=kb/1024 | chart sum(MB)

or to watch everything happening split by sourcetype….

1 index=”_internal” source=”*metrics.log” group=”per_sourcetype_thruput” | eval MB=kb/1024 | chart sum(MB) avg(eps) over series

And if you’re having trouble with a data input and you want a way to troubleshoot it, particularly if your whitelist/blacklist rules arent working the way you expect, go to this URL: HTTPS://YOURSPLUNKHOST:8089/SERVICES/ADMIN/INPUTSTATUS

 How to set the default search time in Splunk 6?

To do this in Splunk Enterprise 6.0, use ui-prefs.conf. If you set the value in $SPLUNK_HOME/etc/system/local, all your users should see it as the default setting. For example, if your $SPLUNK_HOME/etc/system/local/ui-prefs.conf file includes:

1

.

3

[search]

dispatch.earliest_time = @d

dispatch.latest_time = now

The default time range that all users will see in the search app will be today.
The configuration file reference for ui-prefs.conf is here: HTTP://DOCS.SPLUNK.COM/DOCUMENTATION/SPLUNK/LATEST/ADMIN/UI-PREFSCONF

What is dispatch directory?

$SPLUNK_HOME/var/run/splunk/dispatch contains a directory for each search that is running or has completed. For example, a directory named 1434308943.358 will contain a CSV file of its search results, a search.log with details about the search execution, and other stuff. Using the defaults (which you can override in limits.conf), these directories will be deleted 10 minutes after the search complfetes – unless the user saves the search results, in which case the results will be deleted after . days.

What is difference between search head pooling and search head clustering?

Both are features provided splunk for high availability of splunk search head in case anyone search head goes down. Search head cluster is newly introduced and search head pooling will be removed in next upcoming versions. Search head cluster is managed by captain and captain controls its slaves. Search head cluster is more reliable and efficient than search head pooling.

If I want add/onboard folder access logs from a windows machine to splunk how can I add same?

Below are steps to add folder access logs to splunk

  1. Enable Object Access Audit through group policy on windows machine on which folder is located
  2. Enable auditing on specific folder for which you want to monitor logs
  3. Install splunk universal forwarder on windows machine
  4. Configure universal forwarder to send security logs to splunk indexer

How would you handle/troubleshoot splunk license violation warning error?

License violation warning means splunk has indexed more data than our purchased license quota. We have to identify which index/sourcetype has received more data recently than usual daily data volume. We can check on splunk license master pool wise available quota and identify the pool for which violation is occurring. Once we know the pool for which we are receiving more data then we have to identify top sourcetype for which we are receiving more data than usual data. Once sourcetype is identified then we have to find out source machine which is sending huge number of logs and root cause for the same and troubleshoot accordingly.

What is MapReduce algorithm?

MapReduce algorithm is secret behind splunk fast data searching speed. It’s an algorithm typically used for batch based large scale parallelization. It’s inspired by functional programming’s map() and reduce () functions.

How splunk avoids duplicate indexing of logs?

At indexer splunk keeps track of indexed events in a directory called fish buckets (default location /opt/splunk/var/lib/splunk). It contains seek pointers and CRCs for the files you are indexing, so splunkd can tell if it has read them already. – See more at: https://www.learnsplunk.com/splunk-indexer-configuration.html#sthash.t.ixi19P.dpuf.

What is difference between splunk SDK and splunk framework?

Splunk SDKs are designed to allow you to develop applications from the ground up and not require Splunk Web or any components from the Splunk App Framework. These are separately licensed to you from the Splunk Software and do not alter the Splunk Software.
Splunk App Framework resides within Splunk’s web server and permits you to customize the Splunk Web UI that comes with the product and develop Splunk apps using the Splunk web server. It is an important part of the features and functionalities of Splunk Software, which does not license users to modify anything in the Splunk Software.

What would you use to edit contents of the file in Linux? Describe some of the important commands mode in vi editor?

Various editors in Linux file system- vi,jedit, ex line editor or nedit
Two important modes are as below – We can press ‘Esc’ to switch from one mode to another. However, we can press ‘i’ to enter insert mode-

  • Command mode
  • Insert mode

How do you log in to a remote Unix box using ssh?

ssh your_username@host_ip_address

What would you use to view contents of a large file? How to copy/remove file?  How to look for help on a Linux?

  • tail -.0 File1 it would show last 10 rows
  • copy file- cp file_name .
  • Remove file command- rm -rf directory_name
  • Manual/help command – man command_name

How you will uncompressed the file? How to install Splunk/app using the Splunk Enterprise .tgz file

  • tar -zxvf file_name.tar.gz
  • tar xvzf splunk_package_name.tgz -C /opt
  • default directory /opt/splunk

 What does grep() stand for? how to find difference in two configuration files?

  • General Regular Expression Parser.
  • egrep -w ‘word1|word2’ /path/to/file
  • diff -u File_name1.conf File_name2.conf

Talk about Splunk architecture and various stages

Data Input Stage: [accessed from the source and turns it into 64k blocks- metadata includes keys like hostname, source, source type, _time] Data Storage Stage: [Parsing & Indexing] Data Searching Stage: [data analysis using search head] Universal forward > Heavy Forward (Optional) > Indexers > Search head
Deployment Server- [Use to distribute configuration file/Apps] License master- [Use to keep track of our indexing utilizations]

Types Of Splunk Forwarder?

⦁ Universal forwarder(UF) -Light weight Splunk instance- can’t parse or index data
⦁ Heavy forwarder(HF) – full instance of Splunk with advance functionality of parsing & indexing

Precedence in Splunk and discuss some of the important conf files

  • When 2 or more stanzas specify a behaviour that effects same item, then precedence is calculated based on stanza ASCI
  • We can use priority key to specify highest/lowest priority etc

Important conf files

  • props.conf
  • indexes.conf
  • inputs.conf
  • transforms.conf
  • server.conf

What is summary index in Splunk?

The Summary index is default summary which is used to store data as a result of scheduled searches over period of time. It helps to efficiently process large volume of data.

What are types of field extraction. How to mask a data in either of case

  • Search time field extraction
  • Index time field extraction

 What do you mean by roles based access control?

It is very crucial to provide only appropriate roles to appropriate team. This will prevent unauthorized access to any app or data for that matter.
It is very important that we provide access very meticulously and limit their search capability by providing access to only those indexes which needs to be.

What is null queue

Null queue is an approach to trim out all the unwanted data.

Trouble shooting Splunk errors in splunk

  • See if the process is running – ./splunk status
  • IF running go and check log for any latest errors using below command- tail 20 $SPLUNK_HOME/var/log/log/splunk/splunkd.log
  • Splunk crash also happens because of low disk memory- sheck if tere is any crash*log files
  • Check log,splunkd.log,metrics.log or web*log
  • In order to check any conf file related concerns use btool – ./splunk btool props list –debug >/tmp/props.conf
  • Search for errors and warning by typing- Index=_internal | log_level=error OR log_level=warn*
  • Check for the search directory for recent search at – $SPLUNK_HOME/var/ran/splunk/dispatch
  • Enable debug mode.Splunk software has a debug parameter (–debug) that can be used when starting splunk
  • Check for log file OR use below search query – index=_introspection

What are the types of search modes supported in splunk?

  • Fast mode
  • Verbose mode
  • Smart mode

What is difference between source & source type

Source – Identifies as source of data
Source type- in general it refers to data structure of events or format of data
Different sources may have same source type
Command to restart splunk web server
/opt/splunk/bin/splunk start splunkweb

How to use btool for splunk conf file approach

/opt/splunk/bin/splunk cmd btool input list

Create new app from templet

/opt/splunk/bin/splunk create app New_App -templet sample_App

Rollback your aplunk web configuration bundle to previous version

/opt/splunk/bin/splunk rollback cluster-bundle

To specify minimum disk usage in splunk

./splunk set minfreemb = 20000
./splunk restart

Command to change splunkweb port to 9000 via CLI

./splunk set web-port 9000

How to turn down a peer without affecting any other peer of cluster?

./splunk offline

How to show which deployment server in configured to pull data from?

./splunk show deploy-poll

CLI to validate bundles

./splunk validate cluster-bundle

How to see all the license pool active in our Splunk environment?

./splunk list license

Which command is used to the “filtering results” category- explain?

“search”, “where”. “Sort” and “rex”

What is join command and what are various flavours of join command.

  • Join command is used to combine result of a subsearch with result of a search- One or more fields must be common to each results set
  • Inner join- result of inner joint do not include event with NO MATCH
  • Left/Outer join- It include events in the main search and matching having correct field values

|join type=inner P_id [search source=table2] {}

Tell me the syntax of Case command

It’s a comparison & conditional function
Case (X,”Y”,…)
X- Boolean expression that are evaluated from first to last. The function defaults to NULL if non is true
| eval description=case(statsu==20,”OK”,status==404,”NOT FOUND”

 Which role can create data model

Admin & power user

Splunk latest version

Welcome to Splunk Enterprise 2 – Splunk Documentation

Which app ships with splunk enterprise

  • Search & reporting
  • Home App

 

How do we convert unix time into string and string back to unix time format

strftime(X,Y) :  Unix to string as per format
strptime(X,Y) : String to UNIX

How do we find total number of host or source type reporting splunk instance. Report should consider host across the cluster

|metadata type=hosts index=*  | convert ctime(firstTime) | convert ctime(lastTime) |convert ctime(recentTime)

 What is Splunk? Why Splunk is used for analysing machine data?

Splunk is a platform for analysing machine data generated from various data sources such as network, server, IOT and so on. Splunk is used for analysing machine data for following reasons

  • Business Intelligence
  • Operational visibility
  • Proactive monitoring
  • Search and Investigation

Who are the competitors of Splunk in the market? Why is Splunk efficient?

Biggest competitors of Splunk are as follows

  • Sumo logic
  • ELK
  • Loglogic

Splunk is efficient as it comes with many inbuilt features like visualization, analysis, apps, Splunk can also be deployed in cloud through Splunk cloud version. Other platforms require plug in to get additional features.

 What are the benefits of getting data using forwarders?

  • Data is load balanced by default
  • Bandwidth throttling
  • Encrypted SSL connection
  • TCP connection

 What happens if License master is unreachable?

License Slave sets .2 hour timer and try to reach License Master, after which search is blocked in specific license slave until Master is reachable.

What is the command to get list of configuration files in Splunk?

Splunk cmd btool inputs list –debug

What is the command to stop and start Splunk service?

  • ./splunk stop
  • ./splunk start

What is index bucket? What are all stages of buckets?

Indexed data in Splunk is stored in directory called bucket. Each bucket has certain retention period after which data is rolled to next bucket. Various stages of buckets are

  • Hot
  • Warm
  • Cold
  • Frozen
  • Thawed

 What are important configuration files in Splunk?

  • Props.conf
  • inputs.conf
  • outputs.conf
  • transforms.conf
  • indexes.conf
  • deploymentclient.conf
  • serverclass.conf

What is global file precedence in Splunk?

  • System local directory – highest priority
  • App local directory
  • App default directory
  • System default directory – lowest priority

What is difference between stats and timechart command?

sp12

What is lookup command?

Lookup command is used to reference fields from an external csv file that matches fields in your event data.

What is the role of Deployment server?

Deployment server is a Splunk instance to deploy the configuration to other Splunk instances from a centralized location.

What are the default fields in Splunk?

  • Host
  • Source
  • Source type
  • _time
  • _raw

What is Search Factor (SF) and Replication Factor (RF) in Splunk?

The search factor determines the number of searchable copies of data maintained by an index cluster. The default search factor is 2.Replication factor is the number of copies of the data cluster maintains. The search factor should be always less than or equal to the Replication factor.

What is the difference between Splunk apps and add-ons?

Splunk apps contain built-in configurations, reports, and dashboards, Splunk add-ons contain only built-in configurations and not visualization (reports or dashboards)

How can you exclude some events from being indexed in Splunk?

This can be done by using nullQueue in transforms.conf file.
For Example:
transforms.conf
[setnull] REGEX = <regular expression>
DEST_KEY = queue
FOMAT = nullqueue

Where does Splunk default configuration file located?

It is located under $Splunkhome/etc/system/default

Discuss about the sequence in which splunk upgrade can be done in a clustered environment?

  • Upgrade Cluster Master
  • Upgrade Search Head Cluster
  • Upgrade Indexer Cluster
  • Upgrade Standalone Indexers
  • Upgrade Deployment  Server

How do we sync and deploy configurational files and updates across multiple deployment servers in a large multi-layered clustered?

On one of the deployment server, use below commands-

  • $cd ~
  • $./DS_sync.sh
  • $/opt/splunk/bin/splunk reload deploy-server -class ServerClassName

Who is responsible for the right quantity of data?

License master in Splunk for correct facts to enter. Splunk license is constructed on the facts capacity which is on the platform in 24 hours in the window. It is necessary to confirm the atmosphere should be in the control of the obtained volume.

What restrict to find data?

When the license master is unavailable the data goes in the indicator is not concerned. The facts continue to go into the installation of Splunk. The indicator will carry on to move into the Splunk deployment. A warning message is seen on the upside of ahead of the search or the UI of the web to increase the capacity and to decreases the capacity of data.

What happens when to increase the data limitation?

Answer – A message of License violation error is shown on the screen. The warning od the license continues for .14 days. We will receive 5 warnings in 30 days on the window in the trading license. The indicators search results and reports discontinue the activation.

What is used for building a ranking?

Data models are for generating a ranking model of the data. When we contain a huge and numerous unformed data. At the time when you wish to use details without the use of difficult search questions.

  • For generating the reports of sale
  • For setting the entry
  • For certification

How to map the keys and values?

With the help of Lookups command. To enhance the occasion facts by attaching field-value mixture form the lookup tables. Lookup test the field-value mixture of the occasion facts with the help of field-value mixture in the outer tables of lookups. Splunk software attaches the field-value mixture of the tables to the occasion.

What is used to process huge data sets?

The Mapreduce algorithm is a programming model. A map is a function to collect the data side by side for fashion implementation. A map function is very necessary to collect the data fro your search. Now the reduce function gets the results from the function of the map for proceeding.

Define Splunk Success Framework?

It is an adaptable group of applications for increasing and getting faster speed for you has in the data with the help of Splunk software. The organization’s application contains all necessary things to apply and support a Splunk environment to target your data.

Name the items for migration?

Configuration of Custom application – The installer keeps in the default inventories on the collection of the audience. The runtime adjustment is done at the time of running the apps on the head pool of search.
Configuration for the personal user – The installer prints the user arrangements of the head. The head copies the settings to each collection by the simple way of reproducing arrangements.

What is used to track internally?

A sub-directory named fish bucket in Splunk. to guide the content distance of your file from the place to start guiding. By using these features such as seek pointers and CRCs. CRC is to notice the specific entry. And seek pointer are the characteristics of the data of the place.

How the company manages the data?

Data and hints are absorbed by the Splunk company. And changes in the search information as the events. The main procedure is viewed in the data pipeline to perform on the data at the time of recording. The procedure represents the occasions processing. You can connect the occasion with information items to improve the benefits after the data is managed into an occasion.

Explain Occasion management?

Printing and recording are the two platforms of occasion management. With the help of a parsing pipeline like the block. At the time of examining the software of Splunk disconnects the blocks into the occasion. The occasion goes to the recording pipeline. The software of Splunk changes the data at the time when we use both.

What is the use of DB connect?

To construct the columns, rows, and tables from details straight in the Splunk company to guide the data. It helps to join the details of relative data to Splunk company. And to make the facts adaptable but the Splunk company. It can revert the facts to the details of the relative. You can detect and view the relative’s data.

Why DB connect is important?

  • To obtain instantly details into Splunk
  • To conduct on the fly lookups for the details of warehouses
  • To guide the layout of saved data details in amount
  • To write the companies data into details in amount
  • To see the data again the position of proving.
  • To measure, allot and check details of jobs to restrict the excess load.

How to ignore the incoming data?

There are two steps for ignoring data.
First step

  • First, explain to Splunk the data you want.
  • And then you decide what Splunk has to do.
  • Now update props.conf file and join a rule to notice the root of the details
  • After this tell what to do.
  • Now use the transform keyword in the same rule.

Second step

  • Confirm to the Splunk that in which way you want to change the details
  • The details are guided by the Splunk by default.
  • You can transfer the details to nullQueue.

Explain Transaction

It is situated on the occasion to connect different limitations. It is built by the raw text of every member, the details of premature members and the group of every area of every person. It contains two areas such as duration and event count.
Event count is for viewing the counts of occasion in the transaction
Duration is for the contrast of the first and last occasion.

How to remove all the events?

With the help of the Dedup command to assume the identity mixture of values for every area which the user described. It deletes the false principles from the outcome. To show the new record for a special event. It gives back the first key principles for a special field.

Mention the use of the Dedup command of Splunk?

At the time of finding a huge capacity of data we can keep away the Dedup command. The memory is maintained when the command uses the data for each occasion. It maintains every area with the supreme number and size. When the user immediately finds the record or principles then use the command and it showed occasionally a single record or principles for a single ID.

What is single-instance storage?

Data deduplication is the way to remove unnecessary prints of facts and remove the saved overhead. It confirms a special example od data kept on saved media like tape, flash or light. The pieces of unnecessary data are placed again with the point of special facts of details.

Who analyzes data in a backup system?

With the help of online deduplication. Unnecessary prints are removed as the details have corresponded to the backup storehouse. Inline needs a little backup storehouse it occurs traffic jams.

Describe pivot?

Answer – it is used for pulling and dripping attachments to use the preset data models and items. Pivot tool helps data models to describe and separation and to fix the characters for the occasion data which you want.

What is used for collecting the logs?

The element named Splunk’s Forwarder. To gather the facts from the machine you have to manage it by the use of the scheduling Splunk’s forwarders are the separation of the principle of Splunk’s example.

Where to keep the listed data in directories?

In Splunk’s buckets. It is a directory includes the occasion.

  • Hot = includes fresh listed data that can be written.
  • Warm – includes facts moved out from hot buckets.
  • Frozen- moved out from a cold bucket
  • Cold moved out from the warm bucket.

Name the disadvantages of Splunk?

  • It verifies the cost of huge data capacity.
  • The dashboard is not impressive but it is practical
  • You have to take Splunk coaching because it is difficult like a multi-leveled constructer.
  • To recognize the searches is very complicated especially daily appearance and pattern of search.

Why use only Splunk? Why can’t I go for something that is open source?

This kind of question is asked to understand the scope of your knowledge. You can answer that question by saying that Splunk has a lot of competition in the market for analyzing machine logs, doing business intelligence, for performing IT operations and providing security. But, there is no one single tool other than Splunk that can do all of these operations and that is where Splunk comes out of the box and makes a difference. With Splunk you can easily scale up your infrastructure and get professional support from a company backing the platform. Some of its competitors are Sumo Logic in the cloud space of log management and ELK in the open-source category. You can refer to the below table to understand how Splunk fares against other popular tools feature-wise.

Which Splunk Roles can share the same machine?

This is another frequently asked Splunk interview question which will test the candidate’s hands-on knowledge. In case of small deployments, most of the roles can be shared on the same machine which includes IndexerSearch Head and License Master. However, in case of larger deployments the preferred practice is to host each role on stand-alone hosts. Details about roles that can be shared even in case of larger deployments are mentioned below:

Strategically, Indexers and Search Heads should have physically dedicated machines. Using Virtual Machines for running the instances separately is not the solution because there are certain guidelines that need to be followed for using computer resources and spinning multiple virtual machines on the same physical hardware can cause performance degradation.

  • However, a License master and Deployment server can be implemented on the same virtual box, in the same instance by spinning different Virtual machines.
  • You can spin another virtual machine on the same instance for hosting the Cluster master as long as the Deployment master is not hosted on a parallel virtual machine on that same instance because the number of connections coming to the Deployment server will be very high.
  • This is because the Deployment server not only caters to the requests coming from the Deployment master, but also to the requests coming from the Forwarders.

How to connect two BLE devices?

With the help of Btool, it performs by interacting with the CC2640R2F organized to behave as a web processor with the HCI agent special command. It helps to run the Host trail representative application.

What is known as a central resource for searching?

Search head clustering, The persons can be changeable and can run a similar exploration, can view similar control panel and entry a similar outcome from the collection of any person. To gain compatibility the search head in gathering is to contribute the arrangements and applications and to fix the job.

How to recover a non-functioning group?

Without the installation of a static master at that time the group loss the majority. Unless the representative will not join again the group the cluster will not take action. The representative select a master at the time of the majority is achieved. And then the cluster begins to functions.

  • Runtime arrangements
  • The reports planning

Effect of a non-functioning cluster?

if you lose the majority then you cannot select the master the representative carry on the function as individual search heads. They can serve only Adhoc searchers. Planned exploration and alerts may not run if the planned function is downgraded.

Name the features of a knowledge object?

  • It is to contribute and by the correct list of people in the company.
  • Formalized occasion data by applying the knowledge object naming agreement and different the matching objects.
  • The strategies to increase the enhancement of search and pivot.
  • Construct data models for pivot customers.

Name the uses of Knowledge object?

By using the software of Splunk this object is to generate and stored. It includes similar details or it not used for every customer. To manage all the problems we need to handle objects.

  • For fields and field removal is the primary layer. To removes automatically the fields from the IT facts
  • Occasion types and Transactions for listing together with the favorite group of the same occasion.
  • Lookups and workflow action to classify the knowledge object that is expanded in the facts in different methods.

What is used to conduct the group of field details?

By the use of Tags and aliases is for managing and establishing the details. For combining the principle and for providing the given field tags to reverse various features of identity. When you have various origins to use various field names to mention similar data then formalize the facts with the use of aliases.

What is Splunk Administration?

Splunk is mainly used to make machine data reachable, utilizable & helpful to everyone. It also helps to examine the massive volume of machine data that is produced by technology infrastructure & IT systems in virtual, physical & in the cloud.

How does Splunk help in the Organization?

Most of the corporations are investing in this technology as it helps to examine their end-to-end infrastructures, shun service outages & gain real-time critical insights into client experience, key business metrics & transactions.

 What are the pre-requisites required to take Splunk Administration training?

To take Splunk Administration training, there are no particular pre-requisites but the desired aspirants who are having subject knowledge skills in system administration, Linux, windows etc will be added advantage.

Who can take Splunk Administration training in the reputed institutions?

 Desired aspirants who are aspiring to become Splunk Administration expertise in the current IT world can take up this course. IT Employee, Information Security Professionals, Business Analytics professionals, Splunk beginners, Big Data technology expertise can take the course to have a bright career future.

 What is Splunk free?

 Splunk Free is completely a free version of Splunk. It is a free license that will never expire & will allow you to index with 500 MB per day. If the users required more amount of data, then one can purchase an Enterprise license.

 How to configure Splunk?

 Behind the working of Splunk, the Splunk configuration files are the main brains where it controls the entire behaviour of Splunk. All the respective files are saved with .conf extension & with the appropriate access, one can easily edit or read as well.

What are the components of a Splunk enterprise deployment?

 Here are the components of Splunk enterprise deployment that is Indexer, Search head, Forwarder, Deployment server, Functions at a glance, Indexer replication & indexer cluster.

 How is a career path in Splunk Administration?

Splunk Administration career is extremely lucrative where the experts are getting the highest paid salary range when compared to other technologies. The various job roles in the Splunk careers are high such as system engineers, Software engineers, programming analysts, security engineers, solutions architects & technical services manager.

Why choose Splunk when compared to other open-source option?

Splunk administration is facing tough competition in the terms of data analysis, enhancing business intelligence & also provides security & managing IT operation.

What do you mean by license violation in Splunk?

A license violation is the simple term word which will be occurred when the data limit exceeds. There will at least 5 warnings for the commercial licensing & the free version has 3 warnings.

(OR)

This situation takes place when the data limit within the platform exceeds. When you are using commercially licensed software, it generates 5 alerts in the platform. In case you are using a free version of the software, it regenerates only three alerts.

What will you do in case License Master is unreachable?

 In case the license master is unreachable, it is just not possible to search for the lost data within the platform. Though the data coming into the indexer would not be affected, and it would continue to move into Splunk deployment server. In addition, the indexer will continue to index the data. The only change that you will notice is an alert message on the search head alerting that the number of index volume has been exceeded. For tackle the situation, you have to either check the volume of data flowing or get a higher capacity of license. The indexing never stops when only the search function terminations.

(OR)

If the license master is unreachable then there is no possibility to search the data. However, there will not be any effect & it will have a continuous flow of the Splunk deployment & the indexing will continue the index. Moreover, there will also be a warning message when the indexing volume will get exceed.

Why is Splunk administration used for the analysis of machine data?

 Splunk administration is considered as the great tool which will allow the visibility of data which will be generated from machines such as hardware devices, IoT devices, servers and other sources. As it helps to provide crucial insights into IT operations, it is used for analyzing the machine data with ease.

How do you explain the working of Splunk?

 The working of the Splunk administration is based on the three components mainly that are forwarded, indexer & search head.

 What is the main function of the 3 components?

 The main function of the forwarder is to collect the data from various sources & then it will send to the indexers. Then indexer will store the received data in the cloud or in the host machine for future use. Search Head components usually perform many functions such as searching, indexing, visualizing the data.

 What is the use of deployment server in Splunk administration?

 The deployment server usage is more efficient which probably controls the host-independent connotations, path naming conventions, machine naming conventions from a central location.

Does Splunk administration support user authentication systems?

 The Splunk administration will support the various authentication systems such as Splunk internal authentication with role-based user access, LDAP, A scripted authentication API for use with an external authentication system like PAM or RADIUS, Multifactor authentication & Single Sign-on.

How to discover or modify the current LDAP configurations?

 Follow certain steps to discover or modify the current LDAP configurations: Click access control button under the users & authentication. Then click LDAP & then from the respective page, one can easily control specific strategies, can also view the information & also track the LDAP mappings to the Splunk roles.

 What is Splunk cloud administration?

 Here, mostly all the tasks will be handled by the Splunk cloud administrator to use the data in an efficient manner.

(OR)

In order to use all data effectively, all necessary tasks are supposed to be handled by this cloud administration.

How to install & upgrade Splunk enterprise?

 First, the planning of the installation process should be efficient & confidential. Then later, estimate your hardware requirements. The third step is to install the Splunk enterprise on Windows, Unix, Linux or MoS etc & it can also be upgraded to the earlier version if it is required.

(OR)

The installation of Splunk enterprise should be a confidential one and for the same, you need to check the hardware requirement such that the platform can be implemented easily. After checking, you can install the enterprise on operating systems such as Windows, Linux, MoS, and others. In addition, you can also upgrade the enterprise when needed from time to time.

What do you understand by Splunk Administration? What is the latest version of the tool Splunk?

 Splunk can be regarded as a platform that makes data accessible to users. You can have easy visibility of data generated from hardware devices, networks, servers and other sources. The Splunk administration helps to analyze plenty of data that is used in various plenty of IT operations, security, threat and for detecting any fraud cases. Splunk is a vital tool that is used in businesses for data analytics.

The latest version of the tool is Splunk 6.3.

Explain components of Splunk architecture?

 There are four components of this architecture namely –

1 Indexer – It helps to index machine data
2 Forwarder – It helps forward logs to an indexer
3. Search Head – It provides a GUI or graphical user interface for searching while using this tool.
4. Deployment server – It helps to manage the tool components in a distributed environment

How does this platform work?

 This platform works with three components as mentioned above where forwarder assembles data from various sources and forwards it to the indexer. Thereafter, the indexer holds the data for some time in its host storage or machine that acts like cloud storage.

Then, the search head can be used for various purposes such as analyzing, visualizing and searching the data that is stored in the indexer.

In the case of a larger platform, the fourth component is also involved such as Deployment Server. All these components help to act as an antivirus server providing help to the platform for which it is used. These components help to convert the collected data into results that help to solve the posted query. The final information is displayed via a chart or report that is understandable by the mass.

How many types of Splunk forwarder are there?

 The Splunk forwarder is of two types, and they are namely –

Heavy Forwarders – It actually works as an intermediate forwarder that analyzes the data before it is sent to the indexer.
2. Universal Forwarders – It also helps to process any data before it is forwarded to the indexer.

Do you have any idea about how many types of Splunk licenses are there?

 There are mainly six types of licenses pertaining to this platform. They are as follows –

Free license
2. Beta license
Enterprise license
4. Forwarder license
5. Licenses for cluster members
6. Licenses for search heads

How this platform helps in business organizations? Is it possible to use anything open source?

 In these days of advanced technology, most of the organizations are investing in using this platform that aids to analyze end-to-end infrastructure and data. In addition, it also helps in various transactions in the business organization. The platform is used for various IT operation and providing adequate security. In this regard, this Splunk is one of the best software that helps in carrying out all operations.

With the help of this platform, you are able to enhance the infrastructure and provide a good backup for the organization. This platform stands out as the best one when compared to any other open-source software that is able to manage all functions efficiently. The open-source services require plugins for supporting customer support and also to carry out any data input function.

To consider some of its competitors, Sumo Logic is another platform and ELK is another one that can be considered as open-source software.

Can you explain what alerts are available in Splunk?

 Alerts are action in this platform that is generated by any saved search results shown from time to time. As soon as the alerts are shown, other subsequent actions start to occur. For example, it can send an email when an alert is triggered suddenly. The alerts are mainly of three types and they are as follows –

  1. Pre-result alerts –It is a common type of alert that runs for most of the time. The alert is set in such a way that whenever a result comes out for any search, the alerts are triggered.
    2. Rolling-window alerts –These types are hybrid alerts that are shown on real-time search and do not come up with every search result that the platform shows. All the events are well examined within the rolling window and gives out a specific time in which the required event is met by the window.
    3. Scheduled alerts – This is the third category of alerts that mainly functions to assess the history of search results over a given span of time. In this case, you can set time span, schedule and trigger the condition as an alert.

What are the types of common port numbers that are allotted to this platform?

 The common port number’s function depending on the service on which this platform functions. These are –

  • Splunk Management Port – .8089
  • Splunk Web Port – 8000
  • Splunk Indexing Port – 999.
  • Splunk network port – 514

Draw a comparison between Splunk and spark?

Considering the deployment area, the Splunk help to collect data that is generated by the machine and make it accessible to a larger audience. It is a proprietary kind of tool that works in the streaming mode.

On the contrary, spark help in-memory applications. It is basically an open-source software which works both in a streaming and batch mode.

State some advantages of analyzing data into Splunk via forwarders.

 Some of the benefits of using this platform are TCP connection, proper bandwidth and protected SSL connection when trying to transfer data from a forwarder to indexer. After this, in case the indexer is found functioning slowly due to network issues, the data can be forwarded to another index within a short time. In addition to this, the forwarder takes into account the events before forwarding it in order to get a backup of the data.

How does License Master help in Splunk?

This license in the platform is responsible for assuring that the right amount of data is forwarded to the index. The license functions according to the amount of data flow into it within .24 hours. It further helps to ensure that the environment of the platform stays within the limit of the volume of data that it receives.

If you wish to use a free version of Splunk, which features are lacking?

 The free version of the platforms lacks certain features such as

  1. Authentication and scheduled searches
    2. Distributed search
    3. Deployment management of the platform
    4. Forwarding of data into TCP or HTTP

How can you get an IP address from the logs while using this platform?

 The regular expression can be used for extracting the same and the steps are mentioned below:

rex field=_raw  “(?d+.d+.d+.d+)”

OR

rex field=_raw  “(?([0-9]{1,3}[.]){3}[0-9]{1,3})”

How can you solve any issues in this platform pertaining to its performance? 

  • Some of the probable processes are mentioned below that help figure out any performance issue:
  • You can look for the Splunk log for the presence of any errors. In addition, you also have to check the quality of performance of the server such as memory usage, disk input, and output, etc.
  • You can try to install Splunk on Splunk application and look for the occurrence of any error present on the dashboard.
  • Check for the number of saved search attempts and the limit of the system should be that it does not exceed the limit of search saved. In that case, it fails to provide the required alert to the platform.
  • If using Firefox browser, you can install Firebug into the browser extension. After installation, enable it and then try to log onto the above-mentioned platform and open firebug’s panels. Then go to a Net panel that shows the detailing of HTTP and the responses that it has made.

Name some configuration file relating to the platform.

 Some of the important files are as follows:

  1. props.conf
    2. indexes.conf
    3. inputs.conf
    4. transforms.conf
    5. server.conf

What do you understand by Splunk app?

 It is an application that has lists of configurations, search results, dashboards, etc. that works within the above-mentioned platform.

What steps to follow in case you forget the password of the platform?

To reset the password of the platform, you have to log in to the server o which the platform has been installed and after that try to rename password file a definite location mentioned further. After this, you need to restart the platform and then again try to login with the help of the default username then choose admin password and choose for ‘$splunk-homeetcpasswd’ location on the platform.

Can you delete the search history of the platform?

 You, the search history can be cleared, and for the same, you will have to delete ‘$splunk_home/var/log/splunk/searches.log’ found on the server of Splunk.

How can you disable launch message that pops up on the platform?

 You need to set a value like ‘OFFENSIVE=Less in splunk_launch.conf’ on the platform to disable the message service.

Draw a difference between Splunk app and Splunk add-on?

A common factor between both the application and add-on is that it has preconfigured configuration. But the point of difference is that the later one lacks visual feature which is preconfigured in Splunk apps.

What is the use of Splunk alert and what are the various options available while setting up the alert on the platform?

 These alerts created within the platform helps to know about any erroneous condition that might arise within the system. This situation comes up when after failed login attempts, a notification email is sent to the admin within a twenty-hour span.

What is Splunk tool?

Splunk is a powerful platform for searching, analyzing, monitoring, visualizing and reporting of your enterprise data. It acquires important machine data and then converts it into powerful operational intelligence by giving real-time insight to your data using alerts, dashboards, and charts, etc.

Explain the working of Splunk?

Splunk works into three phases –

  • The first phase – it gathers data to solve your query from many sources as required.
  • The second phase – it converts that data into results that can solve your query.
  • The third phase – it displays the information/answers via a chart, report or graph, which is understood by large audiences.

What are the components of Splunk?

Splunk has four important components :

  • Indexer – It indexes the machine data
  • Forwarder – Refers to Splunk instances that forward data to the remote indexers
  • Search Head – Provides GUI for searching
  • Deployment Server –Manages the Splunk components like indexer, forwarder, and search head in a computing environment.

What are the types of Splunk forwarder?

Splunk has two types of Splunk forwarder which are as follows:

  1. Universal Forwarders – It performs processing on the incoming data before forwarding it to the indexer.
  2. Heavy Forwarders – It parses the data before forwarding them to the indexer works as an intermediate forwarder, remote collector.

What are alerts in Splunk?

An alert is an action that a saved search triggers on regular intervals set over a time range, based on the results of the search. When the alerts are triggered, various actions occur consequently. For instance, sending an email when a search to the predefined list of people is triggered.

Three types of alerts:

  1. Pre-result alerts: Most commonly used alert type and runs in real-time for an all-time span. These alerts are designed such that whenever a search returns a result, they are triggered.
  2. Scheduled alerts: The second most common- scheduled results are set up to evaluate the results of a historical search result running over a set time range on a regular schedule. You can define a time range, schedule and the trigger condition to an alert.
  3. Rolling-window alerts: These are the hybrid of pre-result and scheduled alerts. Similar to the former, these are based on real-time search but do not trigger each time the search returns a matching result. It examines all events in real-time mapping within the rolling window and triggers the time that specific condition by that event in the window is met like the scheduled alert is triggered on a scheduled search.

What are Splunk buckets? Explain the bucket lifecycle?

A directory that contains indexed data is known as a Splunk bucket. It also contains events of a certain period. Bucket lifecycle includes the following stages

  • Hot – It contains newly indexed data and is open for writing. For each index, there are one or more hot buckets available
  • Warm – Data rolled from hot
  • Cold – Data rolled from warm
  • Frozen – Data rolled from cold. The indexer deletes frozen data by default but users can also archive it.
  • Thawed – Data restored from an archive. If you archive frozen data, you can later return it to the index by thawing (defrosting) it.

What command is used to enable and disable Splunk to boot start?

To enable Splunk to boot start use the following command:

  • $SPLUNK_HOME/bin/Splunk enable boot-start
  • To disable Splunk to boot start use the following command:
  • $SPLUNK_HOME/bin/Splunk disable boot-start

What is the eval command?

It evaluates an expression and consigns the resulting value into a destination field. If the destination field matches with an already existing field name, the existing field is overwritten with the eval expression. This command evaluates Boolean, mathematical and string expressions.

Using eval command:

  • Convert Values
  • Round Values
  • Perform Calculations
  • User conditional statements
  • Format Values

What are the lookup command and its use case?

The lookup command adds fields based while looking at the value in an event, referencing a lookup table, and adding the fields in matching rows in the lookup table to your event.
Example
… | lookup user group user as local_user OUTPUT group as user_group

What is input lookup command?

input lookup command returns the whole lookup table as search results.
For example
…| input lookup intellipaatlookup returns a search result for every row in the table intellipaatlookup which has two field values:
host.
machine_type.

Explain the output lookup command?

This command outputs the current search results to a lookup table on the disk.
For example
…| output look-up intellipaattable.csv saves all the results into intellipaattable.csv.

What commands are included in the filtering results category?

  • where – Evaluates an expression for filtering results. If the evaluation is successful and the result is TRUE, the result is retained; otherwise, the result is discarded.
  • dedup – Removes subsequent results that match specified criteria.
  • head – Returns the first count results. Using head permits a search to stop retrieving events from disk when it finds the desired number of results.
  • tail – Unlike head command, this returns the last result

What commands are included in reporting results category?

  • top – Finds the most frequent tuple of values of all fields in the field list along with the count and percentage.
  • rare – Finds least frequent tuple of values of all fields in the field list.
  • stats – Calculates aggregate statistics over a dataset
  • chart – Creates tabular data output suitable for charting
  • time chart – Creates a time series chart with the corresponding table of statistics.

What commands are included in the grouping results category?

transaction – Groups events that meet different constraints into transactions, where transactions are the collections of events possibly from multiple sources.

What is the use of sort command?


 It sorts search results by the specified fields.
Syntax:
sort [] … [desc]
Example:
… | sort num(ip), -str(URL)
It sort results by ip value in ascending order whereas URL value in descending order.

Explain the difference between search head pooling and search head clustering?


Search head pooling is a group of connected servers that are used to share the load, Configuration and user data Whereas Search head clustering is a group of Splunk Enterprise search heads used to serve as a central resource for searching. Since the search head cluster supports member interchangeability, the same searches and dashboards can be run and viewed from any member of the cluster.

Explain the function of Alert Manager?


 Alert manager displays the list of most recently fired alerts, i.e. alert instances. It provides a link to view the search results from that triggered alert. It also displays the alert’s name, app, type (scheduled, real-time, or rolling window), severity and mode.

What is SOS?

SOS stands for Splunk on Splunk. It is a Splunk app that provides a graphical view of your Splunk environment performance and issues. It has the following purposes:

  • Diagnostic tool to analyze and troubleshoot problems
  • Examine Splunk environment performance
  • Solve indexing performance issues
  • Observe scheduler activities and issues
  • See the details of the scheduler and user-driven search activity
  • Search, view and compare configuration files of Splunk

What is Splunk DB connect?

It is a general SQL database plugin that permits you to easily combine database information with Splunk queries and reports. It provides reliable, scalable and real-time integration between Splunk Enterprise and relational databases.

What is the difference between the Splunk App Framework and Splunk SDKs?

Splunk App Framework resides within Splunk’s web server and permits you to customize the Splunk Web UI that comes with the product and develop Splunk apps using the Splunk web server. It is an important part of the features and functionalities of Splunk Software, which does not license users to modify anything in the Splunk Software.

Splunk SDKs are designed to allow you to develop applications from the ground up and not require Splunk Web or any components from the Splunk App Framework. These are separately licensed to you from the Splunk Software and do not alter the Splunk Software.

What is Splunk indexer and explain its stages?

The indexer is a Splunk Enterprise component that creates and manages indexes. The main functions of an indexer are:

  • Indexing incoming data
  • Searching indexed data

Splunk indexer has the following stages:

Input: Splunk Enterprise acquires the raw data from various input sources and breaks it into 64K blocks and assign them some metadata keys.

These keys include host, source and source type of the data.

  • Parsing: Also known as event processing, during this stage, the Enterprise analyzes and transforms the data, breaks data into streams, identifies, parses and sets timestamps, performs metadata annotation and transformation of data.
  • Indexing: In this phase, the parsed events are written on the disk index including both compressed data and the associated index files.
  • Searching: The ‘Search’ function plays a major role during this phase as it handles all searching aspects (interactive, scheduled searches, reports, dashboards, alerts) on the indexed data and stores saved searches, events, field extractions and views.

What is the use of replacing command?

Replace command performs a search-and-replace on specified field values with replacement values. The values in a search and replace are case sensitive. Syntax:
replace ( WITH )… [IN ]
Example:
… | replace *localhost WITH localhost IN host change any host value that ends with “localhost” to “localhost”.

List .conf files by priority?

File precedence in Splunk is as follows:

  • System local directory: top priority
  • App local directories
  • default directories
  • System default directory: lowest priority

Where is Splunk default configuration stored?

Splunk default configuration is stored at $splunkhome/etc/system/default

How to reset Splunk admin password?

To reset the password, follow these steps:

Log in to server on which Splunk is installed
Rename password file at $splunk-home\etc\passwd
Restart Splunk
After the restart, you can log in using default username: admin

password: changeme.

State the difference between stats and event stats commands?

stats – This command produces summary statistics of all existing fields in your search results and stores them as values in new fields.
event starts – It is same as stats command except that aggregation results are added in order to every event and only if the aggregation is applicable to that event. It computes the requested statistics similar to stats but aggregates them to the original raw data.

What Are the Components Of Splunk/Splunk Architecture?

Below is components of Splunk:

  • Search head – provides GUI for searching
  • Indexer – indexes machine data
  • Forwarder -Forwards logs to Indexer
  • Deployment server -Manges Splunk components in a distributed environment

Which Is Latest Splunk Version In Use?

Splunk 6.3.

What Is A Splunk Forwarder And What Are Types Of Splunk Forwarder?

 There are two types of Splunk forwarder as below:

  • universal forwarder(UF) -Splunk agent installed on the non-Splunk system to gather data locally, can’t parse or index data
  • Heavyweight forwarder(HWF) – a full instance of Splunk with advanced functionality.
  • Generally works as a remote collector, intermediate forwarder, and possible data filter because they parse data, they are not recommended for production systems

What Are Most Important Configuration Files Of Splunk Or Can You Tell Name Of Few Important Configuration Files In Splunk?

  • props.conf
  • indexes.conf
  • inputs.conf
  • transforms.conf
  • server.conf

What Are Types Of Splunk Licenses

  • Enterprise license
  • Free license
  • Forwarder license
  • Beta license

Licenses for search heads (for distributed search)
Licenses for cluster members (for index replication)

What Is Splunk App?

Splunk app is container/directory of configurations, searches, dashboards, etc. in Splunk

Where Does Splunk Default Configuration Is Stored?

 

$splunkhome/etc/system/default

What Features Are Not Available In Splunk Free?


 Splunk free lacks these features:

  • Authentication and scheduled searches/alerting
  • Distributed search
  • Forwarding in TCP/HTTP (to non-Splunk)
  • Deployment management

What Happens If The License Master Is Unreachable?

License slave will start a .24-hour timer, after which search will be blocked on the license slave (though indexing continues). users will not be able to search for data in that slave until it can reach the license master again.

What Is Summary Index In Splunk?

The Summary index is the default summary index (the index that Splunk Enterprise uses if you do not indicate another one).
If you plan to run a variety of summary index reports you may need to create additional summary indexes.

Can You Write Down A General Regular Expression For Extracting Ip Address From Logs?

There are multiple ways we can extract ip address from logs. Below are a few examples.
Regular Expression for extracting ip address:
rex field=_raw “(?d+.d+.d+.d+)”
OR
rex field=_raw “(?([0-9]{1,.}[.]){3}[0-9]{1,3})”

What Is Difference Between Stats Vs Transaction Command?

The transaction command is most useful in two specific cases:
Unique id (from one or more fields) alone is not sufficient to discriminate between two transactions. This is the case when the identifier is reused, for example, web sessions identified by cookie/client IP. In this case, time span or pauses are also used to segment the data into transactions. In other cases when an identifier is reused, say in DHCP logs, a particular message may identify the beginning or end of a transaction. When it is desirable to see the raw text of the events combined rather than analysis on the constituent fields of the events.

In other cases, it’s usually better to use stats as the performance is higher, especially in a distributed search environment. Often there is a unique id and stats can be used.

How To Troubleshoot Splunk Performance Issues?

Check splunkd.log for any errors

Check server performance issues i.e. CPU/memory usage, disk i/o, etc

Install SOS (Splunk on Splunk) app and check for warning and errors in dashboard

check a number of saved searches currently running and their system resources consumption install Firebug, which is a firefox extension. After it’s installed and enabled, log into Splunk (using firefox), open firebug’s panels, switch to the ‘Net’ panel (you will have to enable it).

Who Are the Biggest Direct Competitors To Splunk?

  • logstash
  • Logged
  • Loglogic
  • sumo logic etc

How Does Splunk Determine 1 Day, From A Licensing Perspective?


 Midnight to midnight on the clock of the license master.

How Are Forwarder Licenses Purchased?

They are included with Splunk, no need to purchase separately.

How To Disable Splunk Launch Message?

Set value OFFENSIVE=Less in splunk_launch.conf

What Is Stool Or How Will You Troubleshoot Splunk Configuration Files?

Splunk tool is a command-line tool that helps us to troubleshoot configuration file issues or just see what values are being used by your Splunk Enterprise installation in an existing environment.

What Is the Difference Between Splunk App And Splunk Add On?


 Basically both contains preconfigured configuration and reports etc but Splunk add on do not have visual app. Splunk apps have preconfigured visual app.

What Is .conf Files Precedence In Splunk?
File precedence is as follows:

  • System local directory: highest priority
  • App local directories
  • App default directories
  • System default directory: lowest priority

What Is Fishbucket Or What Is Fishbucket Index?


 It’s a directory or index at default location /opt/Splunk/var/lib/Splunk .It contains seek pointers and CRCs for the files you are indexing, so splunkd can tell if it has read them already. We can access it through GUI by searching for “index=_thefishbucket”

What Is Dispatch Directory?

$SPLUNK_HOME/var/run/Splunk/dispatch contains a directory for each search that is running or has completed. For example, a directory named 1434308943.358 will contain a CSV file of its search results, a search.log with details about the search execution, and other stuff. Using the defaults (which you can override in limits.conf), these directories will be deleted 10 minutes after the search completes – unless the user saves the search results, in which case the results will be deleted after.days.

What Is the Difference Between Search Head Pooling And Search Head Clustering?

Both are features provided Splunk for high availability of Splunk search head in case anyone search head goes down. Search head cluster is newly introduced and search head pooling will be removed in the next upcoming versions. Search head cluster is managed by captain and captain controls its slaves. Search head cluster is more reliable and efficient than search head pooling.

What are the benefits of feeding data into a Splunk instance through Splunk Forwarders?

If you feed the data into a Splunk instance via Splunk Forwarders, you can reap three significant benefits – TCP connection, bandwidth throttling, and an encrypted SSL connection to transfer data from a Forwarder to an Indexer. Splunk’s architecture is such that the data forwarded to the Indexer is load-balanced by default.

So, even if one Indexer goes down due to some reason, the data can re-route itself via another Indexer instance quickly. Furthermore, Splunk Forwarders cache the events locally before forwarding it, thereby creating a temporary backup of the data.

What is the “Summary Index” in Splunk?

In Splunk, the Summary Index refers to the default Splunk index that stores data resulting from scheduled searches over time. Essentially, it is the index that Splunk Enterprise uses if a user does not specify or indicate another one.

The most significant advantage of the Summary Index is that it allows you to retain the analytics and reports even after your data has aged.

What is the purpose of Splunk DB Connect?

 

Splunk DB Connect is a generic SQL database plugin designed for Splunk. It enables users to integrate database information with Splunk queries and reports seamlessly.

What is the function of the Splunk Indexer?

As the name suggests, the Splunk Indexer creates and manages indexes. It has two core functions – to index raw data into an index and to search and manage the indexed data.

Name a few important Splunk search commands.

Some of the important search commands in Splunk are:

  • Abstract
  • Erex
  • Addtotals
  • Accum
  • Filldown
  • Typer
  • Rename
  • Anomalies

What are some of the most important configuration files in Splunk?

The most crucial configuration files in Splunk are:

  • props.conf
  • indexes.conf
  • inputs.conf
  • transforms.conf
  • server.conf

What is the importance of the License Master in Splunk? What happens if the License Master is unreachable?

In Splunk, the License Master ensures that the right amount of data gets indexed. Since the Splunk license is based on the data volume that reaches the platform within a 24hr-window, the License Master ensures that your Splunk environment stays within the constraints of the purchased volume.

If ever the License Master is unreachable, a user cannot search the data. However, this will not affect the data flowing into the Indexer – data will continue to flow in the Splunk deployment, and the Indexers will index the data. But the top of the Search Head will display a warning message that the user has exceeded the indexing volume. In this case, they must either reduce the amount of data flowing in or must purchase additional capacity of the Splunk license.

Explain ‘license violation’ in the Splunk perspective.

Anytime you exceed the data limit, the ‘license violation’ error will show on the dashboard. This warning will remain for 14 days. For a commercial Splunk license, users can have five warnings in a 30-day window before which Indexer’s search results and reports will not trigger. However, for the free version, users get only three warning counts.

What is the general expression for extracting IP address from logs?

Although you can extract the IP address from logs in many ways, the regular experssion for it would be:

rex field=_raw “(?<ip_address>\d+\.\d+\.\d+\.\d+)”

OR

rex field=_raw “(?<ip_address>([0-9]{1,3}[\.]){3}[0-9]{1,3})”

How can you troubleshoot Splunk performance issues?

To troubleshoot Splunk performance issues, perform the following steps:

  • Check splunkd.log to find any errors
  • Check server performance issues (CPU/memory usage, disk i/o, etc.)
  • Check the number of saved searches that are running at present and also their system resources consumption.
  • Install the SOS (Splunk on Splunk) app and see if the dashboard displays any warning or errors.
  • Install Firebug (a Firefox extension) and enable it in your system. After that, you have to log into Splunk using Firefox, open Firebug’s panels, and go to the ‘Net’ panel to enable it). The Net panel displays the HTTP requests and responses, along with the time spent in each. This will allow you to see which requests are slowing down Splunk and affecting the overall performance.

What are Buckets? Explain Splunk Bucket Lifecycle.

Buckets are directories that store the indexed data in Splunk. So, it is a physical directory that chronicles the events of a specific period. A bucket undergoes several stages of transformation over time. They are:

  • Hot – A hot bucket comprises of the newly indexed data, and hence, it is open for writing and new additions. An index can have one or more hot buckets.
  • Warm – A warm bucket contains the data that is rolled out from a hot bucket.
  • Cold – A cold bucket has data that is rolled out from a warm bucket.
  • Frozen – A frozen bucket contains the data rolled out from a cold bucket. The Splunk Indexer deletes the frozen data by default. However, there’s an option to archive it. An important thing to remember here is that frozen data is not searchable.

What purpose does the Time Zone property serve in Splunk?

In Splunk, Time Zone is crucial for searching for events from a security or fraud perspective. Splunk sets the default Time Zone for you from your browser settings. The browser further picks up the current Time Zone from the machine you are using. So, if you search for any event with the wrong Time Zone, you will not find anything relevant for that search.

The Time Zone becomes extremely important when you are searching and correlating data pouring in from different and multiple sources

Define Sourcetype in Splunk.

In Splunk, Sourcetype refers to the default field that is used to identify the data structure of an incoming event. Sourcetype should be set at the forwarder level for indexer extraction to help identify different data formats. It determines how Splunk Enterprise formats the data during the indexing process. This being the case, you must ensure to assign the correct Sourcetype to your data. To make data searching even easier, you should provide accurate timestamps, and event breaks to the indexed data (the event data).

Explain the difference between Stats and Eventstats commands.

In Splunk, the Stats command is used to generate the summary statistics of all the existing fields in the search results and save them as values in newly created fields. Although the Eventstats command is pretty similar to the Stats command, it adds the aggregation results inline to each event (if only the aggregation is pertinent to that particular event). So, while both the commands compute the requested statistics, the Eventstats command aggregates the statistics into the original raw data.

Differentiate between Splunk App and Add-on.

Splunk Apps refer to the complete collection of reports, dashboards, alerts, field extractions, and lookups. However, Splunk Add-ons only contain built-in configurations – they do not have dashboards or reports.

What is the command to stop and start Splunk service?

The command to start Splunk service is: ./splunk start

The command to stop Splunk service is: ./splunk stop

How can you clear the Splunk search history?

To clear the Splunk search history, you need to delete the following file from Splunk server:

$splunk_home/var/log/splunk/searches.log

What is Btool in Splunk?

Btool in Splunk is a command-line tool that is used for troubleshooting configuration file issues. It also helps check what values are being used by a user’s Splunk Enterprise installation in the existing environment.

What is the need for Splunk Alert? Specify the type of options you get while setting up Splunk Alerts.

Splunk Alerts help notify users of any erroneous condition in their systems. For instance, a user can set up Alerts for email notification to be sent to the admin in case there are more than three failed login attempts within 24 hours.

The different options you get while setting up Alerts include:

  • You can create a webhook. This will allow you to write to HipChat or GitHub – you can write an email to a group of machines containing your subject, priorities, and the body of your email.
  • You can add results in CSV or pdf formats or inline with the body of the message to help the recipient understand the location and conditions of the alert that has been triggered and what actions have been taken for the same.
  • You can create tickets and throttle alerts based on specific conditions such as the machine name or IP address. These alerts can be controlled from the alert window.

What is a Fishbucket and what is the Index for it?

Fishbucket is an index directory resting at the default location, that is:

/opt/splunk/var/lib/splunk

Fishbucket includes seek pointers and CRCs for the indexed files. To access the Fishbucket, you can use the GUI for searching:

index=_thefishbucket

What is the Dispatch Directory?

 

The Dispatch Directory includes a directory for individual searches that are either running or have completed. The configuration for the Dispatch Directory is as follows:

$SPLUNK_HOME/var/run/splunk/dispatch

Let’s assume, there is a directory named 14.4308943.358. This directory will contain a CSV file of all the search results, a search.log containing the details about the search execution, and other relevant information. By using the default configuration, you can delete this directory within 10 minutes after the search completes. If you save the search results, they will be deleted after seven days.

How can you add folder access logs from a Windows machine to Splunk?

To add folder access logs from a Windows machines to Splunk, you must follow the steps listed below:

  • Go to Group Policy and enable Object Access Audit on the Windows machine where the folder is located.
  • Now you have to enable auditing on the specific folder for which you want to monitor access logs.
  • Install Splunk Universal Forwarder on the Windows machine.
  • Configure the Universal Forwarder to send security logs to the Splunk Indexer.

How does Splunk avoid duplicate indexing of logs?

The Splunk Indexer keeps track of all the indexed events in a directory – the Fishbuckets directory that contains seek pointers and CRCs for all the files being indexed presently. So, if there’s any seek pointer or CRC that has been already read, splunkd will point it out.

What is the configuration files precedence in Splunk?

The precedence of configuration files in Splunk is as follows:

  • System Local Directory (highest priority)
  • App Local Directories
  • App Default Directories
  • System Default Directory (lowest priority)

Define “Search Factor” and “Replication Factor.”

Both Search Factor (SF) and Replication Factor (RF) are clustering terminologies in Splunk. While the SF (with a default value of 2) determines the number of searchable copies of data maintained by the Indexer cluster, the RF represents the number of copies of data maintained by the Indexer cluster. An important thing to remember is that SF must always be less than or equal to the replication factor. Also, the Search Head cluster only has a Search Factor, whereas an Indexer cluster has both SF and RF.

Why is the lookup command used? Differentiate between inputlookup & outputlookup commands.

In Splunk, lookup commands are used when you want to receive specific fields from an external file (for example, a Python-based script, or a CSV file) to obtain a value of an event. It helps narrow the search results by referencing the fields in an external CSV file that matches fields in the event data.

The inputlookup command is used when you want to take an input. For instance, the command can take the product price or product name as input and then match it with an internal field such as a product ID. On the contrary, the outputlookup command is used to produce an output from an existing field list.

Differentiate between Splunk SDK and Splunk Framework.

Splunk SDKs are primarily designed to help users develop applications from scratch. They do not require Splunk Web or any other component from the Splunk App Framework to function. Splunk SDKs are separately licensed from Splunk. As opposed to this, the Splunk App Framework rests within the Splunk Web Server. It allows users to customize the Splunk Web UI that accompanies the product. Although it lets you develop Splunk apps, you have to do so by using the Splunk Web Server.

How to know when Splunk has completed indexing a log file?

You can figure out whether or not Splunk has completed indexing a log file in two ways:

  1. By monitoring the data from Splunk’s metrics log in real-time:

index=”_internal” source=”*metrics.log” group=”per_sourcetype_thruput” series=”&lt;your_sourcetype_here&gt;” |

eval MB=kb/1024 | chart sum(MB)

  1. By monitoring all the metrics split by source type:

index=”_internal” source=”*metrics.log” group=”per_sourcetype_thruput” | eval MB=kb/1024 | chart sum(MB) avg(eps) over series

How to list all the saved searches in Splunk?

 

Using syntax:
rest /servicesNS/-/-/saved/searches splunk_server=loca

List out common ports used by Splunk.

Common ports used by Splunk are as follows:

  • Web Port: 8000
  • Management Port: 8089
  • Network port: 514
  • Index Replication Port: 8080
  • Indexing Port: 999.
  • KV store: 8191

Explain Splunk components

The fundamental components of Splunk are:

  • Universal forward: It is a lightweight component which inserts data to Splunk forwarder.
  • Heavy forward: It is a heavy component that allows you to filter the required data.
  • Search head: This component is used to gain intelligence and perform reporting.
  • License manager: The license is based on volume & usage. It allows you to use 50 GB per day. Splunk regular checks the licensing details.
  • Load Balancer: In addition to the functionality of default Splunk loader, it also enables you to use your personalized load balancer.

What are the disadvantages of using Splunk?

Some disadvantages of using Splunk tool are:

  • Splunk can prove expensive for large data volumes.
  • Dashboards are functional but not as effective as some other monitoring tools.
  • Its learning curve is stiff, and you need Splunk training as it’s a multi-tier architecture. So, you need to spend lots of time to learn this tool.
  • Searches are difficult to understand, especially regular expressions and search syntax.

What are the pros of getting data into a Splunk instance using forwarders?

The advantages of getting data into Splunk via forwarders are TCP connection, bandwidth throttling, and secure SSL connection for transferring crucial data from a forwarder to an indexer.

What is the importance of license master in Splunk?

License master in Splunk ensures that the right amount of data gets indexed. It ensures that the environment remains within the limits of the purchased volume as Splunk license depends on the data volume, which comes to the platform within a 24-hour window.

Name some important configuration files of Splunk

Commonly used Splunk configuration files are:

  • Inputs file
  • Transforms file
  • Server file
  • Indexes file
  • Props file

Explain license violation in Splunk.

It is a warning error that occurs when you exceed the data limit. This warning error will persist for 14 days. In a commercial license, you may have 5 warnings within a 1-month rolling window before which your Indexer search results and reports stop triggering.

However, in a free version, license violation warning shows only 3 counts of warning.

What is the use of Splunk alert?

Alerts can be used when you have to monitor for and respond to specific events. For example, sending an email notification to the user when there are more than three failed login attempts in a 24-hour period.

Explain map-reduce algorithm

Map-reduce algorithm is a technique used by Splunk to increase data searching speed. It is inspired by two functional programming functions 1) reduce () 2) map().

Here map() function is associated with Mapper class and reduce() function is associated with a Reducer class.

Explain different types of data inputs in Splunk?

Following are different types of data inputs in Splunk:

  • Using files and directories as input
  • Configuring Network ports to receive inputs automatically
  • Add windows inputs. These windows inputs are of four types: 1) active directory monitor, 2) printer monitor, 3) network monitor, and 4) registry inputs monitor.

How Splunk avoids duplicate log indexing?

Splunk allows you to keeps track of indexed events in a fish buckets directory. It contains CRCs and seeks pointers for the files you are indexing, so Splunk can’t if it has read them already.

Explain pivot and data models.

Pivots are used to create the front views of your output and then choose the proper filter for a better view of this output. Both options are beneficial for the people from a semi-technical or non-technical background.

Data models are most commonly used for creating a hierarchical model of data. However, it can also be used when you have a large amount of unstructured data. It helps you make use of that information without using complicated search queries.

Explain search factor and replication factor?

Search factor determines the number of data maintained by the indexer cluster. It determines the number of searchable copies available in the bucket.

Replication factor determines the number of copies maintained by the cluster as well as the number of copies that each site maintains.

What is the use of lookup command?

Lookup command is generally used when you want to get some fields from an external file. It helps you to narrow the search results as it helps to reference fields in an external file that match fields in your event data.

Explain default fields for an event in Splunk

There are 5 default fields which are barcoded with every event into Splunk. They are: 1) host, 2) source, 3) source type, 4) index, and 5) timestamp.

How can you extract fields?

In order to extract fields from either sidebar, event lists or the settings menu using UI.

Another way to extract fields in Splunk is to write your regular expressions in a props configuration file.

What do you mean by summary index?

A summary index is a special index that stores that result calculated by Splunk.  It is a fast and cheap way to run a query over a longer period of time.

How to prevent events from being indexed by Splunk?

You can prevent the event from being indexed by Splunk by excluding debug messages by putting them in the null queue. You have to keep the null queue in transforms.conf file at the forwarder level itself.

Define Splunk DB connect

It is a SQL database plugin which enables to import tables, rows, and columns from a database add the database. Splunk DB connect helps in providing reliable and scalable integration between databases and Splunk Enterprises.

Define Splunk buckets

It is the directory used by Splunk enterprise to store data and indexed files into the data.  These index files contain various buckets managed by the age of the data.

What is the function of Alert Manager?

The alert manager adds workflow to Splunk. The purpose of alert manager o provides a common app with dashboards to search for alerts or events.

How can you troubleshoot Splunk performance issues?

Three ways to troubleshoot Splunk performance issue.

  • See server performance issues.
  • See for errors in splunkd.log.
  • Install Splunk app and check for warnings and errors in the dashboard.

What is the difference between Index time and Search time?

Index time is a period when the data is consumed and the point when it is written to disk. Search time take place while the search is run as events are composed by the search.

How to reset the Splunk administrator password?

In order to reset the administrator password, perform the following steps:

  1. Login into the server on which Splunk is installed
  2. Rename the password file and then again start the Splunk.
  3. After this, you can sign into the server by using username either administrator or admin with a password changeme.

Name the command which is used to the “filtering results” category

The command which is used to the “filtering results” category is: “where,” “Sort,” “rex,” and “search.”

List out different types of Splunk licenses

The types of Splunk licenses are as follows:

  • Free license
  • Beta license
  • Search heads license
  • Cluster members license
  • Forwarder license
  • Enterprise license

List out the number of categories of the SPL commands.

The SPL commands are classified into five categories:

1) Filtering Results, 2) Sorting Results, 3) Filtering Grouping Results, 4) Adding Fields, and 5) Reporting Results.

What is eval command?

This command is used to calculate an expression. Eval command evaluates boolean expressions, string, and mathematical articulations. You can use multiple eval expressions in a single search using a comma.

Name commands which are included in the reporting results category

Following are the commands which are included in the reporting results category:

  • Rare
  • Chart
  • time chart
  • Top
  • Stats

What is SOS?

Splunk on Splunk or SOS is a Splunk app that helps you to analyze and troubleshoot Splunk environment performance and issues.

Name features which are not available in Splunk free version?

Splunk free version lacks the following features:

  • Distributed searching
  • Forwarding in HTTP or TCP
  • Agile statistics and reporting with Real-time architecture
  • Offers analysis, search, and visualization capabilities to empower users of all types.
  • Generate ROI faster

What is a null queue?

A null queue is an approach to filter out unwanted incoming events sent by Splunk enterprise.

Explain types of search modes in Splunk?

There are three types of search modules. They are:

  • Fast mode: It increases the searching speed by limiting search data.
  • Verbose mode: This mode returns all possible fields and event data.
  • Smart mode: It is a default setting in a Splunk app. Smart mode toggles the search behavior based on transforming commands.

What is the main difference between source & source type

The source identifies as a source of the event which a particular event originates, while the source type determines how Splunk processes the incoming data stream into events according to its nature.

What is a join command?

It is used to combine the results of a sub search with the results of the actual search. Here the fields must be common to each result set. You can also combine a search set of results to itself using the self-join command in Splunk.

How to start and stop Splunk service?

To start and stop Splunk services use can use following commands:

1

.

./splunk start

./splunk stop

Where to download Splunk Cloud?

Visit website: https://www.splunk.com/ to download a free trial of Splunk Cloud.

What is the difference between stats and time chart command?

sp13

Define deployment server

Deployment server is a Splunk instance that acts as a centralized configuration manager. It is used to deploy the configuration to other Splunk instances.

What is Time Zone property in Splunk?

Time zone property provides the output for a specific time zone. Splunk takes the default time zone from browser settings. The browser takes the current time zone from the computer system, which is currently in use. Splunk takes that time zone when users are searching and correlating bulk data coming from other sources.

What is Splunk sound unit connect?

Splunk sound unit is a plugin which allows adding info data with Splunk reports. It helps in providing reliable and ascendible integration between relative databases and Splunk enterprises.

How to install forwarder remotely?

You can make use of a bash script in order to install forwarder remotely.

What is the use of syslog server?

Syslog server is used to collect data from various devices like routers and switches and application logs from the web server. You can use R syslog or syslog NG command to configure a Syslog server.

How to monitor forwarders?

Use the forwarder tab available on the DMC (Distributed Management Console) to monitor the status of forwarders and the deployment server to manage them.

What is the use of Splunk btool?

It is a command-line tool that is designed to solve configuration related issues.

Name Splunk alternatives

Some Splunk alternatives are:

  • Sumo logic
  • Loglogic
  • Loggy
  • Logstash

What is KV store in Splunk?

Key Value (KV) allows to store and obtain data inside Splunk. KV also helps you to:

  • Manage job queue
  • Store metadata
  • Examine the workflow

What do you mean by deployer in Splunk?

Deployer is a Splunk enterprise instant which is used to deploy apps to the cluster head. It can also be used to configure information for app and user.

When to use auto_high_volume in Splunk?

It is used when the indexes are of high volume, i.e., 10GB of data.

What is a stat command?

It is a Splunk command that is used to arrange report data in tabular format.

What is a regex command?

Regex command removes results which do not match with desired regular expression.

What is input lookup command?

This Splunk command returns lookup table in the search result.

What is the output lookup command?

Output lookup command searches the result for a lookup table on the hard disk.

List out various stages of bucket lifecycle

Stages of bucket lifecycle are as follows:

  • Hot
  • Warm
  • Cold
  • Frozen

Name stages of Splunk indexer

Stages of Splunk indexer are:

  • Input
  • Parsing
  • Indexing
  • Searching

Explain how Splunk works?

There are three phases in which Splunk works:

  • The first phase: It generates data and solves query from various sources.
  • The second phase: It uses the data to solve the query.
  • Third phase: it displays the answers via graph, report, or chart which is understood by audiences.

What are three versions if Splunk?

Splunk is available in three different versions. These versions are 1) Splunk enterprise, 2) Splunk light, 3) Splunk cloud.

  • Splunk enterprise: Splunk Enterprise edition is used by many IT organizations. It helps you to analyze the data from various websites and applications.
  • Splunk cloud: Splunk Cloud is a SaaS (Software as a Service) It offers almost similar features as the enterprise version, including APIs, SDKs, and apps.
  • Splunk light: Splunk light is a free version which allows, to make a report, search and edit your log data. Splunk light version has limited functionalities and features compared to other versions.

Name companies which are using Splunk

Well-known companies which are using Splunk tool are:

  • Cisco
  • Facebook
  • Bosch
  • Adobe
  • IBM
  • Walmart
  • Salesforce

What is SLP?

Search Processing Language or SLP is a language which contains functions, commands, and arguments. It is used to get the desired output from the database.

Define monitoring in Splunk

Monitoring is a term related to reports you can visually monitor.

Name the domain in which knowledge objects can be used

Following are a few domains in which knowledge objects can be used:

  • Application Monitoring
  • Employee Management
  • Physical Security
  • Network Security

How many roles are there in Splunk?

There are three roles in Splunk: 1) Admin, 2) Power, and 3.) User.

Are search terms in Splunk case sensitive?

No, Search terms in Splunk are not case sensitive.

Can search results be used to change the existing search?

Yes, the search result can be used to make changes in an existing search.

List out layout options for search results.

Following are a few layout options for search result:

  • List
  • Table
  • Raw

What are the formats in which search result be exported?

The search result can be exported into JSON, CSV, XML, and PDF.

Explain types of Boolean operators in Splunk.

Splunk supports three types of Boolean operators; they are:

  • AND: It is implied between two terms, so you do not need to write it.
  • OR: It determines that either one of the two arguments should be true.
  • NOT: used to filter out events having a specific word.

Explain the use of top command in Splunk

The top command is used to display the common values of a field, with their percentage and count.

What is the use of stats command?

It calculates aggregate statistics over a dataset, such as count, sum, and average.

What are the types of alerts in Splunk?

There are mainly three types of alerts available in Splunk:

  • Scheduled alert: It is an alert that is based on a historical search. It runs periodically with a set schedule.
  • Per result alert: This alert is based on a real time search which runs overall time.
  • Rolling window alert: An alert that is based on real-time search. This search is set to run within a specific rolling time window that you define.

List various types of Splunk dashboards.

  • Dynamic form-based dashboards
  • Dashboards as scheduled reports
  • Real time dashboards

What is the use of tags in Splunk?

They are used to assign names to specific filed and value pairs. The filed can be event type, source, source type, and host.

How to increase the size of Splunk data storage?

In order to increase the size of data storage, you can either add more space to index or add more indexers.

Distinguish between Splunk apps and add-ons

There is only one difference between Splunk apps, and add-ons that is Splunk apps contains built-in reports, configurations, and dashboards. However, Splunk add-ons contain only built-in configurations they do not contain dashboards or reports.

Define dispatch directory in Splunk?

Dispatch directory stores status like running or completed.

What is the primary difference between stats and eventstats commands

Stats command provides summary statistics of existing fields available in search output, and then it stores them as values in new fields. On the other hand, in event stats command aggregation results are added so that every event only if the aggregation applies to that particular event.

What do you mean by source type in Splunk?

Source field is a default field that finds the data structure of an event. It determines how Splunk formats the data while indexing.

Define calculated fields?

Calculated fields are the fields which perform the calculation which the values of two fields available in a specific event.

List out some Splunk search commands

Following are some search commands available in Splunk:

  • Abstract
  • Erex
  • Addtotals
  • Accum
  • Filldown
  • Typer
  • Rename
  • Anomalies

 

So, this brings us to the end of the Splunk Interview Questions blog.This Tecklearn ‘Top Splunk Interview Questions and Answers’ helps you with commonly asked questions if you are looking out for a job in Splunk or Big Data Domain. If you wish to learn Splunk and build a career in Big Data domain, then check out our interactive, Splunk Developer and Admin Training, that comes with 24*7 support to guide you throughout your learning period.

https://www.tecklearn.com/course/splunk-training-and-certification-developer-and-admin/

Splunk Developer & Admin Training

About the Course

Tecklearn’s Splunk Training covers all aspects of Splunk development and Splunk administration from basic to expert level. The trainee will go through various aspects of Splunk installation, configuration, etc. and also learn to create reports and dashboards, both using Splunk’s searching and reporting commands. As part of the course, also work on Splunk deployment management, indexes, parsing, Splunk cluster implementation, and more. With this online Splunk training, you can quickly get up and run with the Splunk platform and successfully clear the Splunk Certification exam.

Why Should you take Splunk Developer and Admin Training?

  • Splunk Development Operations Engineer can pocket home salaries of upto $148,590. -Indeed.com
  • 13,000+ customers in over 110 countries are already using Splunk to gain operational intelligence & reduce operational cost.
  • IDC predicts by 2020, world will be home to 40 trillion GB data. The demand to process this data is higher than ever.

What you will Learn in this Course?

Splunk Administration

Overview of Splunk

  • Need for Splunk and its features
  • Splunk Products and their Use-Case
  • Splunk Components: Search Head, Indexer, Forwarder, Deployment Server & License Master
  • Splunk Licensing options

Splunk Architecture

  • Introduction to the architecture of Splunk

Splunk Installation

  • Download and Install Splunk
  • Configure Splunk
  • Creation of index

Splunk Configuration Files

  • Introduction to Splunk configuration files
  • Managing the. conf files

Splunk App and Apps Management

  • Splunk App
  • How to develop Splunk apps
  • Splunk App Management
  • Splunk App add-ons
  • App permissions and Implementation

User roles and authentication

  • Introduction to Authentication techniques
  • User Creation and Management
  • Splunk Admin Roles and Responsibilities
  • Splunk License Management

Splunk Index Management

  • Splunk Indexes
  • Segregation of the Splunk Indexes
  • Concept of Splunk Buckets and Bucket Classification
  • Creating New Index and estimating Index storage

Various Splunk Input Methods

  • Understanding the input methods
  • Agentless input types

Splunk Universal Forwarder

  • Universal Forwarder management
  • Overview of Splunk Universal Forwarder

Deployment Management in Splunk

  • Implementing the Splunk tool and deploying it on server
  • Splunk environment setup and Splunk client group deployment

Basic Production Environment

  • Universal Forwarder
  • Forwarder Management
  • Data management
  • Troubleshooting and Monitoring

Splunk Search Engine

  • Integrating Search using Head Clustering and Indexer Clustering
  • Conversion of machine-generated data to operational intelligence
  • Set up Dashboard, Charts and Reports

Search Scaling and Monitoring

  • Splunk Distributed Management Console for monitoring
  • Large-scale deployment and overcoming execution hurdles
  • Distributed search concepts
  • Improving search performance

Splunk Cluster Implementation and Index Clustering

  • Cluster indexing
  • Configuring the cluster behaviour
  • Index and search behaviour

Distributed Management Console

  • Introduction to Splunk distributed management console
  • How to deploy distributed search in Splunk environment

Splunk Developer

Splunk Development Concepts

  • Roles and Responsibilities of Splunk developer

Basic Searching

  • Basic Searching using Splunk query
  • Build Search, refine search and time range using Auto-complete
  • Controlling a search job and Identifying the contents of search

Using Fields in Searches

  • Using Fields in search
  • Deployment of Field Extractor and Fields Sidebar for REGEX field extraction

Splunk Search Commands

  • Search command
  • General search practices
  • Concept of search pipeline
  • Specify indexes in search
  • Deployment of the various search commands: Fields, Sort, Tables, Rename, rex and erex

Creating Reports and Dashboards

  • Creation of Reports, Charts and Dashboards
  • Editing Dashboards and Reports
  • Adding reports to dashboard

Creating Alerts

  • Create alerts
  • Understanding alerts
  • Viewing fired alerts

Splunk Commands

  • Splunk Search Commands
  • Transforming Commands
  • Reporting Commands
  • Mapping and Single Value Commands

Lookups

  • Concept of data lookups, examples and lookup tables

Automatic Lookups

  • Configuring and Defining automatic lookups
  • Deploying lookups in reports and searches

Splunk Queries

  • Splunk Queries
  • Splunk Query Repository

Splunk Search Processing Language

  • Learn about the Search Processing Language

Analyzing, Calculating and Formatting results

  • Calculating and analyzing results
  • Value conversion
  • Conditional statements and filtering calculated search results

Splunk Reports and Visualizations

  • Explore the available visualizations
  • Create charts and time charts
  • Omit null values and format results

Got a question for us? Please mention it in the comments section and we will get back to you.

 

0 responses on "Top Splunk Interview Questions and Answers"

Leave a Message

Your email address will not be published. Required fields are marked *