Subsetting Data Sets in SAS

Last updated on Dec 13 2021
Vaidehi Reddy

Table of Contents

Subsetting Data Sets in SAS

Subsetting a SAS data set means extracting a part of the data set by selecting a fewer number of variables or fewer number of observations or both. While subsetting of variables is done by using KEEP and DROP statement, the sub setting of observations is done using DELETE statement.

Also the resulting data from the subsetting operation is held in a new data set which can be used for further analysis. Sub setting is mainly used for the purpose of analyzing a part of the data set without using those variables or observations which may not be relevant to the analysis.

Subsetting Variables

In this method we extract only few variables from the entire data set.

Syntax

The basic syntax for sub setting variables in SAS is −

KEEP var1 var2 ... ;
DROP var1 var2 ... ;

Following is the description of the parameters used −

  • var1 and var2 are the variable names from the data set which needs to be kept or dropped.

Example

Consider the below SAS data set containing the employee details of an organization. If we are interested only in getting the Name and Department values from the data set, then we can use the below code.

DATA Employee;

INPUT empid ename $ salary DEPT $ ;

DATALINES;

1 Rick 623.3           IT

2 Dan 515.2           OPS

3 Mike 611.5          IT

4 Ryan 729.1    HR

5 Gary 843.25   FIN

6 Tusar 578.6   IT

7 Pranab 632.8  OPS

8 Rasmi 722.5   FIN

;

RUN;

DATA OnlyDept;

SET Employee;

KEEP ename DEPT;

RUN;

PROC PRINT DATA = OnlyDept;

RUN;

When the above code is executed, we get the following output.

image001 22
Subsetting Variables

The same result can be obtained by dropping the variables that are not required. The below code illustrates this.

DATA Employee;

INPUT empid ename $ salary DEPT $ ;

DATALINES;

1 Rick 623.3           IT

2 Dan 515.2           OPS

3 Mike 611.5          IT

4 Ryan 729.1    HR

5 Gary 843.25   FIN

6 Tusar 578.6   IT

7 Pranab 632.8  OPS

8 Rasmi 722.5   FIN

;

RUN;

DATA OnlyDept;

SET Employee;

DROP empid salary;

RUN;

PROC PRINT DATA = OnlyDept;

RUN;

Subsetting Observations

In this method we extract only few observations from the entire data set.

Syntax

We use PROC FREQ which keeps track of the observations selected for the new data set.

The syntax for sub setting observations is −

IF Var Condition THEN DELETE ;

Following is the description of the parameters used −

  • Var is the name of the variable based on whose value the observations will be deleted using the specified condition.

Example

Consider the below SAS data set containing the employee details of an organization. If we are interested only in getting the data for employees with salary greater than 700,then we use the below code.

DATA Employee;

INPUT empid name $ salary DEPT $ ;

DATALINES;

1 Rick 623.3           IT

2 Dan 515.2           OPS

3 Mike 611.5          IT

4 Ryan 729.1    HR

5 Gary 843.25   FIN

6 Tusar 578.6   IT

7 Pranab 632.8  OPS

8 Rasmi 722.5   FIN

;

RUN;

DATA OnlyDept;

SET Employee;

IF salary < 700 THEN DELETE;

RUN;

PROC PRINT DATA = OnlyDept;

RUN;

When the above code is executed, we get the following output.

image002 27
Subsetting Observations

So, this brings us to the end of blog. This Tecklearn ‘Subsetting Data Sets in SAS’ blog helps you with commonly asked questions if you are looking out for a job in SAS. If you wish to learn SAS and build a career in Data Analytics domain, then check out our interactive, SAS Training for SAS BASE Certification Training, that comes with 24*7 support to guide you throughout your learning period. Please find the link for course details:

https://www.tecklearn.com/course/sas-training-for-sas-base-certification/

SAS Training for SAS BASE Certification Training

About the Course

SAS Certification Training is intended to make you an expert in SAS programming and Analytics. You will be able to analyse and write SAS code for real problems, learn to use SAS to work with datasets, perform advanced statistical techniques to obtain optimized results with Advanced SAS programming.  In this SAS online training course, you will also learn SAS macros, Machine Learning, PROC SQL, procedure, statistical analysis and decision trees. You will also work on real-life projects and prepare for the SAS Certified Base Programmer certification exam. Upon the completion of this SAS online training, you will have enough proficiency in reading spreadsheets, databases, using SAS functions for manipulating this data and debugging it.

Why Should you take SAS Training?

  • The average salary for a Business Intelligence Developer skilled in SAS is $100k (PayScale salary data)
  • SAS, Google, Facebook, Twitter, Netflix, Accenture & other MNCs worldwide are using SAS for their Data analysis activities and advance their existing systems.
  • SAS is a Leader in 2017 Gartner Magic Quadrant for Data Science Platform.

What you will Learn in this Course?

Introduction to SAS 

  • Introduction to SAS
  • Installation of SAS
  • SAS windows
  • Working with data sets
  • Walk through of SAS windows like output, search, editor etc

SAS Enterprise Guide

  • How to read and subset the data sets
  • SET Statement
  • Infile and Infile Options
  • SAS Format -Format Vs Informat

SAS Operators and Functions

  • Using Variables
  • Defining and using KEEP and DROP statements
  • Output Statement
  • Retain Statement
  • SUM Statement

Advanced SAS Procedures

  • PROC Import
  • PROC Print
  • Data Step Vs Proc
  • Deep Dive into Proc

Customizing Datasets

  • SAS Arrays
  • Useful SAS Functions
  • PUT/INPUT Functions
  • Date/Time Functions
  • Numeric Functions
  • Character Functions

SAS Format and SAS Graphs

  • SAS Format statements
  • Understanding PROC GCHART, various graphs, bar charts: pie, bar

Sorting Techniques

  • NODUP
  • NODUKEY
  • NODUP Vs NODUKEY

Data Transformation Function

  • Character functions, numeric functions and converting variable type
  • Use functions in data transformation

Deep Dive into SAS Procedures, Functions and Statements

  • Find Function
  • Scan Function
  • MERGE Statement
  • BY Statement
  • Joins
  • Procedures Vs Function
  • Where Vs If
  • What is Missover
  • NMISS
  • CMISS

PROC SQL

  • SELECT statement
  • Sorting of Data
  • CASE expression
  • Other SELECT statement clauses
  • JOINS and UNIONS

Using SAS Macros

  • Benefits of SAS Macros
  • Macro Variables
  • Macro Code Constituents and Macro Step
  • Positional Parameters to Macros

Got a question for us? Please mention it in the comments section and we will get back to you.

0 responses on "Subsetting Data Sets in SAS"

Leave a Message

Your email address will not be published. Required fields are marked *