How to Concatenate Data Sets in SAS

Last updated on Sep 14 2022
Nitin Pawar

Table of Contents

How to Concatenate Data Sets in SAS

Multiple SAS data sets can be concatenated to give a single data set using the SET statement. The total number of observations in the concatenated data set is the sum of the number of observations in the original data sets. The order of observations is sequential. All observations from the first data set are followed by all observations from the second data set, and so on.

Ideally all the combining data sets have same variables, but in case they have different number of variables, then in the result all the variables appear, with missing values for the smaller data set.

Syntax

The basic syntax for SET statement in SAS is −

SET data-set 1 data-set 2 data-set 3…..;

Following is the description of the parameters used −

• data-set1,data-set2 are dataset names written one after another.

Example

Consider the employee data of an organization which is available in two different data sets, one for the IT department and another for Non-It department. To get the complete details of all the employees we concatenate both the data sets using the SET statement shown as below.

DATA ITDEPT;

INPUT empid name $ salary ;

DATALINES;

1 Rick 623.3

3 Mike 611.5

6 Tusar 578.6

;

RUN;

DATA NON_ITDEPT;

INPUT empid name $ salary ;

DATALINES;

2 Dan 515.2

4 Ryan 729.1

5 Gary 843.25

7 Pranab 632.8

8 Rasmi 722.5

RUN;

DATA All_Dept;

SET ITDEPT NON_ITDEPT;

RUN;

PROC PRINT DATA = All_Dept;

RUN;

When the above code is executed, we get the following output.

sas 24

Scenarios

When we have many variations in the data sets for concatenation, the result of variables can differ but the total number of observations in the concatenated data set is always the sum of the observations in each data set. We will consider below many scenarios on this variation.

Different number of variables

If one of the original data set has more number of variables then another, then the data sets still get combined but in the smaller data set those variables appear as missing.

Example

In below example the first data set has an extra variable named DOJ. In the result the value of DOJ for second data set will appear as missing.

DATA ITDEPT;
INPUT empid name $ salary DOJ date9. ;
DATALINES;
1 Rick 623.3 02APR2001
3 Mike 611.5 21OCT2000
6 Tusar 578.6 01MAR2009
;
RUN;
DATA NON_ITDEPT;
INPUT empid name $ salary ;
DATALINES;
2 Dan 515.2
4 Ryan 729.1
5 Gary 843.25
7 Pranab 632.8
8 Rasmi 722.5
RUN;

DATA All_Dept;
SET ITDEPT NON_ITDEPT;
RUN;
PROC PRINT DATA = All_Dept;
RUN;

When the above code is executed, we get the following output.

sas 23

Different variable name

In this scenario the data sets have same number of variables but a variable name differs between them. In that case a normal concatenation will produce all the variables in the result set and giving missing results for the two variables which differ. While we may not change the variable name in the original data sets we can apply the RENAME function in the concatenated data set we create. That will produce the same result as a normal concatenation but of course with one new variable name in place of two different variable names present in the original data set.

Example

In the below example data set ITDEPT has the variable name ename whereas the data set NON_ITDEPT has the variable name empname. But both of these variables represent the same type(character). We apply the RENAME function in the SET statement as shown below.

DATA ITDEPT;
INPUT empid ename $ salary ;
DATALINES;
1 Rick 623.3
3 Mike 611.5
6 Tusar 578.6
;
RUN;

DATA NON_ITDEPT;
INPUT empid empname $ salary ;
DATALINES;
2 Dan 515.2
4 Ryan 729.1
5 Gary 843.25
7 Pranab 632.8
8 Rasmi 722.5
RUN;
DATA All_Dept;
SET ITDEPT(RENAME =(ename = Employee) ) NON_ITDEPT(RENAME =(empname = Employee) );
RUN;
PROC PRINT DATA = All_Dept;
RUN;

When the above code is executed, we get the following output.

sas 25

Different variable lengths

If the variable lengths in the two data sets is different than the concatenated data set will have values in which some data is truncated for the variable with smaller length. It happens if the first data set has a smaller length. To solve this we apply the higher length to both the data set as shown below.

Example

In the below example the variable ename is of length 5 in the first data set and 7 in the second. When concatenating we apply the LENGTH statement in the concatenated data set to set the ename length to 7.

DATA ITDEPT;

INPUT empid 1-2 ename $ 3-7 salary 8-14 ;

DATALINES;

1 Rick 623.3

3 Mike 611.5

6 Tusar 578.6

;

RUN;

DATA NON_ITDEPT;

INPUT empid 1-2 ename $ 3-9 salary 10-16 ;

DATALINES;

2 Dan 515.2

4 Ryan 729.1

5 Gary 843.25

7 Pranab 632.8

8 Rasmi 722.5

RUN;

DATA All_Dept;

LENGTH ename $ 7 ;

SET ITDEPT NON_ITDEPT ;

RUN;

PROC PRINT DATA = All_Dept;

RUN;

When the above code is executed, we get the following output.

sas 26

So, this brings us to the end of blog. This Tecklearn ‘How to Concatenate Data Sets in SAS’ blog helps you with commonly asked questions if you are looking out for a job in SAS. If you wish to learn SAS and build a career in Data Analytics domain, then check out our interactive, SAS Training for SAS BASE CertificationTraining, that comes with 24*7 support to guide you throughout your learning period. Please find the link for course details:

SAS Training for SAS BASE Certification

SAS Training for SAS BASE Certification Training

About the Course

SAS Certification Training is intended to make you an expert in SAS programming and Analytics. You will be able to analyse and write SAS code for real problems, learn to use SAS to work with datasets, perform advanced statistical techniques to obtain optimized results with Advanced SAS programming. In this SAS online training course, you will also learn SAS macros, Machine Learning, PROC SQL, procedure, statistical analysis and decision trees. You will also work on real-life projects and prepare for the SAS Certified Base Programmer certification exam. Upon the completion of this SAS online training, you will have enough proficiency in reading spreadsheets, databases, using SAS functions for manipulating this data and debugging it.

Why Should you take SAS Training?

• The average salary for a Business Intelligence Developer skilled in SAS is $100k (PayScale salary data)

• SAS, Google, Facebook, Twitter, Netflix, Accenture & other MNCs worldwide are using SAS for their Data analysis activities and advance their existing systems.

• SAS is a Leader in 2017 Gartner Magic Quadrant for Data Science Platform.

What you will Learn in this Course?

Introduction to SAS

• Introduction to SAS

• Installation of SAS

• SAS windows

• Working with data sets

• Walk through of SAS windows like output, search, editor etc

SAS Enterprise Guide

• How to read and subset the data sets

• SET Statement

• Infile and Infile Options

• SAS Format -Format Vs Informat

SAS Operators and Functions

• Using Variables

• Defining and using KEEP and DROP statements

• Output Statement

• Retain Statement

• SUM Statement

Advanced SAS Procedures

• PROC Import

• PROC Print

• Data Step Vs Proc

• Deep Dive into Proc

Customizing Datasets

• SAS Arrays

• Useful SAS Functions

• PUT/INPUT Functions

• Date/Time Functions

• Numeric Functions

• Character Functions

SAS Format and SAS Graphs

• SAS Format statements

• Understanding PROC GCHART, various graphs, bar charts: pie, bar

Sorting Techniques

• NODUP

• NODUKEY

• NODUP Vs NODUKEY

Data Transformation Function

• Character functions, numeric functions and converting variable type

• Use functions in data transformation

Deep Dive into SAS Procedures, Functions and Statements

• Find Function

• Scan Function

• MERGE Statement

• BY Statement

• Joins

• Procedures Vs Function

• Where Vs If

• What is Missover

• NMISS

• CMISS

PROC SQL

• SELECT statement

• Sorting of Data

• CASE expression

• Other SELECT statement clauses

• JOINS and UNIONS

Using SAS Macros

• Benefits of SAS Macros

• Macro Variables

• Macro Code Constituents and Macro Step

• Positional Parameters to Macros

Got a question for us? Please mention it in the comments section and we will get back to you.

 

 

 

 

 

 

 

 

 

 

0 responses on "How to Concatenate Data Sets in SAS"

Leave a Message

Your email address will not be published. Required fields are marked *