This blog contains some basic SAS commands and their application, along with the output . The problems have been picked up from the book "Learning SAS by Example: A Programmer's Guide" - Chapter 7 onwards. All the programs have been saved in the library a16. Let's begin..
Output:
Learning
Use of the Yrdif() function to find the difference between two years. Helpful in calculating the age sometimes.
Learning
Use of Ranuni( ) function to generate random numbers between 0 and 1

Learning
Use if weekday( ), year( ) to extract days and year from a date and round( ) to round off a number
*Setting up the library
libname a16 "/folders/myshortcuts/myfolder/iSAS/a15016";
Chapter 7 - Problems 1;
data a16.a16_school;
length Quiz $ 1;
input Age Quiz $ Midterm Final;
*Using IF and ELSE IF statements, compute two new variables as follows:
Grade (numeric), with a value of 6 if Age is 12 and a value of 8 if Age is 13;
if Age = 12 then Grade = 6;
else if Age = 13 then Grade = 8;
*The quiz grades have numerical equivalents as follows: A = 95, B = 85, C = 75, D = 70, and F = 65;
if Quiz = 'A' then Quiz_eq = 95;
else if Quiz = 'B' then Quiz_eq = 85;
else if Quiz = 'C' then Quiz_eq = 75;
else if Quiz = 'D' then Quiz_eq = 70;
else if Quiz = 'F' then quiz_eq = 65;
*Compute a course grade (Course) as a weighted average of the
Quiz (20%), Midterm (30%) and Final (50%);
Course = (.2*Quiz_eq + .3*Midterm + .5*Final);
datalines;
12 A 92 95
12 B 88 88
13 C 78 75
13 A 92 93
12 F 55 62
13 B 88 82
;
proc print data = a16.a16_school;
run;
Output:
Learning:
Use of 'if then else' and computing a new variable.
Use of 'if then else' and computing a new variable.
Chapter 7 - Problems 2 ;
*Creating Data set HOSP;
libname a16 "/folders/myshortcuts/myfolder/iSAS/a15016";
data a16.a16_hosp;
do j = 1 to 1000;
AdmitDate = int(ranuni(1234)*1200 + 15500);
quarter = intck('qtr','01jan2002'd,AdmitDate);
do i = 1 to quarter;
if ranuni(0) lt .1 and weekday(AdmitDate) eq 1 then
AdmitDate = AdmitDate + 1;
if ranuni(0) lt .1 and weekday(AdmitDate) eq 7 then
AdmitDate = AdmitDate - int(3*ranuni(0) + 1);
DOB = int(25000*Ranuni(0) + '01jan1920'd);
DischrDate = AdmitDate + abs(10*rannor(0) + 1);
Subject + 1;
output;
end;
end;
drop i j;
format AdmitDate DOB DischrDate mmddyy10.;
run;
*Use PROC PRINT to list observations for Subject
values of 5, 100, 150, and 200;
*Using In Operator;
proc print data = a16.a16_hosp;
WHERE Subject in (5 100 150 200);
run;
*Using OR Operator;
proc print data = a16.a16_hosp;
WHERE Subject = 5 OR Subject = 100 OR Subject = 150 OR Subject = 200;
run;
Output: The output remains the same in either case.
Learning
Use of OR and IN operator
Use of OR and IN operator
Chapter 7 - Problems 4;
*Creating a subset of Sales with Region and TotalSales Variable;
Data a16.Sales_new (keep= Region TotalSales);
set a16.Sales;
input Weight;
*Adding a new variable called Weight with values of 1.5 for the North Region,
1.7 for the South Region, and 2.0 for the West and East Regions. Use a SELECT statement to do this;
Select;
when (Region eq 'North') Weight = 1.5;
when (Region eq 'South') Weight = 1.7;
when (Region eq 'West') Weight = 2.0;
when (Region eq 'East') Weight = 2.0;
otherwise;
end;
proc print data = a16.Sales_new;
run;
Output:
Learning
Use of SELECT statement
Use of SELECT statement
Chapter 8 - Problem 1;
*Creating a dataset named Vital;
data a16.a16_vitals;
input ID : 3. Age Pulse SBP DBP;
label SBP = "Systolic Blood Pressure"
DBP = "Diastolic Blood Pressure";
datalines;
001 23 68 120 80
002 55 72 188 96
003 78 82 200 100
004 18 58 110 70
005 43 52 120 82
006 37 74 150 98
007 . 82 140 100
;
proc print data = a16.a16_vitals;
run;
data a16.a16_newvitals;
set a16.a16_vitals;
if Age lt 50 then do;
if Pulse lt 70 then PulseGroup = 'Low';
else if Pulse ge 70 then PulseGroup = 'High';
If SBP lt 130 then SBPgroup = 'Low';
else if SBP ge 130 then SBPgroup = 'High';
End;
Output:
Chapter 8 - Problem 2;
data a16.a16_monthsales;
input month sales;
SumSales + sales;
datalines;
1 4000
2 5000
3 .
4 5500
5 5000
6 6000
7 6500
8 4500
9 5100
10 5700
11 6500
12 7500
;
run;
proc print data = a16.a16_monthsales;
run;
Output:
Learning
Use of SUM Statement
Use of SUM Statement
Chapter 8 - Problem 4;
data a16.a16_missing;
input A $ B $ C $;
MissA = missing(A);
MissB = missing(B);
MissC = missing(C);
TotalMiss = MissA+MissB+MissC;
datalines;
X Y Z
X Y Y
Z Z Z
X X .
Y Z .
X . .
;
run;
proc print data = a16.a16_missing;
run;
Output:
Learning
USe of Missing function to count the number of missing values in a variable
USe of Missing function to count the number of missing values in a variable
Chapter 9 - Problem 1;
data a16.a16_dates;
input @1 subject $3.
@4 dob mmddyy10.
@14 visit date9.;
age=yrdif(dob,visit);
datalines;
00110/21/195011Nov2006
00201/02/195525May2005
00312/25/200525Dec2006
;
run;
proc print data = a16.a16_dates;
Output
Learning
Use of the Yrdif() function to find the difference between two years. Helpful in calculating the age sometimes.
Chapter 9 - Problem 2;
data a16.a16_threedates;
input @1 date1960_2006 mmddyy8.;
format date1960_2006 date9.;
datalines;
01/01/11
02/23/05
03/15/15
05/09/06
;
proc print data=a16.a16_threedates;
run;
Output
Chapter 9 - Problem 4;
data a16.a16_hosp;
do j = 1 to 1000;
AdmitDate = int(ranuni(1234)*1200 + 15500);
quarter = intck('qtr','01jan2002'd,AdmitDate);
do i = 1 to quarter;
if ranuni(0) lt .1 and weekday(AdmitDate) eq 1 then
AdmitDate = AdmitDate + 1;
if ranuni(0) lt .1 and weekday(AdmitDate) eq 7 then
AdmitDate = AdmitDate - int(3*ranuni(0) + 1);
DOB = int(25000*Ranuni(0) + '01jan1920'd);
DischrDate = AdmitDate + abs(10*rannor(0) + 1);
Subject + 1;
output;
end;
end;
drop i j;
format AdmitDate DOB DischrDate mmddyy10.;
run;
proc print data=a16.a16_hosp (obs=10);
run;
Output
Learning
Use of Ranuni( ) function to generate random numbers between 0 and 1
Chapter 10 - Problem 1;
data a16.a16_blood;
infile "/folders/myshortcuts/myfolder/iSAS/Datasets/blood.txt" truncover;
input Subject Gender $ BloodType $ AgeGroup $ WBC RBC Chol;
proc print data = a16.a16_blood (obs=5);
run;
Output
data a16.a16_subsetA;
set a16.a16_blood;
where Gender = "Female" and BloodType = "AB";
Combined = (0.001*WBC)+ RBC;
run;
proc print data = a16.a16_subsetA (obs=5);
run;
Output
data a16.a16_subsetB;
set a16.a16_blood;
where Gender = "Female" and BloodType = "AB" and (0.001*WBC)+ RBC ge 14;
Combined = (0.001*WBC)+ RBC;
run;
proc print data = a16.a16_subsetB(obs=5);
run;
Output
Learning
Learnt how to create subsets using SET
Chapter 10 - Problem 2;
data a16.a16_monday2002;
set a16.a16_hosp;
day = weekday(AdmitDate);
year = year(AdmitDate);
age = yrdif(DOB, AdmitDate);
age = round(age);
if day ="2" and year = "2002";
run;
proc print data = a16.a16_monday2002;
run;
Output

Learning
Use if weekday( ), year( ) to extract days and year from a date and round( ) to round off a number
Chapter 10 - Problem 4;
data a16.a16_bicycles;
input Country & $25.
Model & $14.
Manuf : $10.
Units : 5.
UnitCost : comma8.;
TotalSales = (Units * UnitCost) / 1000;
format UnitCost TotalSales dollar10.;
label TotalSales = "Sales in Thousands"
Manuf = "Manufacturer";
datalines;
USA Road Bike Trek 5000 $2,200
USA Road Bike Cannondale 2000 $2,100
USA Mountain Bike Trek 6000 $1,200
USA Mountain Bike Cannondale 4000 $2,700
USA Hybrid Trek 4500 $650
France Road Bike Trek 3400 $2,500
France Road Bike Cannondale 900 $3,700
France Mountain Bike Trek 5600 $1,300
France Mountain Bike Cannondale 800 $1,899
France Hybrid Trek 1100 $540
United Kingdom Road Bike Trek 2444 $2,100
United Kingdom Road Bike Cannondale 1200 $2,123
United Kingdom Hybrid Trek 800 $490
United Kingdom Hybrid Cannondale 500 $880
United Kingdom Mountain Bike Trek 1211 $1,121
Italy Hybrid Trek 700 $690
Italy Road Bike Trek 4500 $2,890
Italy Mountain Bike Trek 3400 $1,877
;
run;
proc print data = a16.a16_bicycles (obs = 10);
run;
Output
data a16.a16_mountainUSA a16.a16_roadFrance;
set a16.a16_bicycles;
if Country eq "USA" and Model eq "Mountain Bike"
then output a16.a16_mountainUSA;
if Country eq "France" and Model eq "Road Bike"
then output a16.a16_roadFrance;
run;
proc print data = a16.a16_mountainUSA;
run;
Output
proc print data = a16.a16_roadFrance;
run;
Output
Chapter 11 - Problem 1;
data a16.a16_health;
input Subj : $3.
Height
Weight;
datalines;
001 68 155
003 74 250
004 63 110
005 60 95
;
run;
data a16.a16_bmi (drop = height weight);
set a16.a16_health;
weight_kg=height/2.2;
height_mts=height*.0254;
bmi=weight_kg/height_mts;
bmi_round=round(bmi,1);
bmi_tenth=round(bmi,.1);
bmi_group=round(bmi,5);
bmi_trunc=int(bmi);
;
run;
title "BMI";
proc print data=a16.a16_bmi;
run;
Output
Chapter 11 - Problem 2;
data a16.a16_miss;
set a16.a16_blood;
if missing(wbc) then missWBC+1;
if missing(rbc) then missRBC+1;
if missing(chol) then missCHOL+1;
run;
proc print data=a16.a16_miss (obs=10);
run;
Output
Chapter 11- Problem 3
data a16.a16_blood_miss;
set a16.a16_blood;
if missing(wbc) then call missing(age, rbc, chol);
run;
title "Missing Value";
proc print data=a16.a16_blood_miss (obs = 10);
run;
Output
Chapter 12 - Problem2;
*Data set MIXED;
data a16.a16_mixed;
input Name & $20. ID;
datalines;
Daniel Fields 123
Patrice Helms 233
Thomas chien 998
;
data a16.a16_mixed1(drop = first last first1 first2 last1 last2 f l);
set a16.a16_mixed;
name_low=lowcase(name);
name_prop=propcase(name);
first=scan(name,1,' ');
last=scan(name,2,' ');
first1=upcase(substr(first,1,1));
first2=lowcase(substr(first,2));
f=trim(first1)||trim(first2);
last1=upcase(substr(last,1,1));
last2=lowcase(substr(last,2));
l=trim(last1)||trim(last2);
name_hard=catx(' ',f,l);
run;
title "Names";
proc print data=a16.a16_mixed1;
run;
Output
'
Learning
Use of lowcase( ) ad propcase( ) t0 convert letters to lowercase and proper case respectively
Chapter 12 - Problem 3;
*Data set NAMES_AND_MORE;
data a16.a16_names_and_more;
input Name $20.
Phone & $14.
Height & $10.
Mixed & $8.;
datalines;
Roger Cody (908)782-1234 5ft. 10in. 50 1/8
Thomas Jefferson (315) 848-8484 6ft. 1in. 23 1/2
Marco Polo (800)123-4567 5Ft. 6in. 40
Brian Watson (518)355-1766 5ft. 10in 89 3/4
Michael DeMarco (445)232-2233 6ft. 76 1/3
;
proc print data=a16.a16_names_and_more;
run;
Output
data a16.a16_name_number;
set a16.a16_names_and_more;
name=compbl(propcase(name));
phone=compress(phone,'( ) -');
phone=input(phone,10.);
run;
title "Names and More";
proc print data=a16.a16_name_number;
run;
Output
Learning
Use of COMPBL( ) and COMPRESS( ) function. The former converts two or more blanks to a single blank; the latter removes blanks (default action) or characters that you specify from a character value
Chapter 12 - Problem 4;
data a16.a16_height (drop = h1 h2 h3);
set a16.a16_names_and_more;
h1=compress(scan(height,1,' '),' ','kd');
h1=input(h1,10.);
h2=compress(scan(height,2,' '),' ','kd');
h2=input(h2,10.);
h3=h1*12;
Height_New=sum(h3,h2);
run;
title "Names and More";
proc print data=a16.a16_height;
run;
Output
Chapter 13 - Problem 2;
data a16.a16_survey2;
input ID
(Q1-Q5)(1.);
datalines;
535 13542
012 55443
723 21211
007 35142
;
run;
data a16.a16_survey2_new;
set a16.a16_survey2;
array ques{5} Q1-Q5;
do over ques
ques=5-ques;
end;
run;
proc print data=a16.a16_survey2;
run;
Output
Chapter 13 - Problem 3;
*Data set NINES;
data a16.a16_nines;
infile datalines missover;
input x y z (Char1-Char3)(:$1.) a1-a5;
datalines;
1 2 3 a b c 99 88 77 66 55
2 999 999 d c e 999 7 999
10 20 999 b b b 999 999 999 33 44
;
proc print data=a16.a16_nines;
run;
Output
data a16.a16_nonines;
set a16.a16_nines;
array n{*} _numeric_;
do i=1 to dim(n);
if n{i}=999 then call missing(n{i});
end;
drop i;
run;
title "No Nines";
proc print data=a16.a16_nonines;
run;
Output
Chapter 13 - Problem 5;
data a16.a16_pass;
infile datalines missover;
input id test1-test5;
datalines;
001 90 88 92 95 90
002 64 64 77 72 71
003 68 69 80 75 70
004 88 77 66 77 67
;
run;
proc print data=a16.a16_pass;
run;
data a16.a16_test;
set a16.a16_pass;
array pass{5} test1-test5;
do i=1 to 5;
if pass{i}>65 ;
Output
Chapter 14 - Problem 1;
title "Blood and Subject";
proc print data=a16.a16_blood (obs=10)label;
label wbc="WHITE BLOOD CELL"
RBC="RED BLOOD CELL"
chol="CHOLESTROL";
id subject;
run;
Output
Chapter 14 - Question 2;
*Creating Data set SALES;
data a16.a16_sales;
input EmpID : $4.
Name & $15.
Region : $5.
Customer & $18.
Date : mmddyy10.
Item : $8.
Quantity : 5.
UnitCost : dollar9.;
TotalSales = Quantity * UnitCost;
/* format date mmddyy10. UnitCost TotalSales dollar9.;*/
drop Date;
datalines;
1843 George Smith North Barco Corporation 10/10/2006 144L 50 $8.99
1843 George Smith South Cost Cutter's 10/11/2006 122 100 $5.99
1843 George Smith North Minimart Inc. 10/11/2006 188S 3 $5,199
1843 George Smith North Barco Corporation 10/15/2006 908X 1 $5,129
1843 George Smith South Ely Corp. 10/15/2006 122L 10 $29.95
0177 Glenda Johnson East Food Unlimited 9/1/2006 188X 100 $6.99
0177 Glenda Johnson East Shop and Drop 9/2/2006 144L 100 $8.99
1843 George Smith South Cost Cutter's 10/18/2006 855W 1 $9,109
9888 Sharon Lu West Cost Cutter's 11/14/2006 122 50 $5.99
9888 Sharon Lu West Pet's are Us 11/15/2006 100W 1000 $1.99
0017 Jason Nguyen East Roger's Spirits 11/15/2006 122L 500 $39.99
0017 Jason Nguyen South Spirited Spirits 12/22/2006 407XX 100 $19.95
0177 Glenda Johnson North Minimart Inc. 12/21/2006 777 5 $10.500
0177 Glenda Johnson East Barco Corporation 12/20/2006 733 2 $10,000
1843 George Smith North Minimart Inc. 11/19/2006 188S 3 $5,199
;
title "Sales";
proc print data=a16.a16_sales;
by Region;
var region quantity TotalSales;
sum Quantity TotalSales;
run;
Output
Chapter 14 -Problem 3;
title"First 10 Observation";
title2 "admitted on september 2004";
title3 "older than 83 years";
proc print data=a16.a16_hosp n ="Number of Patients" label;
where year(admitdate) eq 2004 and
month(admitdate) eq 9 and
yrdif(dob,admitdate,'actual')ge 83;
id subject;
label dob = "Date of Birth"
admitdate= "Admission date"
dischrdate="Discharge Date";
var dob admitdate dischrdate;
run;
Output
Chapter 15 - Problem 1;
proc report data=a16.a16_blood(obs=5) nowd headline;
column subject wbc rbc;
define wbc/display"White Blood Cell";
define rbc/ display "red blood cell";
run;
Output
Chapter 15 - Problem 2;
proc report data=a16.a16_blood headline nowd;
column gender wbc rbc;
define gender/ group;
define wbc/ analysis mean format=6.3 display"White Blood Cell" ;
define rbc/ analysis mean format=6.3 display "red blood cell" ;
break after gender/ ol summarize skip;
run;
quit;
proc print data=a16.a16_blood;
run;
Output
Chapter 15 - Problem 3;
proc report data=a16.a16_hosp headline;
column subject admitdate dob age;
define admitdate/ display "admission date";
define dob/ display "date of birth";
define age/computed;
compute age;
age=yrdif(dob,admitdate,"actual");
endcomp;
run;
Output






























Your blog will look much better, if you:
ReplyDelete1. Make it structured. You can have separate page for each chapter and have an index page, linking the pages (for the chapters). You can also add links at each chapter to go back to index, previous (if any) and next (if any) pages.
2. Introduce the chapter (i.e. the page) in your own words.
3. For each problem, explain the problem you are attempting to solve.
4. You can have screenshots programs as well.
5. For each problem solved, write what you learned by solving the problem. Even one line is fine for this.
6. Write a conclusion or summary for each page.
Remember to :
1. Number the problems solved by giving Chapter # and Problem # (just like you did)
2. "Beautify" the programs before posting. There a menu item in SAS studio, which does the beautification for you. Also, you may like to post the programs, the way you have posted the result.
It would help, if you can think of this exercise as making your class notes and homework copy, combined.
I understand, it is lot of work now, given you are trying to do it at eleventh hour. For the next submission, if time permits, you can give it a try.
Cheers
Thank for the feedback sir. I shall definitely incorporate all these things in my next submission.
ReplyDelete