AcHi BlOgS: SAS

SAS FUNCTIONS
Arithmetic Functions
ABS(argument)
returns absolute value
DIM(array-name)
returns the number of elements in a one-dimensional array or the number of elements in a specified dimension of a multidimensional array.
n specifies the dimension, in a multidimensional array, for which you want to know the the number of elements.
DIM(array-name,bound-n)
returns the number of elements in a one-dimensional array or the number of elements in the specified dimension of a multidimensional array
bound-n specifies the dimension in a multidimensional array, for which you want to know the number of elements.
HBOUND(array-name)
returns the upper bound of an array
HBOUND(array-name,bound-n)
returns the upper bound of an array
LBOUND(array-name)
returns the lower bound of an array
LBOUND(array-name,bound-n)
returns the lower bound of an array
MAX(argument,argument, ...)
returns the largest value of the numeric arguments
MIN(argument,argument, ...)
returns the smallest value of the numeric arguments
MOD(argument-1, argument-2)
returns the remainder
SIGN(argument)
returns the sign of a value or 0
SQRT(argument)
returns the square root

Character Functions
BYTE(n)
returns one character in the ASCII or EBCDIC collating sequence where nis an integer representing a specific ASCII or EBCDIC character
COLLATE(start-position<,end-position>) (start-position<,,length>)
returns an ASCII or EBCDIC collating sequence character string
COMPBL(source)
removes multiple blanks between words in a character string
COMPRESS(source<,characters-to-remove>)
removes specific characters from a character string
DEQUOTE(argument)
removes quotation marks from a character value
INDEX(source,excerpt)
searches the source for the character string specified by the excerpt
INDEXC(source,excerpt-1<, ... excerpt-n>)
searches the source for any character present in the excerpt
INDEXW(source,excerpt)
searches the source for a specified pattern as a word
LEFT(argument)
left-aligns a SAS character string
LENGTH(argument)
returns the length of an argument
LOWCASE(argument)
converts all letters in an argument to lowercase
QUOTE(argument)
adds double quotation marks to a character value
RANK(x)
returns the position of a character in the ASCII or EBCDIC collating sequence
REPEAT(argument,n)
repeats a character expression
REVERSE(argument)
reverses a character expression
RIGHT(argument)
right-aligns a character expression
SCAN(argument,n<,delimiters>)
returns a given word from a character expression
SOUNDEX(argument)
encodes a string to facilitate searching
SUBSTR(argument,position<,n>)=characters-to-replace
replaces character value contents
var=SUBSTR(argument,position<,n>)
extracts a substring from an argument. (var is any valid SAS variable name.)
TRANSLATE(source,to-1,from-1<,...to-n,from-n>)
replaces specific characters in a character expression
TRANWRD(source,target,replacement)
replaces or removes all occurrences of a word in a character string
TRIM(argument)
removes trailing blanks from character expression and returns one blank if the expression is missing
TRIMN(argument)
removes trailing blanks from character expressions and returns a null string if the expression is missing
UPCASE(argument)
converts all letters in an argument to uppercase
VERIFY(source,excerpt-1<,...excerpt-n)
returns the position of the first character unique to an expression

Date and Time Functions
DATDIF(sdate,edate,basis)
returns the number of days between two dates
DATE()
returns the current date as a SAS date value
DATEJUL(julian-date)
converts a Julian date to a SAS date value
DATEPART(datetime)
extracts the date from a SAS datetime value
DATETIME()
returns the current date and time of day
DAY(date)
returns the day of the month from a SAS date value
DHMS(date,hour,minute,second)
returns a SAS datetime value from date, hour, minute, and second
HMS(hour,minute,second)
returns a SAS time value from hour, minute, and second
HOUR()
returns the hour from a SAS time or datetime value
INTCK('interval',from,to)
returns the number of time intervals in a given time span
INTNX('interval',start-from,increment<,'alignment'>)
advances a date, time, or datetime value by a given interval, and returns a date, time, or datetime value
JULDATE(date)
returns the Julian date from a SAS date value
MDY(month,day,year)
returns a SAS date value from month, day, and year values
MINUTE(time datetime)
returns the minute from a SAS time or datetime value
MONTH(date)
returns the month from a SAS date value
QTR(date)
returns the quarter of the year from a SAS date value
SECOND(time datetime)
returns the second from a SAS time or datetime value
TIME()
returns the current time of day
TIMEPART(datetime)
extracts a time value from a SAS datetime value
TODAY()
returns the current date as a SAS date value
WEEKDAY(date)
returns the day of the week from a SAS date value
YEAR(date)
returns the year from a SAS date value
YRDIF(sdate,edate,basis)
returns the difference in years between two dates
YYQ(year,quarter)
returns a SAS date value from the year and quarter

Mathematical Functions
AIRY(x)
returns the value of the AIRY function
DAIRY(x)
returns the derivative of the AIRY function
DIGAMMA(argument)
returns the value of the DIGAMMA function
ERF(argument)
returns the value of the (normal) error function
ERFC(argument)
returns the value of the (normal) error function
EXP(argument)
returns the value of the exponential function
GAMMA(argument)
returns the value of the GAMMA function
IBESSEL(nu,x,kode)
returns the value of the modified bessel function
JBESSEL(nu,x)
returns the value of the bessel function
LGAMMA(argument)
returns the natural logarithm of the GAMMA function
LOG(argument)
returns the natural (base e) logarithm
LOG2(argument)
returns the logarithm to the base 2
LOG10(argument)
returns the logarithm to the base 10
TRIGAMMA(argument)
returns the value of the TRIGAMMA function

Noncentrality Functions
CNONCT(x,df,prob)
returns the noncentrality parameter from a chi-squared distribution
FNONCT(x,ndf,ddf,prob)
returns the value of the noncentrality parameter of an F distribution
TNONCT(x,df,prob)
returns the value of the noncentrality parameter from the student's t distribution
Probability and Density Functions
CDF('dist',quantile,parm-1,...,parm-k)
computes cumulative distribution functions
LOGPDFLOGPMF('dist',quantile,parm-1,...,parm-k)
computes the logarithm of a probability density (mass) function. The two functions are identical.
LOGSDF('dist',quantile,parm-1,...,parm-k)
computes the logarithm of a survival function
PDFPMF('dist',quantile,parm-1,...,parm-k)
computes probability density (mass) functions
POISSON(m,n)
returns the probability from a POISSON distribution
PROBBETA(x,a,b)
returns the probability from a beta distribution
PROBBNML(p,n,m)
returns the probability from a binomial distribution
PROBCHI(x,df<,nc>)
returns the probability from a chi-squared distribution
PROBF(x,ndf,ddf<,nc>)
returns the probability from an F distribution
PROBGAM(x,a)
returns the probability from a gamma distribution
PROBHYPR(N,K,n,x<,r>)
returns the probability from a hypergeometric distribution
PROBMC
probabilities and critical values (quantiles) from various distributions for multiple comparisons of the means of several groups.
PROBNEGB(p,n,m)
returns the probability from a negative binomial distribution
PROBBNRM(x,y,r)
standardized bivariate normal distribution
PROBNORM(x)
returns the probability from the standard normal distribution
PROBT(x,df<,nc>)
returns the probability from a Student's t distribution
SDF('dist',quantile,parm-1,...,parm-k)
computes a survival function

Quantile Functions
BETAINV(p,a,b)
returns a quantile from the beta distribution
CINV(p,df<,nc>)
returns a quantile from the chi-squared distribution
FINV(p,ndf,ddf<,nc>)
returns a quantile from the F distribution
GAMINV(p,a)
returns a quantile from the gamma distribution
PROBIT(p)
returns a quantile from the standard normal distribution
TINV(p,df<,nc>)
returns a quantile from the t distribution

Sample Statistics Functions
CSS(argument,argument,...)
returns the corrected sum of squares
CV(argument,argument,...)
returns the coefficient of variation
KURTOSIS(argument,argument,...)
returns the kurtosis (or 4th moment)
MAX(argument,argument, ...)
returns the largest value
MIN(argument,argument, ...)
returns the smallest value
MEAN(argument,argument, ...)
returns the arithmetic mean (average)
MISSING(numeric-expression character-expression)
returns a numeric result that indicates whether the argument contains a missing value
N(argument,argument, ....)
returns the number of nonmissing values
NMISS(argument,argument, ...)
returns the number of missing values
ORDINAL(count,argument,argument,...)
returns the largest value of a part of a list
RANGE(argument,argument,...)
returns the range of values
SKEWNESS(argument,argument,argument,...)
returns the skewness
STD(argument,argument,...)
returns the standard deviation
STDERR(argument,argument,...)
returns the standard error of the mean
SUM(argument,argument,...)
returns the sum
USS(argument,argument,...)
returns the uncorrected sum of squares
VAR(argument,argument,...)
returns the variance

State and ZIP Code Functions
FIPNAME(expression)
converts FIPS codes to uppercase state names
FIPNAMEL(expression)
converts FIPS codes to mixed case state names
FIPSTATE(expression)
converts FIPS codes to two-character postal codes
STFIPS(postal-code)
converts state postal codes to FIPS state codes
STNAME(postal-code)
converts state postal codes to uppercase state names
Tip:
For Version 6, the maximum length of the value that is returned is 200 characters. For Version 7 and beyond, the maximum length is 20 characters.
STNAMEL(postal-code)
converts state postal codes to mixed case state names
Tip:
For Version 6, the maximum length of the value that is returned is 200 characters. For Version 7 and beyond, the maximum length is 20 characters.
ZIPFIPS(zip-code)
converts ZIP codes to FIPS state codes
ZIPNAME(zip-code)
converts ZIP codes to uppercase state names
ZIPNAMEL(zip-code)
converts ZIP codes to mixed case state names
ZIPSTATE(zip-code)
converts ZIP codes to state postal codes

Trigonometric and Hyperbolic Functions
ARCOS(argument)
returns the arccosine
ARSIN(argument)
returns the arcsine
ATAN(argument)
returns the arctangent
COS(argument)
returns the cosine
COSH(argument)
returns the hyperbolic cosine
SIN(argument)
returns the sine
SINH(argument)
returns the hyperbolic sine
TAN(argument)
returns the tangent
TANH(argument)
returns the hyperbolic tangent

Truncation Functions
CEIL(argument)
returns the smallest integer that is greater than or equal to the argument
FLOOR(argument)
returns the largest integer that is less than or equal to the argument
FUZZ(argument)
returns the nearest integer if the argument is within 1E-12
INT(argument)
returns the integer value
ROUND(argument,round-off-unit)
rounds to the nearest round-off unit
TRUNC(number, length)
truncates a numeric value to a specified length

Variable Information Functions
GETVARC(data-set-id,var-num)
returns the value of a SAS data set character variable
GETVARN(data-set-id,var-num)
returns the value of a SAS data set numeric variable
VARFMT(data-set-id,var-num)
returns the format assigned to a SAS data set variable
VARINFMT(data-set-id,var-num)
returns the informat assigned to a SAS data set variable
VARLABEL(data-set-id,var-num)
returns the label assigned to a SAS data set variable
VARLEN(data-set-id,var-num)
returns the length of a SAS data set variable
VARNAME(data-set-id,var-num)
returns the name of a SAS data set variable
VARNUM(data-set-id,var-name)
returns the number of a SAS data set variable's position in a SAS data set
VARRAY(name)
returns a value that indicates whether the specified name is an array
VARRAYX(expression)
returns a value that indicates whether the value of the specified argument is an array
VARTYPE(data-set-id,var-num)
returns the data type of a SAS data set variable
VFORMAT(var)
returns the format associated with the given variable
VFORMATD(var)
returns the format decimal value associated with the given variable
VFORMATDX(expression)
returns the format decimal value associated with the value of the specified argument
VFORMATN(var)
returns the format name associated with the given variable
VFORMATNX(expression)
returns the format name associated with the value of the specified argument
VFORMATW(var)
returns the format width associated with the given variable
VFORMATWX(expression)
returns the format width associated with the value of the specified argument
VFORMATX(expression)
returns the format associated with the value of the specified argument
VINARRAY(var)
returns a value that indicates whether the given variable is a member of an array
VINARRAYX(expression)
returns a value that indicates whether the value of the specified argument is a member of an array
VINFORMAT(var)
returns the informat associated with the given variable
VINFORMATD(var)
returns the informat decimal value associated with the given variable
VINFORMATDX(expression)
returns the informat decimal value associated with the value of the specified argument
VINFORMATN(var)
returns the informat name associated with the given variable
VINFORMATNX(expression)
returns the informat name associated with the value of the specified argument
VINFORMATW(var)
returns the informat width associated with the given variable
VINFORMATWX(expression)
returns the informat width associated with the value of the specified argument
VINFORMATX(expression)
returns the informat associated with the value of the specified argument
VLABEL(var)
returns the label associated with the given variable
VLABELX(expression)
returns the variable label for the value of a specified argument
VLENGTH(var)
returns the compile-time (allocated) size of the given variable
VLENGTHX(expression)
returns the compile-time (allocated) size for the value of the specified argument
VNAME(var)
returns the name of the given variable
VNAMEX(expression)
validates the value of the specified argument as a variable name
VTYPE(var)
returns the type (character or numeric) of the given variable
VTYPEX(expression)
returns the type (character or numeric) for the value of the specified argument

Missing values in SAS
Numeric missing values are represented by a single period (.).
Character missing values are represented by a single blank enclosed in quotes (' ').
Special numeric missing values are represented by a single period followed by a single letter or an underscore (for example .A, .S, .Z, ._).
Special missing values
These are only available for numeric variables and are used for distinguishing between different types of missing values.
Responses to a questionnaire, for example, could be missing for one of several reasons (Refused, illness, Dead, not home). By using special missing values, each of these can be tabulated separately, but the variables are still treated as missing by SAS in data analysis.data survey;
missing A I R;
input id q1;
cards;
8401 2
8402 A
8403 1
8404 1
8405 2
8406 3
8407 A
8408 1
8408 R
8410 2
;
proc format;
value q1f
.A='Not home'
.R='Refused'
;
run;
proc freq data=survey;
table q1 / missprint;
format q1 q1f.;
run;
Sort order for missing values
There is a serious logic error in the following code:if age < 20 then agecat=1;
else if age < 50 then agecat=2;
else if age ge 50 then agecat=3;
else if age=. then agecat=9;
Sort order
Symbol
Description
smallest
_
underscore

.
period

A-Z
special missing values A (smallest) through Z (largest)

-n
negative numbers

0
zero
largest
+n
positive numbers
Working with missing values
When transforming or creating SAS variables, the first part of the code should deal with the case where variables are missing.if age=. then agecat=.;
else if age < 20 then agecat=1;
else if age < 50 then agecat=2;
else agecat=3;
Note that if you use special missing values then 'if age=.' cannot be used and 'if age le .Z' must be used to identify missing values.
Note that the result of any operation on missing values will return a missing value. In the following example, the variable total will be missing if any one of q1-q6 is missing.total=q1+q2+q3+q4+q5+q6;
An alternative is to use:total=sum(of q1-q6);
in which missing values are assumed to be zero.
Even if you think that a variable should not contain any missing values, you should always write your code under the assumption that there may be missing values.

AcHi BlOgS

Friday, December 5, 2008

Meta data Creation

Monday, May 12, 2008

AE Summary Tables by SOC

Friday, January 11, 2008

Detecting Outliers based on 99% quantiles

Wednesday, January 9, 2008

THIS VARIABLE IS UNINITIALIZED

SAS Functions (an excerpt)

ODS ExcelXp tagsets usage.

Missing Values in SAS

My Previous Posts

Support the Haiti Disaster Relief Effort

Search This Blog