* Design and Analysis Macro Collection Version 1.29; %let dandaverzz=1.29; /* ************************************************************************************* COPYRIGHT See legal notices for UPower/UnifyPow in its coding section below (~line 4300). All other code is ************************************************************** * Copyright (C) 2002-2010 Arnold Saxton (asaxton@utk.edu) * * University of Tennessee, Knoxville TN 37996-4500 * * This program is free software, you can redistribute it * * and/or modify it under the terms of the GNU General * * Public License as published by the Free Software * * Foundation, either version 2 of the License, or * * (at your option) any later version. Basically all * * copies, modifications or derivative works must allow * * the user to freely use the software, to copy, modify * * and distribute, and must carry this same License for * * free use. Source code must be distributed, but * * distribution charges of any magnitude are permitted. * * * * This program is distributed in the hope that it will * * be useful, but WITHOUT ANY WARRANTY, without even the * * implied warranty of MERCHANTABILITY or FITNESS FOR A * * PARTICULAR PURPOSE. See the GNU General Public License * * for more details. * * A copy of the GNU General Public License can be obtained* * from Free Software Foundation, Inc., 59 Temple Place, * * Suite 330, Boston, MA 02111-1307 USA * * or http://www.gnu.ai.mit.edu/copyleft/gpl.txt. * ************************************************************** Program Modification History WCLEAR 02/2010 1.29 Added to collection. Just run %wclear and your log, output and graph windows will be cleared. MMAOV 02/2010 1.29 Only graphics versions of plots produced if possible. Fixed problem with multiple covariate models not producing means. Covariate name including a number was fixed. 09/2009 1.28 Fixed repeat= so correct subject is case insensitive Handle no means for some glimmix dists. 07/2009 1.27 Now exclude output from html to mimic listing. Influence diagnostics changed to non-iterative, so diag=noinf is not needed. However, not available when KR is used. 06/2009 Version 1.26 applied to all, allow rank transform and print means, fixed B*T plots with by=, allow cov as variable name. Added boxcox transform, put power in titles 07/2008 1.25 Allow @3 syntax, handle multiple lsmeans statements better. 06/2008 1.25 command=, multiple plot requests, sqrt backtrans did not use transvalue 02/2008 1.24 Major fix to back transformations, improved goptions on all macros, add noinf option to stop experimental diagnostics, improve b*t graphs. 04/2007 1.23 influence diagnostics changed to iterative, and print bad residuals. 03/2007 1.22 add GLIMMIX, very crude 02/2006 1.21 fix blkcomb, fix transform+block analysis 08/2006 1.20 fix kenward-rogers use with one error term 09/2006 1.19 add printmeans option to not print means (for large models) 02/2005 1.18 handle models with no fixed effects 12/2005 1.17 save and restore options, allow list of y vars REG 02/2010 1.29 Lost collinearity diagnostics restored. Only graphics versions of plots produced if possible. 06/2009 1.26 Argument checking improved, LOF models expanded. Added boxcox transformation, added power to titles. 06/2008 1.25 Added commands= 04/2007 1.23 Improved response surface + LOF Fix observation order in response surface diagnostics Allow multiple y variables to be listed - but univariate analysis only on each one Truncate Cp plots at 5*#xvars, removed stat clutter outside graph Added list of simplest models with acceptable Cp for selection=cp CP option always requested 04/2007 1.23 dummy regression option to specify x variable 02/2005 1.18 added transformations DESIGN 06/2009 1.26 Added CCD, intial required argument, fixed switchback with no treat= 01/2004 1.10 Added to collection ORTHPOLY 02.2010 1.29 Fixed numobs=. Used to give illegal contrast coefficients that were not orthogonal. 03/2003 1.04 Added to collection PDMIX800 02/2010 1.29 cleaned up Log printing of LSD values. 07/2009 1.26 LSD values calculated for each comparison, average is put in lsdvalavgzz and printed in the log. 06/2008 1.25 Fixed LSD assumption if adjp=., sorting of sliced, log values given more info, fixed complex slicing. 09/2006 1.19 printmeans option like MMAOV 08/2003 1.05 slice correction, handles groups with one mean 03/2002 1.02 error in by processing 10/2001 1.01 printing changed again, turned off log notes 06/2001 1.00 bug in slice and printing modified PDGLM 02/2010 1.29 Added to collection 11/2004 Updated to allow multiple dependent variables. Cleaned up the code to match GLM datasets (eg, no adjust info). 06/2002 Created by converting pdmix612 v1.6 to accept SAS Version 8 output. UPOWER 06/2009 1.26 Fixed trailing blanks in contrast label, fraction in 1st errordf value 02/2008 1.24 Fix contrasts so all are processed, error from previous change. Delete more unused stuff. 03/2007 1.22 Rewrote so statements are not written to external file. 01/2005 1.16 Created by modifying UnifyPow, deleting all but MU problems. ERRORDF statement added to change df for non-residual error terms. Changes are flagged by AMS comments. VARIOGRAM 09/2009 1.28 Error checking, better handling of lags and number of bins. 09/2006 1.19 Added to collection SIMPOWER 02/2009 1.28 Added to collection ******************************************************************************************; */ /* ============================================================================================== ==============================================================================================; **** Short Documentation %MMAOV macro **** Purpose: To simplify statistical analysis of ANOVA Mixed Models **** **** Run by adding these statements: **** %include 'c:\danda.sas'; **** %mmaov(dataset, yvar, class= , fixed= , random= , other specifications); **** **** dataset is the SAS dataset name containing data to be analyzed. **** yvar is a list of dependent variables. The same model will be run on all. **** Do not separate them with commas! **** class= list variables whose values represent classification. **** Do not list variables with quantitative values to be used in regression. **** fixed= specify fixed model terms, such as treatments, their interactions, covariates, etc. **** random= specify random model terms, such as blocks, error terms, samples, etc. **** Other options: **** repeat= to specify ONE repeated measures factor, which MUST be in the lowest sub-plot. **** contrasts=%str(your statements) to request tests of contrasts. **** contrast= command= commands= can be used instead of contrasts=. **** by= to have separate analyses for each level of the variable(s) specified. **** Special Options: **** at= (cov1 cov2)=(5 60) will request least squares means to be calculated at the specified covariate values. Covariate mean values are the default. **** type= to change the repeated measures correlation pattern, AR(1) by default. The default is ar(1), autoregressive. **** adjust= to use a mean separation other than LSD (default). **** ddfm= denominator degrees of freedom method. MMAOV switches to Satterthwaite approximate df for models with more than one error term, or if repeat is used. To prevent this, user can specify the desired method. Default is CONTAIN. **** plot= list a term with 1 or 2 treatment factors to plot treatment means. **** diagnostics=OFF to not produce any checks for normality, equal variance, etc. **** =NOINF to turn off just the influence diagnostics **** plottype= GPLOT by default produces graphics window output. If you do not have SAS/GRAPH, set **** to some other value. **** mixdiffs= NO by default turns off printing of pairwise comparisons and least squares means. **** Set to some other value if you need to see the P-values for each comparison. **** printmeans=YES by default, any other value will not print any mean information to Output. **** The following options are passed to pdmix800 macro, which is included in this file. **** sort= YES or NO (default) to have least square means sorted by value. **** alpha= .05 (default), the significance level used for mean separation and lsmeans tests. **** SLICE=variables Effects containing all the slice variables will **** be subdivided, and mean separation reporting done within **** slice levels. Note that all comparisons are made, just **** reporting of comparisons across slice levels is suppressed. **** This is useful to reduce the complexity of letter groupings. **** Test0=NO by default. Any other value will print tests of Ho: lsmean=0 and confidence interval. **** **** Transform options **** Formula used TransType= TransValue= **** 10**(x-yvar) from_pH x (This will backtransform values to the pH scale) **** sqrt(yvar+valuezz) sqrt x **** log(yvar+x) log x **** log10(yvar+x) log10 x **** arsin(sqrt(yvar/x)) arcsinsqrt x **** yvar**x power x (required) **** rank rank **** by default the transvalue is 0, except arsinsqrt has a default of 1; **** rank transform does ranking within levels of random class variables (assumed to be blocks) **** **** Example: %mmaov(one,yvar,class=A B rep, fixed=A|B X,random=rep(A*B), **** contrasts=%str(contrast 'A1 vs A2' A 1 -1;) ); **** would be used for a CRD factorial, with sampling and a **** covariate, and a specific contrast requested. **** **** Datasets created: WARNING...Will overwrite user's files of the same name. Remain available for use after **** running the macro. These may be useful for printing or further processing by the user. **** -meanzz contains the least squares mean output with mean separation letter groups. **** -diffzz contains all pairwise tests of treatment means, from the pdiff option on lsmeans; **** -solzz fixed effect parameter estimates, or solution vector. Contains covariate slopes. **** -aaazz **** -btizz **** -ssezz **** -residzz from the outp= option on the model statement. Contains predicted and residual values **** with random effects included in predicted. **** -residmzz from the outpm= option on the model statment. Same as residzz but random effects are **** not in predicted, so will be in residuals. **** -hovzz **** -mzz **** -tempzz working dataset **** -tempzzz working dataset **** If covariates are used, then also aov1zz, aov3zz, aov5zz are created, containing ANOVA tables **** for the diagnostic models that check for covariate*treatment interactions. **** -optionszz saves and restores users options **** **** Other output: Mean separation values, LSD value by default, are printed in **** the SAS Log. Min and max values are printed to help decide if the average value **** is appropriate to use in unbalanced situtations. ********************************************************************************************** ============================================================================================== ============================================================================================== **** Short Documentation: %REG Macro **** **** Use by adding these statements **** %include 'c:\danda.sas'; **** %reg(data set, dependent variable, list of independents); **** Optional settings **** opt= add a list of options to the model; **** by= do a separate analysis for each level of the by variable(s) **** deg= requests a polynomial regression of the given degree **** If deg=2 and multiple independents, a response surface is run. **** As written, polynomials are limited to the 8th power. **** lof= YES or NO should a lack of fit analysis be done. **** Yes requires replicated data. **** class= list variables that identify groups for dummy regression. **** command= or commands= %str(your statements). This allows other statements like contrasts **** to be requested. **** plottype = GPLOT by default. Set it to any other value to skip **** creation of graphics, useful if SAS/GRAPH is not installed. **** transtype= and transvalue= function like %MMAOV. No backtransformation is done. **** diagnostics=ON (default) prints diagnostic output, any other value suppresses this. **** indvar= identify the X variable (needed if X is not in the model). **** Datasets created (WARNING: existing datasets are erased!) **** tempzz, tempzzz, residzz, predzz, dsetzzz *******************************************************************; ============================================================================================== ============================================================================================== **** Quick Documentation %DESIGN MACRO **** Purpose: To generate randomized layouts of various Designs. **** %design(design, options) **** First argument specifies the design requested, can be blank. **** Choices are CRD, RBD, IBD, rIBD, CCD, MSLSD, LSD, XOVR, SWI **** OPTIONS: **** treat= number of treatment levels **** rep= number of replicates **** row= number of rows in Latin Square type designs **** col= number of column in Latin Square type designs **** popsize= total population size **** square= number of Latin Squares **** seed= random number seed. A starting value for the random number generation can **** be specified with seed=up to 9 digit integer. **** period= number of periods, eg. switchback **** iblock= number of incomplete blocks **** block= number of complete blocks **** factors= names of factors for central composite design (CCD) **** min= max= min and max values for CCD factors. Defaults are 0 and 1; **** blocksize= size of an incomplete block **** WARNING Datasets zzz and designzz used here will destroy existing data **** of the same name. DESIGNZZ will contain the final design. **** **** Usage: **** Run %design() to get a list of designs and options. **** Completely randomized designs **** - specify treat=t, rep=r **** to randomly assign t treatments to r reps each **** - OR specify popsize=p, rep=r **** to randomly samply r exp. units from a population of size p **** Randomized Block Design **** - specify block=b, treat=t **** to randomly assign t treatments within b blocks **** Optionally, add rep=r, and the treatments will be replicated **** r times within each block. **** Latin Squares **** - specify row=x **** to randomly create an x by x Latin square. **** Optionally add square=s to generate s Multiple squares. **** - specify row=x, col=c **** to randomly create a crossover design. c is rounded up to a **** multiple of x. **** - specify period=3+, treat= and col= to produce a switchback. **** **** Incomplete Block Designs **** Warning: These designs use Proc Optex in SAS/QC **** - specify iblock=, treat=, blocksize= for a general incomplete block. **** - additionally specify block= if a resolvable design is desired. **** **** Central Composite Design **** Basic request is %design(CCD, factors=X1 X2) **** **** Examples: **** Create 4 blocks of size 3 for 7 treatments; **** %design(iblock=4,treat=7,blocksize=3); **** Create 4 complete blocks, each with 4 blocks of **** size 4, for 16 treatments; **** %design(block=4,iblock=4,blocksize=4,treat=16); **** Create 4 complete blocks, each with 2 iblocks of size 3, **** for 7 treatments; **** %design(block=4,iblock=2,blocksize=3,treat=7); **** This fails, so should be run without blocks; **** Not all designs are resolvable; **** **** *********************************************************************************************; ============================================================================================== ============================================================================================== **** Short Documentation: %ORTHPOLY macro **** **** Purpose: Generate orthogonal polynomial contrasts. **** Usage: List the values of your treatment levels, and optionally the number of **** observations for each treatment. **** Example: %orthpoly(50 100 120, numobs=10 10 9); **** This SAS statement is for a 3 treatment experiment, with treatment levels **** having values of 50, 100, 120. For example, this might be kg of fertilizer. **** Additionally, the fact that each treatment will have 10 observations, except **** the last, is coded. **** Algorithm: Proc IML is used, which handles unequal spacing and unequal numbers of **** observations. **** **** **** *********************************************************************************************; ============================================================================================== ============================================================================================== **** Short Documentation: %PDMIX800 macro, for SAS Version 8 or 9 ORIGINAL REFERENCE: Saxton, A.M. 1998. A macro for converting mean separation output to letter groupings in Proc Mixed. In Proc. 23rd SAS Users Group Intl., SAS Institute, Cary, NC, pp1243-1246. PURPOSE: This macro takes two data sets from Proc MIXED (Version 8), created by the DIFFS option on the LSMEANS statement. If an ADJUST= option is used, the pdiffs from this are used, not the unadjusted defaults. The pdiffs are converted to groups, labeled by numbers, and this is merged onto the lsmeans data set. The numbers are converted to letters, and for cases where more than 26 letters are needed, sections of letters are coded. For example, 3 means might have the letters A, (2)A, and (3)A. These 3 means are all different, because although all have the letter A, each A belongs to a different section, identified by (#). CAUTIONS!!!!!!! Depends on computer using ASCII characters, with 32=blank and capital letters following this. Requires temporary SAS datasets MSGRPZZ, LSDVALZZ, PDTEMPZZ, PDTEMPZZZ, PDTEMPMZZ, so any existing SAS dataset with these names will be destroyed. There may be an IML limit of 90 total characters in the group letter labels, but space for 200 are hardcoded. Since SAS/IML is used, this must be installed on the computer, along with BASE and STAT. Parameters. -First required parameter must name a dataset created by ODS OUTPUT DIFFS in proc mixed -Second required parameter must name a dataset created by ODS OUTPUT LSMEANS in proc mixed -Optional parameters, given in any order, case insensitive. SORT=YES - printing of means is in order of least squares mean value. Any value other than YES leaves means in the proc mixed sort order. ALPHA=.05 - critical probability value for deciding if means differ or not. The default is .05, and values must be between 0 and 1. WORKSIZE=1 - number of Kb of memory for IML to use. This should only be needed in very extreme circumstances as IML dynamically increases memory as needed. Values over 1M are set to 1M. TEST0=YES - this requests that 3 variables (df, t, p) be included in the printing. Any value other than NO prints all variables produced by the lsmeans. MIXFMT=NO - this removes the formatting assigned by proc mixed, which helps compress the page width of the output. This also will result in the means and std. errors being rounded, which usually is desirable. Any value besides NO retains the proc mixed formatting. NUMLET=200 - This specifies maximum number of letters that will be permitted. Many means may possibly require many letters, but memory requirements get excessive. The default of 200 should fail only in unusual cases. If failure occurs (error message in log), rerun with this option set higher. SLICE=variables Effects containing all the slice variables will be subdivided, and mean separation reporting done within slice levels. Note that all comparisons are made, just reporting of comparisons across slice levels is suppressed. This is useful to reduce the complexity of letter groupings. printmeans=YES by default, any other value will not print any mean information to Output. Example of use. Assume the file pdmix800.sas, containing the macro code, is on the c: drive. Then the code below will run MIXED, and run pdmix800 on the lsmeans. MIXED is told not to print the means and pdiffs, using the ODS exclude statement, as pdmix800 does the printing in the more desirable format. Also shown are two optional parameters. proc mixed; class block a b; model y = a b a*b; random block; lsmeans a b a*b/pdiff; ods output diffs=ppp lsmeans=mmm; ods exclude diffs lsmeans; run; %include 'c:\danda.sas'; %pdmix800(ppp,mmm,alpha=.01,sort=yes); ******************************************************************** **** Short Documentation: %PDGLM PURPOSE: This macro takes two data sets from Proc GLM (Version 8 or higher), created by the ods DIFF and LSMEANS tables. If an ADJUST= option is used, the pdiffs from this are used. The pdiffs are converted to groups, labeled by numbers, and this is merged onto the lsmeans data set. The numbers are converted to letters, and for cases where more than 26 letters are needed, sections of letters are coded. For example, 3 means might have the letters A, (2)A, and (3)A. These 3 means are all different, because although all have the letter A, each A belongs to a different section, identified by (#). Cautions: Depends on computer using ASCII characters, with 32=blank and capital letters following this. Requires temporary SAS datasets MSGRPZZ and TEMPZZ, so any existing SAS datasets with these names will be destroyed. There may be an IML limit of 90 total characters in the group letter labels, but space for 200 are hardcoded. Since IML is used, this must be installed on the computer, along with BASE and STAT. Parameters: -First required parameter must name a dataset created by ODS OUTPUT DIFFMAT= in proc glm; -Second required parameter must name a dataset created by ODS OUTPUT LSMEANS= in proc glm; -Optional parameters, given in any order, case insensitive. SORT=YES - printing of means is in order of least square mean value. Any value other than YES leaves means in the proc mixed sort order. ALPHA=.05 - critical probability value for deciding if means differ or not. The default is .05, and values must be between 0 and 1. WORKSIZE=1 - number of Kb of memory for IML to use. This should only be needed in very extreme circumstances as IML dynamically increases memory as needed. TEST0=NO - this requests that 3 variables (df, t, p) not be included in the printing. Any value other than NO prints all variables produced by the lsmeans. MIXFMT=NO - this removes the formatting assigned by proc mixed, which helps compress the page width of the output. This also will result in the means and std. errors being rounded, which usually is desirable. Any value besides NO retains the proc mixed formatting. NUMLET=200 - This specifies maximum number of letters that will be permitted. Many means may possibly require many letters, but memory requirements get excessive. The default of 200 should fail only in unusual cases. If failure occurs (error message in log), rerun with this option set higher. Example of use. Assume the file danda.sas, containing the macro code, is on the c:\ drive. Then the code below will run GLM, and run pdglm on the lsmeans. GLM is told not to print the means and pdiffs, using the ods listing exclude statement, as pdglm does the printing in the more desirable format. Also shown are two optional parameters. proc glm; class block a b; model y = a b a*b; random block; lsmeans a b a*b /pdiff; ods listing exclude lsmeans diff; ods output diff=ppp lsmeans=mmm; run; %include 'c:\danda.sas'; %pdglm(ppp,mmm,alpha=.01,sort=yes); ***************************************************************** ============================================================================================== ============================================================================================== * Short Documentation %Variogram macro * helps fit spatial variograms to data; * When done, the model type, nugget, sill and range can be used in other spatial analyses * as known values. * * Usage: * Run in three stages. * 1. Run with %variogram(data,yvar, xcoor, ycoor); * This will produce two graphs, from which distance interval (lagdist) and * number of intervals (lagmax) can be chosen. * 2. Then %variogram(data,yvar, xcoor, ycoor,lagdist=, lagmax=); * will produce a variogram. From this, begin choosing a model. * In particular start choosing nugget, sill and range. * 3 models are shown, exponential, spherical and gaussian. * 3. Run %variogram(data,yvar, xcoor, ycoor,lagdist=, lagmax=,nugget=,sill=,range=); * as often as needed to adjust nugget, sill and range to fit the observed * variogram with one of the theoretical models. * * Other options: * model= Use this to choose one model. EXP, SPH and GAU are plotted by default. ****************************************************************************************; Short documentation %SimPower Macro Purpose: Calculates power by simulating data under Ha and counting how often Ha correctly rejected. Usage: %SimPower(dataset, yvar, class=, fixed=, random=, mu=, ntotal=, alpha=, varcomp=); */ * MACRO CODE BEGINS; %let printdebugzz=0; *=1 turns on printing of debugging info; *********************************************** *** All graphics options are set here. *** User can change or override these by submitting new statements AFTER %include; *** Note some symbol definitions must be defined using the %let syntax below, because *** symbols are reused for different macros/graphs identified by comments. *** Just copy the %let symb_zz=%str() to be changed to your program, and modify; goptions ftext=swiss hpos=35 vpos=35; legend1 frame; **symbols for %mmaov except covariate plots; %let symb1zz=%str( symbol1 ;symbol1 i=join v=square l=1; axis1 ;axis1 label=(a=90 h=1.5); **always vertical axis; axis2 ;axis2 label=(h=1.5); **always horizontal axis; symbol2 ;symbol2 i=join v=triangle l=2; ); **symbols for %mmaov B*T plots; %let symb11zz=%str( symbol1 ;symbol1 i=join v=square l=1; axis1 ;axis1 label=(a=90 h=1.5); **always vertical axis; axis2 ;axis2 label=(h=1.5) ; **always horizontal axis; legend1 ; legend1 frame across=4 label=none shape=symbol(1,.75); symbol2 ;symbol2 i=join v=triangle l=2; ); **symbols for %mmaov covariate raw data plot ; %let symb2zz=%str( symbol1 ;symbol1 i=none v=square; axis1 ;axis1 label=(a=90 h=1.5); **always vertical axis; axis2 ;axis2 label=(h=1.5); **always horizontal axis; ); **definitions for %mmaov covariate adjustment plot; %let symb3zz=%nrstr( symbol1 ;symbol1 v=none i=rq; axis2 ;axis2 label=(h=1.5); **always horizontal axis; axis3 ;axis3 order=(&ymin to &ymax by &ystep) label=(a=90 "&transtype &yvarzz"); ); **%reg univariate diagnostic output; %let symb0zz=%str( symbol1 ;symbol2 ;symbol3 ;symbol4 ;symbol5 ;symbol6 ;); **%reg predict*observed plot; %let symb8zz=%str( symbol1 ; symbol1 i=rl v=dot c=black; axis1 ;axis1 label=(a=90 h=1.5); **always vertical axis; axis2 ;axis2 label=(h=1.5); **always horizontal axis;); **%reg residual plots; %let symb9zz=%str( symbol1 ; symbol1 i=none v=dot c=black; axis1 ;axis1 label=(a=90 h=1.5); **always vertical axis; axis2 ;axis2 label=(h=1.5); **always horizontal axis;); ** %reg one reg line with CI; %let symb4zz=%str( symbol1 ;symbol2 ;symbol3 ;symbol4 ;symbol5 ;symbol6 ; symbol1 i=spline v=none c=black; symbol2 i=spline v=none c=black l=1; symbol3 i=spline v=none c=black l=1; symbol4 i=spline v=none c=black l=2; symbol5 i=spline v=none c=black l=2; symbol6 i=none v=square c=black; axis1 ;axis1 label=(a=90 h=1.5); **always vertical axis; axis2 ;axis2 label=(h=1.5); **always horizontal axis; ); ** %reg dummy reg line with CI; %let symb5zz=%str( symbol1 ;symbol2 ;symbol3 ;symbol4 ;symbol5 ;symbol6 ; symbol1 i=spline v=square l=1; symbol2 i=spline v=dot l=1; symbol3 i=spline v=circle l=1; symbol4 i=spline v=diamond l=1; symbol5 i=spline v=triangle l=1; symbol6 i=spline v=square l=2; axis1 ;axis1 label=(a=90 h=1.5); **always vertical axis; axis2 ;axis2 label=(h=1.5); **always horizontal axis; ); ** %reg dummy regression predicted*observed (define symbol for each treatment); %let symb6zz=%str(symbol1 ;symbol2 ;symbol3 ;symbol4 ;symbol5 ;symbol6 ; symbol1 i=none v=square; symbol2 i=none v=dot l=1; symbol3 i=none v=circle l=1; symbol4 i=none v=diamond l=1; symbol5 i=none v=triangle l=1; symbol6 i=none v=square l=2; axis1 ;axis1 label=(a=90 h=1.5); **always vertical axis; axis2 ;axis2 label=(h=1.5); **always horizontal axis; ); ** definitions for variogram plot; %let symb7zz=%str( symbol1 ;symbol2 ;symbol3 ;symbol4 ;symbol5 ;symbol6 ; axis1 ;axis1 minor=none label=( c=black 'Lag Distance') ; axis2 ;axis2 minor=(number=1) label=(a=90 c=black 'Variogram') ; axis3 ; axis3 label=("Minimum Distance between observations in Bin"); axis4 ; axis4 label=(a=90); symbol1 i=join l=1 v='o' c=black w=3; symbol2 i=join l=1 v='r' c=black w=3; symbol3 i=join l=1 v=square c=blue ; symbol4 i=join l=1 v=diamond c=green ; symbol5 i=join l=1 v=circle c=red ; symbol6 i=join value='.' height=.5 cv=black ; **variogram raw data; ); ******wclear utility macro to clear user windows************************************************; %macro wclear(); %if %sysfunc(exist(work.gseg,"CATALOG"))>0 %then %do; dm graph 'end'; proc catalog cat=work.gseg kill; run;quit; %end; dm output 'clear'; *dm log 'clear'; %mend; ************************************************************************************ ************************************************************************************ ************************************************************************************ ************************************************************************************; %macro mmaov(dset,yvarlist,fixed=,class=,random= , repeat= ,by= , command= , commands= , contrasts= , contrast=, adjust= , type=ar(1), ddfm=, at=, sort=NO, alpha=.05, slice=,test0=NO, diagnostics=ON, mixdiffs=NO, printmeans=YES, transtype=,transvalue=0,plot=,plottype=gplot, dist=normal); %global yvarzz printdebugzz transtypezz; %local bystatzz zzzz lastbyzz adjustzz randind covind mclistzz sclistzz covariate cov2 cov2trt covtrt subjzz nblkfact blkmodel blistzz lastblkv nerrorterms nrterms byby numbyzz dumreg print1 nfterms fixedterms curcov lastfixv flistzz numfclass t1dfpzz t1dffzz t1dfdzz mmopt prnchk plotbt serror numtrtcomb numblkcomb ; %let transtypezz=&transtype; ***transtype is only used here, to get user value; %let transtypezzz=&transtype; **will be used for titles; ************ check for syntax errors that need immediate stop; data _null_; transtype=upcase("&transtypezz"); if transtype in('' 'RANK' 'LOG' 'SQRT' 'FROM_PH' 'LOG10' 'ARSINSQRT' 'BOXCOX' 'ARCSINSQRT' 'POWER') then error='N'; else do; error='Y'; put "ERROR: Unsupported transtype option, &transtypezz.."; put "Error: Choose one of RANK LOG SQRT FROM_PH LOG10 ARSINSQRT BOXCOX ARCSINSQRT POWER." ; end; call symput('serror',error); run; ************* check datasets; %if %length(&yvarlist)=0 %then %do; %let serror=Y; %put ERROR: Dependent variable(s) must be given.; %end; %if %length(&dset)=0 %then %do; %let serror=Y; %put ERROR: Dataset name must be given.; %end; %if %sysfunc(exist(&dset))=0 %then %do; %let serror=Y; %put ERROR: Dataset &dset does not exist.; %end; %if &serror = Y %then %do; %put %nrstr( %MMAOV will not be run.); %goto abort2; %end; proc printto log=user.dsetzz; *** turn off log entries by dumping to temp file; run; proc optsave out=optionszz; ** save user option settings; run; options nonotes nomprint; data optionszz; set optionszz; if optname in ('NOTES' 'MPRINT'); ***only save options that are changed; run; proc printto ; run; %if %length(&by) >0 %then %do; %let byby= &by; proc sort data=&dset; by &byby; %let bystatzz= by &byby ; data _null_; call symput('lastbyzz', scan("&byby",-1) ); run; %end; %else %let bystatzz=;; ***give options correct syntax; %if %length(&adjust) >0 %then %let adjustzz= adjust=&adjust; %else %let adjustzz=;; %if %length(&at) >0 %then %let at= at &at;; *************parse arguments for needed components********; data _null_; ***first clean up and get essential counts; length classvar xx2 xx3 $ 1024; classvar=translate( "&class" ,' ','()|*'); classvar=compbl(classvar); **may need to scan for unique vars; call symput('class',trim(classvar)); *** fix syntax of fixed model to create terms with no spaces; xx2=compbl("&fixed"); xx2=tranwrd(xx2,' *','*'); xx2=tranwrd(xx2,'* ','*'); xx2=tranwrd(xx2,' )',')'); xx2=tranwrd(xx2,'( ','('); xx2=tranwrd(xx2,' (','('); xx2=tranwrd(xx2,' |','|'); xx2=tranwrd(xx2,'| ','|'); ***translate a(b c) to a(b*c); ll=length(xx2); nested=0; do ii=1 to ll; char=substr(xx2,ii,1); if char='(' then nested+1; if char=')' then nested=nested-1; if nested>0 and char=' ' then substr(xx2,ii,1)='*'; end; nfterms=1; xx3=scan(xx2,nfterms,' '); do while (xx3 ne ''); nfterms+1; xx3=scan(xx2,nfterms,' '); end; call symput('nfterms',compress(nfterms-1)); call symput('fixed',compbl(xx2)); *** process random model to give terms with no spaces; xx2=compbl("&random"); xx2=tranwrd(xx2,' *','*'); xx2=tranwrd(xx2,'* ','*'); xx2=tranwrd(xx2,' )',')'); xx2=tranwrd(xx2,'( ','('); xx2=tranwrd(xx2,' (','('); xx2=tranwrd(xx2,' |','|'); xx2=tranwrd(xx2,'| ','|'); ***translate a(b c) to a(b*c); ll=length(xx2); nested=0; do ii=1 to ll; char=substr(xx2,ii,1); if char='(' then nested+1; if char=')' then nested=nested-1; if nested>0 and char=' ' then substr(xx2,ii,1)='*'; end; nrterms=1; xx3=scan(xx2,nrterms,' '); do while (xx3 ne ''); nrterms+1; xx3=scan(xx2,nrterms,' '); end; call symput('nrterms',compress(nrterms-1)); call symput('random',compbl(xx2)); run; %let fixednway=; data _null_; length xxfix xxr xx1 xx2 xx3 xx4 $ 2048; classvar=compbl("&class"); if length("&repeat")>0 then haverepeat=1; ****Need a list of class variables that are fixed factors; ****Take each class variable and see if it is in fixed model; ****Assumes convention that no random factor will be in fixed model; xxf=compbl("&fixed"); ii=1; xx2=scan(classvar,ii); xxr=''; xxfix=' '; kk=0; do while (xx2 ne ''); if haverepeat then do; **At the same time, get list of all class variables without the repeat var; **note this means user must not list extraneous class variables; if upcase(trim(xx2))ne upcase("&repeat") then xxr=compress(xxr||'*'||trim(xx2)); end; if index(upcase(xxf),upcase(trim(xx2)) ) ne 0 then do; **class variable is in fixed model; xxfix=compbl(xxfix||' '||trim(xx2)); kk+1; end; ii=ii+1; xx2=scan(classvar,ii); end; call symput('subjzz',substr(xxr,2));**strip off leading *; xxfix=trim(left(xxfix)); call symput('flistzz',compbl(xxfix)); **list of fixed class vars; call symput('numfclass',compress(kk)); ** number in flistzz; call symput('lastfixv',scan(xxfix,-1) ); xx=translate(trim(xxfix),'*',' '); ** interact all fixed class; if xx='*' then xx=''; ** in case there are no fixed effects; call symput('fixednway',compress(xx)); *** work with each model term to find covariates; **identify covariates as single terms not in class; length fc covtrt mclist sclist cov2 cov2trt xxcov $ 512; mclist=' ';sclist=' '; fc=' '; xxcov=' '; ii=1; xx=scan(xxf,ii,' '); *get fixed term including all components; xxcov=' '; jj=0; do while (xx ne ''); **take each factor and see if in class list; kk=1; xx1=scan(xx,kk,' *()|@'); pureclass=1; someclass=0; do while (xx1 ne ''); **look at all factors; if anyalpha(xx1)=1 then do; ** if first char is letter, check. Otherwise is a number, from @3 syntax, skip; if index(upcase(classvar),upcase(trim(xx1)) )=0 then do; **xx1 factor not in class; pureclass=0; if index(upcase(xxcov),upcase(trim(xx1)) )=0 then do; **if not in cov list, add; jj=jj+1; xxcov=compbl(xxcov||' '||xx1); mclist=compbl(mclist||' '||compress('mean_x'||trim(jj))) ; sclist=compbl(sclist||' '||compress('stddev_x'||trim(jj))); end; end; else someclass=1; end; kk=kk+1; xx1=scan(xx,kk,' *()|@'); end; if pureclass=1 then do; **collect all pure fixed terms for lsmeans; fc=compbl(fc||xx); end; if pureclass=0 and someclass=1 then dumreg=1; ****have terms like trt*x; ii=ii+1; xx=scan(xxf,ii,' '); end; call symput('fixedterms',trim(fc)); call symput ('covariate',trim(xxcov)); call symput ('covind',compress(jj)); call symput ('mclistzz',trim(mclist)); call symput ('sclistzz',trim(sclist)); if dumreg ne 1 then do; ***create model terms for checking covariate assumptions; fc=left(fc); xx1=translate(trim(fc),'|',' '); cov2=' ';cov2trt=' '; covtrt=' '; ii=1; xx=scan(xxcov,ii); do while (xx ne ''); cov2=compbl(cov2||' '||compress(xx||'*'||xx)); if xx1='|' then cov2trt=compbl(cov2trt||' '|| compress(xx||'*'||xx)); else cov2trt=compbl(cov2trt||' '|| compress(xx||'*'||xx) ||'|'||xx1); if xx1='|' then covtrt=compbl(covtrt||' '||xx); else covtrt=compbl(covtrt||' '||xx||'|'||xx1); ii=ii+1; xx=scan(xxcov,ii); end; call symput ('cov2',trim(cov2)); call symput ('cov2trt',trim(cov2trt)); call symput ('covtrt',trim(covtrt)); end; else call symput('dumreg','yes'); **** use containment method to see how many error terms are in model ***; ** take each factor from fixed, and scan random. Record if present. ** Number of error terms is number of unique combinations. ** If a fixed factor is not in any random terms, add one for residual; ** Only need to decide if there is one error term, or more; fixfactcount=0; if &nrterms>0 then do; length e1-e&nrterms $ 256; array eee e1-e&nrterms ; do kk=1 to &nrterms; eee[kk]=' '; end; xx3=compbl("&random"); xx2=compbl(xxfix); **all unique fixed factors, but no covariates; useresid=0; ii=1; xx1=scan(xx2,ii); do while (xx1 ne ''); **for each fixed factor; fixfactcount=0; do kk=1 to &nrterms; ** and each random term; xx4=scan(xx3,kk,' '); if index(upcase(xx4), upcase(trim(xx1)) ) ne 0 then do; eee[kk]=compress(eee[kk]||ii||'-'); **flag term as containing fixed effect; fixfactcount=1; end;end; if fixfactcount=0 then useresid=1; **residual error term for this fixed effect; ii+1; xx1=scan(xx2,ii); end; end; else useresid=1; numerrterms=0; do kk=1 to &nrterms; if eee[kk] ne '' then do; do ii= (kk+1) to &nrterms; if eee[kk]=eee[ii] then eee[ii]=''; end; numerrterms+1; end;end; numerrterms=numerrterms+useresid; call symput('nerrorterms',compress(numerrterms)); ** handle random effects, see if any blocking factors *****; if &nrterms=0 then call symput ('randind','0'); else do; call symput ('randind','1'); xx2=symget('random'); length blockvars $ 512; blockvars=''; do ii=1 to &nrterms; **for all random terms; xx3=scan(xx2,ii,' '); isblock=0; jj=1; xx4=scan(xxf,jj); do while (xx4 ne ''); * if any terms are fixed then not a block; if index(upcase(xx3), upcase(trim(xx4)) ) ne 0 then isblock=1; jj=jj+1; xx4=scan(xxf,jj); end; if isblock=0 then do; ** and all random factors in term must be in class to be a block; jj=1; xx4=scan(xx3,jj); inclass=1; do while (xx4 ne ''); if index(upcase(classvar), upcase(trim(xx4)) )=0 then inclass=0; jj=jj+1; xx4=scan(xx3,jj); end; if inclass=0 then isblock=1; end; if isblock=0 then blockvars=compbl(blockvars||trim(xx3)); end; call symput('blkmodel',blockvars); ****get list of unique block factors; blockvars=compbl(translate(blockvars,' ','*()|')); xx2=scan(blockvars,1); jj=1; ii=2; xx3=scan(blockvars,ii); do while (xx3 ne ''); if index(upcase(xx2), upcase(trim(xx3)) )=0 then do; jj+1; xx2=compbl(xx2||' '||xx3); end; ii=ii+1; xx3=scan(blockvars,ii); end; call symput('blistzz',compbl(xx2)); call symput('lastblkv',compress(scan(xx2,-1)) ); call symput('nblkfact',compress(jj)); end; run; %if %length(&ddfm)=0 %then %do; %if %length(&repeat)>0 | &nerrorterms>1 %then %let ddfm=kr; %else %let ddfm=contain; ; %end; *** used for debugging; %if &printdebugzz=1 %then %do; %put by processing (byby)=<&byby> (bystatzz)=<&bystatzz>; %put fixednway= <&fixednway>; %put user class list= <&class>; %put user fixed model= <&fixed>; %put user random model= <&random> ; %put fixed model pure class effects= <&fixedterms>; **used by lsmeans statement; %put fixed class list (flistzz)= <&flistzz> number=&numfclass; **class list of just the fixed factors; %put last in fixed class (lastfixv)= <&lastfixv>; %put dumreg=&dumreg covind=&covind covariates= <&covariate>; %put covariate mean name (mclistzz)= <&mclistzz>; %put covariate se name (sclistzz)= <&sclistzz>; %put cov diag model (covtrt)= <&covtrt>; %put cov diag model (cov2trt)= <&cov2trt>; %put Any random? (randind)=&randind (nrterms)= <&nrterms> ; %put block list (blistzz)= <&blistzz> numblock factors (nblkfact)= <&nblkfact>; %put subject (subjzz)= <&subjzz>; %put dummy reg flag (dumreg)= <&dumreg>; %put Number of error terms (nerrorterms)= <&nerrorterms>; %end; ***** run entire macro on each dep variable*****; %let yvarnum=1; %let yvarzz=%scan(&yvarlist,&yvarnum); %do %until (&yvarzz=); **if this dep var analysis fails, make sure old datasets are not used; proc datasets nodetails nolist; delete meanzz diffzz residzz residmzz hovzz mzz; run;quit; ***** users dataset copied to dsetzz, only dsetzz used below; data dsetzz; set &dset; OrigObsNumzz=_n_; **save original observation order; run; ******* rank transformation***********; %if %upcase(&transtypezz)=RANK %then %do; %if %length(&blistzz)>0 %then %do; proc sort data=dsetzz; by &blistzz; run; proc rank data=dsetzz out=dsetzz; by &blistzz; var &yvarzz; ranks rank_&yvarzz; run; proc sort data=dsetzz; by OrigObsNumzz; run; *original data order; %end; %else %do; proc rank data=dsetzz out=dsetzz; var &yvarzz; ranks rank_&yvarzz; run; %end; %let t_ = rank_; %end; %else %do; ********other transforms*********; %if %length(&transtypezz) > 0 %then %do; %let t_=tzz; %if %upcase(&transtypezz)=BOXCOX %then %do; ods exclude all; proc mixed data=dsetzz ; &bystatzz; class &class ; model &yvarzz = &fixed /outp=residzz outpm=residmzz ddfm=&ddfm ; %if &randind=1 %then random &random;; %if %length(&repeat)>0 %then repeated &repeat / subject=&subjzz type=&type;; &contrasts &contrast &command &commands run; proc means noprint data=residzz; var resid; output out=tempzz min=minresid; run; data tempzz; set tempzz; zzzz=1; run; data residzz; set residzz; zzzz=1; run; data residzz; merge residzz tempzz; by zzzz; resid=resid-minresid+1; drop _type_; run; ods listing; proc transreg data=residzz details pbo ; title2 'Box-Cox search for optimal power transformation'; ods output boxcox = tempzz ; %if &transvalue=0 %then model boxcox(resid / convenient lambda=-3 to 3 by 0.1) = class(&fixedterms); %else model boxcox(resid / convenient lambda=&transvalue) = class(&fixedterms); ; ods exclude details; run; options notes; data tempzz; set tempzz end=aa; format _all_; retain flagzz 0; if CI='*' and flagzz=0 then do; flagzz=1; **found lower CL; call symput("minzz",lambda); end; if flagzz=1 and CI='' then do; flagzz=2; **found upper CL; call symput("maxzz",lambda); end; if CI='<' then call symput("optzz",lambda); if aa and flagzz ne 2 then call symput("maxzz",lambda);**if CI goes to max; run; proc gplot; &symb1zz; plot loglike*lambda/href=&minzz &optzz &maxzz vaxis=axis1 haxis=axis2; run;quit; %let check4=0; data tempzz; set tempzz; where CI='<'; if _n_>0 then do; call symput('check4',4); call symput('optpow',lambda); end; run; %if &check4=0 %then %do; %put Optimal transformation not found. Rerun with transvalue giving wider search range.; %goto abort3; %end; ***set to optimum transformation found; %let transtypezz=POWER; %let transvalue= &optpow; %end; %let transtypezzz = &transtypezz; ***for title labels; data dsetzz; set dsetzz; trtypezz=upcase("&transtypezz"); valuezz=1*&transvalue; drop trtypezz valuezz; if trtypezz='LOG' then do; tzz&yvarzz=log(&yvarzz+valuezz); end; if trtypezz='SQRT' then do; tzz&yvarzz=sqrt(&yvarzz+valuezz); end; if trtypezz='FROM_PH' then do; tzz&yvarzz=10**(valuezz-&yvarzz); end; if trtypezz='LOG10' then do; tzz&yvarzz=log10(&yvarzz+valuezz); end; if trtypezz in('ARSINSQRT' 'ARCSINSQRT') then do; if valuezz=0 then tzz&yvarzz=arsin(sqrt(&yvarzz)); else tzz&yvarzz=arsin(sqrt(&yvarzz/valuezz)); end; if trtypezz='POWER' then do; tzz&yvarzz=&yvarzz**valuezz; call symput("transtypezzz", compress("Power("||valuezz||")")); end; run; %end; %else %let t_=;; %end; **transformation; *********** BEGIN ANALYSIS **************************************; %if %length(&blistzz)>0 and %length(&fixednway)>0 and %upcase(&diagnostics)^=OFF %then %do; ***********check on block*treatment interactions********; ods exclude all; proc sort data=dsetzz out=tempzz; by &byby &blistzz &flistzz; proc means noprint; by &byby &blistzz &flistzz; var &t_.&yvarzz; output out=btizz mean=&t_.&yvarzz; run; proc glm data=btizz; &bystatzz; class &blistzz &flistzz; model &t_.&yvarzz = &blkmodel &fixednway; output out=tempzz r=resid p=pred; ods output overallanova=aaazz; run; data ssezz; set aaazz; if source='Error'; keep df sse &byby; sse=ss; run; data tempzz; set tempzz; sijk=pred*pred; run; proc glm; &bystatzz; class &blistzz &flistzz; model sijk = &blkmodel &fixednway; output out=tempzzz r=resids p=preds; ods output overallanova=aaaazz; run; data tempzz; merge tempzz tempzzz; by &byby &blistzz &flistzz; drop _type_ _freq_; run; proc reg outsscp=tempzzz; &bystatzz; model resids=resid; run; data tempzz; set tempzzz; if _name_ ='resids'; ssab=resid**2/resids; run; data tempzz; merge tempzz ssezz; &bystatzz; DDF=df-1; TukeyF = ssab/( (sse-ssab)/(ddf) ); TukeyProb = 1-probf(tukeyf,1,ddf); call symput('t1dfpzz',compress(round(tukeyprob,.0001))); call symput('t1dfdzz',compress(ddf)); call symput('t1dffzz',compress(round(tukeyf,.001))); run; ods exclude none; %if %length(&byby)>0 %then %do; proc print data=tempzz; var &byby TukeyF DDF TukeyProb; title2 "Tukey Single DF Test results for &transtypezzz &yvarzz"; run; %end; ******* create trt codes, dropping missing levels; proc sort data=btizz; by &flistzz; run; data btizz; set btizz; by &flistzz; retain trtcomb 0; drop _type_ ; ***check all fixed= vars for missing, indicate by nodata=1; nodata=0; %let ii=1; %let tempvarzz =%scan(&flistzz,&ii); %do %until (&tempvarzz=); if &tempvarzz ='' then nodata=1; %let ii=%eval(&ii+1); %let tempvarzz=%scan(&flistzz,&ii); %end; ***ii loop; if nodata=0 then do; **if not missing, record level; if first.&lastfixv then trtcomb=trtcomb+1; end; if last.&lastfixv then call symput('numtrtcomb',compress(trtcomb)); run; ****create codes for nonmissing block combination levels; %let prnchk=0; **indicates if only 1 obs/treat/block; proc sort data=btizz; by &blistzz; run; data btizz; set btizz; by &blistzz; retain blkcomb 0; retain docheck 1; drop docheck nodata ; nodata=0; **check block list to see if any are missing; %let ii=1; %let tempvarzz =%scan(&blistzz,&ii); %do %until (&tempvarzz=); if &tempvarzz ='' then nodata=1; %let ii=%eval(&ii+1); %let tempvarzz=%scan(&blistzz,&ii); %end; ***ii loop; if nodata=0 then do; if first.&lastblkv then blkcomb+1; end; if docheck=1 then do; if _freq_>2 then do; docheck=0; call symput('prnchk','1'); end; end; if last.&lastblkv then call symput('numblkcomb',compress(blkcomb)); run; ******* set which variables to use for B*T plots; %if &nblkfact>1 %then %let plotblkcomb=blkcomb; %else %let plotblkcomb=&lastblkv;; %if &numfclass>1 %then %let plottrtcomb=trtcomb; %else %let plottrtcomb=&lastfixv;; proc sort data=btizz; by &byby &blistzz ; run; %if %upcase(&plottype) ^=GPLOT %then %do; proc plot data=btizz vpercent=50;; &bystatzz; title2 "Check on block*treat interactions for &transtypezzz &yvarzz"; %if %length(&byby)=0 %then %do; title3 "Tukey Single DF Test: F=&t1dffzz on 1,&t1dfdzz DF (P=&t1dfpzz)"; %end; plot &t_.&yvarzz * &plottrtcomb = &plotblkcomb ; plot &t_.&yvarzz * &plotblkcomb = &plottrtcomb ; run; %if &prnchk=1 or &numblkcomb>22 or &numtrtcomb>22 %then %do; proc print data=btizz; title2 "Codes and &transtypezzz &yvarzz mean data for block treatment interaction plots"; run; %end; %end; %if %upcase(&plottype)=GPLOT %then %do; ***graphics version of B*T plots; proc sort data=btizz; by &byby blkcomb; proc gplot data=btizz ; &bystatzz; &symb11zz; title2 "Check on block*treat interactions for &transtypezzz &yvarzz"; %if %length(&byby)=0 %then %do; title3 "Tukey Single DF Test: F=&t1dffzz on 1,&t1dfdzz DF (P=&t1dfpzz)"; %end; plot &t_.&yvarzz* &plotblkcomb = &plottrtcomb /vaxis=axis1 haxis=axis2 %if &numtrtcomb<23 %then legend=legend1 ;%else nolegend;; run; proc sort data=btizz; by &byby trtcomb; proc gplot data=btizz ; &bystatzz; &symb11zz; title2 "Check on block*treat interactions for &transtypezzz &yvarzz"; %if %length(&byby)=0 %then %do; title3 "Tukey Single DF Test: F=&t1dffzz on 1,&t1dfdzz DF (P=&t1dfpzz)"; %end; plot &t_.&yvarzz*&plottrtcomb = &plotblkcomb/vaxis=axis1 haxis=axis2 %if &numblkcomb<23 %then legend=legend1 ; %else nolegend;; run;quit; %end; %end; ***************************B*T interaction; *****************************************************; *****Handle covariate checking***********************; %if &covind > 0 and "&flistzz" ^=" " and %upcase(&diagnostics)^=OFF %then %do; %if %upcase(&plottype)^= GPLOT %then %do; proc plot data=dsetzz; &bystatzz; title2 "Raw Data Graphs for &transtypezzz &yvarzz and covariates"; %if &numfclass=1 %then plot &t_.&yvarzz*(&covariate)=&flistzz; %else plot &t_.&yvarzz*(&covariate); ; ods listing; run; %end; %else %do; proc gplot data=dsetzz; &bystatzz; &symb2zz; title2 "Raw Data Graphs for &transtypezzz &yvarzz and covariates"; %if &numfclass=1 %then plot &t_.&yvarzz*(&covariate)=&flistzz/vaxis=axis1 haxis=axis2 legend=legend1; %else plot &t_.&yvarzz*(&covariate)/vaxis=axis1 haxis=axis2 ; ; ods listing; run;quit; %end; %if &dumreg ^= yes %then %do; proc mixed data=dsetzz ; &bystatzz; title2 "Test for quadratic interactions - &transtypezzz &yvarzz"; class &class ; model &t_.&yvarzz = &fixed &covtrt &cov2 &cov2trt /htype=1; %if &randind=1 %then random &random;; %if %length(&repeat)>0 %then repeated &repeat / subject=&subjzz type=&type;; ods exclude all; ods output tests1=aov1zz; run; proc mixed data=dsetzz ; &bystatzz; title2 "Test for linear interactions - &transtypezzz &yvarzz"; class &class ; model &t_.&yvarzz = &fixed &cov2 &covtrt /htype=1; %if &randind=1 %then random &random;; %if %length(&repeat)>0 %then repeated &repeat / subject=&subjzz type=&type;; ods exclude all; ods output tests1=aov3zz; run; proc mixed data=dsetzz; &bystatzz; title2 "Test for quadratic covariate - &transtypezzz &yvarzz"; class &class ; model &t_.&yvarzz = &fixed &cov2 /htype=1; %if &randind=1 %then random &random;; %if %length(&repeat)>0 %then repeated &repeat / subject=&subjzz type=&type;; ods exclude all; ods output tests1=aov5zz; run; data aov1zz; set aov1zz; test='1:Quadratic Interaction'; data aov3zz; set aov3zz; test='2:Linear Interaction'; data aov5zz; set aov5zz; test='3:Quadratic Covariate'; data aovzz; set aov1zz aov3zz aov5zz; proc print; by test notsorted; ods listing ; title2 "Testing Standard Covariate Model Assumptions for &transtypezzz &yvarzz"; run; %end; %end; ****************************************************************; options notes mprint; %if %upcase(&dist)=NORMAL %then %do; proc mixed data=dsetzz covtest; &bystatzz; title2 "Mixed ANOVA for &transtypezzz &yvarzz"; class &class ; model &t_.&yvarzz = &fixed /outp=residzz outpm=residmzz solution ddfm=&ddfm residual %if %upcase(&diagnostics)^=NOINF & %upcase(&diagnostics)^=OFF & %upcase(&ddfm)^=KR & %upcase(&ddfm)^=KENWARDROGERS %then influence;; %if &randind=1 %then random &random;; %if %length(&repeat)>0 %then repeated &repeat / subject=&subjzz type=&type;; &contrasts &contrast &command &commands %if %length(&fixedterms)>0 %then %do; lsmeans &fixedterms / pdiff &adjustzz &at cl alpha=&alpha ; ods output diffs=diffzz lsmeans=meanzz; %if %upcase(&mixdiffs) = NO %then ods exclude diffs lsmeans;; %end; ods exclude solutionf; ods output solutionf=solzz; run; options nomprint; options nonotes; %if %length(&fixedterms)>0 and &test0=NO %then %do; **these vars are added with alpha or cl option; data meanzz; set meanzz; drop alpha upper lower; run; %end; %end; %else %do; proc glimmix data=dsetzz ; &bystatzz; title2 "GLIMMix ANOVA for &yvarzz"; class &class ; model &yvarzz = &fixed / solution ddfm=&ddfm distribution=&dist; %if &randind=1 %then random &random;; %if %length(&repeat)>0 %then random &repeat/residual subject=&subjzz type=&type;; &command; &commands; &contrasts; &contrast; %if %length(&fixedterms)>0 and (%upcase(&dist)=POISSON or %upcase(&dist)=BINOMIAL ) %then %do; lsmeans &fixedterms / pdiff &adjustzz &at ilink; ods output diffs=diffzz lsmeans=meanzz; %if %upcase(&mixdiffs) = NO %then ods exclude diffs lsmeans;; %end; output out=residzz residual; ods exclude ParameterEstimates; ods output ParameterEstimates=solzz; run; %end; options nomprint; options nonotes; ***if residzz does not exist, model did not run, so abort; %let chk4=0; data _null_; set residzz; if _n_>0 then do; call symput("chk4","3"); stop; end; run; %if &chk4>0 %then %do; *****process solution dataset to print regression slopes*********; %if &covind > 0 %then %do; data solzz; set solzz; *** print only terms with reg variables; drop xx xx1 test kk hasreg; length xx xx1 $ 200; xx=upcase(trim("&covariate")); test=upcase(effect); kk=1; xx1=scan(xx,kk); hasreg=0; do while (xx1 ne '' and hasreg=0); if index(test,trim(xx1)) ne 0 then hasreg=1; kk=kk+1; xx1=scan(xx,kk); end; if hasreg=0 or test='INTERCEPT' then delete; run; title2 "Covariate &covariate slope estimates for &transtypezzz &yvarzz"; proc print data=solzz; run; %end; %if %length(&fixednway)>0 and (%upcase(&dist)=POISSON or %upcase(&dist)=BINOMIAL or %upcase(&dist)=NORMAL ) %then %do; title2 "Mean separation for &transtypezzz &yvarzz"; ****if transformed, print means later, not here; %let print1=&printmeans; %if %length(&transtypezz)>0 %then %let print1=NO; %pdmix800(diffzz,meanzz,sort=&sort,alpha=&alpha, slice=&slice, test0=&test0, printmeans=&print1); *********if transformed, provide back transformed *******************; %if %upcase(&transtypezz)^= RANK & %length(&transtypezz) > 0 %then %do; data msgrpzz; set msgrpzz; value=1*&transvalue; trtypezz=upcase("&transtypezz"); drop temp value trtypezz; if trtypezz='SQRT' then do; btestimate=estimate*estimate-value; btstderr = 2*stderr*estimate; end; if trtypezz='FROM_PH' then do; btstderr = (log10(exp(1))/estimate)*stderr; btestimate=-log10(estimate) + value; end; if trtypezz='LOG' then do; btestimate=exp(estimate) - value; btstderr = stderr*(btestimate+value); end; if trtypezz='LOG10' then do; btestimate=10**estimate - value; btstderr = stderr*(log(10)*(btestimate+value)); end; if trtypezz in('ARSINSQRT' 'ARCSINSQRT') then do; temp=sin(estimate)**2; if value=0 then do; btestimate=temp; btstderr=stderr*sqrt(1-temp)*sqrt(btestimate)*2; end; else do; btestimate=value * temp; btstderr=stderr*sqrt(1-temp)*sqrt(temp)*2*value; end; end; if trtypezz='POWER' then do; if value=0 then do; btestimate=.; btstderr=.; end; else do; btestimate=estimate**(1/value); btstderr = stderr* (1/abs(value))* (estimate**((1-value)/value) ) ; end; end; run; %if &printmeans=YES %then %do; proc print data=msgrpzz label ; by effect adjustment bygroup notsorted; label bygroup=' Set' adjustment=' Method'; title2 "Back-transformed (bt) Mean Separation for &transtypezzz &yvarzz"; run; %end; %let plotbt=btestimate; **bt means can be plotted; %end; %end; %if %upcase(&transtypezz) = RANK %then %do; **for rank transformation, just print mean of ranks; %if &printmeans=YES %then %do; proc print data=msgrpzz label ; by effect adjustment bygroup notsorted; label bygroup=' Set' adjustment=' Method'; title2 "Mean Separation for &transtypezzz &yvarzz"; run; %end; %end; ***************************************************************; %if %length(&plot) >0 %then %do; ************* plot 1 or 2 factors if requested ****************; *** convert &plot to single words for each request; data _null_; length plot $ 2048; plot=compbl("&plot"); plot=tranwrd(plot," *","*"); plot=tranwrd(plot,"* ","*"); call symput("plot",plot ); run; %let plotnum=1; %let plotzz=%scan(&plot,&plotnum,%str( ) ); %do %while (%length(&plotzz)>0); **** loop through all plot requests; %let trta=%upcase(%scan(&plotzz,1)); %let trtb=%upcase(%scan(&plotzz,2)); %let dsid=%sysfunc(open(dsetzz,I)); **get original variable types; %let idzz=%sysfunc(varnum(&dsid,&trta)); %let typeazz=%sysfunc(vartype(&dsid,&idzz)); %if %length(&trtb)>0 %then %do; %let idzz=%sysfunc(varnum(&dsid,&trtb)); %let typebzz=%sysfunc(vartype(&dsid,&idzz)); %end; %let dsid=%sysfunc(close(&dsid)); data tempzz; set msgrpzz; format estimate; %if %length(&trtb)=0 %then %do; if upcase(EFFECT) = "&trta"; xaxis=&trta; label xaxis="&trta"; %end; %else %do; if upcase(EFFECT) in( "&trta*&trtb" "&trtb*&trta" "&trtb(&trta)" "&trta(&trtb)" ); **depending on numeric or character, decide which to put on x-axis; %if &typeazz=N and &typebzz=C %then %do; xaxis=1*&trta ; label xaxis="&trta"; %let factzz=&trtb; %end; %else %do; %if &typebzz=N %then xaxis=1*&trtb; %else xaxis=&trtb; ; label xaxis="&trtb"; %let factzz=&trta; %end; %end; run; title2 "Plot of &plotzz least squares means for &transtypezzz &yvarzz"; %if %length(&trtb)=0 %then %do; *** just a single variable, give histogram; %if %upcase(&plottype) ^=GPLOT %then %do; proc chart; &bystatzz; vbar (estimate &plotbt)*xaxis; %end; %else %do; proc Gchart; &bystatzz; &symb1zz; vbar &trta / discrete type=mean sumvar=estimate raxis=axis1 maxis=axis2; ; %end; %end; %else %do; %if %upcase(&plottype) ^=GPLOT %then %do; proc Plot; &bystatzz; plot (estimate &plotbt)*xaxis=&factzz; %end; %else %do; proc GPlot; &bystatzz; &symb1zz; plot (estimate &plotbt)*xaxis=&factzz /legend=legend1 vaxis=axis1 haxis=axis2; ; %end; %end; run; quit; %let plotnum=%eval(&plotnum+1); %let plotzz=%scan(&plot,&plotnum,%str( ) ); %end; ***plotting loop; %end; **plot= option; ************************************; ********Normality diagnostics***************************************; %if %upcase(&diagnostics)^=OFF %then %do; %if %upcase(&diagnostics)^=NOINF %then %do; proc print data=residzz; title2 "&transtypezzz &yvarzz Outliers with large Studentized Residuals"; where StudentResid>2.5 or .0 %then %do; ods exclude all; proc glm data=dsetzz; &bystatzz; class &class; model &t_.&yvarzz= &fixednway; means &fixednway/hovtest; ods output hovftest=hovzz; run; options nonotes; data hovzz; set hovzz; retain LeveneF_df LeveneP; if source ne 'Error' then do; LeveneF_df=compress(round(fvalue,.001)||'('||df); LeveneP=round(probf,.001); end; else do; LeveneF_df=compress(levenef_df||','||df||')'); output; end; keep &byby levenef_df levenep; run; proc sort data=dsetzz out=tempzz; by &byby &flistzz; data tempzz; set tempzz; obswtzz=1; ****need to get rid of missing observations or rawmeans are wrong; if &t_.&yvarzz=. then obswtzz=0; %if %length(&covariate)> 0 %then %do; array covzzz &covariate; do over covzzz; if covzzz=. then obswtzz=0; end; %end; run; proc means noprint; by &byby &flistzz; weight obswtzz; var &t_.&yvarzz &covariate; output out=mzz n=nobs mean=rawmean_y &mclistzz std=stddev_y &sclistzz stderr=stderr_y; run; data mzz; merge mzz hovzz; &bystatzz; drop _freq_ _type_; format levenep 6.4; %if %length(&lastbyzz) > 0 %then %do; if first.&lastbyzz=0 then do; levenef_df=''; levenep=.; end; %end; run; ods exclude none; proc print; %if &covind<1 %then title2 "Check equality of stddev for y=&transtypezzz &yvarzz"; %else title2 "Check equality of stddev for y=&transtypezzz &yvarzz x=&covariate" ;; run; %end; **equal variance; %end; **diag off for normality and equal variance; **************************************************************; ****If covariates and treatments, make a plot of raw means and predicted ****; %if &covind > 0 and %upcase(&diagnostics)^=OFF %then %do; %if %length(&fixednway)>0 %then %do; data mzz; set mzz; &bystatzz; retain bygrp 0; %if %length(&byby)>0 %then if first.&lastbyzz then bygrp=bygrp+1; %else bygrp=1; ; call symput('numbyzz',bygrp); ****how many by groups to do? ****; run; proc means noprint data=mzz; &bystatzz; id bygrp; var &mclistzz; output out=zzz mean=covmean_x1-covmean_x&covind; run; data zzz; merge mzz zzz ; by &byby bygrp; run; ****loop through covariates and by groups ******; proc sort data=residmzz; by &flistzz; data residmzz; set residmzz end=aa; by &flistzz; retain trtcomb 0; if first.&lastfixv then trtcomb=trtcomb+1; ** need number of treatments for plotting; if aa then call symput('numtrtcomb', compress(trtcomb)); run; %do jj=1 %to &covind; data tempzz; merge dsetzz residmzz; %if &numbyzz=1 %then bygrp=1; ; call symput('curcov',scan("&covariate", &jj)); run; %if &numbyzz>1 %then %do; proc sort; &bystatzz; %end; data tempzz; merge tempzz zzz; &bystatzz; run; %do ii=1 %to &numbyzz; data zzz; set zzz; if _n_=&ii then call symput('covmnzz',covmean_x&jj); run; %if %upcase(&plottype) ^= GPLOT %then %do; proc plot data=tempzz; &bystatzz; where bygrp=ⅈ title2 "Raw means (M) and &curcov covariate reg lines (p) for y=&transtypezzz &yvarzz"; plot rawmean_y*mean_x&jj='M' pred*&curcov='p' /href=&covmnzz overlay; run; %end; %if %upcase(&plottype)=GPLOT %then %do; ** run this separately in case graph is not installed; proc sort data=tempzz; by trtcomb; data zzz; set tempzz; by trtcomb; array vvvzz $ v1-v11; v1='U';v2='C';v3='P';v4='Z';v5='a';v6='S';v7='V';v8='D'; v9='A';v10='B';v11='X'; drop symbid; retain iii 0; if first.trtcomb then iii=iii+1; xsys='2'; ysys='2'; position='5'; when='B'; if rawmean_y ne . then do; y=rawmean_y; x=mean_x&jj; ***rawmeans labeled; function='Label'; text='M'; output; end; if &numtrtcomb < 23 and &t_.&yvarzz ne . then do; y=&t_.&yvarzz; x=&curcov; ***observations labeled; function='Label'; symbid=mod((iii-1),22)+1; if symbid le 11 then do; style='marker ' ; text=vvvzz[symbid]; end; else do; style='markere'; text=vvvzz[symbid-11]; end; output; end; run; *** get max and min for y-axis with 7 ticks; proc means noprint data=tempzz; var &t_.&yvarzz ; output out=zzzz min=var1min max=var1max ; run; data zzzz; set zzzz; var1max=round(1.1*var1max, 10**floor(log10(var1max)-1)); step1=1.1*(var1max-var1min)/6; step1=round(step1,10** floor(log10(step1)-1) ); var1min=var1max-6*step1; call symput("ymax",compress(var1max)); call symput("ymin",compress(var1min)); call symput("ystep",compress(step1)); run; %if &numtrtcomb<23 %then %do; proc gplot data=tempzz; &bystatzz; where bygrp=ⅈ title2 "Covariate adjustment of Raw means (M)"; %unquote(&symb3zz); plot pred*&curcov=trtcomb /vaxis=axis3 href=&covmnzz annotate=zzz legend=legend1 ; run; quit; %end; %else %do; ***too many treatments, so no legend; proc gplot data=tempzz; &bystatzz; where bygrp=ⅈ title2 "Covariate adjustment of Raw means (M)"; %unquote(&symb3zz); plot pred*&curcov=trtcomb /vaxis=axis3 href=&covmnzz annotate=zzz nolegend; run; quit; %end; %end; %end; %end; **plotting loop through covariates; %end; %else %do; **** create plots with no fixed effects*****; %do jj=1 %to &covind; data _null_; call symput('curcov',scan("&covariate", &jj)); run; %if &numbyzz>1 %then %do; proc sort data=residzz; &bystatzz; %end; proc gplot data=residzz; &bystatzz; title2 "Regression plots of Predicted &transtypezzz &yvarzz against &curcov"; %unquote(&symb8zz); plot pred*&curcov &yvarzz*&curcov /vaxis=axis1 overlay ; run; quit; %end; %end; %end; **cov plots; ***end for checking if residzz was created ; %end; %else %put ERROR: Analysis of &yvarzz FAILED. Check statement, model and/or dep variables and rerun.;; %let yvarnum=%eval(&yvarnum+1); %let yvarzz=%scan(&yvarlist,&yvarnum); %end; ***yvar loop; %abort3: proc optload data=optionszz; run; %abort2: %mend mmaov; ************************************************************************* ************************************************************************* ************************************************************************* *************************************************************************; %macro pdmix800(pname,lname,sort=NO,alpha=.05,worksize=1,test0=NO, mixfmt=YES,numlet=200,slice=,printmeans=YES); %global dandaverzz yvarzz bylistzz varlistzz transtypezz; **put out for possible use by backtrans; %local dsid chk3 error1 error neweffectlength lastslicevar var adjust bylist; %global printdebugzz; %put PDMIX800 &dandaverzz processing; *** check arguments; %if %length(&yvarzz)=0 %then %let yvarzz=; %let error=0; %if %length(&lname)=0 %then %let error=1; %if %sysfunc(exist(&lname)) %then %do; %let dsid=%sysfunc(open(&lname,I)); %let chk3=%sysfunc(varnum(&dsid,ESTIMATE)); %if &chk3=0 %then %let error=2; %let chk3=%sysfunc(varnum(&dsid,EFFECT)); %if &chk3=0 %then %let error=2; %let dsid=%sysfunc(close(&dsid)); %end; %else %let error=1; %if &error>0 %then %do; %if &error=1 %then %put WARNING: Dataset &lname does not exist.; %if &error=2 %then %put WARNING: Dataset &lname was not made by proc mixed.; %end; %let error1=&error; %let error=0; %if %length(&pname)=0 %then %let error=1; %if %sysfunc(exist(&pname)) %then %do; %let dsid=%sysfunc(open(&pname,I)); %let chk3=%sysfunc(varnum(&dsid,ESTIMATE)); %if &chk3=0 %then %let error=3; %let chk3=%sysfunc(attrn(&dsid,nobs)); %if &chk3=0 %then %let error=2; %let dsid=%sysfunc(close(&dsid)); %end; %else %let error=1; %if &error>0 %then %do; %if &error=1 %then %put WARNING: Dataset &pname does not exist.; %if &error=2 %then %put WARNING: There are no observations in dataset &pname.; %if &error=3 %then %put WARNING: Dataset &pname was not made by proc mixed.; %end; %if (&error or &error1) %then %do; %put ERROR: PDMIX800 terminated due to errors in input values.; %goto skip; %end; ** save setting of notes option; proc printto log=user.pdtempzz; *** turn off log entries by dumping to temp file; run; proc optsave out=optionszzz; ** save user option settings; run; options nonotes nomprint; data optionszzz; set optionszzz; if optname in ('NOTES' 'MPRINT'); ***only save options that are changed; run; proc printto; run; data _null_; *** numerical value check only possible in data step; if &alpha le 0.0 or &alpha ge 1.0 then call symput('error','3'); numlet=1*&numlet; if numlet=. then do; put 'Numlet not a number, 200 will be used'; call symput('numlet',compress(200)); numlet=200; end; if numlet<10 then do; put 'Numlet too small, 10 will be used'; call symput('numlet',compress(10)); end; if numlet>2000 then do; put 'Numlet too large, 2000 will be used'; call symput('numlet',compress(2000)); end; if &worksize>1e6 then do; put 'Worksize reset to 1M'; call symput('worksize',compress(1000000)); end; run; %if &error %then %do; %put ERROR: PDMIX800 terminated due to errors in input values.; %if &error=3 %then %put Alpha can only have values between 0 and 1.; %goto skip; %end; ****need list of variable names, either sliced or not; data _null_; *** First get unique list of all names used in BY statements; *** these come before the variable EFFECT, but include EFFECT in list; dsid=open("&lname",'i'); length namlist $ 512; ii=1; value=varname(dsid,ii); do while (value ^= 'Effect') ; if ii=1 then namlist=value; else namlist=trim(namlist)||' '||value; ii=ii+1; value=varname(dsid,ii); end; call symput('bylistzz',compbl(namlist)); **list without effect; if namlist='' then namlist=value; else namlist=trim(namlist)||' '||value; namlist=trim(namlist); call symput('bylist',namlist); **list with effect; ****************************************************; *** Now get list of all class variables (always between effect and estimate); length list list1 list2 $ 3200; start=varnum(dsid,"EFFECT") +1; ii=1;jj=start; slicein=upcase("&slice"); do while(ii); name=varname(dsid,jj); name1=upcase(name); **case sensitive names are returned by varname; type=vartype(dsid,jj); if name1 ^= 'ESTIMATE' then do; kk=indexw(slicein,name1); if kk=0 then do; list=compress(list||'='||name); if type='N' then list2= trim(list2)||' left('||trim(name)||left(")= '_' and") ; else list2= trim(list2)||' left('||trim(name)||left(")='' and") ; end; else do; if type='N' then list1= trim(list1)||' left('||trim(name)||left(")='_' or") ; else list1= trim(list1)||' left('||trim(name)||left(")='' or") ; end; jj=jj+1; end; else ii=0; end; list=substr(list,2); jj=length(list1); if jj>2 then list1=substr(list1,1,jj-2); list2=substr(list2,1,length(list2)-3); call symput('slice1',trim(list1)); call symput('varlist1',trim(list2)); list=translate(list,' ','='); call symput ('varlistzz',trim(list)); run; %if &printdebugzz=1 %then %do; %put bylist &bylist; %put bylistzz &bylistzz; %put varlistzz &varlistzz; %put varlist1 &varlist1; %put slice &slice slice1 &slice1; %end; ********** add variables to datasets ******************************; data pdtempzz; attrib ADJUSTMENT length=$30; set &pname end=aa; ** if adjusted and adj probs are not there, fill in; if ADJP=. then do; ADJP=PROBT; end; if ADJUSTMENT ='' then ADJUSTMENT=compress('LSD(P<'||"&alpha"||')'); else do; ADJUSTMENT=compress(ADJUSTMENT||'(P<'||"&alpha"||')' ); if substr(ADJUSTMENT,1,7)='Dunnett' then call symput('error','4'); end; if aa then call symput('nobsdiffzz',compress(_n_)); run; %if &error %then %do; %put ERROR: PDMIX800 terminated due to errors in input values.; %if &error=4 %then %put ADJUST=Dunnett output not supported.; %goto skip; %end; data pdtempmzz; set &lname; by &bylist notsorted; *** add bygroup variable to means dataset; retain bygroup 0; if first.effect then bygroup+1; if first.EFFECT and last.EFFECT then df0=1; **flag single means, will have no diffs; else df0=0; *dothiseffectzz=0; run; ***means and diffs data may have different effects, due to 0 df, so copy bygroup over to diffs; data pdtempzzz; set pdtempmzz end=aa; by bygroup notsorted; retain count totnobs 0; if first.bygroup then count=0; count+1; if last.bygroup then do; count=count*(count-1)/2; totnobs=totnobs+count; do ii=1 to count; output; end; end; if aa then call symput('nobsmeanzz',compress(totnobs)); keep &bylist bygroup ; run; %if &printdebugzz=1 %then %do; proc print; title3 'PDMIX800 debug datasets-check0'; run; title3; %end; **supposedly these two datasets are exactly the same size. If not, abort; %if &nobsmeanzz = &nobsdiffzz %then %do; data pdtempzz; merge pdtempzz pdtempzzz; **this copies bygroup onto the diffs; run; %end; %else %do; %put ERROR: PDMIX800 terminated - Means and Diffs datasets are not compatible.; %put WARNING: If you use multiple LSMEANS statements, all must have pdiff option.; %goto skip ; %end; %if &printdebugzz=1 %then %do; proc print; title3 'PDMIX800 debug datasets-check1';run; title3; %end; ***this sort is required to give IML data by slice; proc sort data=pdtempzz; by bygroup &slice; run; %if &printdebugzz=1 %then %do; proc print; title3 'PDMIX800 debug datasets-check2';run; title3; %end; %if %length(&slice) ne 0 %then %do; *******************************************************************; *******************************************************************; *** sort, edit, relabel diff and mean data for the slice option ***; *** this works by redefining effects that are being sliced ***; *** Example: In a 2*2 factorial, slicing the A*B interaction by A *** means only 2 comparisons are needed of the 4*3/2=6 possible. *** These are A1B1-A1B2 and A2B1-A2B2; %if %length(&varlistzz)=0 %then %put ERROR: No variables left after slicing.; %else %do; %let lastslicevar=%scan(&slice,-1); *** identify effects that will be sliced, renumber bygroup to add slices; *** bygroup is index of effects in original order. Create bygroup1zz to be effect/slice; proc sort data=pdtempmzz; by bygroup &slice; data pdtempmzz ; set pdtempmzz; by bygroup &slice; retain bygroup1zz 0; dothiseffectzz=0; *****test if effect should be sliced; if not(&slice1) then do; **no slice vars missing; if not(&varlist1) then dothiseffectzz=1; end; if dothiseffectzz then do; if first.&lastslicevar then bygroup1zz=bygroup1zz+1; end; else if first.bygroup then bygroup1zz=bygroup1zz+1; run; *** now fix up diffs dataset; *** merge bygroup1zz onto diffs to avoid mismatched diffs and means; data pdtempzzz; set pdtempmzz; by bygroup1zz; if first.bygroup1zz; keep dothiseffectzz bygroup1zz bygroup &slice; run; proc sort data=pdtempzz ; by bygroup &slice; data pdtempzz; merge pdtempzz (in=have) pdtempzzz; by bygroup &slice; if have; ***Delete any pdiffs information that compares across slices; ***compared factor levels must match on all slice variables; discardzz=0; if dothiseffectzz then do; %let ii=1; %let var=%scan(&slice,1); %do %while(%length(&var) ne 0); %let var2=_&var; %if %length(&var2)>32 %then %let var2=%substr(&var2,1,32); if &var ne &var2 then discardzz=1; %let ii=%eval (&ii+1); %let var=%scan(&slice,&ii); %end; if discardzz then delete; end; drop discardzz ; run; %end; **** if means data set has single means (eg 0 df) then sort these to the bottom so they do not merge with the msgrp letter output; proc sort data=pdtempmzz; by &bylist &slice; data pdtempmzz; set pdtempmzz; by &bylist &slice ; drop dothiseffectzz bygroup1zz; bygroup=bygroup1zz; ** final store of new group labels; **slicing is being done, so may have slice groups with just one level; if dothiseffectzz >0 and first.&lastslicevar and last.&lastslicevar then df0=1; run; %end; ***sort single means to bottom, and get data back to original bygroup order; proc sort data=pdtempmzz; by df0 bygroup ; data pdtempzz; set pdtempzz; ** save new by group labels; bygroup=bygroup1zz; drop bygroup1zz; run; %if &printdebugzz=1 %then %do; proc print data=pdtempmzz; title3 'Means data set ready'; run; proc print data=pdtempzz; title3 'Diffs data set ready for IML'; run; title3 ; %end; *************************************************************************; *** ready to process for differences within each effect/slice/bygroup ***; proc iml worksize=&worksize; reset nolog fw=7; printdebug=0; alpha=α use pdtempmzz; **for reading later; **** create mean separation output dataset with length 200; temp=j(1,&numlet,'0'); msgroup=rowcatc(temp); ADJUSTMENT=' '; create msgrpzz var{msgroup bygroup lsmrank ADJUSTMENT}; **** create indexes of effect and by group locations; *** For all useful variable names, read in levels; test='a'; ii=1; use pdtempzz; varlist= "&bylistzz &slice &varlistzz"; value='a'; ii=1; do while (value ^= '') ; value=scan(varlist,ii); if value ^= '' then do; *** the BY variables are not guaranteed to be character, *** so convert them if necessary; read all var value into hold; if type(hold)='N' then level=level||char(hold); else level=level||hold; free hold; end; ii=ii+1; end; if printdebug=1 then print varlist level; if ncol(level)=0 then do; file log; put "NOTE: No variables found for use in &pname."; dataerr=1; end; else dataerr=0; if dataerr ^= 1 then do; call change(level,'','-'); level=rowcatc(level); idx=1; dim=nrow(level); if printdebug=1 then print dim level; ***search down for number of comparisons in each section; ***read number of rows involving first mean to get number of means, then calculate number of comparisons; byby=0; do jj=1 to dim; first=level[jj,1]; byby=byby+1; **go to end of comparisons with mean 1; kk=jj; flag=1; do while(flag=1); kk=kk+1; if(kk > dim) then flag=0; else if (level[kk,1] ^= first) then flag=0; end; num=kk-jj+1; idx=idx || idx[1,byby] + num; jj=jj-1+num*(num-1)/2; ** skip to next section; end; free level; end; if printdebug=1 then print idx byby; ** BIG BB loop through rows of prob data ** subsetting out block dealing with each effect; pptr=1; **points to where probs start for current means; do iigroup = 1 to byby; dim= idx[1,iigroup+1]-idx[1,iigroup]; nn= dim*(dim-1)/2; **********************************************************; **for sorting letters need descending order, and antiranks; setin pdtempmzz; range=idx[1,iigroup] : idx[1,iigroup+1]-1 ; read point range var {ESTIMATE} into lsmcur; range=idx[1,iigroup]; read point range var{bygroup} into bygroup; ***just to put into output dataset; **stupid rank function fails on missing values; **so must temporarily make them non missing; test=lsmcur[><,]-1.e-30; locmiss=loc(lsmcur=.); kk=ncol(locmiss); if kk>0 then lsmcur[locmiss,]=test; lsmrnk=dim+1-rank(lsmcur); if kk>0 then lsmcur[locmiss,]=.; lsmarnk=lsmrnk; lsmarnk[lsmrnk,]=(1:(dim))`; if printdebug=1 then print pptr nn; **********************************************************; **** get prob file data for these means. _adjp_ contains the probs, no matter what adjust method; setin pdtempzz; range=pptr:pptr+nn-1; read point pptr var {ADJUSTMENT} into ADJUSTMENT; read point range var {ADJP} into data; pptr=pptr+nn; if printdebug=1 then print data; *** put p values into matrix; p = j(dim,dim,0); kk=1; do ii=1 to dim-1; do jj=ii+1 to dim; if data[kk,1]=. then p[jj,ii]=1; else p[jj,ii] = data[kk,1]; p[ii,jj]=p[jj,ii]; **fill in upper triangle for next sort; kk=kk+1; end;end; *** sort matrix by lsm value, so high mean gets first letter; temp=p; p[,lsmrnk]=temp; temp[lsmrnk,]=p; p=temp; free temp; if nn>&numlet then maxlet=&numlet; **memory use limit; else maxlet=nn+1; group = j(dim, maxlet, 0); members=j(dim,1,0); if printdebug=1 then print p dim data; gcode=1; ngroup=1; do ii=1 to dim; kk=0; flag=0; do jj=ii+1 to dim; * go down row, find group members ; if p[jj,ii] > alpha then do; * jj and ii are the same ; * check jj against members ; do mm=1 to kk ; ll=members[mm,1]; if jj>ll then test1=p[jj,ll]; else test1=p[ll,jj]; if test1<0 then test1=-test1; if(test1 < alpha) then goto jmp0; * need new group ; end; jmp0: if mm=kk+1 then do; do mm=ii+1 to dim; if mm=jj then mm=mm+1; *skip jj (on diagonal); if mm>dim then go to jmp2; if jj>mm then test1=p[jj,mm]; else test1=p[mm,jj]; if test1 > alpha && -p[mm,ii] > alpha then do; * previous grouped mean mm may belong in this group ; * so check if already in and current members; * dont conflict ; do ll=1 to kk; nn=members[ll,1]; if nn=mm then goto jmp1; if nn0 then p[jj,ll]=-p[jj,ll]; end; else do; if p[ll,jj]>0 then p[ll,jj]=-p[ll,jj]; end; end; group[jj,ngroup]=gcode; kk=kk+1; members[kk,1]=jj; end; else flag=1; end; end; if(kk=0) then do; * no members ; do jj=1 to ngroup until (group[ii,jj] ^= 0) ; end; * not in a group yet, so set flag ; if(jj=ngroup+1) then kk=kk+1; end; if(kk^=0) then do; * need to set current mean ; group[ii,ngroup]=gcode; ngroup=ngroup+1; gcode=gcode+1; if ngroup > &numlet then do; ** number of letters needed exceeded maximum; jj=dim; ii=dim; **stop loops this way to avoid warnings; iigroup=byby; dataerr=1; call symput('error','1'); end; end; if(flag^=0) then ii=ii-1; * need another group for this mean; end; if dataerr=0 then do; **skip below if error; ngroup=ngroup-1; group=group[,1:ngroup]; ***** this section just takes the groups identified by numbers above and converts numbers to letters. This depends on the ASCII character definitions, eg. 64 value below is what gets capital letters; *** write out letters; kk=nrow(group); do ii=1 to kk; gc='';nsect=1; do jj=1 to ngroup; mm=group[ii,jj]; if mm > 0 then do; ** blanks are 0, do not do them; sect=floor((mm-1)/26); *** 26 letters in alphabet; offset=mm-sect*26; sect=sect+1; if sect > nsect then do; nsect=sect; gc=gc||"("||char(sect)||")"; end; gc=gc||byte(64+offset); end; end; lsmrank=lsmarnk[ii,1]; msgroup=rowcatc(gc); ** save letters, by group and sort info; append var {msgroup bygroup lsmrank ADJUSTMENT}; end; end; **dataerr; end; ** for the big bb loop over effect sections; quit; %if &error=1 %then %do; %put ERROR: PDMIX800 terminated due to exceeding NUMLET limit (&numlet).; %put ERROR: Either increase limit, or use LSD values in LOG to identify differences.; data msgrpzz; set msgrpzz; ***delete incomplete letters; msgroup=''; run; %end; **** put group letters back in original lsm order; **** they were sorted so largest mean gets letter A; proc sort data=msgrpzz; by bygroup lsmrank; %if &printdebugzz=1 %then %do; proc print data=msgrpzz; run; %end; **** merge letters with means and print ****; data msgrpzz; merge pdtempmzz msgrpzz; label msgroup='Letter Group'; if ESTIMATE=. or DF=0 then do; **do not print for missing means or no data; msgroup=''; end; %if %upcase(&mixfmt)=NO %then %do; format _all_; %end; run; *******************************************************************; **** before printing, add the lsdvalues; proc sort data=pdtempzz; by &bylist &slice; proc means noprint data=pdtempzz; by &bylist &slice ; var STDERR ; output out=lsdvalzz n=numcomp; run; data lsdvalzz; merge pdtempzz lsdvalzz; by &bylist &slice ; if df>0 then do; if upcase(substr(adjustment,1,3))='LSD' then critt=tinv( (1-&alpha/2),DF); if upcase(substr(adjustment,1,3))='BON' then critt=tinv( 1-&alpha/(2*numcomp), DF); if upcase(substr(adjustment,1,5))='SIDAK' then do; prob=exp( log(1-&alpha/2) /numcomp ); critt=tinv( prob , DF); end; if upcase(substr(adjustment,1,7))='SCHEFFE' then do; numdf=-1+(sqrt(1+8*numcomp)+1)/2; critt=sqrt(numdf*finv(1-&alpha,numdf,DF)); end; if upcase(substr(adjustment,1,5))='TUKEY' then do; numdf=(sqrt(1+8*numcomp)+1)/2; ** number of treatments; critt=probmc('RANGE', . , 1-&alpha,DF,numdf); *put critt; critt=critt/sqrt(2); **adjust for tukey needing sd of mean, not diff; end; SigDiffValue=stderr*critt; end; run; proc means noprint data=lsdvalzz; by &bylist &slice ; id adjustment; var SigDiffValue ; output out=lsdvalavgzz mean=avgsigdiff max=maxsigdiff min=minsigdiff; run; data lsdvalavgzz; set lsdvalavgzz; label ; drop _type_ back; back=-1; format minsigdiff maxsigdiff avgsigdiff best7. ; %if %length(&slice)>0 %then put "&transtypezz " "&yvarzz " &bylist &slice adjustment 'average value=' avgsigdiff +back ', min=' minsigdiff +back ', max=' maxsigdiff ; %else put "&transtypezz " "&yvarzz " &bylist adjustment 'average value=' avgsigdiff +back ', min=' minsigdiff +back ', max=' maxsigdiff ; ; run; ******** print mean separation ************; proc sort data=msgrpzz; by &bylistzz bygroup effect; run; %if %upcase(&sort)=YES %then %do; proc sort data=msgrpzz; by &bylistzz bygroup EFFECT descending ESTIMATE; %end; data msgrpzz; set msgrpzz; ** drop working variables before printing; drop df0 lsmrank; %if %upcase(&test0)=NO %then %do; drop tvalue probt df; %end; run; %if &printmeans=YES %then %do; proc print data=msgrpzz label ; by effect adjustment bygroup notsorted; label bygroup=' Set' adjustment=' Method'; run; %end; *** restore user options; proc optload data=optionszzz; run; %skip: %mend pdmix800; %macro pdglm(pname,lname,sort=no,alpha=.05,worksize=1,test0=NO, mixfmt=yes,numlet=200); *** check arguments; %local dsid chk3 error1 error; %let error=0; %if %length(&lname)=0 %then %let error=1; %if %sysfunc(exist(&lname)) %then %do; %let dsid=%sysfunc(open(&lname,I)); %let chk3=%sysfunc(varnum(&dsid,LSMEAN)); %if &chk3=0 %then %let error=2; %let chk3=%sysfunc(varnum(&dsid,EFFECT)); %if &chk3=0 %then %let error=2; %let dsid=%sysfunc(close(&dsid)); %end; %else %let error=1; %if &error>0 %then %do; %if &error=1 %then %put WARNING: Dataset &lname does not exist.; %if &error=2 %then %put WARNING: Dataset &lname was not made by proc glm.; %end; %let error1=&error; %let error=0; %if %length(&pname)=0 %then %let error=1; %if %sysfunc(exist(&pname)) %then %do; %let dsid=%sysfunc(open(&pname,I)); %let chk3=%sysfunc(varnum(&dsid,_1)); %if &chk3=0 %then %let error=3; %let chk3=%sysfunc(attrn(&dsid,nobs)); %if &chk3=0 %then %let error=2; %let dsid=%sysfunc(close(&dsid)); %end; %else %let error=1; %if &error>0 %then %do; %if &error=1 %then %put WARNING: Dataset &pname does not exist.; %if &error=2 %then %put WARNING: There are no observations in dataset &pname.; %if &error=3 %then %put WARNING: Dataset &pname was not made by proc glm.; %end; %if (&error or &error1) %then %do; %put NOTE: PDGLM800 terminated due to errors in input values.; %goto skip; %end; data _null_; *** numerical value check only possible in data step; if &alpha < 0.0 or &alpha > 1.0 then call symput('error','3'); run; %if &error %then %do; %put PDGLM800 terminated due to errors in input values.; %if &error=3 %then %put Alpha can only have values between 0 and 1.; %goto skip; %end; *****************************************************; proc sort data=&lname; by dependent effect; proc sort data=&pname; by dependent effect; proc iml worksize=&worksize; reset nolog fw=7; alpha=α use &lname; **for reading later; **** create mean separation output dataset with length 200; temp=j(1,&numlet,'0'); msgroup=rowcatc(temp); _adjust_=' '; create msgrpzz var{msgroup bygroup lsmrank }; **** create indexes of effect and by group locations; test='a'; ii=1; *** get unique list of all names used in effects; *** have to read varnames directly since labels may be truncated; dsid=open("&pname",'i'); use &pname; dataerr=0; ***use rowname to id groups of means. ; ***Largest rowname gives size of diff matrix to read in; ***idx is created to look like {1 4 11} if there are 3 and 7 means, so has start index of each group; read all var {ROWNAME} into lvl; level=num(lvl); free lvl; dim=nrow(level); idx=loc(level=1); byby=ncol(idx); idx=idx||(dim+1); ***print level,idx,byby; ** BIG BB loop through rows of prob data ** subsetting out block dealing with each effect; pptr=1; **points to where probs start for current means; do bygroup = 1 to byby; dim= idx[1,bygroup+1]-idx[1,bygroup]; nn= dim*(dim-1)/2; **for sorting letters need descending order, and antiranks; setin &lname; range=idx[1,bygroup] : idx[1,bygroup+1]-1 ; read point range var {LSMEAN} into lsmcur; **stupid rank function fails on missing values; **so must temporarily make them non missing; test=lsmcur[><,]-1.e-30; locmiss=loc(lsmcur=.); kk=ncol(locmiss); if kk>0 then lsmcur[locmiss,]=test; lsmrnk=dim+1-rank(lsmcur); if kk>0 then lsmcur[locmiss,]=.; lsmarnk=lsmrnk; lsmarnk[lsmrnk,]=(1:(dim))`; **** get prob matrix data for these means.; setin &pname; vlist=compress(char(1:dim)); vlist=concat('_',vlist); range=idx[1,bygroup] : idx[1,bygroup+1]-1 ; ***notavail read point pptr var {_mstech_} into _adjust_; read point range var vlist into p; **** pptr=pptr+nn; *** sort matrix by lsm value, so high mean gets first letter; temp=p; p[,lsmrnk]=temp; temp[lsmrnk,]=p; p=temp; free temp; if nn>&numlet then maxlet=&numlet; **memory use limit; else maxlet=nn+1; group = j(dim, maxlet, 0); members=j(dim,1,0); gcode=1; ngroup=1; do ii=1 to dim; kk=0; flag=0; do jj=ii+1 to dim; * go down row, find group members ; if p[jj,ii] > alpha then do; * jj and ii are the same ; * check jj against members ; do mm=1 to kk ; ll=members[mm,1]; if jj>ll then test1=p[jj,ll]; else test1=p[ll,jj]; if test1<0 then test1=-test1; if(test1 < alpha) then goto jmp0; * need new group ; end; jmp0: if mm=kk+1 then do; do mm=ii+1 to dim; if mm=jj then mm=mm+1; *skip jj (on diagonal); if mm>dim then go to jmp2; if jj>mm then test1=p[jj,mm]; else test1=p[mm,jj]; if test1 > alpha && -p[mm,ii] > alpha then do; * previous grouped mean mm may belong in this group ; * so check if already in and current members; * dont conflict ; do ll=1 to kk; nn=members[ll,1]; if nn=mm then goto jmp1; if nn0 then p[jj,ll]=-p[jj,ll]; end; else do; if p[ll,jj]>0 then p[ll,jj]=-p[ll,jj]; end; end; group[jj,ngroup]=gcode; kk=kk+1; members[kk,1]=jj; end; else flag=1; end; end; if(kk=0) then do; * no members ; do jj=1 to ngroup until (group[ii,jj] ^= 0) ; end; * not in a group yet, so set flag ; if(jj=ngroup+1) then kk=kk+1; end; if(kk^=0) then do; * need to set current mean ; group[ii,ngroup]=gcode; ngroup=ngroup+1; gcode=gcode+1; if ngroup > &numlet then do; ** number of letters needed exceeded maximum; jj=dim; ii=dim; **stop loops this way to avoid warnings; bygroup=byby; dataerr=1; call symput('error','1'); end; end; if(flag^=0) then ii=ii-1; * need another group for this mean; end; if dataerr=0 then do; **skip below if error; ngroup=ngroup-1; group=group[,1:ngroup]; ***** this section just takes the groups identified by numbers above and converts numbers to letters. This depends on the ASCII character definitions, eg. 64 value below is what gets capital letters; *** write out letters; kk=nrow(group); do ii=1 to kk; gc='';nsect=1; do jj=1 to ngroup; mm=group[ii,jj]; if mm > 0 then do; ** blanks are 0, do not do them; sect=floor((mm-1)/26); *** 26 letters in alphabet; offset=mm-sect*26; sect=sect+1; if sect > nsect then do; nsect=sect; gc=gc||"("||char(sect)||")"; end; gc=gc||byte(64+offset); end; end; lsmrank=lsmarnk[ii,1]; msgroup=rowcatc(gc); ** save letters, by group and sort info; append var {msgroup bygroup lsmrank }; end; end; **dataerr; end; ** for the big bb loop over effect sections; quit; %if &error=1 %then %do; %put ERROR: PDGLM800 terminated due to exceeding NUMLET limit.; %goto skip; %end; **** put group letters back in original lsm order; **** they were sorted so largest mean gets letter A; proc sort data=msgrpzz; by bygroup lsmrank; **** if means data set has single means (eg 0 df) then sort these to the bottom so they do not merge with the msgrpzz output; proc means data=&lname noprint; by dependent effect; var lsmean; output out=tempzz n=nummeans; run; data tempzz; set tempzz; if nummeans le 1 then df0=1; else df0=0; run; data tempzz; merge &lname tempzz; by dependent EFFECT; proc sort; by df0; **** merge letters with means and print ****; data msgrpzz; merge tempzz msgrpzz; drop lsmrank df0 _type_ _freq_ nummeans; label msgroup='LetterGroup'; label LSMean='LSMean'; if LSMEAN=. then do; **do not print for missing means; msgroup=''; end; %if %upcase(&test0)=NO %then %do; drop ProbT PROBTDiff MissPattern LSMeanNumber; %end; %if %upcase(&mixfmt)=NO %then %do; format _all_; %end; run; %if %upcase(&sort)=YES %then %do; proc sort data=msgrpzz; by bygroup effect descending LSMEAN; %end; proc print data=msgrpzz label ; by bygroup EFFECT notsorted; run; %skip: %mend; **PDGLM; ************************************************************************* ************************************************************************* ************************************************************************* *************************************************************************; %macro design(design,seed=0,rep=0,treat=0,popsize=0,block=1,square=1,row=0, col=0,iblock=0,blocksize=0,period=0, factors=0, min=, max=); %local totn type; %let design= %upcase(&design); *** blank call prints out help; %if %length(&design)=0 and &treat=0 and &row=0 and &factors=0 and &rep=0 %then %do; %put %nrstr( %Design generates the following designs. Possible options are listed.); %put Completely Randomized Design: CRD, treat=, rep=; %put Random Sample: ,popsize=, rep=; %put Randomized Block Design: RBD, block=, treat=, rep=; %put Latin Square: LSD, row=; %put Multiple Square Latin Square: MSLSD, square=, row=; %put Crossover: XOVR, col=, row=; %put Switchback: SWI, period=, col=, treat=; %put Incomplete Block (resolvable): rIBD, block=, iblock=, treat=, blksize=; %put IBD (unresolvable): IBD, iblock=, treat=, blksize=; %put Central Composite Design: CCD, factors=, min=, max=; %goto skip; %end; *** attempt to decode design from options; %if %length(&design)=0 %then %do; %let design=CRD; %if &factors>0 %then %let design=CCD; %if &block >1 %then %let design=RBD; %if &row ne 0 |&col ne 0 %then %let design=LSD; %if &blocksize ne 0 %then %let design=IBD; %if &period ne 0 %then %let design=SWI; %if &seed=0 %then %do; **plan does not allow 0 seed; data _null_; seed=0; call ranuni(seed,xx); call symput('seed',seed); run; %end; %if &design=SWI %then %do; %if &period le 2 %then %do; %let type=ERROR; %put ERROR: Switchbacks must have at least 3 periods.; %end; %end; %end; %if &design =CRD %then %do; %if &popsize ne 0 %then %do; proc plan seed=&seed; title2 "Random sample of &rep units from population of size &popsize"; factors expunits=&rep of &popsize; run; %end; %else %do; data designzz; do treatment=1 to &treat; do rep=1 to &rep; rannum=ranuni(&seed); output; end;end; run; proc sort; by rannum; data designzz; set designzz; expunit=_n_; run; proc print noobs; title2 "Random assignment of &treat treatments to &rep reps each"; var expunit treatment rep; run; %end; %end; %if &design=RBD %then %do; %let totn=%eval(&rep*&treat); proc plan seed=&seed; factors block=&block ordered expunit=&totn random/noprint; output out=designzz; run; proc sort; by block; data designzz; set designzz; by block; ***rerandomize plan just to be sure; retain ran1; retain seed &seed; if first.block then call ranuni(seed,ran1); call ranuni(seed,ran2); drop block; run; proc sort; by ran1 ran2; data designzz; set designzz; by ran1; retain block hold 0; ***convert expunit to treat,rep labels, then renumber; if first.ran1 then do; block+1; hold=0; end; treatment=ceil(expunit/&rep); rep=expunit-&rep*(treatment-1); hold+1; expunit=hold; run; proc tabulate; title2 "Random assignment of &rep reps of &treat treatments within &block blocks"; class block expunit; var treatment; tables expunit,block*treatment=' '* mean='Treatment'* format=6.0 /condense; run; %end; %if &design=LSD or &design=XOVR or &design=MSLSD %then %do; %if &row=0 %then %let row=&col; %if &treat=0 %then %let treat=&row; %if &col<&row %then %let col=&row; %if &square=1 %then title2 "Latin Square Design for &treat treatments"; %else title2 "Latin Square Design for &square squares of &treat treatments"; ; %if &col>&row %then %do; data _null_; square=ceil(&col/&row); call symput('square',square); run; title2 "Crossover Design "; %end; %do ii=1 %to □ proc plan seed=&seed; factors row=&row ordered col=&row ordered /noprint; treatments treatment=&treat cyclic; output out=zzzz row random col random treatment ; quit; data zzzz; set zzzz; square=ⅈ run; %if &ii=1 %then %do; data designzz; set zzzz ; run; %end; %else %do; data designzz; set designzz zzzz; run; %end; ***spin seed a bit; data _null_; seed=&seed; do ii=1 to 100; call ranuni(seed,xx); end; call symput('seed',seed); run; %end; %if &col>&row %then %do; proc sort data=designzz; by square col row; data designzz; set designzz; by square col; retain rannum seed; if _n_=1 then seed=&seed; if first.col then call ranuni(seed,rannum); drop col; run; proc sort; by rannum; data designzz; set designzz; by rannum; retain col 0; if first.rannum then col=col+1; run; proc tabulate; class col row; var treatment; tables row,col*treatment=' '* mean='Treatment'* format=6.0 /condense; run; %end; %else %do; proc sort data=designzz; by square row col; proc tabulate; class square col row; var treatment; tables square*row,col*treatment=' '* mean='Treatment'* format=6.0 /condense; run; %end; %end; %if &design=SWI %then %do; %if &treat = 0 %then %let treat=2; **fake value if not given; data designzz; retain col subj 0; subj=&col; if subj<1 then subj=&treat; **fake value if not given; do while (col lt subj); do ii=1 to &treat; one=ii; do jj=ii+1 to &treat; two=jj; *** generate design points for a subject; col+1; period=1; treatment=one; output; do while (period lt &period); period+1; treatment=two; output; if period lt &period then do; period+1; treatment=one; output; end; end; col+1; period=1; treatment=two; output; do while (period lt &period); period+1; treatment=one; output; if period lt &period then do; period+1; treatment=two; output; end; end; end; end;end; run; data designzz; set designzz; by col; retain rannum seed; if _n_=1 then seed=&seed; if first.col then call ranuni(seed,rannum); drop col; run; proc sort; by rannum; data designzz; set designzz; by rannum; retain col 0; if first.rannum then col=col+1; run; title2 "Switchback Design with &treat treatments in &period periods"; proc tabulate; class period col; var treatment; tables period,col*treatment=' '*mean='Treatment'*[format=6.0] /condense; run; %end; %if &design=IBD or &design=RIBD %then %do; ***if one complete block, do a general incomplete block; *** iblock must be the total number of incomplete blocks; %if &block=1 %then %do; %let nobs=%eval(&blocksize*&iblock); data zzz; *do iblock=1 to &iblock; do treatment=1 to &treat; output; *end; end; run; ods exclude all; proc optex data=zzz seed=&seed; class treatment; model treatment; blocks structure= (&iblock.) &blocksize. ;; *generate n=&nobs method=m_federov; ods listing select blockdesignefficiencies; output out=designzz number=1 blockname=iblock; * put best design in dataset designzz; run; ods exclude none; proc sort; by iblock; data designzz; set designzz ; by iblock; retain seed ran1; if _n_=1 then seed=&seed; if first.iblock then call ranuni(seed,ran1); ran2=ranuni(0); drop iblock; run; proc sort; by ran1 ran2; data designzz; set designzz; by ran1; retain iblock expunit 0; if first.ran1 then do; iblock=iblock+1; expunit=0;end; expunit+1; run; proc tabulate; title2 'Incomplete Block Design'; class iblock expunit; var treatment; tables expunit,iblock*treatment=' '*mean='Treatment'*[format=6.0] /condense; run; proc freq data=designzz; title2 "A check on treatment frequencies"; tables treatment/list; run; %end; %else %do; **** else do a resolvable incomplete block, **** with iblock the number of small blocks in a rep; %let nobs=%eval(&blocksize*&block*&iblock); data zzz; do block=1 to █ do iblock=1 to &iblock; do treatment=1 to &treat; output; end;end;end; run; ods exclude all; proc optex data=zzz seed=&seed; class iblock treatment block; model block iblock block*iblock treatment block*treatment; generate n=&nobs method=m_federov ; output out=designzz number=1 ; * put best design in dataset zzzz; run; ods exclude none; proc sort; by block iblock; data designzz; set designzz; by block iblock; retain seed ran1 ran0; if _n_=1 then seed=&seed; if first.block then call ranuni(seed,ran0); if first.iblock then call ranuni(seed,ran1); call ranuni(seed,ran2); drop block iblock; run; proc sort; by ran0 ran1 ran2; data designzz; set designzz; by ran0 ran1; retain block iblock expunit 0; if first.ran0 then do; block=block+1; iblock=0; end; if first.ran1 then do; iblock=iblock+1; expunit=0;end; expunit+1; run; proc tabulate; title2 'Resolvable Incomplete Block Design'; class block iblock expunit; var treatment; tables expunit,block*iblock*treatment=' '*mean='Treatment'*[format=6.0] /condense; run; proc freq; tables block*treatment/noprint out=zzz; title2 "Check treatment frequencies (are equal in resolvable design)"; run; proc freq data=zzz; tables treatment*count/list; run; %end; %end; %if &design=CCD %then %do; data designzz; nfact=1; name=scan("&factors",nfact); do while(name ne ''); nfact=nfact+1; name=scan("&factors",nfact); end; call symput('nfact',compress(nfact-1)); run; *** loop through factors; %if &nfact=1 %then %put ERROR: Central Composite Design must have at least 2 factors.; %else %do; data designzz; set designzz; array names &factors; array min min1-min&nfact; array max max1-max&nfact; if length("&min")>0 then do; do ii=1 to &nfact; min[ii] = 1*scan("&min",ii); if min[ii]=. then min[ii]=0; end; end; else do ii=1 to &nfact; min[ii]=0; end; if length("&max")>0 then do; do ii=1 to &nfact; max[ii] = 1*scan("&max",ii); if max[ii]=. then max[ii]=1; end; end; else do ii=1 to &nfact; max[ii]=1; end; names[1]= min[1]; output; names[1]= max[1]; output; run; %do ii=2 %to &nfact; data designzz; set designzz; array names &factors; array min min1-min&nfact; array max max1-max&nfact; names[&ii]= min[&ii]; output; names[&ii]= max[&ii]; output; run; %end; *** now create points on sphere; data zzzz; set designzz(obs=1); array names &factors; array min min1-min&nfact; array max max1-max&nfact; ***for each factor, go +- to sphere, set all others to avg; do ii=1 to &nfact; names[ii]=(min[ii]+max[ii])/2; end; do ii=1 to &nfact; radius=(max[ii]-min[ii])/2; dist=sqrt(2)*radius; avg=(min[ii]+max[ii])/2; names[ii]= avg-dist; output; names[ii]= avg+dist; output; names[ii]=avg; end; run; data designzz; set designzz zzzz; run; *** now create reps; data zzzz; set designzz(obs=1); if &rep=0 then reps=6; else reps=&rep; array names &factors; array min min1-min&nfact; array max max1-max&nfact; do ii=1 to &nfact; avg=(min[ii]+max[ii])/2; names[ii]=avg; end; do ii=1 to reps; output; end; data designzz; set designzz zzzz; run; proc print data=designzz; run; %end; %end; title2 ; %skip: %mend design; ********************************************************************************* ********************************************************************************* ********************************************************************************* *********************************************************************************; %macro reg(dset,yvarlist,xxmod,opt= ,by=, deg=1, class=, lof=NO ,plottype=gplot, indvar=, transtype=, transvalue=0, diagnostics=ON, command=,commands=); %local var1zz var2zz bystatzz numinf myps numxvar lofyes lofclass lofterm lofmodel purerror puredf varselect whatmodel xxmod glmmod yvar yvarnum t_ clistzz maxyaxis stepyaxis; %global printdebugzz; ************ check for syntax errors that need immediate stop; ************* check datasets; %let serror=N; %if %length(&dset)=0 %then %do; %let serror=Y; %put ERROR: Dataset name must be first argument.; %end; %if %sysfunc(exist(&dset))=0 %then %do; %let serror=Y; %put ERROR: Dataset &dset does not exist.; %end; %if %length(&yvarlist)=0 %then %do; %let serror=Y; %put ERROR: Dependent variable(s) must be given.; %end; %if &serror = Y %then %do; %put %nrstr( %REG will not be run.); %goto quitstop1; %end; %let serror=N; data _null_; length xvar $ 1024; transtype=upcase("&transtype"); if transtype in('' 'LOG' 'SQRT' 'FROM_PH' 'LOG10' 'ARSINSQRT' 'ARCSINSQRT' 'POWER' 'BOXCOX') then error='N'; else do; error='Y'; put "ERROR: Unsupported transtype option, &transtype."; end; if index("&opt","noint")>0 and %length(&xxmod)=0 then do; error='Y'; put "ERROR: No model was specified."; end; xvar=trim("&dset"); dsid=open(xvar,'I'); if dsid=0 then do; error='Y'; put "ERROR: Dataset &dset does not exist." ; end; else do; ii=1; xvar=scan("&xxmod",ii,' |*()@'); do while(xvar>''); ***does X variable exist?; if anydigit(xvar) > 1 then do; **skip @2 syntax; chk3=varnum(dsid,xvar); if chk3<1 then do; error='Y'; put "ERROR: Model variable " xvar "not in dataset."; end; end; ii+1; xvar=scan("&xxmod",ii,' |*()@'); end; ii=1; xvar=scan("&yvarlist",ii); do while(xvar>''); ***does Y variable exist?; chk3=varnum(dsid,xvar); if chk3<1 then do; error='Y'; put "ERROR: Dependent variable " xvar "not in dataset."; end; ii+1; xvar=scan("&yvarlist",ii); end; ii=1; xvar=scan("&class",ii); do while(xvar>''); ***does class variable exist?; chk3=varnum(dsid,xvar); if chk3<1 then do; error='Y'; put "ERROR: Class variable " xvar "not in dataset."; end; ii+1; xvar=scan("&class",ii); end; end; call symput('serror',error); run; %if &serror = Y %then %do; %put %nrstr( %REG will not run.); %goto quitstop1; %end; proc printto log=user.dsetzz; *** turn off log by dumping to temp file; run; proc optsave out=optionszz; ** save user option settings; run; options nonotes nomprint; data optionszz; set optionszz; if optname in ('NOTES' 'MPRINT'); ***only save options that are changed; run; proc printto ; ** usual logging back on; run; **** set by processing and temp dataset; data dsetzzz; set &dset; OrigObsNumzz=_n_; ***store original order; run; %let bystatzz=; %if %length(&by) >0 %then %do; proc sort data=dsetzzz; by &by; %let bystatzz= by &by ; %end; *********get model information **********; %let whatmodel=blank; %let varselect=NO; %let lofreg=; %let dummyzz=; %let lofmodzz=; %let xvar1zz=; %let xvar2zz=; data _null_; length xxmodel xxclass xxreg $ 2048; **** pull out variable selection option; find=index(upcase("&opt"),"SELECTION"); if find>0 then do; stmt=substr("&opt",find); option=upcase(scan(stmt,2,'= ')); call symput("varselect",compress(option)); end; **** get unique list of reg vars in model; xxmodel=propcase(symget('xxmod'),' /\=+-*'||'09'x); xxclass=propcase(symget('class')); **** set flag for dummy reg; if xxclass > ' ' then call symput("Dummyzz", "yes"); xxreg=''; xx=scan(xxmodel,1); ii=1;bad=1; do while (xx ne ''); find=indexw(xxclass,xx); if find>0 then do; bad+1; **dont count class vars; end; else do; if indexw(xxreg,xx)=0 then do; xxreg=compbl(xxreg||' '||xx); * name=compress('var'||ii-bad+1||'zz'); * call symput(name,compress(xx)); end; else bad+1; **already found; end; ii=ii+1; xx=scan(xxmodel,ii); end; call symput("xxreglist", xxreg); call symput("numxvar",compress(ii-bad)); *** may need first two vars for plotting; xx=scan(xxreg,1); if xx>'' then do; call symput("xvar1zz",compress(xx)); xx=scan(xxreg,2); if xx>'' then call symput("xvar2zz",compress(xx)); end; run; **** decode LOF information; %if %upcase(&lof)=YES %then %do; data _null_; length xx1 xx2 xx3 xx4 xx $ 2048; length xxmodel $ 2048; xxmodel=symget('xxmod'); drop xxmodel; call symput('lofyes','yes'); xx=scan(xxmodel,1); ii=1; xx1=''; xx2=''; xx4=''; kkzz=0; do while (xx ne ''); **check if xx is in class=, so not used here; if indexw("&class",xx," *()")=0 then do; xx3=compress('lof'||xx); if indexw(xx1,xx3)=0 then do; **add if unique; kkzz+1; **counts reg variables; xx4=compbl(xx4||' '||xx); *list of reg vars; xx1=compbl(xx1||' '||xx3); xx2=compress(xx2||'*lof'||xx); end; end; ii=ii+1; xx=scan(xxmodel,ii); end; call symput('lofreg',xx4); call symput('lofclass',xx1); xx=compress("&class"); if xx > ' ' then xx1=compress(substr(xx2,2)||'*'||translate("&class",'*',' ')); else xx1=compress(substr(xx2,2)); call symput('lofterm', xx1); xx=xx4; if kkzz=1 and "&whatmodel" ne "dummy" then do ii=2 to ° xx=compress(xx||'|'||xx4); end; else xx=xxmodel; call symput('lofmodel',xx); run; data dsetzzz; set dsetzzz; **** create class variables for each regressor; array lzz &lofclass; array xzz &lofreg; do over lzz; lzz=xzz; end; run; %end; data _null_; *** flag response surface model, and create model for GLM containing interactions; length glmmod $ 2048; length xxmodel $ 2048; xxmodel=symget('xxmod'); if &numxvar>1 and °=2 then call symput('whatmodel' ,'RespSurf'); if &numxvar=1 and ° >1 then call symput('whatmodel' ,'Polynomial'); xx=compbl(xxmodel); xx=translate(trim(xx),'|',' '); glmmod=xx; do jj=2 to ° glmmod=compbl(glmmod||'|'||xx); end; call symput('glmmod',glmmod); *** get names and model expressions ***; ii=1; xx=scan(xxmodel,1); zz1=compress('lof'||xx); do while(xx ne ''); var=compress('var'||ii||'zz'); call symput(var,xx); ii=ii+1; xx=scan(xxmodel,ii); if xx ne '' then do; zz1=compress(zz1||'*lof'||xx); end;end; zz2=translate(zz1,' ','*'); call symput('lofmodzz',zz1); **needs to be lofx*lofw*lofz; call symput('clistzz',zz2); **needs to be lofx lofw lofz; run; %if &numxvar>1 and %length(&class)=0 and &varselect=NO and &lofyes ^= yes %then %let opt= &opt collin vif stb pcorr2;; %if &printdebugzz=1 %then %do; %put ModelType= &whatmodel LOF=&lofyes Dummy=&Dummyzz VariableSelection=&varselect; %put glmmodel &glmmod; %put lofmodzz &lofmodzz; %put xxmod &xxmod; %put clistzz &clistzz; %put lofmodel &lofmodel; %put lofclass &lofclass; %put lofterm &lofterm; %put lofreg &lofreg; %put reg var list = &xxreglist; %put number x vars = &numxvar; %put xvar1zz xvar2zz &xvar1zz &xvar2zz; %put Model Options= &opt; %end; **************************************************************************; ***** run entire macro on each dep variable*****; %let yvarnum=1; %let yvar=%scan(&yvarlist,&yvarnum); %do %until (&yvar=); **if this yvar run fails, make sure old datasets are not used; proc datasets nodetails nolist; delete tempzz residzz misszz red1zz; quit; *********************************; ********create transform*********; %let transtypezzz= &transtype; **for titles; %if %length(&transtype) > 0 %then %do; %let t_=tzz; %if %upcase(&transtype)=BOXCOX %then %do; ods exclude all; proc glm data=dsetzzz ; &bystatzz; %if &class > ' ' %then class &class;; model &yvar = &xxmod / &opt; output out=residzz r=resid; run; proc means noprint data=residzz; var resid; output out=tempzz min=minresid; run; data tempzz; set tempzz; zzzz=1; run; data residzz; set residzz; zzzz=1; run; data residzz; merge residzz tempzz; by zzzz; resid=resid-minresid+1; drop _type_; run; ods listing; %if &transvalue=0 %then %let transvalue= -3 to 3 by 0.1; proc transreg data=residzz details pbo ; title2 'Box-Cox search for optimal power transformation'; ods output boxcox = tempzz ; %if &class > ' ' %then model boxcox(resid / convenient lambda=&transvalue) = class(&class)| identity(); %else model boxcox(resid / convenient lambda=&transvalue) = identity(&xxmod);; ods exclude details; run; data tempzz; set tempzz end=aa; format _all_; retain flagzz 0; if CI='*' and flagzz=0 then do; flagzz=1; **found lower CL; call symput("minzz",lambda); end; if flagzz=1 and CI='' then do; flagzz=2; **found upper CL; call symput("maxzz",lambda); end; if CI='<' then call symput("optzz",lambda); if aa and flagzz ne 2 then call symput("maxzz",lambda);**if CI goes to max; run; proc gplot; &symb1zz; plot loglike*lambda/href=&minzz &optzz &maxzz vaxis=axis1 haxis=axis2; run;quit; %let check4=0; data tempzz; set tempzz; where CI='<'; if _n_>0 then do; call symput('check4',4); call symput('optpow',lambda); end; run; %if &check4=0 %then %do; %put Optimal transformation not found. Rerun with transvalue wider than &transvalue.; %goto quitstop; %end; ***set to optimum transformation found; %let transtype=POWER; %let transvalue= &optpow; %end; data dsetzzz; set dsetzzz; trtypezz=upcase("&transtype"); valuezz=1*&transvalue; drop trtypezz valuezz; if trtypezz='LOG' then do; tzz&yvar=log(&yvar+valuezz); end; if trtypezz='SQRT' then do; tzz&yvar=sqrt(&yvar+valuezz); end; if trtypezz='FROM_PH' then do; tzz&yvar=10**(valuezz-&yvar); end; if trtypezz='LOG10' then do; tzz&yvar=log10(&yvar+valuezz); end; if trtypezz in('ARSINSQRT' 'ARCSINSQRT') then do; if valuezz=0 then tzz&yvar=arsin(sqrt(&yvar)); else tzz&yvar=arsin(sqrt(&yvar/valuezz)); end; if trtypezz='POWER' then do; tzz&yvar=&yvar**valuezz; call symput("transtypezzz", compress("Power("||valuezz||")")); end; run; %end; %else %let t_=; ; *********** main analysis ************************; %if &lofyes= yes %then %do; ***********************************************************************; *************** response surface with lack of fit *********************; %if &whatmodel=respsurf %then %do; proc means noprint data=dsetzzz; var &var1zz &var2zz; output out=zzz min=var1min var2min max=var1max var2max; run; data misszz; set zzz; drop _type_ _freq_ var1min var2min var1max var2max; do &var1zz=var1min to var1max; do &var2zz=var2min to var2max; &t_.&yvar=.; output; end;end; run; data tempzz; set dsetzzz misszz; proc sort; by &xxmod; title2 "Response surface regression of &xxmod explaining &transtypezzz &yvar"; options mprint notes; proc rsreg data=dsetzzz ; &bystatzz; model &t_.&yvar = &xxmod/ nocode lackfit; &command &commands run; options nomprint nonotes; proc glm data=tempzz; &bystatzz; ods exclude all; class &lofclass ; model &t_.&yvar=&glmmod@2 &lofterm /ss1 &opt ; ods output overallanova=tempzzz; run; quit; title3 'Use these tests if LOF was significant'; proc print data=tempzzz; run; data tempzzz; set tempzzz; if source='Error' then do; call symput('purerror',compress(ms) ); modelterms= ° * &numxvar + &numxvar * (&numxvar-1)/2; ddf= repeat(left(df),modelterms); call symput('puredf',ddf ); end; run; proc mixed noprofile data=tempzz; &bystatzz; title3 'Use these parameter estimate results if LOF was significant'; model &t_.&yvar= &glmmod@2 /solution int ddf=&puredf ; parms (&purerror)/noiter; ods exclude all; ods output solutionf=residzz; run; proc print data=residzz; ods listing; run; proc glm data=tempzz; &bystatzz; title3 'Analysis without LOF for diagnostics'; model &t_.&yvar =&glmmod@2 /&opt ; output out=residzz p=PredictedValue stdp=StdErrMeanPred stdi=StdErrIndPred r=Residual stdr=StdErrResidual u95=UpperCL l95=LowerCL u95m=UpperCLMean l95m=LowerCLMean student=StudentResidual cookd=CooksD dffits=Dffits; run;quit; ods listing; ***************************************; *** plots; %if %upcase(&plottype) ^= GPLOT %then %do; proc plot data=residzz; &bystatzz; title3 'Contour plot of first 2 variables response surface'; plot &var1zz*&var2zz=predictedvalue /contour=5; run; %end; %if %upcase(&plottype)=GPLOT %then %do; proc g3grid data=residzz out=tempzz; &bystatzz; grid &var1zz*&var2zz=PredictedValue / naxis1=21 naxis2=21 join; run; proc gcontour data=tempzz; &bystatzz; title3 'Contour plot of first 2 variables response surface'; plot &var1zz*&var2zz=PredictedValue / ; run;quit; proc g3d; &bystatzz; plot &var1zz*&var2zz=PredictedValue /grid; run;quit; %end; data residzz; set residzz; where &t_.&yvar ne .; *** after plotting, dump filler obs; format _all_; observation=_n_; if cooksd>2 then pinfluence=50; else pinfluence=0; if dffits>2 then pinfluence=pinfluence+50; run; %end; %else %do; *********LOF single, multiple, polynomial, dummy**********; options mprint notes; proc glm data=dsetzzz; &bystatzz; %if &Dummyzz= yes %then %do; title2 "Lack of Fit Dummy Regression model for &transtypezzz &yvar"; %end; %else %do; %if °=1 %then %do; title2 "Lack of Fit Multiple Regression, &xxmod -> &transtypezzz &yvar";; %end; %else %do; title2 "Lack of Fit Polynomial Model, &xxmod -> &transtypezzz &yvar";; %end; %end; class &lofclass &class; model &t_.&yvar=&&lofmodel &lofterm /ss1 &opt ; &command &commands ods output overallanova=tempzzz; run; quit; options nomprint nonotes; ***** check if this ran, abort if not; %let chk4=0; data _null_; set tempzzz; if _n_>0 then do; call symput("chk4","3"); stop; end; run; %if &chk4=0 %then %do; %put ERROR: Analysis of &yvar FAILED. Check statement, model and/or dep variables and rerun.;; %goto quitstop; %end; data tempzzz; set tempzzz; if source='Error' then do; call symput('purerror',compress(ms) ); ddf= repeat(left(df),°); call symput('puredf',ddf ); end; run; data _null_; **fix intercept options for mixed; if indexw("&opt", "noint")=0 then call symput("mixopt","int"); else call symput("mixopt","&opt"); run; proc mixed noprofile data=dsetzzz; &bystatzz; title3 "Parameter estimates with correct std. errors based on pure error"; class &class; model &t_.&yvar=&&lofmodel /solution &mixopt ddf=&puredf ; parms (&purerror)/noiter; ods exclude all; ods output solutionf=residzz; run; proc print data=residzz; ods listing; run; proc glm data=dsetzzz; &bystatzz; title3 'Analysis without lack of fit for R-square, diagnostics...'; class &class; model &t_.&yvar=&&lofmodel /ss1 ss3 &opt ; output out=residzz p=PredictedValue stdp=StdErrMeanPred stdi=StdErrIndPred r=Residual stdr=StdErrResidual u95=UpperCL l95=LowerCL u95m=UpperCLMean l95m=LowerCLMean student=StudentResidual cookd=CooksD dffits=Dffits; ods exclude parameterestimates; ods exclude nobs; run; quit; data residzz; set residzz; format _all_; observation=_n_; if cooksd>2 then pinfluence=50; else pinfluence=0; if dffits>2 then pinfluence=pinfluence+50; run; %end; *******************************end of LOF ***********************; *****************************************************************; %end; %else %do; %if &whatmodel=respsurf %then %do; ********************************************************************; ********* response surface ********************************; ****create grid of values for prediction surface; proc means noprint data=dsetzzz; var &var1zz &var2zz; output out=zzz min=var1min var2min max=var1max var2max; run; data misszz; set zzz; drop _type_ _freq_ var1min var2min var1max var2max step1 step2; step1=(var1max-var1min)/10; step2=(var2max-var2min)/10; do &var1zz=var1min to var1max by step1; do &var2zz=var2min to var2max by step2; &t_.&yvar=.; output; end;end; run; data tempzz; set dsetzzz misszz; proc sort; by &xxmod; title2 "Response surface regression of &xxmod explaining &transtypezzz &yvar"; options mprint notes; proc rsreg data=dsetzzz ; &bystatzz; model &t_.&yvar = &xxmod/ nocode ; &command &commands run; options nomprint nonotes; proc glm data=tempzz; &bystatzz; ***run glm for diagnostics, no need for any results; ods exclude all; model &t_.&yvar =&glmmod @2 ; &command &commands output out=residzz p=PredictedValue stdp=StdErrMeanPred stdi=StdErrIndPred r=Residual stdr=StdErrResidual u95=UpperCL l95=LowerCL u95m=UpperCLMean l95m=LowerCLMean student=StudentResidual cookd=CooksD dffits=Dffits; run;quit; ods listing; data residzz; set residzz; format _all_; observation=_n_; if cooksd>2 then pinfluence=50; else pinfluence=0; if dffits>2 then pinfluence=pinfluence+50; run; *** contour plot; %if %upcase(&plottype) ^= GPLOT %then %do; proc plot data=residzz; &bystatzz; title3 'Contour plot of first 2 variables response surface'; plot &var1zz*&var2zz=PredictedValue /contour=5; run; %end; %if %upcase(&plottype)=GPLOT %then %do; proc g3grid data=residzz out=tempzz; &bystatzz; grid &var1zz*&var2zz=PredictedValue / naxis1=21 naxis2=21 join; run; proc gcontour data=tempzz; &bystatzz; title3 'Contour plot of first 2 variables response surface'; plot &var1zz*&var2zz=PredictedValue ; run;quit; proc g3d; &bystatzz; plot &var1zz*&var2zz=PredictedValue /grid; run;quit; %end; data residzz; set residzz; where &t_.&yvar ne .; *** after plotting, get rid of filler; run; %end; %else %do; %if °>1 or &Dummyzz=yes %then %do; ******************************************************; ********* Plain Polynomial ***************************; options mprint notes; proc glm data=dsetzzz; &bystatzz; %if °>1 %then %do; title2 "Polynomial Regression of &xxmod@° -> &transtypezzz &yvar"; model &t_.&yvar=&glmmod @ ° /ss1 ss3; %end; %if &Dummyzz=yes %then %do; ******************************************************; ********* Dummy Regression ***************************; title2 "Dummy Regression explaining &transtypezzz &yvar"; class &class; model &t_.&yvar=&xxmod/solution &opt; %end; &command &commands output out=residzz p=PredictedValue stdp=StdErrMeanPred stdi=StdErrIndPred r=Residual stdr=StdErrResidual u95=UpperCL l95=LowerCL u95m=UpperCLMean l95m=LowerCLMean student=StudentResidual cookd=CooksD dffits=Dffits; ods exclude nobs; run;quit; options nomprint nonotes; data residzz; set residzz; format _all_; observation=_n_; if cooksd>2 then pinfluence=50; else pinfluence=0; if dffits>2 then pinfluence=pinfluence+50; run; %end; %else %do; ************************************************************; ********* simple or multiple regression*************************; %if &numxvar=1 %then title2 "Simple Linear Regression of &xxmod -> &transtypezzz &yvar"; %else %if &numxvar< 7 %then title2 "Multiple Regression of &xxmod -> &transtypezzz &yvar"; %else title2 "Multiple Regression of &numxvar X variables -> &transtypezzz &yvar";; %if &varselect^=NO %then %do; ***************variable selection section; %let opt= CP &opt; **always give Cp statistics; data _null_; **set axis for cp plot; maxyaxis= 5*&numxvar; stepyaxis=floor(&numxvar/2); call symput("maxyaxis",maxyaxis); call symput("stepyaxis",stepyaxis); run; options notes mprint; proc reg data=dsetzzz; &bystatzz; model &t_.&yvar = &xxmod/ &opt; &command &commands plot cp.*np. / chocking=red cmallows=blue nomodel nostat vaxis=0 to &maxyaxis by &stepyaxis; ods output subsetselsummary=cpzzzz; run; quit; options nonotes nomprint; %if &varselect=CP %then %do; data cpzzzz; set cpzzzz; where cp2 then flag=flag+1; end; keep pinfluence ; pinfluence=100*flag/&numinf; run; data residzz; merge dsetzzz predzz(keep=StdErrIndPred) residzz red1zz; format _all_; run; %end; %end; %end;%end; ***all analysis done***; ***************************************************************; ****** check if residuals created, abort if not; %let chk4=0; data _null_; set residzz; if _n_>0 then do; call symput("chk4","3"); stop; end; run; %if &chk4=0 %then %do; %put ERROR: Analysis of &yvar FAILED. Check statement, model and/or dep variables and rerun.;; %goto quitstop; %end; ***************************************************************; %if &diagnostics=ON and &varselect=NO %then %do; ********diagnostic output common to all models ****************; proc print noobs; where observation <11 or pinfluence>50 or abs(studentresidual)>2 or cooksd>3; title3 "Residuals, etc. for first 10, PLUS problem observations"; run; %if %length(&var1zz)>0 %then %do; %if %upcase(&plottype) ^= GPLOT %then %do; proc plot nolegend; &bystatzz; title3 'Graph of observed data (o), predicted line (p)'; title4 'and 95% CI for MEAN(m) and Individual(i) prediction'; plot predictedvalue*&var1zz='p' (lowerclmean upperclmean)*&var1zz='m' (lowercl uppercl)*&var1zz='i' &t_.&yvar * &var1zz='o' /overlay; run; proc plot; &bystatzz; title3 "Graph of observed data against predicted values"; %if %length(&class)>0 %then plot &t_.&yvar * predictedvalue=%scan(&class,1) ; %else plot &t_.&yvar * predictedvalue='p' ;; run; %end; %if %upcase(&plottype)=GPLOT %then %do; proc sort data=residzz; by &by &var1zz; run; proc gplot ; &bystatzz; &symb4zz; title3 'Graph of observed data, predicted line'; title4 'and 95% CI for Mean and Individual prediction'; plot predictedvalue*&var1zz (lowerclmean upperclmean)*&var1zz (lowercl uppercl)*&var1zz &t_.&yvar * &var1zz /overlay vaxis=axis1 haxis=axis2; run;quit; %if %length(&class)>0 %then %do; ***dummy regression plots; proc gplot; &bystatzz; &symb5zz; title3 "Graph of predicted lines"; plot predictedvalue*&var1zz=%scan(&class,1) /vaxis=axis1 haxis=axis2 ; run;quit; proc gplot; &bystatzz; &symb6zz; title3 "Graph of observed data against predicted values"; plot &t_.&yvar * predictedvalue=%scan(&class,1) /vaxis=axis1 haxis=axis2 ; run;quit; %end; %else %do; proc gplot; &bystatzz; &symb8zz; title3 "Graph of observed data against predicted values"; plot &t_.&yvar * predictedvalue /vaxis=axis1 haxis=axis2 ;; run;quit; %end; %end;%end; %if %length(&xxmod)>0 %then %do; title3 'Residual Plot against X variable'; %if %upcase(&plottype) ^= GPLOT %then %do; %let myps=%sysfunc(getoption(PAGESIZE)); %if &myps lt 40 %then proc plot;; %if 40 le &myps lt 60 %then proc plot vpercent=50;; %if &myps ge 60 %then proc plot vpercent=33;; &bystatzz; plot residual * (&xxmod)/vref=0; run; %end; %if %upcase(&plottype)=GPLOT %then %do; proc gplot; &bystatzz; &symb9zz; plot residual * (&xxmod)/vref=0 vaxis=axis1 haxis=axis2 ; run;quit; %end; %end; proc sort data=residzz; by &by OrigObsNumzz; **insure original data order; &symb0zz; **symbol definitions for univariate; proc univariate plot normal plotsize=18; &bystatzz; title3 'Overall check on normality'; ods exclude quantiles testsforlocation; ods exclude basicmeasures moments; ods exclude fitquantiles goodnessoffit; ods exclude parameterestimates; var residual; %if %upcase(&plottype)=GPLOT %then %do; probplot /normal(mu=est sigma=est w=3) waxis=3 href=50 vref=0 square lhref=1 lvref=1; qqplot /normal(mu=est sigma=est w=3) waxis=3 href=0 vref=0 square lhref=1 lvref=1; histogram /normal(mu=est sigma=est w=3) waxis=3 wbarline=3 ; %end; run; ods listing; %end; **diagnostics=off; %let yvarnum=%eval(&yvarnum+1); ****next y variable; %let yvar=%scan(&yvarlist,&yvarnum); %end; ***yvar loop; %quitstop: proc optload data=optionszz; run; %quitstop1: %mend reg; ****************************************************************************** ****************************************************************************** ****************************************************************************** ******************************************************************************; %macro orthpoly(trtvals, numobs=); title2 "Orthogonal Polynomial Coefficients"; Proc IML; reset fw=12 fuzz noname nolog noautoname; x = {&trtvals}; ntrt = ncol(x); start='Contrast'; end=';'; trt='Treat'; quote="'"; %if %length(&numobs)>0 %then n = {&numobs}; %else n=j(1,ntrt,1);; polynomials={'Linear', 'Quadratic', 'Cubic', 'Quartic', 'Quintic', '6th','7th','8th','9th','10th','11th','12th','13th','14th'}; c = t(orpol(x, ntrt,n)); c = c#n; ***attempt to account for weights; div=abs(c+(abs(c)<1e-13)); div=div[,><] * j(1,ntrt,1); c=c/div; rowlab=""; collab=j(1,ntrt,""); do ii=2 to ntrt; label=rowcatc(quote||polynomials[ii-1,1]||quote); print start label trt (c[ii,]) [colname=collab rowname=rowlab] end; end; print "Checks that contrasts coefficients sum to zero", (c[2:ntrt,+]); quit; %mend orthpoly; /* **run this to get a table of orthogonal polynomial coefficients for balanced experiments.; proc iml; reset fw=12; reset autoname nocenter; do trt=2 to 10; x=1:trt; c = t(orpol(x)); c=round(c[2:trt,],1.e-12); **attempt to get integers by dividing by smallest nonzero number; **does not work when the smallest coefficient is not 1; div=abs(c+(abs(c)=0)); div=div[,><] * j(1,trt,1); c=c/div; print 'Col1 is # Treat, Col2 is polynomial power, followed by coefficients'; treat=j(trt-1,1,trt); power=t(1:trt-1); print (treat|| power|| c); end; quit; ***another way to get orthpolys, but does not allow unequal weighting; data one; array yyy y1-y5; do rep=1 to 3; do over yyy; yyy=rannor(0); output; end;end; run; proc glm; class treat; model y1-y5 = ; repeated treat (10 20 30 40 50) polynomial /printm; run;quit; */ ****************************************************************************** ****************************************************************************** ****************************************************************************** ******************************************************************************; %macro variogram(dset,yvar,xcoor, ycoor, lagdist=, lagmax=, nugget=0,range=10,sill=50, model=, maternv=1, plotmaxvar=1.e90, mround=.1); *********check arguments; %let serror=N; %if %length(&dset)=0 | %length(&yvar)=0 | %length(&xcoor)=0 | %length(&ycoor)=0 %then %do; %let serror=Y; %put ERROR: 4 arguments are required, in the order dataset, dep. variable, x coordinate, y coordinate.; %end; %else %do; %if %sysfunc(exist(&dset))=0 %then %do; %let serror=Y; %put ERROR: Dataset &dset does not exist.; %end; %else %do; %let dsid=%sysfunc(open(&dset,I)); %let chk3=%sysfunc(varnum(&dsid,&yvar)); %if &chk3=0 %then %do; %let serror=Y; %put ERROR: Dataset &dset does not contain dependent variable &yvar.; %end; %let chk3=%sysfunc(varnum(&dsid,&ycoor)); %if &chk3=0 %then %do; %let serror=Y; %put ERROR: Dataset &dset does not contain coordinate variable &ycoor.; %end; %let chk3=%sysfunc(varnum(&dsid,&xcoor)); %if &chk3=0 %then %do; %let serror=Y; %put ERROR: Dataset &dset does not contain coordinate variable &xcoor.; %end; %let chk3=%sysfunc(attrn(&dsid,nobs)); %if &chk3=0 %then %do; %let serror=Y; %put ERROR: Dataset &dset does not have any observations; %end; %let dsid=%sysfunc(close(&dsid)); %end; %end; %if &serror = Y %then %do; %put ERROR: Specification errors caused Variogram to not run.; %goto fail; %end; *****initial run****************************; %if %length(&lagmax)=0 %then %do; proc gplot data=&dset ; &symb7zz; plot &xcoor * &ycoor=6 ; title2 'Scatter Plot to make sure data covers region of interest'; run ; quit; **** Produce an initial histogram of inter-pair distances ****; proc variogram data=&dset outdistance = outdistzz ; compute novariogram nhc=15; coordinates xc=&ycoor yc=&xcoor ; var &yvar ; run ; **** Chart of initial histogram ****; proc gchart data=outdistzz ; &symb7zz; vbar LB / discrete vref=30 maxis=axis3 raxis=axis4 freq freq=count ; title2 'Distribution of Pairwise Distances' ; title3 "Lag distance = &lagdist, Lag Max = &lagmax"; run ; quit; %end; %else %do; ***************************************subsequent runs************; ***if lagdist is not set, check if previous outdistzz is available to compute a value; ***Otherwise, just use 1. Issue Error anyway; %if %sysevalf(&lagdist<=0) %then %do; %if %sysfunc(exist(outdistzz)) %then %do; data outdistzz; set outdistzz; by LB; if last.LB then do; lagd=UB/&lagmax; call symput("Lagdist",lagd); end; %end; %else %do; %let lagdist=1; %end; %put ERROR: Lag Distance must be given, lagdist=. A value of &lagdist is assumed.; %end; %let lagdisti=%sysevalf(&lagdist/2); ** variogram interprets bin size differently; %let numbars= %sysevalf(&lagmax/&lagdist,ceil); %let lagmaxi=%eval(&lagmax+1); ************************************************************** **** Compute the variogram, given the histogram interval ****; **** and maximum distance. **************************************************************; **** Chart of current histogram ****; proc variogram data=&dset outdistance = outdistzz ; compute novariogram maxlag=&lagmaxi lagdistance=&lagdisti nhc=&numbars lagtol=0 ; coordinates xc=&ycoor yc=&xcoor ; var &yvar ; run ; data outdistzz; set outdistzz; if LB > &lagmax * &lagdist then delete; if count=0 then count=1; run; proc gchart data=outdistzz ; &symb7zz; vbar LB / discrete vref=30 freq=count freq maxis=axis3 raxis=axis4; title2 'Distribution of Pairwise Distances' ; title3 "Lag distance = &lagdist, Lag Max = &lagmax"; run ; quit; *******variogram calculations; proc variogram data=&dset outvar = outvarzz ; compute lagdistance=&lagdisti maxlag=&lagmaxi robust ; coordinates xc=&ycoor yc=&xcoor ; var &yvar ; run ; **** Add in theoretical semivariogram for plotting ****; data outvarzz ; set outvarzz ; cnug=&nugget; crang=⦥ cvar=&sill; model=upcase("&model"); model1=substr(model,1,3); if model1='' then modtype=0; if model1 in ('GAU') then modtype=1; if model1 in ('EXP') then modtype=2; if model1 in ('SPH') then modtype=3; if model1 in ('LIN' ) then do; if model='LINEAR' then modtype=4; if model='LINLOG' then modtype=14; end; if model1 in ('MAT') then do; if model='MATERN' then modtype=5; if model='MATHSW' then modtype=15; end; if model1 in ('POW') then modtype=6; if modtype=. then put "ERROR: Spatial model not recognized"; if _n_=1 then do; *** fill in zero distance; hold=distance; distance=0; vari=cnug; %if %length(&model)=0 or %upcase(&model)=SPHERE %then %do; vtype='Spherical'; output; %end; %if %length(&model)=0 or %upcase(&model)=EXP %then %do; vtype = 'Exponential'; output ; %end; %if %length(&model)=0 or %upcase(&model)=GAUSS %then %do; vtype = 'Gaussian'; output ; %end; distance=hold; end; if modtype in (0,3) then do; if distance le (crang) then vari = cnug + cvar*(1.5*distance/(crang)-.5*(distance/(crang))**3); else vari=cnug+cvar; vtype='Spherical'; output; end; if modtype in (0,2) then do; vari = cnug + cvar*(1-exp(-distance/crang)) ; vtype = 'Exponential'; output ; end; if modtype in (0,1)then do; vari = cnug + cvar*(1-exp(-distance*distance/(crang*crang))) ; vtype = 'Gaussian'; output ; end; if modtype= 6 then do; vari = cnug + cvar*(1-crang**distance) ; vtype = 'Power'; output ; end; if modtype=4 then do; vari = cnug + cvar*(crang*distance) ; vtype = 'Linear'; output ; end; if modtype=14 then do; vari = cnug + cvar*(crang*log(distance)) ; vtype = 'LinLog'; output ; end; if modtype=5 then do; vari = cnug + cvar*(1-(.5*distance/crang)**matrenv * 2*ibessel(2,maternv,0)*(distance/crang)/gamma(maternv)); vtype = 'Matern'; output ; end; if modtype=15 then do; vari = cnug + cvar*(1-(distance*sqrt(maternv)/crang)**matrenv * 2*ibessel(2,maternv,0)*(2*distance*sqrt(maternv)/crang)/gamma(maternv)); vtype = 'MatHSW'; output ; end; ****actual data; vari = variog ; vtype = '1regular' ; output ; vari = rvario ; vtype = '0robust' ; output ; run ; **** Plot all the semivariograms ****; proc gplot data=outvarzz ; *****NOTE WHERE********; where vari<&plotmaxvar; &symb7zz; plot vari*distance=vtype / vaxis=axis2 haxis=axis1 vzero hzero; title2 "Theoretical and Sample Semivariogram for &yvar" ; title3 "Nugget=&nugget Range=&range Sill=&sill"; run ; quit; %end; %fail: %mend variogram; ****************************************************************************** ****************************************************************************** ****************************************************************************** UPOWER************************************************************************; /* ---------------------------------------------------------------------- FOR INFORMATION AND LATEST VERSION: http://www.bio.ri.ccf.org/UnifyPow ---------------------------------------------------------------------- UnifyPow.sas, macro version, 2002.08.17a 2002 Copyright (c) by Ralph G. O'Brien, PhD Department of Biostatistics and Epidemiology Cleveland Clinic Foundation Cleveland, OH 44195 Voice: 216-445-9451 robrien@bio.ri.ccf.org This single file contains all components of UnifyPow. You simply put it in an appropriate directory on your system and %include it in your SAS run. It is distributed so that it runs in the way I have been demonstrating in my workshops. --------------------------------------------------- EXPERIENCED SAS USERS MAY CUSTOMIZE UnifyPow OUTPUT --------------------------------------------------- All results from each UnifyPow problem are stored in a temporary SAS data set called PowData. Knowing this, experienced SAS users may easily customize their output by merging results from two or more problems and by using their own PROC TABULATE or SAS/GRAPH code. The examples.sas file I distribute should have one or two examples of this. I recommend that you first examine the structure of PowData by just seeing what it holds. ---------------------- UnifyPow LEGAL NOTICES ---------------------- THIS SOFTWARE IS MADE AVAILABLE "AS IS". UnifyPow is a trademark of Ralph G. O'Brien. No commercial use of this trademark may be made without prior written permission of Ralph O'Brien. All UnifyPow software and its included text and accompanying documentation are Copyright 1997 by Ralph G. O'Brien. Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee to Ralph O'Brien is hereby granted, provided that these legal notices appear in all copies and supporting documentation, that the name "UnifyPow" is retained, and that the names of Ralph O'Brien and the Cleveland Clinic Foundation are not used in advertising or publicity pertaining to distribution of the software without the specific, written prior permission of Ralph O'Brien. Although the above trademark and copyright restrictions do not convey the right to redistribute derivative works, Ralph O'Brien encourages unrestricted distribution of patches or ancillary code which can be applied to or used in conjunction with Ralph O'Brien's distribution. If this software is modified for local use, please denote this on all modified versions of the software by appending the letter "L" to the current version number and by noting the changes in the code itself and in the associated documentation. RALPH G. O'BRIEN AND THE CLEVELAND CLINIC FOUNDATION DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, WITH REGARD TO THIS SOFTWARE, INCLUDING WITHOUT LIMITATION ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, AND IN NO EVENT SHALL RALPH G. O'BRIEN OR THE CLEVELAND CLINIC FOUNDATION BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA, OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, TORT (INCLUDING NEGLIGENCE) OR STRICT LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. ----------------------------------------- REPORTING PROBLEMS AND MAKING SUGGESTIONS ----------------------------------------- UnifyPow's computations are checked thoroughly using multiple methods: comparing its results to those obtained by using other software, to those in published tables and examples, and to results obtained from Monte Carlo simulation. No true bugs have yet been reported to me about UnifyPow's numerical accuracy, but no software can be said to be totally free of problems. I really do appreciate knowing if you encounter difficulties or have suggestions for improvements. Most of the additions to UnifyPow come about this way. On the other hand, I cannot promise to address everyone's queries, especially those that are mostly related to consulting advice. I correspond best via email. Do not modify this code without obtaining written permission from me. If changes are needed for some purpose, have the wisdom and courtesy to consult me. --------------- CITING THE WORK --------------- If you use this work, please cite it--something like ... using UnifyPow, a module for the SAS System (O'Brien, 1998). This reference is O'Brien, RG (1998), "A Tour of UnifyPow: A SAS Macro for Sample-Size Analysis," Proceedings of the 23rd SAS Users Group International Conference, Cary NC: SAS Institute, 1346-1355. Another key reference for this work is: O'Brien, RG and Muller, KE (1993), "Unified Power Analysis for t Tests through Multivariate Hypotheses," in Edwards, EK. (Ed.), Applied Analysis of Variance in the Behavioral Sciences, New York: Marcel Dekker, pp. 297-344. Actually, that chapter covers freeware called OneWyPow.sas and other %include modules that are now unified in UnifyPow.sas, which offers a simpler interface and a far larger set of methods. --------------------------------------- OBTAINING FREEWARE SAS CODE AND UPDATES --------------------------------------- UnifyPow is freeware distributed via http://www.bio.ri.ccf.org/UnifyPow a website at the Department of Biostatistics and Epidemiology at the Cleveland Clinic Foundation. Register! Downloading new versions periodically ensures that you are getting the latest version I feel is safe for public distribution. At the website, you may register your name and email address, so that I can notify you of new releases. */ /* ---------------------------- SOURCE CODE for tables macro ---------------------------- */ %macro tables; /* All kinds of tables for displaying results. Easy to customize. */ %if &DoTables = no %then %do; %let DoTables = yes; %goto donetabl; %end; /* The tables are constructed so that they will handle either "Power" or "NTotal" (TotalN or TotalPairs or PairsTotal) statements. */ %if &ResltVar = power %then %do; %let SpecVar = NTotal; %let result = %str(Power="Power"*mean=" "); %let spec = %str(Ntotal="Total N"); %let format = 9.4; %end; %if &ResltVar = NTotal %then %do; %let SpecVar = NomPower; %let result = %str(Ntotal="Total N"*mean=" "); %let spec = %str(NomPower="Minimum Power"); %let format = 9.0; %end; %if &TablType = tPow %then %do; /* all alphas in one table, does not give parent */ proc tabulate format = &format order=data; class alpha EffcTitl testtype &SDorCV &SpecVar scenario; format alpha &AlphaFmt; format NomPower &PowerFmt; var &ResltVar; table scenario="Scenario:"* EffcTitl="Effect:", alpha= "Alpha" * testtype="Type ", &SDorCV="&SDType" * &spec * &result /rtspace=35; %end; /* tpow */ %if &TablType = FPow %then %do; /* condensed tables for F tests */ proc tabulate format = &format order=data; class alpha EffcTitl testtype &SDorCV &SpecVar scenario; format alpha &AlphaFmt; format NomPower &PowerFmt; var &ResltVar; table scenario="Scenario:", EffcTitl="Test" * alpha="Alpha" * testtype="Type", &SDorCV="&SDType" * &spec * &result /rtspace=30; %end; /* TablType = FPow */ %donetabl: run; %mend tables; %macro ChkNull (ChkVar, Endxx); /* Macro to check for null scenario (effect size = 0). If so, sets power = alpha and skips regular computation. Required to trap cases with no power, so that FindN will work OK. */ if &ChkVar = 0 then do; if not(FindingN) then power = alpha; if FindingN then do; if NulMsgOn then put // "============================================================" / "NOTE: There appears to be a situation in which the effect is" / " a null one, thus no Ntotal will achieve desired power." / " Ntotal has been set to missing (.)." / "============================================================" //; Ntotal = .; NulMsgOn = 0; end; /* FindingN */ go to &Endxx; end; /* ChkVar */ %mend ChkNull; %macro CkCntrst (Cmat, Nrows, Ncols); do irowCC = 1 to &Nrows; sumCC = 0; do icolCC = 1 to &Ncols; sumCC = sumCC + &Cmat{irowCC, icolCC}; end; if abs(sumCC) gt .01 then put // "WARNING: Contrast row " irowCC " does not sum to 0.00."; end; %mend CkCntrst; %macro instruct (); **AMS changed so instructions saved to sas dataset, not external file; %global dfeformzz basiczz PrgmLinzz; %let dfeformzz=0; ***AMS initialize/reset these; %let basiczz=1; data upowerzz; infile DATALINES missover ; length statment $ 256; retain lincnt 0; input ; statment=left(_infile_); keep statment; **AMS find ErrorDF statement. need formula as macro; check=upcase(trim(statment)); if index(check,"ERRORDF") then do; basic=scan(check,2, " "); ptr=indexw(check,basic)+length(basic); formula=substr(check,ptr); call symput('basiczz',trim(basic)); call symput('dfeformzz',trim(formula)); end; lincnt=lincnt+1; call symput("PrgmLinzz",compress(lincnt)); %mend instruct; /* ++++ Begin UPw05_MtrxMacros segment v. 2002.07.01 ++++ */ /* Set of matrix routines for the data step: matDup ....... C = A %matAmnB(A, NrowA, NcolA, B, C) matDiagI ...... C = diag(vecA). Makes diagonal matrix from reciprocals of elements of a vector %matDiagI(vecA, dim_vecA, C) matAB ........ C = A*B %matAB(A, NrowA, NcolA, B, NcolB, C) C may replace A, not B matABt ....... C = A*B` %matABt(A, NrowA, NcolA, B, NrowB, C) C may replace A, not B. matAtB ....... C = A`*B %matAtB(A, NrowA, NcolA, B, NcolB, C) C may NOT replace A or B. matAtA ....... C = A`*A %matAtA(A, NrowA, NcolA, C) C may NOT replace A. matAv ....... C = A*vecB, where v is a 1-dimensional array %matAv(A, NrowA, NcolA, vecB, C) C may replace A. matAinv ...... C = inv(A) %matAinv(A, dimA, C, detA) C may replace A matInput ...... inputs matrix using UnifyPow syntax %matInput (NewMatrx, Nrows, Ncols) CkCntrst checks whether rows of coefficients sum to 0.00 %CkCntrst (C, Nrows, Ncols) */ %macro matDiagI(d, dim_d, A); /* Creates an dim_d x dim_d diagonal matrix, A, based on the reciprocals of the elements in the vector d. */ do i_rowA = 1 to &dim_d; do i_colA = 1 to &dim_d; if i_rowA = i_colA then &A{i_rowA, i_colA} = 1/&d{i_rowA}; else &A{i_rowA, i_colA} = 0; end; end; %mend matDiagI; %macro matAB(A, NrowA, NcolA, B, NcolB, AB); /* Multiplies an NrowA x NcolA matrix by an NcolA x NcolB matrix Returns AB, an NrowA x NcolB matrix. AB may replace A. Must declare TempVec{max NcolB} in the main section. */ do i_rowA = 1 to &NrowA; do i_colB = 1 to &NcolB; TempVec{i_colB} = 0; do i_colA = 1 to &NcolA; TempVec{i_colB} = TempVec{i_colB} + &A{i_rowA, i_colA}*&B{i_colA, i_colB}; end; end; do i_colB = 1 to &NcolB; &AB{i_rowA, i_colB} = TempVec{i_colB}; end; end; %mend matAB; %macro matAv(A, NrowA, NcolA, v, Av); /* Multiplies NrowA x NcolA matrix by NcolA-element vector. Returns Av, an NrowA x 1 matrix. Av may replace A. */ do i_rowA = 1 to &NrowA; temp = 0; do i_colA = 1 to &NcolA; temp = temp + &A{i_rowA, i_colA}*&v{i_colA}; end; &Av{i_rowA, 1} = temp; end; %mend matAv; %macro matABt(A, NrowA, NcolA, B, NrowB, ABt); /* Multiplies an NrowA x NcolA matrix by the transpose of an NrowB x NcolA matrix. Returns ABt, an NrowA x NrowB matrix. ABt may replace A. */ do i_rowA = 1 to &NrowA; do i_rowB = 1 to &NrowB; tempVec{i_rowB} = 0; do i_colA = 1 to &NcolA; tempVec{i_rowB} = tempVec{i_rowB} + &A{i_rowA, i_colA}*&B{i_rowB, i_colA}; end; end; do i_colA = 1 to &NrowB; &ABt{i_rowA, i_colA} = tempVec{i_colA}; end; end; %mend matABt; %macro matAtB(A, NrowA, NcolA, B, NcolB, AtB); /* Multiplies the transpose of an NrowA x NcolA matrix by an NrowA x NcolB matrix Returns AtB, an NcolA x NcolB matrix. AtB may NOT replace A. */ do i_colA = 1 to &NcolA; do i_colB = 1 to &NcolB; &AtB{i_colA, i_colB} = 0; do i_rowA = 1 to &NrowA; &AtB{i_colA, i_colB} = &AtB{i_colA, i_colB} + &A{i_rowA, i_colA}*&B{i_rowA, i_colB}; end; end; end; %mend matAtB; %macro matAinv(Z, dimZ, Zinv, detZ); /* Z is square matrix of dimension dimZ. Returns inv(Z) and det(Z). Z is not altered, unless Zinv replaces Z, which is OK. */ %matDup(&Z, &dimZ, &dimZ, TempZ); %matAinv2(TempZ, &dimZ, &Zinv, &detZ); %mend matAinv; %macro matAinv2(A, dimA, Ainv, detA); /* A is square matrix of dimension dimA. Returns inv(A) and det(A). Ainv may NOT replace A. A is altered. Need to declare vv{max dimA} and indx{max dimA}. Algorithm taken from Press WH, et. al. Numerical recipes in FORTRAN: the art of scientific computing, 2nd Ed., Cambridge University Press, 1986, pp 34-41. */ tiny = 1.0E-12; do i = 1 to &dimA; do j = 1 to &dimA; &Ainv{i,j} = 0; end; &Ainv{i,i} = 1; end; /* LU decomposition: adapted from ludcmp subroutine in Numerical Recipes */ do i = 1 TO &dimA; maxA = 0; do j = 1 TO &dimA; if abs(&A{i,j}) > maxA then maxA = abs(&A{i,j}); end; if maxA = 0 then do; put "ERROR: A matrix singularity has suspended computations."; stop; end; vv{i} = 1.0/maxA; end; d = 1; do j = 1 to &dimA; do i = 1 to j-1; sum = &A{i,j}; do k = 1 to i-1; sum = sum - &A{i,k}*&A{k,j}; end; &A{i,j} = sum; end; maxA = 0; do i = j to &dimA; sum = &A{i,j}; do k = 1 to j-1; sum = sum - &A{i,k}*&A{k,j}; end; &A{i,j} = sum; dum = vv{i}*abs(sum); if dum >= maxA then do; max_i = i; maxA = dum; end; end; if j ne max_i then do; do k = 1 to &dimA; dum = &A{max_i, k}; &A{max_i,k} = &A{j,k}; &A{j,k} = dum; end; d = -d; vv{max_i} = vv{j}; end; indx{j} = max_i; if &A{j,j} = 0 then &A{j,j} = tiny; if j ne &dimA then do; dum = 1.0/&A{j,j}; do i = j+1 to &dimA; &A{i,j} = &A{i,j}*dum; end; end; end; /* j = 1 to dimA */ /* Compute determinant to see A if is singular. To understand this, read in Numerical Recipes. */ &detA = d; do i = 1 to &dimA; &detA = &detA*&A{i,i}; end; if abs(&detA) < tiny then do; put // "ERROR: A singular or near-singular matrix was encountered" / " when computing an inverse."; &detA = 0; do irowAinv = 1 to &dimA; do icolAinv = 1 to &dimA; &Ainv{irowAinv, icolAinv} = .; end; end; EndAinv = 1; end; if not(EndAinv) then do; /* functions as go to */ do m = 1 to &dimA; /* LU backsubstitution: adapted from lubksb subroutine in Numerical Recipes */ ii = 0; do i = 1 to &dimA; LL = indx{i}; sum = &Ainv{LL,m}; &Ainv{LL,m} = &Ainv{i,m}; if ii ne 0 then do j = ii TO (I-1); sum = sum - &A{i,j}*&Ainv{j,m}; end; else if sum ne 0 then ii = I; &Ainv{i,m} = sum; end; do i = &dimA to 1 by -1; sum = &Ainv{i,m}; do j = i+1 to &dimA; sum = sum - &A{I,j}*&Ainv{j,m}; end; &Ainv{i,m} = sum/&A{i,i}; end; end; end; /* endAinv */ %mend matAinv2; %macro matDup(OldMatrx, NumRows, NumCols, NewMatrx); /* OldMatrix (NumRows x NumCols) is duplicated as NewMatrx. */ do irowSM = 1 to &NumRows; do icolSM = 1 to &NumCols; &NewMatrx{irowSM, icolSM} = &OldMatrx{irowSM, icolSM}; end; end; %mend matDup; %macro matInput (NewMatrx, Nrows, Ncols); link GetMatrx; &Nrows = Nrows; &Ncols = Ncols; %matDup(Mvalues, &Nrows, &Ncols, &NewMatrx); %mend matInput; /* ++++ End UPw05_MtrxMacros segment ++++ */ %macro TroubMsg; put // "WARNING: Despite extensive testing, the iterative method" / " to find NTtotal for a specified power has failed" / " to converge. NTotal was set to missing." // " Please report this to" / " Ralph O'Brien" / " robrien@bio.ri.ccf.org"; Ntotal = .; power = .; go to &Cntinu; %mend TroubMsg; %macro FindN (GetPow, Cntinu); /* Secant method to find Ntotal such that power(Ntotal) = NomPower. Starting values are carefully found to ensure convergence. GetPow is the statement label to begin power computation. Cntinu is the statement label to go to after convergence. */ if Step = 1 then do; FindingN = 1; MinPossN = NumGrps + 1; Ntotal = MinPossN; if "&dfeformzz" ne '0' then do; **AMS; if &basiczz>1 then Ntotal=(NumGrps+1)*&basiczz; **AMS; x=ntotal/(&basiczz*NumGrps); **AMS; rxfull=ntotal-(&dfeformzz); ****** make sure Ntotal is large enough; **AMS; do while (rxfull>ntotal-1); **AMS; Ntotal=Ntotal+&basiczz; **AMS; x=ntotal/(&basiczz*NumGrps); **AMS; rxfull=ntotal-(&dfeformzz); **AMS; end; **AMS; MinPossN=ntotal; end; Step = 2; go to &GetPow; end; /* Step 1 */ if Step = 2 then do; if (Ntotal = MinPossN) and (power > NomPower) then do; put // "WARNING: Power exceeds " NomPower / " using smallest possible NTotal: " MinPossN / " Are your specifications correct?"; Ntotal = .; ***above may replace this? FindingN=0; **AMS stop search; go to &Cntinu; end; Diff = power - NomPower; if Diff < 0 then do; Ntotal = 4*Ntotal; **AMS steps of 4, not 2 should be faster; OldDifLo = Diff; if "&dfeformzz" ne '0' then do; **AMS; x= ntotal/(&basiczz*NumGrps); **AMS; rxfull=ntotal-(&dfeformzz); end; go to &GetPow; end; NtotHi = Ntotal; DiffHi = Diff; NtotLo = Ntotal/4; **AMS; DiffLo = OldDifLo; Ntotal = NtotLo + (NtotHi - NtotLo)/(1 - DiffHi/DiffLo); if not(Ntotal > 0) then do; %TroubMsg; end; Step = 3; if "&dfeformzz" ne '0' then do; **AMS; Ntotal = ceil(NtotLo + (NtotHi - NtotLo)/(1 - DiffHi/DiffLo)); **AMS; x= ntotal/(&basiczz*NumGrps); **AMS; rxfull=ntotal-(&dfeformzz); end; go to &GetPow; end; /* Step 2 */ if Step = 3 then do; Diff = power - NomPower; **AMS stop if within 1 of sample size. changed to .000049 because an example stopped too early; if abs(Diff)<.000049 or (NtotHi-NtotLo)= 1 then do; **AMS; if diff<0 then Ntotal=NtotHi; FindingN = 0; go to &Cntinu; end; if Diff < 0 then do; NtotLo = Ntotal; DiffLo = Diff; end; else do; NtotHi = Ntotal; DiffHi = Diff; end; Ntotal = NtotLo + (NtotHi - NtotLo)/(1 - DiffHi/DiffLo); if not(Ntotal > 0) then do; %TroubMsg; end; if "&dfeformzz" ne '0' then do; *** binary search needed because df change power nonlinearly; Ntotal = floor(NtotLo + (NtotHi - NtotLo)/2); **AMS binary search; **AMS; x= ntotal/(&basiczz*NumGrps); **AMS; rxfull=ceil(ntotal-floor(&dfeformzz)); end; go to &GetPow; end; /* Step 3 */ %mend FindN; /* ++++ End UPw04_FindNMacros segment ++++ *//* /* ++++ Begin UPw08_Declares segment v. 2002.08.17a ++++ */ %global ProbType TablType SDType TblMmnts ; %global WriteBox ResltVar format DoTables ; %global SDorCV ; %global AlphaFmt PowerFmt ; %let WriteBox = 0; /* MAKE MACRO: Remove commenting around next statement. */ %macro UPower(); /* Activate for macro version. */ /* Set overall maximum sizes for UnifyPow problems. */ %let MxGrpMes = 20; /* Max number of groups or measures */ %let MaxSDs = 20; /* Max number of specs for SDs or similar parameters */ %let MaxAlphs = 10; /* Max number of alpha specs */ %let MaxNs = 100; /* Max number of Ntotals or powers */ %let ProbType = To be defined; %let SDorCV = SD; %let SDType = Std Dev; %let DltaPowr = .02; %let WriteBox = %eval(&WriteBox+1); %let DoTables = yes; %let SDname = Standard Deviation; %let AlphaFmt = 6.3; %let PowerFmt = 6.3; data PowDatazz; file print; if _n_ = 1 then do; if &WriteBox = 1 then do; put /; put "--------------------------------------------------------------------"; put "| UnifyPow 2002.08.17a 2002 Copyright (c) by Ralph G. O'Brien |"; put "| For information, see http://www.bio.ri.ccf.org/UnifyPow |"; put "| Report problems to robrien@bio.ri.ccf.org |"; put "--------------------------------------------------------------------"; put " *** This version modified to create UPower "; end; put // "Verbatim specifications and processing notes" / "============================================" / ; end; keep scenario; keep alpha power NomPower SD NTotal EffcTitl dfH testtype tails; keep rXfull ; array value {&MxGrpMes} _temporary_; /* temporary parameter values */ array mu {&MxGrpMes} _temporary_; array Cmat {&MxGrpMes,&MxGrpMes} _temporary_; array Cmu{20,1} Cmu1-Cmu20; array invCWCT{20,20} inCWC1-inCWC400; array indx{20} indx1-indx20; array VV{20} VV1-VV20; array TempMat {&MxGrpMes,&MxGrpMes} _temporary_; array TempVec {&MxGrpMes} _temporary_; array TempZ {&MxGrpMes,&MxGrpMes} _temporary_; /* for %matAinv */ array w {&MxGrpMes} _temporary_; array wDesign {&MxGrpMes} _temporary_; array alphaV {&MaxAlphs} _temporary_; array SDV {&MaxSDs} _temporary_; array NtotalV {&MaxNs} _temporary_; array NomPowrV {&MaxNs} _temporary_; array Mvalues {&MxGrpMes, &MxGrpMes} _temporary_; length title $78; /* temporary variable for ReadTitl link */ length Scenario $78; length EffcTitl $40; /* effect title, up to 40 characters */ length vChar $20; length keyword $25; length TestType $11; format NomPower 7.5; **AMS so fits table; format alpha 4.3; length CurInput $15; length OldInput $15; length statment $78; ***AMS statements from %instruct; EndJstIn = 0; /* END statement last thing read */ NewProb:; /* New Problem */ /***** set defaults on specifications *****/ /* indicator variables: 0=no, 1=yes */ AlphaIn = 0; /* alpha levels input yet */ CntrstIn = 0; /* Has CONTRASTS statement been read */ DoAdjstN = 1; /* Adjust found N to make it conform to cell weights */ DoTails = 3; /* Do both 1- and 2-tailed tests (1 + 2 = 3) */ EndOFile = 0; FindingN = 0; /* currently interating to find Ntotal */ NewAlfTl = 0; /* new Alpha or Tails has just been set */ NmParmIn = 0; /* Number of parameters statement in */ NoOveral = 0; /* skip overall test */ NtotalIn = 0; /* total N input yet */ NullValu = 0; /* null value for one- and two-group tests */ NulMsgOn = 1; /* If null is true, print message; only does one time */ NumGrps = 1; /* Number of groups */ PowerIn = 0; /* power input yet */ ScenarIn = 0; /* Scenario title has been input */ SDIn = 0; /* standard deviations input yet */ ValidPrb = 0; /* Valid problem statement input yet */ wIn = 0; /* weights input yet */ /* ++++ End UPw08_Declares segment ++++ */ /* ++++ Begin UPw09_ProbPars1 segment v. 2002.07.01 ++++ */ /* Comment lines may be used anyplace. There are three kinds: Block of comment lines. /# Block of comments lines, at most 78 chars per line. If any contain ";" then use DATALINES4 statement and follow all UnifyPow input with ";;;;". #/ Single comment lines. // Comments lines, at most 78 chars per line. // Another comment line. // See note just above about using ";". // Either block comments or // comments may be used anyplace. Follow end of UnifyPow statement with open period ( . ) and add comment. Examples: pi .80 . Mystic Michelle claims 80% success rate. alpha .05 .0167 . 0.05/3 = 0.0167 for Bonferroni protection. */ curline=0; /* Read in problem and scenario statements */ CurInput = "ProbStmt"; link EchoInpt; EndJstIn = 0; if not(ScenarIn) then scenario = statment; keyword = upcase(scan(statment,1,' ') ); /* Parse the END statement */ if index(keyword,'END') then do; EndJstIn = 1; go to NewProb; end; /* Parse the mu and the mu:logNormal statements */ /* The "mu" statement leads to regular ANOVA (Normal-theory) tests on means. Unless blocked by a "NoOverall" command, UnifyPow automatically tests the overall hypothesis that all G means are equal. "Contrast" statements let users test any contrast of the form C*mu = 0, where C is a rank(C) x G contrast matrix and mu is the G x 1 vector of means. The "null" statement applies to one-group and two-group tests: Ho: mu_1 = NullValu Ho: mu_1 - mu_2 = NullValu ---- */ if (keyword='MU' or index(keyword,'MEAN') ) then do; ValidPrb = 1; call symput('TablType','FPow'); link GetValus; NumGrps = count; rXfull = NumGrps; if NumGrps le 2 then call symput('TablType','tPow'); call symput('ProbType','mu'); call symput('SDType','Standard Deviation'); do i = 1 to NumGrps; mu{i} = value{i}; end; go to NextSpec; end; if not(ValidPrb) then do; put // "ERROR: Invalid problem statement."; stop; end; /* ++++ Begin UPw12_OthrPars segment v. 2002.07.01 +++++ */ NextSpec: CurInput = "OthrStmt"; link EchoInpt; keyword = upcase(scan(statment,1,' ') ); if index(keyword,'ERRORDF') then do; ****AMS already scanned this; go to NextSpec; end; /* parse the WEIGHT statement */ /* Note: Weights are relative. They are summed and converted to fractions. Thus use WEIGHT .333 .666 or, equivalently, WEIGHT 1 2 This is not the same as WEIGHT .333 .667 which may cause problems, especially when finding NTotal given a Power specification. */ if index(keyword,'WEIGHT') then do; link GetValus; wIn = 1; NumWght = count; if NumGrps ne NumWght then do; put // "ERROR: Improper number of WEIGHTS specified."; stop; end; SumWts = 0; do i = 1 to NumWght; w{i} = value{i}; SumWts = SumWts + w{i}; end; do i = 1 to NumWght; w{i} = w{i}/SumWts; wDesign{i} = w{i}; end; go to NextSpec; end; /* parse the ALPHA and TAILS statements */ link CkAlfTls; if NewAlfTl then do; go to NextSpec; end; /* parse the standard deviations statements: SD or SIGMA */ if (keyword = 'SD') or (keyword = 'SIGMA') then do; link GetValus; SDIn = 1; NumSD = count; do i = 1 to NumSD; SDV{i} = value{i}; if not(SDV{i} > 0) then do; put // "ERROR: A standard deviation is not positive."; stop; end; end;; go to NextSpec; end; /* SD parser */ /* parse the TotalN or NTotal (total sample size) statement */ if (index(keyword,'TOTAL') gt 0) then do; call symput('ResltVar','power'); link GetValus; NtotalIn = 1; NumNtotl = count; do i = 1 to NumNtotl; NtotalV{i} = value{i}; if not(NtotalV{i} > 0) or not(floor(NtotalV{i}) = NtotalV{i}) then do; put // "ERROR: Total sample size is not correct."; stop; end; end; NumPower = 1; NomPowrV{1} = 0; go to NextSpec; end; /* parse the Power statement */ if keyword = 'POWER' then do; call symput('ResltVar','NTotal'); link GetValus; PowerIn = 1; NumPower = count; MinPower = 1; do i = 1 to NumPower; NomPowrV{i} = value{i}; if NomPowrV{i} > .999 then call symput('PowerFmt','6.4'); MinPower = min(of MinPower NomPowrV{i}); if AlphaIn and not(MaxAlpha < NomPowrV{i} < 1) then do; put // "ERROR: Power must be between alpha and 1."; stop; end; end; NumNtotl = 1; NTotalV{1} = 1; go to NextSpec; end; /* parse the NoOverall statement */ if index(keyword,'NOOVERALL') then do; NoOveral = 1; go to NextSpec; end; /* parse the NoAdjustN statement */ if index(keyword,'NOADJUSTN') then do; DoAdjstN = 0; go to NextSpec; end; /* parse the NoTables statement */ if index(keyword,'NOTABLE') then do; call symput('DoTables','no'); go to NextSpec; end; /* parse the CONTRAST statement */ if index(keyword,'CONTRAST') then do; CntrstIn = 1; go to CheckIt; end; put // "ERROR: " keyword $ "statement not understood."; stop; EOFfound: if EndJstIn then go to AllDone; EndOFile = 1; if CurInput = "ProbStmt" then do; put // "ERROR: End of UPower statements was reached too soon."; stop; end; /* ++++ End UPw12_OthrPars segment +++++ */ /* ++++ Begin UPw13_ChkParms segment v. 2002.07.01 ++++ */ if CurInput = "OthrStmt" then go to CheckIt; if CurInput = "GetMatrx" then go to EndGtMat; if CurInput = "muContrast" then go to ContPowr; if OvAlDone then go to AllDone; CheckIt: /* Check to see if parameters are in */ if NtotalIn and PowerIn then do; put // "ERROR: Cannot use both Ntotal and Power statements."; stop; end; if not(NtotalIn or PowerIn) then do; put // "ERROR: Missing Ntotal or Power statement for this problem."; stop; end; if AlphaIn = 0 then do; AlphaIn = 1; NumAlpha = 1; alphaV{1} = .05; put // "NOTE: No ALPHA statement. Default is alpha = 0.05 only."; end; if wIn = 0 then do; /* force balanced design */ wIn = 1; if NumGrps > 1 then put / "NOTE: No WEIGHTS statement. Default is balanced design."; do i = 1 to NumGrps; w{i} = 1/NumGrps; wDesign{i} = w{i}; end; end; if NtotalIn then do; do i = 1 to NumGrps; do j = 1 to NumNtotl; if abs(round(w{i}*NtotalV{j}) - w{i}*NtotalV{j}) gt .003*round(w{i}*NtotalV{j}) then do; put // "WARNING: Total sample sizes and cell weights produce"; put " cell sizes that are not integers."; put " Group: " i " NTotal: " NtotalV{j}; end; end; end; end; /* Shouldn't get to here unless it is a mu, mu:logNormal, PairedMu, PairedMu:logNormal, Wilcoxon, or mu(P) problem. */ /* compute power for overall tests of means */ if NumGrps = 2 and NoOveral = 0 then do; put / "Comparing 2 groups on location"; dfH = 1; PrimSSHe = w{1}*w{2}*(mu{1} - mu{2} - NullValu)**2; EffcTitl = 'Compare two means'; if not( NoOveral) then link GetPower; go to cntrasts; end; /* NumGrps = 2 do */ if NumGrps > 2 then do; put / "ANOVA testing on " NumGrps "independent means."; put //; if NoOveral then go to cntrasts; EffcTitl = 'Compare all means'; dfH = NumGrps - 1; /* standard overall anova */ mubar = 0; do i = 1 to NumGrps; mubar = mubar + w{i}*mu{i}; end; PrimSSHe = 0; do i = 1 to NumGrps; PrimSSHe = PrimSSHe + w{i}*(mu{i} - mubar)**2; end; link GetPower; end; /* NumGrps > 2 do */ /* ++++ End UPw16_Compute3_mu segment ++++ */ /* ++++ Begin UPw17_Compute4_MVmu segment v. 2002.08.07 ++++ */ cntrasts:; /* This section also handles SSH and CHI**2 effects statements */ if not(CntrstIn ) then go to AllDone; if CntrstIn then OvAlDone = 1; /* Perform C*mu = 0 general contrasts (without proc iml) */ if CntrstIn then do; NewCont: /* begin contrast or effects specification */ link EchoInpt; link ReadTitl; EffcTitl = title; if CntrstIn then do; %matInput (Cmat, dfH, NColTemp); if NColTemp ne NumGrps then do; put // "ERROR: Contrast has incorrect number of coefficients."; stop; end; if dfH = NumGrps then put // "WARNING: The contrast has as many rows as there are groups."; if dfH gt NumGrps then do; put // "ERROR: Contrast has too many rows."; stop; end; %CkCntrst (Cmat, dfH, NumGrps); CurInput = "muContrast"; keyword=scan(statment,1,' '); /* don't do this, no new TAILS or ALPHA allowed; link CkAlfTls; put 'NewAlfTl=' NewAlfTl keyword 'dfH=' dfH cmu1 cmu2 cmu3 cmu4; if not(NewAlfTl) then do; curline+1; set upowerzz point=curline; if _error_ then goto EOFfound; end; */ ContPowr: link PrimSSHC; link GetPower; if EndOFile then go to AllDone; else go to NewCont; end; /* CntrstIn and not(MVmuProb) */ end; /* if CntrstIn */ AllDone: if EndJstIn then go to NewProb; else stop; ************************************************************************************; /* Various link functions in no particular order */ CkAlfTls: NewAlfTl = 0; /* link to parse the ALPHA and TAILS statements */ if keyword = 'ALPHA' then do; NewAlfTl = 1; link GetValus; AlphaIn = 1; NumAlpha = count; MaxAlpha = 0; do i = 1 to NumAlpha; alphaV{i} = value{i}; if alphaV{i} le .001 then call symput('AlphaFmt','6.4'); if not(0 < alphaV{i} < 1) then do; put "Improper alpha: " alphaV{i}; stop; end; if PowerIn and not(0 < alphaV{i} < MinPower) then do; put // "ERROR: Alpha must not exceed power."; stop; end; if (alphaV{i} gt .20) then put // "WARNING: " alphaV{i} "is a large alpha level."; MaxAlpha = max(of MaxAlpha alphaV{i}); end; end; /* ALPHA */ /* parse the TAILS statement */ if index(keyword,'TAIL') or index(keyword,'SIDE') then do; NewAlfTl = 1; link GetValus; if count > 2 then go to ErrTAILS; do i = 1 to count; if not((value{i} = 1) or (value{i} = 2) or (value{i} = 3)) then go to ErrTAILS; end; if (count = 1) then DoTails = value{1}; if (count = 2) then do; if (value{1} + value{2} = 3) then DoTails = 3; else go to ErrTAILS; end; return; ErrTAILS: put // "ERROR: Invalid TAILS statement."; stop; end; /* TAILS */ return; /* ChAlfTls */ GetPower:; /* link for power for F, t, chi-square */ /* If DoChiSq = 1, then do chi-square & Z instead of F & t */ do iSD = 1 to NumSD; SD = SDV{iSD}; PrimLmda = PrimSSHe/(SD**2); /* Primary noncentrality */ do ialpha = 1 to NumAlpha; alpha = alphaV{ialpha}; do i_Ntotal = 1 to NumNtotl; Ntotal = NtotalV{i_Ntotal}; if "&dfeformzz" ne '0' then do; **AMS; x= ntotal/(&basiczz*NumGrps); **AMS; rxfull=ceil(ntotal-floor(&dfeformzz)); end; do i_Power = 1 to NumPower; NomPower = NomPowrV{i_Power}; /* compute power for F */ testtype = 'Regular F '; if dfH = 1 then testtype = '2-sided t'; tails = 2; if PowerIn then do; Step = 1; %FindN(GetPow02, Cntinu02); end; GetPow02: lambda = Ntotal*PrimLmda; if (-.0001 < lambda < 0) then lambda = 0; %ChkNull (lambda, End02); dfE = Ntotal - rXfull; if dfE < 1 then do; put // "ERROR: NTotal =" Ntotal " is too small for this problem."; stop; end; fcrit = finv(1-alpha, dfH, dfE, 0.0); link F_TRAP; if trap = 1 then power = .9999; else power = SDF('F', fcrit, dfH, dfE, lambda); if FindingN then do; %FindN(GetPow02, Cntinu02); Cntinu02: if NTotal ne . then do; link Adjust_N; go to GetPow02; end; end; End02: /* power for F */ link OutPower; /* if dfH = 1, compute power for one-sided tests */ if dfH = 1 and DoTails ne 2 then do; tails = 1; testtype = '1-sided t'; if PowerIn then do; Step = 1; %FindN(GetPow03, Cntinu03); end; GetPow03: %ChkNull (PrimLmda, End03); lambda = Ntotal*PrimLmda; if (-.0001 < lambda < 0) then lambda = 0; dfE = Ntotal - rXfull; if dfE < 1 then do; put // "ERROR: NTotal =" Ntotal " is too small for this problem."; stop; end; tcrit = tinv(1-alpha, dfE, 0); link T_TRAP; if TRAP = 1 then power = .9999; else power = SDF('t', tcrit, dfE, sqrt(lambda)); if FindingN then do; %FindN(GetPow03, Cntinu03); Cntinu03: if NTotal ne . then do; link Adjust_N; go to GetPow03; end; end; End03: end; /* 1-sided t */ link OutPower; end; /* power loop */ end; /* Ntotal loop */ end; /* alpha loop */ end; /* SD loop */ return; /* GetPower */ /* ++++ End UPw18_Links1 segment ++++ */ GetValus: /* a link routine */ vChar = ""; count = 0; do i = 1 to dim(value); value{i} = .; end; /* Value 1 */ vChar=scan(statment,count+2,' '); if vChar = "" then do; return; end; link OKvalue; /* Value 2 */ vChar=scan(statment,count+2,' '); if vChar = "" then do; return; end; if upcase(vChar) = "TO" then link DoToBy; else do; link OKvalue; GetValue: vChar=scan(statment,count+2,' '); if vChar = "" then do; return; end; link OKvalue; go to GetValue; end; return; /* GetValus */ OKvalue: /* a link subroutine */ In = 1; count = count + 1; value{count} = 1*vChar; return; /* OKValue */ DoToBy: /* a link subr to process [min] TO [max] BY [incr] */ MaxValue= 1*scan(statment,count+3,' '); if MaxValue = . then link ToByErr; ivChar=scan(statment,count+4,' '); if upcase(vChar) ne "BY" then link ToByErr; incr=1*scan(statment,count+5,' '); if incr = . then link ToByErr; MakeVals: NxtValue = value{count} + incr; if NxtValue le 1.00001*MaxValue then do; count = count + 1; value{count} = NxtValue; go to MakeVals; end; return; /* DoToBy */; ToByErr: put // "ERROR: in [min] TO [max] BY [incr] specification."; stop; return; /* ToByErr */ /* ++++ End UPw19_Links2 segment ++++ */ /* ++++ Begin UPw22_Links5 segment v. 2002.07.01 ++++ */ /* Round the power result. (The tail check is left over from a former way to handle the TAILS statement and is probably no longer needed.) */ OutPower: if (DoTails ne 3) and (dfH = 1) and (tails ne DoTails) then return; ****if power > .999 then power = .999; ****AMS modify****; ****else power = round(power,.001); output; if PowerIn and ((power - NomPower) > &DltaPowr) then do; end; return; /* OutPower */ Adjust_N: /* Increase Ntotal so that it conforms with WEIGHTS or ForceN specification. Not done if NTotal statement is used or if NoAdjustN is specified. */ NTotal = ceil(NTotal); if DoAdjstN = 0 then return; Check_N: do i_Grps = 1 to NumGrps; if abs(round(wDesign{i_Grps}*Ntotal) - wDesign{i_Grps}*Ntotal) > .01 then do; Ntotal = Ntotal + 1; go to Check_N; end; end; if "&dfeformzz" ne '0' then do; **AMS if Ntotal changes, need to redo these; **AMS; x= ntotal/(&basiczz*NumGrps); **AMS; rxfull=ceil(ntotal-floor(&dfeformzz)); end; return; /* Adjust_N (1) */ return; /* Adjust_N (2) */ PrimSSHC: /* Computes PrimSSHe for C*mu contrasts, where muhat_j has variance (sigma**2)/(Ntotal*w{j}). */ %matAv(Cmat, dfH, NumGrps, mu, Cmu); %matDiagI(w, NumGrps, TempMat); %matABt(TempMat, NumGrps, NumGrps, Cmat, dfH, TempMat); %matAB(Cmat, dfH, NumGrps, TempMat, dfH, invCWCt); %matAinv(invCWCt, dfH, invCWCt, det); %matAtB(Cmu, dfH, 1, invCWCt, dfH, TempMat); %matAB(TempMat, 1, dfH, Cmu, 1, TempMat); PrimSSHe = TempMat{1,1}; return; F_trap:; /* ********************************************************************* Switching to SDF functions eliminated old SAS bug problems, but * there is still a need check for lambdas that are < 0 due to * trancations and rounding. There is also no harm in keeping this * check, so I left it in rather than hunt down and eliminate all the * calls to it. * ********************************************************************* */ crit = fcrit; trap:; trap = 0; a1 = dfH*crit/(dfH+lambda); a2 = 2*(dfH + 2*lambda)/(9*(dfH + lambda)**2); a3 = 2/(9*dfE); SJz = ((1 - a3)*a1**(1/3) - (1 - a2))/sqrt(a2 + a3*a1**(2/3)); IF SJz < -3.80 then trap = 1; return; /* F_trap */ T_trap:; /* a check on whether delta is too high for PROBT due to SAS bugs */ crit = tcrit**2; dfH = 1; link trap; return; /* T_trap */ GetMatrx:; /* Link function to read a matrix. Mvalues{} has Nrows rows and Ncols columns. Matrix in UnifyPow commands has this form: 11 12 13 14 > 21 22 23 24 > 31 32 33 34 So ">" means "another row coming." This input would set Nrows to 3 and Ncols to 4. Uses GetValus and EchoInpt link functions. Follow running of this with %matDup(Mvalues, Nrows, Ncols, NewMatrx) where "NewMatrx" is the name of the desired matrix. */ OldInput = CurInput; CurInput = "GetMatrx"; GetRow1: link GetValus; if count = 0 then do; link EchoInpt; go to GetRow1; /* look for values on next line */ end; Nrows = 1; Ncols = count; do icolGM = 1 to Ncols; MValues{Nrows, icolGM} = value{icolGM}; end; /* Get values for next row */ NextRow: ***peek at next line; curline+1; if curline>&PrgmLinzz then do; statment=''; goto EOFfound; end; set upowerzz point=curline; keyword=scan(statment,1,' '); if keyword ne ">" then do; curline=curline-1; go to DoneGtMt; end; put statment; **echo valid continuation line; Nrows = Nrows + 1; link GetValus; if count ne Ncols then do; put // "ERROR: Length of rows is not identical."; stop; end; do icolGM = 1 to Ncols; MValues{Nrows, icolGM} = value{icolGM}; end; go to NextRow; DoneGtMt: ptr=1; CurInput = OldInput; EndGtMat: return; /* GetMatrx */ EchoInpt: /* Link function to check input line for a comment and echo it out to the output file. */ /* Check whether the statement is blank. If so, it puts a blank line into the output and continues. Also check whether it is a single comment line. Format is // Comment line, at most 78 chars per line. */ ReadLine: curline+1; if curline>&PrgmLinzz then do; statment=''; goto EOFfound; end; set upowerzz point=curline; put statment ; if trim(statment)="" or index(left(statment), "//")=1 then do; put; go to ReadLine; end; /* Print comment blocks. Format is /# Comments lines, at most 78 chars per line. If any contain ";" then use DATALINES4 statement and follow all UnifyPow input with ";;;;". #/ */ if index(left(statment),"/#") = 1 then do; if index(statment,"#/") then go to ReadLine; NxtComnt: curline+1; set upowerzz point=curline; if _error_ then goto EOFfound; put statment ; if index(statment,"#/") then go to ReadLine; go to NxtComnt; end; /* block comment */ return; /* EchoInpt */ ReadTitl: /* Link function to read title between double quotes ("). Puts value in variable called title. Variable TitleLng is the length of title, not counting " marks. */ delimter=compress("'"||'"'); title=scan(statment,1,delimter); if title='' then do; put // 'ERROR: Expecting title enclosed in quotes'; stop; end; TitleLng=findc(statment,delimter); ***find first quote; TitleLng=findc(statment,delimter,TitleLng+1); ***find second quote; statment="title "||substr(statment,titlelng+1); return; /* ++++ End UPw23_Links6 segment ++++ */ /* MAKE MACRO: Remove commenting signs around next 3 lines. */ run; * Activate for macro version; %tables; * Activate for macro version; %mend UPower; * Activate for macro version; ***************************************************************************************************; %macro SimPower(dset, yvar, class=, fixed=, random=, ntotal=, alpha=.05, varcomp=, contrasts=, iter=1000, seed=0); proc printto log=user.dsetzz; *** turn off log entries by dumping to temp file; run; proc optsave out=optionszz; ** save user option settings; run; options nonotes nomprint; ods noresults; data optionszz; set optionszz; if optname in ('NOTES' 'MPRINT'); ***only save options that are changed; run; proc printto ; run; proc datasets nolist; delete aovzzz contzzz; run;quit; %let ptype=1; ************ ANOVA power; ods exclude all; %do iter=1 %to &iter; **** class must be ordered so nested terms are at end, cross classified at begin; proc sort data=&dset out=workzz; by &class; *** then extract fixed effects from class, and generate variance for remaining; %let rlist= rep sample; data workzz; set workzz; by notsorted &rlist; retain seed &seed; retain varzz1 varzz2; ***need one for each random; retain vczz1-vczz2; drop varzz1 - varzz2; array rrr &rlist; array vvv varzz1-varzz2; **corresponding effects; array ccc vczz1-vczz2; **corresponding user var comps; if _n_=1 then do; do ii=1 to 2; **need max; ccc[ii]= 1*scan("&varcomp",ii); end; end; if first.rep then do; *****need to know first name; call rannor(seed,varzz1); call rannor(seed,varzz1); end; ***start with mean and residual; call rannor(seed,reszz); call rannor(seed,reszz); yyzzz = &yvar + reszz* ccc[2]; ***need max here and next line; *** add random effects; do ii=1 to 2-1; yyzzz= yyzzz + vvv[ii]*ccc[ii]; end; run; proc mixed; class &class ; model yyzzz = &fixed; random &random ; &contrasts ods output tests3=aovzz contrasts=contzz; run; proc datasets; append base=aovzzz data= aovzz; append base=contzzz data=contzz; run;quit; %end; ***summarize iterations; ods exclude none; data contzzz; set contzzz; rename label=effect; run; data aovzzz; set aovzzz contzzz; if probf le &alpha then sig=1; else sig=0; run; proc sort; by effect; proc means n mean std; by effect; var sig; run; %if &ptype=2 %then %do; ****************** regression power; data one; do iter=1 to 10000; do nobs=1 to 20; **x=nobs/4; ** divide to give appropriate SDX; **or optimum design is all points at extremes; if nobs le 10 then x = 1; else x=3; junk=rannor(0); junk=rannor(0); y = 5 + 1.5*x + rannor(0)*3; output; end;end; run; ods exclude all; proc reg; by iter; model y=x; ods output anova=xxx; ods output fitstatistics=xx; ods output parameterestimates=x; run; ods exclude none; data xx; set xx; if label2='R-Square'; rsquare=nvalue2; run; data x; set x; if variable='x'; if probt le .05 then sig05=1; else sig05=0; if probt le .01 then sig01=1; else sig01=0; run; data x; merge x xx; proc means; var sig05 sig01 rsquare estimate; run; %end; %if ptype=3 %then %do; ****************** contingency power; %end; ods results; proc optload data=optionszz; run; %mend;