comp.soft-sys.sas - The SAS statistics package.
> > Date: Wed, 29 Dec 2004 12:51:05 -0800 > From: Dale McLerran < XXXX@XXXXX.COM > > Subject: Re: Reducing a matrix in IML iteratively > > --- "Tonidandel, Scott" < XXXX@XXXXX.COM > wrote: > > > Dale, > > > > Thanks for the helpful suggestions. I think the macro is the way to > > go > > because this code is part of a larger simulation study that may have > > as > > many as 8 X-variables. I think I am going to go with the second macro > > you suggested below but this led to a few additional questions (I > > hope > > you don't mind this additional imposition -- if so my apologies). The > > first question has to do with the scope of my problem while the other > > two stem more from my unfamiliarity with combining macros and IML. > > > > 1) I was originally thinking about reducing the entire correlation > > matrix and then pulling the parts out of that reduced matrix that I > > need (Sxx, Syy, Sxy). But, I liked the idea you presented to focus > > on a piece of the original correlation matrix (Sxx in this case) > > and reduce that. But this leaves the additional step of me having > > to reduce the Syy, and Sxy matrices as well. Is there an easy way > > you would recommend doing that in the context of your macro? > > The Sxy matrix can be subset using the index vector. Just > employ Sxy[index,] and also Syx[,index]. I was under the > impression that all Y variables would be included in the > multivariate R-square computation, so that no subscripting would > be needed for Syy. If that is incorrect, then you will need > additional loops which form an index for the columns of the > multivariate response. > > > > > 2) As I said earlier this is actually a piece of a larger simulation > > where I am generating data with various numbers of predictor > > variables and criterion variables so this will be nested within a > > larger do loop. In my larger program I have a scalar variable > > called numpred which is indexed from 2 to 8 so the &columns > > variable in your macro will need to take on these values. But, > > the macro will not let me put numpred as an argument b/c it treats > > it as text. I am not very familiar with combining macros and IML so > > how would I make this numpred variable an argument in the macro? > > If I understand correctly, you have code something like the > following: > > do numpred=1 to 8; > do iter=1 to 1000; > <generate X data with numpred columns> > <generate Y data with numcrit columns> > <obtain multivariate R-square of Y with every combination of X> > end; > end; > > > Now, employing the code that I posted yesterday, you would need > to have NUMPRED inner do loops in order to obtain the multivariate > R-square for every possible combination of the columns of X. > But the number of loops cannot depend on the IML variable NUMPRED. > Thus, the code that I sent yesterday will not work. > > Here is a different solution which will work for the problem > described above. We know that there are ((2**NUMPRED)-1) > combinations of the predictor variables that must be considered. > Now, if we loop from 1 to ((2**NUMPRED)-1) and convert the loop > index value to binary, then we will have a set of values that > are constructed as > > do loop reversed > value binary value binary value > 1 00000001 10000000 > 2 00000010 01000000 > 3 00000011 11000000 > 4 00000100 00100000 > ... ... ... > > > Note that I have represented the binary variable with eight digits > which is the maximum value for the index variable NUMPRED. Now, > we can treat the columns of the reversed binary value as indicators > for the columns of X that we should select for the indexed > combination. Below is code which constructs the reversed binary > value, and from that constructs an index for the columns of X > that should be selected. > > proc iml; > do numpred=1 to 4; > do i=1 to ((2**numpred)-1); > Xcols = reverse(putn(i,"binary8.")); > do j=1 to numpred; > xj=num(substr(Xcols,j,1)); > if j=1 then index=xj*j; > else index=index || xj*j; > end; > index = loc(index); > print numpred Xcols index; > end; > end; > quit; > <stuff removed> > HTH, > > Dale > > ===== > --------------------------------------- > Dale McLerran > Fred Hutchinson Cancer Research Center > mailto: XXXX@XXXXX.COM > Ph: (206) 667-2926 > Fax: (206) 667-5977 > --------------------------------------- Dale, I like the trick of looping over all possible combinations using the binary numbers, however this does bring with it some overheads, specifically the character string manipulation and then building up the matrix index by repeated concatenation with itself. Below is some IML code for enumerating combinations that I have taken from an old program that calculates multiple-case influence diagnostics. For me this has several advantages, firstly that the subsets of different size are kept separate and secondly that it should be a lot more efficient when the number of combinations rises into the thousands. proc iml; reset noname; /* IML module to loop over all combinations of k things from n. Subsets are enumerated in increasing subset size from k1 to k2. */ start comblist(n,k1,k2); do k=k1 to k2; ncomb=round(exp(lgamma(n+1)-lgamma(k+1)-lgamma(n-k+1))); index=1:k; index[k]=k-1; last=(n-k+1):n; do s=1 to ncomb; j=k; do while (index[j]=last[j]); j=j-1; end; index[j]=index[j]+1; do i=j+1 to k; index[i]=index[i-1]+1; end; print index [format=3.0]; end; end; finish; run comblist(8,1,8); quit; run; Kind regards, Ian. Ian Wakeling Qi Statistics.