sas >> how to create correlation matrix in SAS

by xinwei » Tue, 08 Jul 2008 04:17:31 GMT

i have a dataset containing 20 subjects and thousands of variables. I
wonder what SAS procedure I can use to calculate the correlation matrix of
these variables?
these variables are continuous and I may want to collapse them into binary
ones. Is it possible to calcualte the correlation matrix(chisq p value)
for lots of binary variables?


sas >> how to create correlation matrix in SAS

by art297 » Tue, 08 Jul 2008 07:32:48 GMT


Since your question raises a number of questions, it cannot be answered
with a simple, one line, answer.

First, since you have continuous variables, why make them binary and need
chi square? Keep them as they are and use pearson product-moment
correlations. That you can do, easily, with proc corr.

Second, you do realize that with only 20 subjects and over a thousand
correlations (or chi-squares for that matter), many of them will show what
appear as "real" correlation when, in fact, one would expect many of them
to look real through chance alone.

Thus, if you use .05 as your alpha, you would expect 50 of every 1,000
relationships to show a false significant correlation.

I'm sure that others can point out even more problems with what you are


sas >> how to create correlation matrix in SAS

by Paige Miller » Tue, 08 Jul 2008 20:21:34 GMT

PROC CORR does what you want.

I see no statistical reason to take continuous variables and turn them
in to binary variables. You lose information that way.

Paige Miller
paige\dot\miller \at\ kodak\dot\com

sas >> how to create correlation matrix in SAS

by peterflomconsulting » Tue, 08 Jul 2008 21:13:09 GMT

Xin Wei < XXXX@XXXXX.COM > wrote matrix in SAS

PROC CORR is what you want. If you want to make a file with *just* the correlations, and no simple statistics, and no p-values, then try this;

proc corr data = dsname noprob nosimple;
var varlist;

As to collapsing your variables --- it's possible, but it is almost always a bad idea.

PROC CORR will give you p-values by default.



Peter L. Flom, PhD
Statistical Consultant
www DOT peterflom DOT com

sas >> how to create correlation matrix in SAS

by sudip.memphis » Tue, 08 Jul 2008 22:26:37 GMT

If you need to calculate correlation for only binary variables then you can use

proc freq, with option plcorr,

for more details you can look at this website

Similar Threads

1. SAS/IML code for regression with correlation matrix.

Dear all,

I am looking for SAS/IML code that compute linear regression with
correlation matrix input. The IML manual has regression module but is
for raw data. I'd really appreciate if you could share the code with
me, or you could show me where I can find it.


2. Re-arrange columnar values into a correlation matrix

3. Need a more efficient way of simulating a correlation matrix

Hi. I'm preparing to run a monte carlo simulation in which I'm looking at
the impact of different correlation matrices on power. Rather than hand
enter a lot of different cor matrices, I would like for them to be
generated by code. The following code appears to do the trick. Basically,
it creates a lower half matrix, transposes it, and sums the two in Proc
Iml. However, I doubt this is the most efficient way of producing a cor
matrix. I was wondering if could come up with a more efficient way of
doing the same thing.  I though of using a two-dimensional array to do
everything in a single data step but don't know how to set up the code.

Also, I need to be able to generate several types of cor matrices. I'm
simmulating standardized variables so the covariance structure =
correlation structure.

(1) Compound symmetry (in my code you get this by setting min=max. This
represents my fixed cor matrix.)
(2) Unstructured (set type to 'U'. This represents my random cor matrix.
In my code, I can generate random cor values bounded by a lower and upper

I would also like to be able to simmulate a couple of cor structures that
represent time series data. My understanding is that an AR(1) cor
structure represents data collected at discrete points in time whereas
spatial power represents data collected continuously. Can anyone help with
the following:

(3) First-order autoregressive AR(1). If I understand the structure, the
covariance elements are composed of diagonal bands of rho raised to the
power of abs(row-col).
(4) Spatial power SP(POW)(c). Here it seems like there is a constant rho
that is raised to a power, which is allowed to vary for each covariance


%Macro Matrix (v=, min=, max=, type=);
  Data Matrix;
    step = 2*(&max - &min) / (&v * (&v -1));
    do row=1 to &v;
 do col=1 to &v;
    if row >= col then
       if row = col then do;
       else do;
  if UPCASE(&type)='U' then
    do until (r >= &min and r <= &max);
       else do;
    drop x;

  proc transpose data=Matrix out=M2(drop=row _name_);
        var r;
 by row;

  proc transpose data=M2 out=M3(drop=_name_) ;

  proc iml;
 use m2;
 read all into m2;
 use m3;
 read all into m3;
 CREATE Matrix FROM Matrix ;

  data Matrix;
 format Col1-Col&v 10.2;
 retain Col1-Col&v;
 SET Matrix;
 %do i=1 %to &v;
  RENAME Col&i=V&i;
%Matrix(v=4, min=.1, max=.1, type='Y');

proc print data=Matrix;

4. Generating an unstructured positive definite correlation matrix

5. correlation scatterplot Matrix

I am a novice user of sas and came across with the problem of
creating a scatter plot of longley data like we see all the charts
shown in a matrix form of 6x6 or 7x7. I have tried number of commands
but not having a much of luck.
There are 7 variables including the response variable.
x1....x63 and Y.

Q2: Also how do you get codes from the window base sas point of view as
compared to using the editor mode. I have not used the point and click
area for analysis. i use the editor window but on search for help I see
most of the information on window base sas. I am sure I could do the
ablove issue on window base but then how do I access the code?
Thanks for your help.

6. Correlation matrix from DoE

7. Specifying a residual correlation matrix

Hi all,

I wonder if it is possible to specify a tailor-made residual
correlation matrix (R) to be used in a mixed model analysis (PROC MIXED

The reason for the question is that we have observations on a number of
individuals (one obs per individual) and we are interested in
estimating the effects of some (fixed) explanatory variables on the
outcome. However, we know that these individuals are relatives and
would like to account for this similarity. The idea was then to force
SAS to use an R-matrix that we have defined and that show the
relationship between them, since we believe that e is not
ND(0,I*sigma_e), but rather ND(0,R*sigma_e). I believe this is pretty
straight forward MME, but I am not sure that SAS can handle it.

All and any ideas are welcome!


8. Computing Contemporaneous Cross-Correlation Matrix