sas >> nonparametric tests for difference in means

by mtolea » Thu, 28 Jun 2007 09:27:55 GMT

I understand that the following nonparametric tests: Wald-Wolfowitz runs
test, the Mann-Whitney U test, and the Kolmogorov-Smirnov two-sample test
can be used to compare two independent groups on the mean value for some
variable, when this variable is not normally distributed.

Could someone help me with a sas code for one (or more) of these tests? All
I need is the two means, their standard errors and the p value comparing the

Thank you,

sas >> nonparametric tests for difference in means

by EvilPettingZoo97 » Thu, 28 Jun 2007 09:34:59 GMT

On Wed, 27 Jun 2007 21:27:55 -0400, Magda Tolea < XXXX@XXXXX.COM >


You will want to check out the documentation on PROC NPAR1WAY.


proc npar1way data=sashelp.class ;
class sex ;
var age ;
run ;


sas >> nonparametric tests for difference in means

by MTOLEA » Thu, 28 Jun 2007 10:04:21 GMT

Thank you Ken.


Magdalena Tolea
University of Maryland, Baltimore
DEPM/Gerontology PhD Program
MSTF room 311A
tel: 410-706-4046

On Wed, 27 Jun 2007 21:27:55 -0400, Magda Tolea < XXXX@XXXXX.COM >


You will want to check out the documentation on PROC NPAR1WAY.


proc npar1way data=sashelp.class ;
class sex ;
var age ;
run ;


Confidentiality Statement:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

nonparametric tests for difference in means

by davidlcassell » Fri, 29 Jun 2007 12:59:06 GMT

In addition to the fine advice you have already received about PROC
NPAR1WAY, let me add some thoughts.

Tests like the Mann-Whitney U do not test the mean. They test the
median. There can be an important difference. Even when the data
are not perfectly normally distributed, the ordinary t-test can have more
power than Mann-Whitney, so think about this. If the data are really far
from normal, then that t-test is not appopriate, but your mean and median
may be so far apart that referring to this as a test of the mean value may
be *massively* misleading.

The other tests that you mention are a lot farther away from a test of the
mean value. K-S and tests like it test whether the entire distribution
of sample A looks like that of sample B. They can have exactly the same
mean and still be vastly different according to their EDFs. Wald-Wolfowitz
doesn't even do that much. It is a test of randomness of sequence based
on runs in the data stream.

Could you explain why you listed these three tests, when at least two
of them are drastically inappropriate for your stated purpose?

Picture this share your photos and you could win big! ?ocid=TXT_TAGHM&loc=us

nonparametric tests for difference in means

by davidlcassell » Fri, 29 Jun 2007 13:01:05 GMT

XXXX@XXXXX.COM quasi-evilly replied:

True, if the data are actually independent observations.

The more I think about Magda's reference to Wald-Wolfowitz,
the more I wonder if these are really independent data.

Get a preview of Live Earth, the hottest event this summer - only on MSN ?source=msntaglineliveearthhm

Similar Threads

1. Inconsistency between multiple test by Dunn and nonparametric one-way anova - Kruskall-Wallis test

2. Dependent sample difference in mean test

I have two dependent samples with different numbers of observations.  I
need to know whether the means of the two samples are statistically
different from each other.

My sample_1 has approximately 800,000 observations.  Sample_2 has
approximately 130,000 observations.

I have run a regression on sample_1 to generate coefficients.  I then
"fit" the coefficients from sample_1 to the characteristics of sample_2
observations.  This gives me a predicted value for sample_2 based on
sample_1 coefficients.  I then calculate a residual by subtracting each
sample_2 observation actual value from the predicted value (predicted
from the sample_1 coefficients applied to the sample_2

Then I take the mean of the residuals from sample_2.

I repeat the process in the opposite, i.e., I run a regression on
sample_2, get coefficients, then fit the coeffificients from sample_2
to the sample_1 characteristics.  This generates a predicted value,
which I subtract from each sample_1 actual - this generates the
sample_1 residuals.  I then take the mean sample_1 residual.

I expect the sample_1 and sample_2 residuals to be of opposite sign.  I
need to test the difference in the mean residuals.  I have two
dependent samples (of residuals) and I have very different sample sizes
(of residuals).

I can make the assumption that they are perfectly negatively correlated
and proceed with a t-test.  Then assume that they are perfectly
uncorrelated and proceed with a t-test.  This will give me a range of
t-stats for my test.

But, I was hoping someone could help me with a stronger (or more
direct) test.  I'm afraid the range won't give strong enough results.

So, this is a statistical theory question instead of a direct SAS


3. Testing Difference in Means for 30 cases

4. how to test difference in difference?

Hi all,

I have the following dataset, where y is the dependent var, X1 is indepent var (continuous), X2 is dummy for group A (=1) and group B (=0)

Y       X1        X2

I would like to conduct the following test with standard error reported:

{(aveage y of group A if X1 increases by 0.1 - average y of group A if X1 increases by 0.5) - (average y of group B if X1 increases by 0.1 - average y of group B if X1 increases by 0.5)},

How should I do this?

Thanks in advance.


Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes

5. How to get p-values for nonparametric multiple test by Dunn

6. Walsh's nonparametric test

In answering a lister's posting concerning "Grubb's Test for Outliers" David L. Cassell wrote:

"There are nonparametric tests, like Walsh's procedure, which would be preferable under most conditions (except, of course, real normality with precisely one outlier)."

Could you enlighten on how to implement such a test? Thanks a lot!

Do you Yahoo!?
Exclusive Video Premiere - Britney Spears

7. RE : nonparametric Dunnett test in SAS

8. Re-2: Inconsistency between multiple test by Dunn and nonparametric one-way anova - Kruskall-Wa

Hello Holger!

Looks like you are in the same position that I am. I like statistics and
use it very often for my research-work as well as for my colleauges.
However I often have problems, since I am not a stat. man. :(

About the program you sent me. I hope it's a good approach, since
SAS guys provided this example in their Online Documentation. However
as you already pointed out on SAS-L, its very tedius to copy&paste p-values
from Wilcoxon rank sum test to new data. I have created a macro which, does
this job for you. I have attached it bellow. If you find any errors please let me
know. I'm also sending this mail to SAS-L so anyone can use this macro.

With regards, Gregor

On Friday 21 May 2004 18:24,  XXXX@XXXXX.COM  wrote:
> Dear Gregor,
> I promised you to tell you what I found out about Bonferroni-Holms
> correction. I had a talk with a professional statistican, but we did not
> talk about this item. There were so many other problems. It's hard to find
> a good statistican, who is nearby and has time enough for discussion about
> all the things I would like to know. I know one old statistican, who worked
> as Professor at university, but as many methods we use are developed in the
> last 20 years, I fear it will be no good idea to ask him for some advice.
> If you get more information about how to conduct Bonferroni-Holms
> correction (I really hope that the programm I sent to you is OK) I would be
> glad, if you could let me know.
> Dear Gregor, I wish you a very nice weekend!
> Best regards
> Holger

Lep pozdrav / With regards



 Macro name: wilcoxon_multiple

 Purpose: Tests pairwise differences between three or more independent
           groups by means of nonparametric Wilcoxon rank sum or
           Mann-Whitney U test and applies Bonferroni-Holms correction
           for multiple comparison.

 Written by: Gregor GORJANC
             gregor< mrcina< bfro< uni-lj< si
             Animal Science Department
             Biotechnical faculty
             University of Ljubljana
             Groblje 3
             SI-1230 Domzale

 Current version: $Id:,v 1.1 2004/05/24 10:38:06 gregor Exp gregor $

 SAS version: SAS v8.2 on linux and windows XP platform

 Method: Read data and perform pairwise test between groups with npar1way
          procedure. Raw p-values are stored in temporary table, which is
          used for multtest procedure to apply correction for multiple

         %wilcoxon_multiple ( data =
                             , var =
                             , class =
                             , out =
                             , where =
                             , clean =

 Required parameters: data - SAS data set
                      var - variable name
                      class - class/group effect name

 Optional parameters: out - name string for created tables, default
                      where - SAS and SQL where clause for matching
                               specific records
                      clean - remove temporary tables at the end, [yes|no]
                               default is yes

 Sub-macros called: none

 Data sets created: wilcoxon_multiple_l - tests are presented in rows i.e.
                                            k*(k-1)/2 rows for k groupss
                    wilcoxon_multiple_c - multiple comparison table i.e.
                                           k*k table for k groups

 Limitations: Group names should not be longer than 10 characters.

 Notes: Idea taken from multtest examples in SAS OnlineDoc. However, I
         do not know if this procedure is correct from statistical point
         of view.

 History: Look at changelog at the end of file.

 Sample macro call:

data example;
    input y genotype $;
    1 LW
    1 LW
    3 LW
    2 LW
    2 SL
    2 SL
    4 SL
    3 SL
    5 DU
    7 DU
    9 DU
    3 DU

* Without optional parameters;
%wilcoxon_multiple ( data = example
                   , var = y
                   , class = genotype
                   , out =
                   , where =
                   , clean = );

* With optional parameters;
%wilcoxon_multiple ( data = example
                   , var = y
                   , class = genotype
                   , out = genotype_y
                   , where = y>> 1
                   , clean = no);


%macro wilcoxon_multiple ( data =
                         , var =
                         , class =
                         , out = wilcoxon_multiple
                         , where =
                         , clean = yes);

    %* --- Echo --- ;

    %* --- Default values for some optional parameters --- ;
    %if %length(&out)=0 %then %let out=wilcoxon_multiple;
    %if %length(&clean)=0 %then %let clean=YES;

    %* --- Table cleanup variable - separate names with commas! --- ;
    %let tables=;

    %* --- Get groups, pairs, ... --- ;
    %* Number of groups and groups;
    proc sql noprint;
        INTO :num
        FROM &data
        %if %length(&where) ne 0 %then WHERE &where;
        CREATE TABLE tmp1 AS
            SELECT DISTINCT &class
            FROM &data
            %if %length(&where) ne 0 %then WHERE &where;
            ORDER BY &class;
        %let tables=tmp1;
    data tmp1;
        set tmp1;
    %* Pairs;
    proc sql;
        CREATE TABLE &out._l (
        test_num NUM,
        &class.1 CHAR(10),
        &class.1n NUM,
        &class.2 CHAR(10),
        &class.2n NUM,
        test CHAR(21),
        raw_p NUM
        %let test_num=0;
        %do i=1 %to #
            %do j=%eval(&i+1) %to #
                %let test_num=%eval(&test_num+1);
                INSERT INTO &out._l (test_num, &class.1n, &class.2n)
                    VALUES (&test_num, &i, &j);
        UPDATE &out._l AS a SET &class.1 = (
            SELECT &class
            FROM tmp1 AS b
            WHERE A.&;
        UPDATE &out._l AS a SET &class.2 = (
            SELECT &class
            FROM tmp1 AS b
            WHERE A.&;
    data &out._l;
        set &out._l;
        test=compress(&class.1||'-'||&class.2,' ');

    %* --- Wilcoxon rank sum or Mann-Whitney U test --- ;
    %do i=1 %to &test_num;
        %* Data for &i test;
        proc sql;
            CREATE TABLE tmp2 AS
                SELECT &var,
                FROM &data AS a, &out._l AS b
                WHERE B.test_num=&i AND
                      (B.&class.1=A.&class OR
                      %if %length(&where) ne 0 %then AND &where;
        ods output wilcoxontest=tmp3;
        proc npar1way data=tmp2 wilcoxon;
            var &var;
            class &class;
        proc sql;
        UPDATE &out._l AS a SET raw_p = (
           SELECT nValue1
           FROM tmp3 AS b
           WHERE Name1='P2_WIL' AND
           WHERE A.test_num=&i;
    proc sql;
        CREATE TABLE tmp4 AS
            SELECT test_num,
            FROM &out._l;
    %let tables=&tables, tmp2, tmp3, tmp4;

    %* --- Bonferroni-Holms correction --- ;
    proc multtest pdata=tmp4 holm hoc fdr out=tmp4 noprint;run;
    data &out._l;
        merge &out._l tmp4;
        by test_num;

    %* --- Create multiple comparison table --- ;
    proc sql;
        CREATE TABLE &out._c (
        &class CHAR(10),
        %do i=1 %to #
            &class.&i NUM,
        id NUM
        INSERT INTO &out._c (&class, id)
            SELECT &class,
            FROM tmp1;
        %do i=1 %to #
            %do j=%eval(&i+1) %to #
                UPDATE &out._c AS a SET &class&j = (
                   SELECT stpbon_p
                   FROM &out._l AS b
                   WHERE B.&class.1n=&i AND
                         B.&class.2n=&j AND
        ALTER TABLE &out._c DROP id;

    %* --- Add n and means to &out._l --- ;
    data &out._l;
        set &out._l (keep=test_num &class.1 &class.2 raw_p stpbon_p);
    proc sql;
        UPDATE &out._l AS a SET n = (
            SELECT COUNT(&var)
            FROM &data AS b
            WHERE A.&class.1=B.&class
            %if %length(&where) ne 0 %then AND &where;
         UPDATE &out._l AS a SET mean = (
             SELECT AVG(&var)
             FROM &data AS b
             WHERE A.&class.1=B.&class
             %if %length(&where) ne 0 %then AND &where;

    %* --- Print the results --- ;
    title " Multiple comparison between &class for &var ";
    proc print data=&out._l;run;
    proc print data=&out._c;run;

    %* --- Cleanup --- ;
    %if %upcase(&clean)=YES %then %do;
    proc sql;
        DROP TABLE &tables;

%mend wilcoxon_multiple;

 $Log:,v $
 Revision 1.1  2004/05/24 10:38:06  gregor
 Initial revision

* ends here;