### sas >> QR: Bonferroni correction in multiple linear regression?

Hi,
in one way anova , we usually use the Bonferroni correction , if we have a categorical variable X, of more than 3 categories, for example 5, in order to test the equality of means of the Y variable.

If instead of anova, I use the multiple linear regression Y= X, do I need to do the Bonferroni correction here also?

Many thanks

__________________________________________________
Do You Yahoo!?
En finir avec le spam? Yahoo! Mail vous offre la meilleure protection possible contre les messages non sollicit
http://mail.yahoo.fr Yahoo! Mail

### sas >> QR: Bonferroni correction in multiple linear regression?

>>> "adel F." < XXXX@XXXXX.COM > 6/6/2006 6:37 am >>> wrote
<<<
in one way anova , we usually use the Bonferroni correction , if we
have a categorical variable X, of more than 3 categories, for example
5, in order to test the equality of means of the Y variable.

If instead of anova, I use the multiple linear regression Y= X, do I
need to do the Bonferroni correction here also?

How to correct for multiple comparisons is a complex question that
depends on many things, some of them philosophical. But it cannot
depend on the choice between ANOVA and regression, since the two are the
same, they are both instances of the general linear model, in matrix
form

Y = XB + e

Now,often, one uses the word 'regression' when the X is continuous,
And ANOVA when the X is categorical, and some use ANCOVA when the X are
a mix of categorical and continuous. But that's nomenclature. not
statistics.

HTH, but if you want more specific advice, I suggest writing back to

Peter

Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
http://cduhr.ndri.org
www.peterflom.com
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)

### sas >> QR: Bonferroni correction in multiple linear regression?

When performing anova or multiple regrassion one never should do any
adjustment to alpha levels on emerging effects. These analyses test for
effects of some independent continuous or categorical variable(s) on some
dependent variable(s) as a whole, thus without pointing to significant
differences between pairs of groups in case of three or more groups. This
actually is a single test, not a series of multiple tests. You may have been
far too conservative with your alpha levels in this case.

Only if one performs post hoc analysis or a priori tests between pairs of
groups (e.g. t-tests), which involve more than one separate (multiple) tests
some alpha adjustment applies because of higher probabilities on
coincidentally significant results. Bonferroni is quite common, quite
conservative too, your effects have to be strong to prove significant. Other
adjustment methods are available of which the one proposed by Keppel is
quite liberal. Keppel only applies a Bonferroni-like adjustment in case of
circular dependence of sets of tests between groups.

Search with google on [alpha bonferroni keppel partitioning multiple].

http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0403C&L=sas-l&P=R3884
http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0602B&L=sas-l&P=R42081
http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0603C&L=sas-l&P=R17218

Regards - Jim.

a categorical variable X, of more than 3 categories, for example 5, in
order to test the equality of means of the Y variable.
to do the Bonferroni correction here also?
possible contre les messages non sollicit

### sas >> QR: Bonferroni correction in multiple linear regression?

No.

We don't.

When we do the omnibus hypothesis test

H0: mu_1 = mu_2 = mu_3 = ... = mu_k

we have a *single* hypothesis test and a *single* F statistic to
evaluate the whole thing, so there is no correction for multiple
comparisons.

It doesn't look like you should be doing Bonferroni anywhere.

I don't recommend it, even when you *do* have a setting for multiple
comparisons. For instance, you have 42 treatments and you want to
compare all the treatment *differences* to see which ones differ from
zero. That's 42 * 41 / 2 = 861 differences to compare unless you have
a much smaller list of _a_prioiri_ tests to make. Now you need some
sort of adjustment for the multiple comparisons. I don't recommend
either of Jim's suggestions. I like Tukey, or Ryan-Einot-Gabriel-Welsch.
And if you test against a control, then you have different sets of
comparisons, and different methods.

HTH,
David
_________________________________________________________________
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

Hi,

Sorry to bring up a very simple question.
I have a variable y and other independent variables x1, x2, ..., xn
Now I need to applyy multiple linear regression model
y = a1*x1+a2*x2 + ... + an*xn
to forecast future values of y.

Can anybody tell me in SAS what procedure I should use?

Thanks a lot.

Fred

> I have an binary depedent variable Y(0,1) and three independent
> variables X (1,2) and Z(1,2,3) and T(1,2,3,4)
> I know that it is stuitable to consider a logistic model for Y and
X,Z,T.
>
> But I would like to use linear regression model for some purposes,
> could you please give me any suggestions to do this in SAS, how can
> I include dummy variables and how can I choose my reference for the
> X,Z and T, and also how to include include interraction terms.

He got some very good advice from David Cassell (David always gives
<<<
No matter how you recode your dummy variables, you cannot get around
the fact that your dependent variable Y is a 0/1 variable. You can NEVER
get the crucial underlying assumptions for simple linear regression. And
without those assumptions, you cannot get valid results!

So don't do this.

Why can't you use logistic regression? Write back to SAS-L (not to me
personally) and tell us why you want simple linear regression (well,
with just dummy variables it's an analysis of variance model) and why
you can't use PROC LOGISTIC to get what you want.
>>>>

Then Youssef replied
<<<
If I would like to use the linear probability models using the
proportion of Y= 1 with denominator the to total over each set of
distinct categories of my independent variables the number of people
with Y=1 and X=1,Z=1,T= 1 over the total number of people X=1,Z=1,T= 1
and so on,
this give the proportion to be used as a depedent variable?
This will do a better job than if I use a simple linear model with
Y(0,1)?
>>>

First, you have not answered David's questions.  Why do you want to use
linear regression?  Why do you not want to use logistic regression?
It's sort of rude to ask for advice and then ignore it.

Second, what is it you are trying to do? What are these variables?

Third, no, using proportions as a DV in linear regression is not
appropriate.  Whether it's 'better' than using 0,1 responses is not
really the point.  It's like asking whether it's better to use a hammer
or a crowbar to screw things in.  What you want is a screwdriver.  The
screwdriver here is logistic regression - at least, that's the
screwdriver based on what you've said.  There are at least two reasons
you ought not use proportions as a DV in linear regrssion  1)
proportions are bounded by 0 and 1 - linear regression assumes that the
DV goes from negative infinity to positive infinity.  SOMETIMES this
doesn't need to be literally true, but this isn't one of those times.
2)  Linear regression assumes homoscedasticity.  Proportions as DVs
don't have this.

Peter

Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
www.peterflom.com
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)