by luaburto » Fri, 23 Apr 2004 00:00:13 GMT

Hello Guys.
A bounch of question about person correlation.
1.- Can I calculate Person Correlation between a numerical
(continuous) variable with a Binary (dummy, flag) variable??
2.- Is it valid?
3.- Does it give sense results?
4.- Is there other measure/test/procedure to understand the influence
of a continuous variable in a binary variable?

Thanks in advance

by Art Kendall » Fri, 23 Apr 2004 00:25:39 GMT

from the glossary in Cohen
"Point biserial The product-moment correlation between a dichotomous
correlation and a continuous (scale) variable."

Cohen, Jacob, et al (2003) Applied multiple regression/correlation
analysis for the behavioral sciences, third edition. Lawrence Erlbaum
Associates, Mahwah, NJ.
ISBN 0-8058-2223-2
LoC HA31.3 .A67.2003

Cohen, et al is an excellent reference on correlation and regression.


by jim clark » Fri, 23 Apr 2004 00:27:33 GMT





To appreciate the above, run the following spss job.

data list free / dich cont.
begin data
0 1 0 2 0 3 0 4 0 5
1 3 1 4 1 5 1 6 1 7
end data.
corr dich cont /stat.
ttest /group = dich(0 1) /vari = cont.
regr /vari = dich cont /dep = cont /enter.
regr /vari = dich cont /dep = dich /enter.

Left for another reader.

Best wishes

James M. Clark
Department of Psychology
University of Winnipeg
Winnipeg, Manitoba
CANADA

by Bruce Weaver » Fri, 23 Apr 2004 02:39:29 GMT

Art & Jim have responded to points 1-3. (Art meant to say
correlation between a dichotomous *variable*, not a
dichotomous correlation.)

You mean the binary variable is an outome (or dependent)
variable? How about logistic regression then?

The SPSS output will have in a table of regression
coefficients a column headed EXP(B). This will give you the
odds ratio associated with a one-unit increase in your
continuous variable. (For some variables it you'll want to
change the scaling so you get the odds ratio associated a
change larger than one unit. E.g., for age, you might want
the OR for a 5- or 10-year change.)

For help with logistic regression in SPSS, see the textbook
examples here:


by Richard Ulrich » Fri, 23 Apr 2004 13:46:15 GMT

On Thu, 22 Apr 2004 14:39:29 -0400, Bruce Weaver < XXXX@XXXXX.COM >

[snip, detail]

Then there are the graphical presentations.
box-and-whisker plots
line plots (or scattergram with the two)
two histograms

by dhg » Sat, 24 Apr 2004 23:50:25 GMT

On Thu, 22 Apr 2004 11:27:33 -0500, jim clark < XXXX@XXXXX.COM >

Nice illustration. You can also add

cont BY dich
/DESIGN = dich .

to illustrate that the eta-squared is the square of the r value.

by Eric Bohlman » Tue, 27 Apr 2004 06:19:33 GMT

XXXX@XXXXX.COM (Luis Aburto) wrote in


What do you mean by "valid"?

The result will be a function of the mean difference between the two groups
defined by the dichotomy, as well as the proportions accounted for by the
two groups. Testing its significance is *exactly* the same as doing a t-
test for the mean difference between the groups.

Specifically, if X is the binary variable and Y is the continuous variable
then r=sqrt(P1*(1-P1))*(Ybar1-Ybar0)/Sy where P1 is the proportion of
observations where X is 1, Ybar1 is the mean value for Y conditional on
X=1, and Sy is the standard deviation of Y. In other words, it's simply
Cohen's D scaled by a factor involving the group sizes.

If you're treating the continuous variable as a predictor and the binary
variable as an outcome, then logistic regression is the usual approach.