In PROC MIXED, you can specify the E option to the LSMEANS statement

and see what are the actual contrasts used to compute the LSMEANS. A

useful feature.

Also in PROC MIXED, you can specify, for example

repeated/local=exp(x1 x2);

and estimate the dispersion effects of a model which has log-linear

variance.

Now I understand why, when you specify the REPEATED command as above,

that the LSMEANS now have a different standard errors, compared to

when you do not use the REPEATED statement above. I cannot understand

why the LSMEANS have different estimates when you use the REPEATED

statement above, compared to when you don't use that REPEATED

statement, especially since the E option on the LSMEANS statement

gives the exact same contrasts in both cases.

Can someone explain to me why this REPEATED statement changes the

LSMEANS estimates? Thanks!

and see what are the actual contrasts used to compute the LSMEANS. A

useful feature.

Also in PROC MIXED, you can specify, for example

repeated/local=exp(x1 x2);

and estimate the dispersion effects of a model which has log-linear

variance.

Now I understand why, when you specify the REPEATED command as above,

that the LSMEANS now have a different standard errors, compared to

when you do not use the REPEATED statement above. I cannot understand

why the LSMEANS have different estimates when you use the REPEATED

statement above, compared to when you don't use that REPEATED

statement, especially since the E option on the LSMEANS statement

gives the exact same contrasts in both cases.

Can someone explain to me why this REPEATED statement changes the

LSMEANS estimates? Thanks!

In PROC MIXED, you can specify the E option to the LSMEANS statement

and see what are the actual contrasts used to compute the LSMEANS. A

useful feature.

Also in PROC MIXED, you can specify, for example

repeated/local=exp(x1 x2);

and estimate the dispersion effects of a model which has log-linear

variance.

Now I understand why, when you specify the REPEATED command as above,

that the LSMEANS now have a different standard errors, compared to

when you do not use the REPEATED statement above. I cannot understand

why the LSMEANS have different estimates when you use the REPEATED

statement above, compared to when you don't use that REPEATED

statement, especially since the E option on the LSMEANS statement

gives the exact same contrasts in both cases.

Can someone explain to me why this REPEATED statement changes the

LSMEANS estimates? Thanks!

In PROC MIXED, you can specify the E option to the LSMEANS statement

and see what are the actual contrasts used to compute the LSMEANS. A

useful feature.

Also in PROC MIXED, you can specify, for example

repeated/local=exp(x1 x2);

and estimate the dispersion effects of a model which has log-linear

variance.

Now I understand why, when you specify the REPEATED command as above,

that the LSMEANS now have a different standard errors, compared to

when you do not use the REPEATED statement above. I cannot understand

why the LSMEANS have different estimates when you use the REPEATED

statement above, compared to when you don't use that REPEATED

statement, especially since the E option on the LSMEANS statement

gives the exact same contrasts in both cases.

Can someone explain to me why this REPEATED statement changes the

LSMEANS estimates? Thanks!

and see what are the actual contrasts used to compute the LSMEANS. A

useful feature.

Also in PROC MIXED, you can specify, for example

repeated/local=exp(x1 x2);

and estimate the dispersion effects of a model which has log-linear

variance.

Now I understand why, when you specify the REPEATED command as above,

that the LSMEANS now have a different standard errors, compared to

when you do not use the REPEATED statement above. I cannot understand

why the LSMEANS have different estimates when you use the REPEATED

statement above, compared to when you don't use that REPEATED

statement, especially since the E option on the LSMEANS statement

gives the exact same contrasts in both cases.

Can someone explain to me why this REPEATED statement changes the

LSMEANS estimates? Thanks!

Paige,

In a nutshell, when you specify that the error structure has local

exponential effects, then you end up fitting a weighted regression

model with weights which are not uniform for all observations.

Parameter estimates for weighted and unweighted regressions have

the same expectations but not the same point estimates in finite

samples.

Does this clarify the issue?

Dale

---------------------------------------

Dale McLerran

Fred Hutchinson Cancer Research Center

mailto: XXXX@XXXXX.COM

Ph: (206) 667-2926

Fax: (206) 667-5977

---------------------------------------

____________________________________________________________________________________Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out.

http://answers.yahoo.com/dir/?link=list&sid=396545433

In PROC MIXED, you can specify the E option to the LSMEANS statement

and see what are the actual contrasts used to compute the LSMEANS. A

useful feature.

Also in PROC MIXED, you can specify, for example

repeated/local=exp(x1 x2);

and estimate the dispersion effects of a model which has log-linear

variance.

Now I understand why, when you specify the REPEATED command as above,

that the LSMEANS now have a different standard errors, compared to

when you do not use the REPEATED statement above. I cannot understand

why the LSMEANS have different estimates when you use the REPEATED

statement above, compared to when you don't use that REPEATED

statement, especially since the E option on the LSMEANS statement

gives the exact same contrasts in both cases.

Can someone explain to me why this REPEATED statement changes the

LSMEANS estimates? Thanks!

and see what are the actual contrasts used to compute the LSMEANS. A

useful feature.

Also in PROC MIXED, you can specify, for example

repeated/local=exp(x1 x2);

and estimate the dispersion effects of a model which has log-linear

variance.

Now I understand why, when you specify the REPEATED command as above,

that the LSMEANS now have a different standard errors, compared to

when you do not use the REPEATED statement above. I cannot understand

why the LSMEANS have different estimates when you use the REPEATED

statement above, compared to when you don't use that REPEATED

statement, especially since the E option on the LSMEANS statement

gives the exact same contrasts in both cases.

Can someone explain to me why this REPEATED statement changes the

LSMEANS estimates? Thanks!

Dale, yes it clarifies the issue somewhat, and I certainly believe

you, as you have demonstrated "mad skillz" (as my teenagers would say)

in this area.

But I want to ask a few more details. The Mixed Model is formulated

as:

y = XB + Zu + e

and the variance of u is the matrix G and the variance of e is the

matrix R; and G and R are independent.

I was thinking that the least squares means depends only on X and B

and Z and u. The reason I bring this up is that the when you request

PROC MIXED to estimate the dispersion effects of a log-linear variance

model, you are changing the value of the matrix R. Changing the

estimate of R does not effect the estimates of B, which in turn

determine the least squares means. Where am I wrong?

n May 23, 7:21 am, Paige Miller < XXXX@XXXXX.COM > wrote:

Paige,

You are close to the answer. The estimable functions generated by the

E option only depend on X. There is a good explanation of them in the

help file for GLM in a description of the type III sums of squares.

You get different answers when you use the repeated statement because

the estimates of B, u, and e change since you have specified a

different covariance matrix for the multivariate normal. However, the

X matrix is not changed by the covariance matrix specification, so you

get the same estimable functions from the E option.

Mark

Paige,

You are close to the answer. The estimable functions generated by the

E option only depend on X. There is a good explanation of them in the

help file for GLM in a description of the type III sums of squares.

You get different answers when you use the repeated statement because

the estimates of B, u, and e change since you have specified a

different covariance matrix for the multivariate normal. However, the

X matrix is not changed by the covariance matrix specification, so you

get the same estimable functions from the E option.

Mark

n May 23, 7:21 am, Paige Miller < XXXX@XXXXX.COM > wrote:

Paige,

You are close to the answer. The estimable functions generated by the

E option only depend on X. There is a good explanation of them in the

help file for GLM in a description of the type III sums of squares.

You get different answers when you use the repeated statement because

the estimates of B, u, and e change since you have specified a

different covariance matrix for the multivariate normal. However, the

X matrix is not changed by the covariance matrix specification, so you

get the same estimable functions from the E option.

Mark

Paige,

You are close to the answer. The estimable functions generated by the

E option only depend on X. There is a good explanation of them in the

help file for GLM in a description of the type III sums of squares.

You get different answers when you use the repeated statement because

the estimates of B, u, and e change since you have specified a

different covariance matrix for the multivariate normal. However, the

X matrix is not changed by the covariance matrix specification, so you

get the same estimable functions from the E option.

Mark

n May 23, 8:46 am, XXXX@XXXXX.COM wrote:

DING DING DING!

That was the sound of the light bulb turning on over Paige's head. I

understand now. Thanks.

But this leads to a practical problem. Seems to me that you can choose

to receive Least Squares Means that make sense to me (since I have a

balanced data set, I expect the LS Means to equal to cell means)

without being able to estimate the dispersion effects, or I can choose

to estimate the dispersion effects and get LSMeans that don't seem

intuitive and would be hard to explain.

Is that the way you see it?

DING DING DING!

That was the sound of the light bulb turning on over Paige's head. I

understand now. Thanks.

But this leads to a practical problem. Seems to me that you can choose

to receive Least Squares Means that make sense to me (since I have a

balanced data set, I expect the LS Means to equal to cell means)

without being able to estimate the dispersion effects, or I can choose

to estimate the dispersion effects and get LSMeans that don't seem

intuitive and would be hard to explain.

Is that the way you see it?

n May 23, 8:46 am, XXXX@XXXXX.COM wrote:

DING DING DING!

That was the sound of the light bulb turning on over Paige's head. I

understand now. Thanks.

But this leads to a practical problem. Seems to me that you can choose

to receive Least Squares Means that make sense to me (since I have a

balanced data set, I expect the LS Means to equal to cell means)

without being able to estimate the dispersion effects, or I can choose

to estimate the dispersion effects and get LSMeans that don't seem

intuitive and would be hard to explain.

Is that the way you see it?

DING DING DING!

That was the sound of the light bulb turning on over Paige's head. I

understand now. Thanks.

But this leads to a practical problem. Seems to me that you can choose

to receive Least Squares Means that make sense to me (since I have a

balanced data set, I expect the LS Means to equal to cell means)

without being able to estimate the dispersion effects, or I can choose

to estimate the dispersion effects and get LSMeans that don't seem

intuitive and would be hard to explain.

Is that the way you see it?

n Wed, 23 May 2007 06:08:45 -0700, Paige Miller < XXXX@XXXXX.COM >

wrote:

statement

A

above,

understand

/dragging out soapbox sound

/climbs onto soapbox

There are some days when I wish the term Least Squares Mean wasn't so

ingrained into our thinking. This would be one of them. We're a fair bit

away from the OLS solution, so these particular best linear unbiased

estimates aren't "least squares" at all. When considering the dispersion

effects, they aren't even means. They are BLU estimates of central

tendency, that we happen to call LSMeans. I have to remind myself of this

continually. They aren't really means.

/notices that noose is firmly attached around neck

/hopes no one kicks soapbox back into corner

Steve Denham

Mathematical Biologist

Monsanto Co.

wrote:

statement

A

above,

understand

/dragging out soapbox sound

/climbs onto soapbox

There are some days when I wish the term Least Squares Mean wasn't so

ingrained into our thinking. This would be one of them. We're a fair bit

away from the OLS solution, so these particular best linear unbiased

estimates aren't "least squares" at all. When considering the dispersion

effects, they aren't even means. They are BLU estimates of central

tendency, that we happen to call LSMeans. I have to remind myself of this

continually. They aren't really means.

/notices that noose is firmly attached around neck

/hopes no one kicks soapbox back into corner

Steve Denham

Mathematical Biologist

Monsanto Co.

On May 23, 11:52 am, XXXX@XXXXX.COM (Steve Denham)

This is a much better way of saying what I was thinking when I made my

last post, and directly addresses my concern about using the LSMEANS

while estimating dispersion effects. Thank you.

I shall come to your virtual defense if necessary.

This is a much better way of saying what I was thinking when I made my

last post, and directly addresses my concern about using the LSMEANS

while estimating dispersion effects. Thank you.

I shall come to your virtual defense if necessary.

-- Steve Denham < XXXX@XXXXX.COM > wrote:

I have to disagree here. Weighted least squares estimates are

every bit an OLS solution. The weighted ls estimation problem can

be written as

bhat = inv(X'WX)*(X'WY)

Let U=X'sqrt(W),

Z=sqrt(W)Y

Then

bhat = inv(U'U)*(U'Z)

which is the usual OLS estimation problem.

I have to disagree again. Weighted means are every bit as much

means as unweighted means. If the weights are inversely proportional

to the variance of the response, then we have optimal estimates

of the true mean.

Yes, this is a point of agreement. But just what are these BLU

estimates of central tendency? How do we interpret them?

Unless otherwise constructed, the LSMeans are estimates of what

the mean would be for each level of categorical predictor variable

A given that all other categorical variables have a completely

balanced distribution and that all continuous variables are

observed at their mean values. To the extent that there is a

population in which this assumption can be expected to hold true,

then the LSMeans are estimates of the expectation across the

various levels of A in that population.

Sometimes it is not at all reasonable to believe that there is

such a population. The LSMeans can be computed under other

distributions of the categorical variables if you specify the

OM option. When you specify the OM option, the LSMeans will be

much more like the observed means when the data are not balanced.

If your design matrix has no continuous variable and the categorical

variables are balanced at all levels of A, then the LSMeans will

be identical to raw means. However, if there are continuous

variables in the data set (as Paige must have since he is using

the local=exp(X1 X2) option), then LSMeans will almost never be

the same as the raw mean, even if all categorical predictor

variables are balanced.

Dale

---------------------------------------

Dale McLerran

Fred Hutchinson Cancer Research Center

mailto: XXXX@XXXXX.COM

Ph: (206) 667-2926

Fax: (206) 667-5977

---------------------------------------

____________________________________________________________________________________

Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.

http://sims.yahoo.com/

I have to disagree here. Weighted least squares estimates are

every bit an OLS solution. The weighted ls estimation problem can

be written as

bhat = inv(X'WX)*(X'WY)

Let U=X'sqrt(W),

Z=sqrt(W)Y

Then

bhat = inv(U'U)*(U'Z)

which is the usual OLS estimation problem.

I have to disagree again. Weighted means are every bit as much

means as unweighted means. If the weights are inversely proportional

to the variance of the response, then we have optimal estimates

of the true mean.

Yes, this is a point of agreement. But just what are these BLU

estimates of central tendency? How do we interpret them?

Unless otherwise constructed, the LSMeans are estimates of what

the mean would be for each level of categorical predictor variable

A given that all other categorical variables have a completely

balanced distribution and that all continuous variables are

observed at their mean values. To the extent that there is a

population in which this assumption can be expected to hold true,

then the LSMeans are estimates of the expectation across the

various levels of A in that population.

Sometimes it is not at all reasonable to believe that there is

such a population. The LSMeans can be computed under other

distributions of the categorical variables if you specify the

OM option. When you specify the OM option, the LSMeans will be

much more like the observed means when the data are not balanced.

If your design matrix has no continuous variable and the categorical

variables are balanced at all levels of A, then the LSMeans will

be identical to raw means. However, if there are continuous

variables in the data set (as Paige must have since he is using

the local=exp(X1 X2) option), then LSMeans will almost never be

the same as the raw mean, even if all categorical predictor

variables are balanced.

Dale

---------------------------------------

Dale McLerran

Fred Hutchinson Cancer Research Center

mailto: XXXX@XXXXX.COM

Ph: (206) 667-2926

Fax: (206) 667-5977

---------------------------------------

____________________________________________________________________________________

Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.

http://sims.yahoo.com/

First, I don't have continuous predictors, all of my X variables are

categorical, and if I used the notation X1 X2 and it confused you, my

apologies.

More importantly, I am looking for some guidance. I do want estimates

of the LSMeans in each cell. I also want to estimate a log-linear

variance model, but if the cost of doing so is to get LSMeans that I

cannot explain and don't look right to scientists who can

independently compute the mean of their data in each cell, then maybe

I don't want the log-linear variance. Can I get reasonable looking

LSMeans and log-linear variance? It appears not. So as I see things

now, my choice is simply to pick one or the other (reasonable looking

LSMeans or log-linear variance) and live with that.

Is that how you see things? Do I have other choices? Does it make

sense to compute the LSMeans in one invocation of PROC MIXED, and the

variance estimates in a second invocation of PROC MIXED?

Thanks for any advice you might have.

Similar Threads

1. Proc Mixed, lsmeans and class variables

Hi all OK - todays questions: I'd like to use the lsmeans procedure to provide estimates based on the variables in the code below. My problem is that the variable "spddv" (a dummy variable with subjects 1,2,3 and 4) is a class variable, and so does not allow mixed to process this without an error "spddv is not a covariate in the model". How do I do this? Many thanks Stuart proc mixed data = bothtp covtest noclprint ratio ic method = ml; class species treeID regenplot subplot plot year tai spddv; model lnai = /*main effects*/ height species tab Pretapr spddv /*2 way interactive effects*/ Height*tapr baperha*species /*3 way interactive effects*/ species*Yc*ci species*tai*tab species*tapr*yc / cl solution residual noint/*influence outpm = resultbmtp*/; *random int / sub=plot type=un s; *(randomly selected plots); random int / sub=subplot (plot) type=un s; *(randomly selected subplots nested in plots); random int / sub=regenplot(subplot plot) type=un s cl; *(randomly selected regenplots nested in subplots); repeated year / sub=treeID(regenplot subplot plot) type=sp(exp) (year); *(repeated measures per tree over time); lsmeans species / diff at (height ci tapr pretapr baperha spddv yc) = (100 -0.13824 10 10 0 1 12); lsmeans species / diff at (height ci tapr pretapr baperha spddv yc) = (200 0.11197 10 10 0 1 12); lsmeans species / diff at (height ci tapr pretapr baperha spddv yc) = (300 0.21460 10 10 0 1 12); lsmeans species / diff at (height ci tapr pretapr baperha spddv yc) = (400 0.27720 10 10 0 1 12); lsmeans species / diff at (height ci tapr pretapr baperha spddv yc) = (500 0.26826 10 10 0 1 12); ods output /*solutionf = fixed solutionr = random*/ lsmeans = lsmean /*influence = infbmtp*/; run;

2. Proc mixed: Lsmeans for each level of a fixed factor

3. Proc Mixed LSMEANS question

Dear SAS users, I am interested in estimated means and SE's of variable SpawnInterval for two values of binary variable Major4, taking nested phylogenetic covariance into account. Two apparent problems in the results (see below) are: 1) the mean spawning intervals given by the "lsmeans" statement are different when the random statements are included and when they are not. I expected the estimates themselves to be identical, but a larger SE for GLS, as discussed in Litton and generally held for GLS vs. OLS. 2) the SE's given in the 1st case (with random statements) are large and clearly overlap, yet the the p-value for the coefficient Major is 0.065. Am I doing something wrong, or is something wrong with my expectations? Thanks (details below), Yetta I submitted the following statements: Title "Relationship between SI and Major4 accessory, nested random effects for class order"; 29 Proc mixed data=Phylo.FishData method=REML covtest noclprint=10 noitprint; 30 class Major4 class order family; 31 model SpawnInterval = Major4 / solution chisq; 32 random intercept / subject=class; 33 random intercept / subject=order(class); 34 lsmeans major4; 35 run; WARNING: Class levels for Order are not printed because of excessive size. WARNING: Class levels for Family are not printed because of excessive size. NOTE: 5 observations are not included because of missing values. NOTE: PROCEDURE MIXED used (Total process time): real time 1.39 seconds cpu time 0.12 seconds *** Results of analysis with random statements below *** Type 3 Tests of Fixed Effects Num Den Effect DF DF Chi-Square F Value Pr > ChiSq Pr > F Major4 1 316 3.41 3.41 0.0646 0.0656 Least Squares Means Major accessory Standard Effect 1=true Estimate Error DF t Value Pr > |t| Major4 0 1.3808 0.3978 316 3.47 0.0006 Major4 1 1.5093 0.3957 316 3.81 0.0002 *** Versus case with random statements commented out ****** Proc mixed data=Phylo.FishData method=REML covtest noclprint=10 noitprint; 44 class Major4 class order family; 45 model SpawnInterval = Major4 / solution chisq; 46 * random intercept / subject=class; 47 * random intercept / subject=order(class); 48 lsmeans major4; 49 run; *** Results - no random statements *** Type 3 Tests of Fixed Effects Num Den Effect DF DF Chi-Square F Value Pr > ChiSq Pr > F Major4 1 351 42.18 42.18 <.0001 <.0001 Least Squares Means Major accessory Standard Effect 1=true Estimate Error DF t Value Pr > |t| Major4 0 0.9028 0.06796 351 13.28 <.0001 Major4 1 1.5029 0.06259 351 24.01 <.0001

4. Delayed run time leading to out of memory when LSMEANS is added in proc mixed

5. Contrast & Lsmeans in SAS Proc Mixed

Hello All, As far as I know, SAS Proc Mixed is not calculating the CONTRAST between covariates. It does not even do LSMEANS. I have some covariates in the model that I want to compare them. Please look at the following code: PROC MIXED DATA = Analyze; CLASS cont sex; MODEL colour = sex CovA1 CovA2 CovB1 CovB2 CovA1*CovB1 CovA1*CovB2 CovA2*CovB1 CovA2*CovB2 / SOLUTION; RANDOM cont; RUN; Assume that the interaction is significant, how I can say that the CovA solutions are differnt from each other within each class on CovB and ...? Thanks, MJK

6. lsmeans and estimates in proc mixed

Does anyone know why I am getting non-est lsmeans? ********************************************************************* Program: proc glm data=ana; class product_offer_new pdb_list_type_new zip10r_seg; model resp=product_offer_new pdb_list_type_new zip10r_seg /solution; weight season_effect_weight; lsmeans product_offer_new /stderr pdiff out=prod_adjmeans; lsmeans pdb_list_type_new /stderr pdiff out=list_adjmeans; lsmeans zip10r_seg /stderr pdiff out=zip_score_adjmeans; Class Levels Values PRODUCT_OFFER_NEW 10 PDB_LIST_TYPE_NEW 4 ZIP10R_SEG 20 Output: NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. The GLM Procedure Least Squares Means PRODUCT_OFFER_ LSMEAN NEW RESP LSMEAN Number Commits Non-est 1 Competitor Price Non-est 2 Content Non-est 3 General Price Non-est 4 Intro Price Non-est 5 Offer Non-est 6 Premium Offers Non-est 7 Premium Services Non-est 8 Price Choice Non-est 9 Registration Non-est 10