sas >> predicted value from "proc mixed"

by chunling_lu » Wed, 06 Jul 2005 22:42:45 GMT

Dear all,

Does predicted value generated from "proc mix" including or not exluding random effects? I am always confused...

Y = XB + random effects + errors
Predict Y = XB
Or Predicted Y = XB + random effects

???

Thanks very much.

Chunling


DavidL Cassell < XXXX@XXXXX.COM > wrote:
XXXX@XXXXX.COM replied:
>How about this:
>
>If the input dataset has any missing, I don't want to import it, if
>there's no missing, I need to import the dataset. A global falg as you
>mentioned will do it.

Much clearer. Thank you. Now I have some more questions:

[1] If you don't import the file, then what do you do next?

[2] If you *do* import the file, then what do you do next?

[3] Why does even one missing value make the whole file invalid?
Why isn't there some provision for dealing with a missing value or
two (or ten, or...)?

[4] What are you using to do the conversion from SPSS?

Thanks for clarifying,
David
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

sas >> predicted value from "proc mixed"

by davidlcassell » Thu, 07 Jul 2005 07:37:33 GMT



Well, it depends on what you're trying to predict in PROC MIXED.

If you want to predict a mean at a given point (the OUTPREDM= option), then
you're getting Y = XB (where B is the beta-hat vector). Output values do
not
incorporate the EBLUP values for ZG. Standard errors are based only on the
covariance matrix for B.

If you want to predict a single predicted value (the OUTPRED= option), then
you're getting Y = XB + ZG (where G is the gamma-hat matrix and ZG is the
EBLUP
for Z*gamma). Standard errors are based on a quadratic form incorporating B
and G both.

HTH,
David
_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar get it now!
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/

sas >> predicted value from "proc mixed"

by chunling_lu » Thu, 07 Jul 2005 22:36:54 GMT

Dear David,

Thanks very much for the clarification.

I would like to get predicted value for each individual, I guess then using OUTPRED will do that. Since the predicted value includes random effects, so if I want to get predicted value without random effects, I will have to use (predicted value - random effects), where random effects for each individual will also be generated from the model, is that correct?

Chunling




Well, it depends on what you're trying to predict in PROC MIXED.

If you want to predict a mean at a given point (the OUTPREDM= option), then
you're getting Y = XB (where B is the beta-hat vector). Output values do
not
incorporate the EBLUP values for ZG. Standard errors are based only on the
covariance matrix for B.

If you want to predict a single predicted value (the OUTPRED= option), then
you're getting Y = XB + ZG (where G is the gamma-hat matrix and ZG is the
EBLUP
for Z*gamma). Standard errors are based on a quadratic form incorporating B
and G both.

HTH,
David
_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar ?get it now!
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/


---------------------------------
Discover Yahoo!
Have fun online with music videos, cool games, IM & more. Check it out!

sas >> predicted value from "proc mixed"

by davidlcassell » Fri, 08 Jul 2005 08:39:47 GMT


XXXX@XXXXX.COM replied:

If all you want is the point estimate using XB without ZG, then use
the OUTPREDM= option and get the predicted means. Then ignore
the standard errors that come with it, since they are for predicted
means and not single predicted points.

If you want the individual predicted value, you ought to use OUTPRED=
and leave the predicted value alone. Subtracting off the random effects
as you suggest will give you a number which has no relationship with
the standard errors that come with the point estimates.

HTH,
David
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

Similar Threads

1. predicted values in proc mixed

2. Inverse Predicted values from proc mixed

Hello everyone,

I was wondering if there an easy way to get the inverse predicted values
from a mixed models; that is, the values corresponding to the independent
variable (X) based on the predicted values for the dependent variable (Y).

Thanks,

Tony

3. proc mixed - predicted values the same

4. MIXED & GLIMMIX: Predicted values, class & subj variables

I am working on a project using the GLIMMIX procedure to estimate a
random coefficients Logit model.  BUT, I am struggling to understand
some theoretical concepts and SAS programming under the MIXED/GLIMMIX
procedure.  I have some questions I was hoping to get help with.

=20

1A. Theoretical Question: if you estimated a random coefficients model
(say with PROC MIXED) would you expect the mean of the predicted values
BY SUBJECT to equal the mean of the dependent variable BY SUBJECT?

=20

For example, using data from SAS's example 41.5 for the MIXED procedure:

-------------------------------

proc mixed data=3Drc;

 class Batch Monthc;

 model Y =3D  / s outp=3Dpredicted;

 random Monthc / sub=3DBatch s;

run;

=20

proc sort data=3Dpredicted;

 by Batch;

proc summary data=3Dpredicted;

 where Y~=3D.;

 by Batch;=20

 var Y Pred;

 output out=3Dtestst1 mean=3D ;

=20

proc print data=3Dtestst1; =20

=20

proc means data=3Dpredicted;

  where Y~=3D.;

 var Y Pred;

----------------------------

I found that the mean of Y and mean of Pred over all observations was
the same, but that the mean of Y and mean of Pred within each Batch were
NOT the same.  Why would that be? Under what conditions would you expect
them to be the same?

=20

1B.  In the non-linear world of GLIMMIX, I find that the mean of Y and
Pred over ALL observations was not the same.  Is this do to the
non-linear nature (would one expect this theoretically) or is this
likely a programming problem?=20

=20

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

=20

2. SAS programming question:  I am trying to figure out what happens
when you include a categorical variable in the random statement.  In the
code based on example 41.5 pasted above is there an equivalent, using
dummy variables, to the "random Monthc" part of the program?  (Note: In
the example Monthc takes on 6 values: 0, 1, 3, 6, 9, 12.)

=20

What I am struggling with is how PROC MIXED comes up with estimated
coefficient for the fixed effect (the intercept) AS WELL AS estimates of
random effects for each value of Monthc and each Batch (6 Monthc values
* 3 Batch Values =3D 18 estimated coefficient).  If I was doing this via
dummy variables, I would think I would have to leave a dummy out and
hence not have all those estimated co-efficients.  ( Note: on page 2089
of the manual it says, in reference to the opening example: "The CLASS
statement instructs PROC MIXED to consider (variables listed in the
CLASS statement) as classification variables.  Dummy (indicator)
variables are, as a result, created corresponding to all of the distinct
levels of (variables listed in the CLASS statement").

=20

I tried to recreate example 41.5 putting 6 dummies (month00 month01
month03 month06 month09 month12) in the random statement instead of
Monthc.  SAS smartly excluded one of them.   (Though I'm not sure why
they picked Month06 to exclude.) =20

=20

----------------------------

data rc2;

 set rc;

if month =3D 0 then month00 =3D 1; else month00 =3D 0;=20

if month =3D 1 then month01 =3D 1; else month01 =3D 0;=20

if month =3D 3 then month03 =3D 1; else month03 =3D 0;=20

if month =3D 6 then month06 =3D 1; else month06 =3D 0;=20

if month =3D 9 then month09 =3D 1; else month09 =3D 0;=20

if month =3D 12 then month12 =3D 1; else month12 =3D 0;

=20

proc mixed data=3Drc2;

 class Batch ;

 model Y =3D  / s outp=3Dpredicted2;

 random month00 month01 month03 month06 month09 month12 / sub=3DBatch s;

run;

----------------------------

=20

=20

Any help or reference to understand what happens to classification
variables in the random statement in PROC MIXED and GLIMMIX would be
helpful.

=20

=20

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

=20

=20

3. (Related to question 1) In the model that I am interested in, when I
take the mean of the predicted values and the mean of the independent
variable over all observations, the mean of the predicted values is less
than the mean of the independent variable.  Also, when I take the means
within subjects (for me subjects are states), the means of the predicted
values for each subject (state) is *less* than the mean of the
independent variables for each subject (state).

Am I doing something wrong?  Why does this happen?

=20

My code:

=20

  proc glimmix data=3Dt21.hlmALL03 method=3DMMPL ;

   class sdtype sevetype CLtype FIPS;=20

   model IEPinc =3D   / dist=3Dbinomial link=3Dlogit SOLUTION;

   random  sdtype CLtype sevetype  / sub=3DFIPS SOLUTION G;

   NLOPTIONS tech=3Dnrridg;=20

   weight ORIGWT; =20

   output out=3Dpreddata PREDICTED(ilink blup)=3Dpred;

 run;=20

=20

 proc means data=3D preddata;

   var IEPinc pred;

   weight ORIGWT;

=20

INFO on my Project given below.

=20

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

4. About the project:

I am trying to get estimates, for each state, of the relationship
between a student's characteristics and whether or not he/she is
included in an assessment.  I intend to use the coeffieints of the model
to apply to a subsequent year of data to do a decomposition
(Oaxaca-Blinder / Findley for this non-linear case) FOR EACH STATE of
the portion of change in inclusion rates that are due to changes in
student population in the state and the proportion that is due to all
other factors.   I could estimate separate logits for each state, since
I am really just interested in state by state changes, but particular
SDtypes are not common, so a state may have 1 observations for =
SDtype=3D4
in the first year and 10 in the year I am applying the model to.  It was
thought that combining the estimation in a multi-level model would help
me estimate coefficeints for states with 0 or few observations for
particular type (say SDtype=3D4) drawing on data from states with many
observations.  It is thought that states have some similarities in how
they handle different types of students but also some differences and
that differences are systemic state-wide.

=20

IEPinc =3D 0 if not included; 1 if included on the assessment

Sdtype has 13 values=20

Sevetype has 4 values

CLtype has 3 values

FIPS is the state identification variable

=20

Note also that students are nested within schools within states, but I
ignore, perhaps incorrectly, the with-in school nesting.  The weights
are given to make the students within each state representative of that
state.=20

=20

Comments on my modeling of the situation in "proc glimmix" above would
be very welcome.  Specifically:

4a.  Is this the right way to get separate coeffiecnts by state?

4b. Do I need to include "sdtype CLtype sevetype" in the MODEL
statement?  I want separate estimates by state so I left out the fixed
effects portion.

4c. Do I need to inclued "Intercept" in the RANDOM statement?  I left it
out b/c it caused the G matrix to not have full rank.  The estimated
coefficients come out the same.

=20

=20

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

=20

=20

Thanks for reading and thanks for all the help people give on this
list... the archives have always been a great resource for me.

=20

Sami=20

=20

=20

5. proc mixed. group trajectory of the predicted dependent

6. Predicted values and CI's in proc glm

Hi All, 

 

Does anyone know if it is possible to calculate predicted values AND their
confidence intervals for an individual with known values of the explanatory
variables, using proc glm?

 

In proc genmod it is possible to do this as follows:

1)       Extend the dataset by one or more row(s) corresponding to the values
that you want to predict.

2)       Create a weight variable, such that the original data are given a
weight of 1, and the new variable a weight of zero, so that only the original
data is used in the model fitting calculations.  When predicted values are
asked for, proc genmod gives predicted values for the original data and the
new lines.

 

I tried this in proc glm, but the values for the confidence interval of the
predicted values of my new lines are just listed as missing, with a note that
'observation was not used in this analysis'.

 

I have created some dummy data: - the last two lines of the data are the ones
I want to predict values and confidence interval of the y variable for, but
not to take into account when fitting the model.  

 

data MadeUp;

   input y x1$ x2$ weight;

   datalines;

10 y n 1

15 n y 1

12 y n 1

10 y y 1

11 n y 0

12 y n 0

;

run;

proc glm data=MadeUp;

   class x1 x2;

   weight weight;

   model y = x1 x2 / cli;

run;

 

To put this into context, I have data on fruit yield (y) with several factors
that predict growth (x1 - x4).  Each of the x1-x4 have a cost associated with
them, and I will go on to calculate the profit and CI for profit based on
predicted yield and the cost of having each factor (i.e. i.e. calculate the
economic optimum).

 

I'm running SAS 8.2 on Windows XP.  Any help would be much appreciated.

 

Thanks, 

 

Mark

 

7. AW: predicted values in proc genmod

8. PROC NLMIXED predicted values (zero inflated Poisson)

Referencing much-appreciated prior posts by Dale McLerran, I fit a zero
inflated Poisson model using:

proc nlmixed data=mmf.model;
  by groupvar;
  ETA_PROB = BP_0 + BP_1*z1;
  p_0 = exp(eta_prob)/(1 + exp(eta_prob));
  ETA_LAMBDA = B0+ B1*x1+B2*x2+B3*x3+
      B4*x4+B5*x5+B6*x6+B7*x7;
  lambda = exp(eta_lambda);
  if y=0 then prob = p_0 + (1-p_0)*exp(-lambda);
  if y=0 then loglike = log(prob);
  else loglike = log(1-p_0) + y*log(lambda) -lambda - lgamma(y+1);
  model y ~ general(loglike);
run;

and now I need to plot the predicted values of y versus the observed
values of y. Can these predicted values of y be output somehow from PROC
NLMIXED (as a colleague, who is not available for questioning, indicated)?
I can't seem to find a way. Otherwise, I think that what I need to do is
compute for each observation "p_0_hat" and "lambda_hat" (so, predicted
values of p_0 and lambda, respectively) and then compute y_hat = (1-
p_0_hat)*lambda_hat. Is that right? Thank you!