### sas >> how to generate multivariate random data from a given

hi,

I want to generate multivariate random data from a given distribution, which
is not multivariate normal or student's t.
especially, the idea is from the paper by Clayton el al(1985): Journal of
royal statistical society, ser A.
Does anybody have some suggestion? thanks

Jeff

```for instance, randomly select one value from (1,2,3,4). i know ranuni is
random sampling from U(0,1)
thanks
```

```Hi,

I need to create random, dummy data in the following scenarios.  Where I
believe I have a solution, I'll post it, but am open to improvements.

1.  Uniform distribution of contiguous numeric values (eg. 1,2,3,4,5)
var = int(ranuni(0)*5)+1;  where 5 = number of elements, 1 = starting value

data _null_;
do i=1 to 20;
var = int(ranuni(0)*5)+1;
put var=;
end;
run;

2.  Uniform distribution of non-contiguous numeric values (eg. 9,3,7,4,5)
array list{5} _temporary_ (9,3,7,4,5);
var = list{int(ranuni(0)*dim(list))+1};

data _null_;
array list{5} _temporary_ (9,3,7,4,5);
do i=1 to 20;
var = list{int(ranuni(0)*dim(list))+1};
put var=;
end;
run;

Note:  this approach is generic, and could also be used for #1.

3.  Uniform distribution of either contiguous or non-contiguous character
values

Use similar approach to #2

data _null_;
array list{5} \$ _temporary_ ("S","C","O","T","B");
do i=1 to 20;
var = list{int(ranuni(0)*dim(list))+1};
put var=;
end;
run;

4.  Non-uniform distribution of boolean values (desire a weighting toward
one or the other)

var = (ranuni(0) > .8);  where we desire 80% of hits to be 0, 20% to be 1
this assumes 0 based numbering, use an offset if
using 1 (or n) based numbering

reverse the comparison if we want 80% of hits to be 1, i.e.

var = (ranuni(0) < .8);

data _null_;
do i=1 to 20;
var = (ranuni(0) > .8);
put var=;
end;
run;

=======================================

Here is where I'm stuck for ideas...

5.  Non-uniform distribution of multiple values, weighting desired for one
item, remaining percentage spread amongst other values.

eg. (1,2,3,4,5), desire 80% of hits on 4, 20% of hits spread between 1,2,3,5

Perhaps this is a good approach???

data _null_;
array list{4} _temporary_ (1,2,4,5);
do i=1 to 40;
if (ranuni(0) < .8) then
var = 3;
else
var = list{int(ranuni(0)*dim(list))+1};
put var=;
end;
run;

6.  Non-uniform distribution of multiple values, weighting desired for
multiple items, remaining percentage spread amongst other values.

eg. (1,2,3,4,5), desire 30% of hits on 2, 20% of hits on 4, rest of hits

I'm stuck on the best approach on this one.  Any good ideas?

The solutions needs to run within a data step, as this algorithm would be
part of a larger data step.

Any input, esp. on #5 and #6, is appreciated.

Thanks,
Scott

P.S.:  The final solution would be a macro that would get the values of a
format and create this code.  For example (psuedocode and untested):

proc format;
value code (NOTSORTED)
9 = "Code 1"
3 = "Code 2"
7 = "Code 3"
4 = "Code 4"
5 = "Code 5"
;
run;

data testdata;
attrib code1 length=8 format=code.;
attrib code2 length=8 format=code.;
do pt=10011001 to 10011010;
code1=%dummy_data(code);  * uniform distribution across uncoded format
values ;
code2=%dummy_data(code,wval=7,wpct=.8);  * 80% of values are 7,
remainder are spread across rest of values ;
output;
end;
run;

This would likely involve creating a proc format cntlout dataset, using
%sysfunc to open that dataset, build some macro variables, and generate the
appropriate SAS code.  Of course, the generated SAS code must be
syntactically correct for a data step (i.e. cannot invoke a procedure, etc).

```

```Dear All,

I have a dataset with an hierarchy structure individuals nested within countries, I have four continuous outcomes Y1,Y2,Y3,Y4, the correlation between these variables varies from 0.15 to 0.56.

I have two continuous and 4 categorical independent variables.

I first used an univariate random effect model, i.e. separate mixed mode for each Yi

Only 1 of my 4 outcomes , Y2, had a country effect, with 5% of the variance of Y2 was explained by the country effect, all three others have no significant country effect as judged by the p value of the random effect solution.

I went to use proc mixed in a multivariate framework by analysing the 4 outcomes (Y1,Y2,Y3,Y4) in one single model,  I have used the  code kindly provided by our friend Dale.

I run out of memory in the full multivariate model, then I have considered 6 separate bivariate random effect models (Y1,Y2) , (Y1,Y3).(Y3,Y4)  to take into account the bivariate correlation of the outcomes.

I have noticed some differences in the fixed part between different bivariate models , and with the univariate model (when I consider proc mixed for each outcome separately)

My question is  how I will interpret the results of these 6 models and the separate univariate mixed model, and what will be the best way to take into account the correlation of the outcomes and the hierarchy of my design.