sas >> a very crude macro on calculating factorial..

by npunnani » Fri, 16 Jul 2004 23:57:16 GMT

Hello all,

I am trying to write a macro that takes in a number and calcualtes the
factorial from 1 to n in the ascending order. I have written it and it
is executing properly but i need suggestions to achieve it in lesser
number of steps. I am a beginner in SAS and so not very good in
building proper logic into the code.can number of loops be reduced?

any suggestions would be of great help .

Thanks

Rao.



/*-------Macro to calculate the factorial from 1 to n in the ascending
order-----*/

dm log 'clear';
dm output 'clear';
%macro factorial(n); /*---parameter to the macro----*/
data dd;
do i = &n to 1 by -1;
j = 1;
m = i;
k = 1;
do until (m<=1);
j = j * m;
k = m - 1;
m = k;
end;
output;
end;
drop m k ;
run;
proc sort;
by i;
run;
proc print noobs label;
label i = 'Integer'
j = 'factorial';
run;
%mend factorial;


sas >> a very crude macro on calculating factorial..

by yhuang » Sat, 17 Jul 2004 00:10:27 GMT


On Fri, 16 Jul 2004 10:57:16 -0500, nageswara rao punnani



Why not use SAS provided a function FACT():

50 data _null_;
51 a=fact(5);
52 put a=;
53 run;

a=120

Kind regards,

Ya Huang



sas >> a very crude macro on calculating factorial..

by sashole » Sat, 17 Jul 2004 02:26:02 GMT

ao,

Of course Ya is absolutely right, and the FACT function will do the job with
[almost] no programming at all [and before FACT(N) became available, there
had been GAMMA(N+1)]. However, it you intend the coding of factorial as an
exercise in the SAS programming, it is as good as another, and in this case,
a few points might be due.

Your gut feeling is right - you only need a single loop. Moreover, by
altering its direction from reverse to forward, you can make your factorials
naturally ordered the way you wish and thus render PROC SORT unnecessary,
all of which simplifies the code. Further, I find it usually more convenient
to not include a procedure to display data together with code producing
data in the same macro. Just make the macro output the file you specify,
then you will have the liberty to display it by a means of your choice - and
in SAS, there are many more than PRINT. In other words,

%macro fact (n= , out=) ;
data fact ;
retain n 0 fact 1 ;
output ;
do n = 1 to &n ;
fact = fact * n ;
output ;
end ;
run ;
%mend fact ;

%fact (n = 18, out = fact) ;

proc SQL ;
select n format=2.
, fact format=18. label = 'n!'
from fact
;
quit ;

Which prints:

n n!
----------------------
0 1
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
9 362880
10 3628800
11 39916800
12 479001600
13 6227020800
14 87178291200
15 1307674368000
16 20922789888000
17 355687428096000
18 6402373705728000

Note that 18! is just as much as you can fit into the SAS integer precision.
Also, 170! is the maximal factorial (~1E306) you can fit in a SAS number
under Windows (less under z/OS, because there the exponent is shorter).

As you see, in this particular case, the availablility of FACT function does
not simplify code way too much; ite would change the Data step stuffing to

do n = 1 to &n ;
fact = fact (n) ;
output ;
end ;

Moreover, it is easy to see that computer-wise, this code is actually less
efficient, for each time FACT() is called, its guts have to recalculate the
entire product instead of just multiplying by the next N. Of course, the
difference is negligible and virtually impossible to notice, unless just for
the sake of curiosity we recompute a factorial enough times (from N = 1 to
170, 1 million times below):

04 %let n = 170 ;
05 data _null_ ;
06 do repeat = 1 to 1e6 ;
07 fact = 1 ;
08 do n = 1 to &n ;
09 fact = fact * n ;
10 end ;
11 end ;
12 run ;
NOTE: DATA statement used (Total process time):
real time 4.87 seconds
cpu time 4.87 seconds

13 data _null_ ;
14 do repeat = 1 to 1e6 ;
15 do n = 1 to &n ;
16 fact = fact (n) ;
17 end ;
18 end ;
19 run ;
NOTE: DATA statement used (Total process time):
real time 1:33.79
cpu time 1:33.54

Kind regards,
----------------
Paul M. Dorfman
Jacksonville, FL
----------------



a very crude macro on calculating factorial..

by Richard A. DeVenezia » Sat, 17 Jul 2004 02:54:20 GMT




Type in haste no doubt.
data fact ;
should be
data &out ;


If I had a hat, I would have to tip it to Paul.

FACT is a function that, in Windows, accepts only 172 distinct inputs: The
integers [0...171].
Q for SI: Why does the function do an internal looping calculation instead
of a table lookup to return the factorial value ?
Last I checked, factorial(I) is a constant over time :) The 172 constants
should be computed at SAS system build time.

I am too impatient to wait out a mere 1.5 minutes, but here is a visual you
can use to examine the O() of the looping inside FACT().
Increase i & N upper limits if you are more patient than I. Increase by
step if you are less impatient.

data timing;
do N = 1 to 40 by 2;
t0 = datetime();
do i = 1 to 5e6;
fact = fact(N);
end;
t1 = datetime();
e = t1-t0;
put n= z2. e= z10.6;
output;
end;
run;

symbol1 i=join v=dot;

proc gplot data=timing;
plot e*n;
run;

--
Richard A. DeVenezia
http://www.devenezia.com




a very crude macro on calculating factorial..

by stringplayer_2 » Sat, 17 Jul 2004 05:40:57 GMT

-- "Richard A. DeVenezia" < XXXX@XXXXX.COM > wrote:

Good point. A table lookup would seem very appropriate for the
FACT function.


I expected to see E increasing linearly with N if additional
looping increased execution time. In fact, the plot of E is flat
over N on my system (SAS 8.12 (TS2M0), Windows 2000 Workstation
service pack 3, 1GB RAM, 2.4 Ghz). I added some additional values
of N from 41 to 171 by 10, and even for large N the execution time
for the FACT function was constant. I take it that you observe
behavior in which E increases with N. Is that correct?

I also ran Paul's code with macro variable N=10 and N=100. Note
that Paul's code computes every factorial from 1! to &N!, so that
the number of factorial computations when &N=100 is 10 times the
number of factorial computations when &N=10. Paul's sequential
algorithm for computing the factorials should scale by a factor
of 10. If the FACT function employs looping from 1 to K, when
computing K!, then the FACT function should not scale by a factor
of 10. Rather, the ratio of execution times should be approximately
112 times longer to compute 1! through 100! than it would be to
compute 1! through 10! since there would be 5050 loops for N=100
and 45 loops for N=10. On the other hand, if the FACT function
employs a table lookup, that should scale (approximately) by 10
since we have to look up 10 times as many values.

My log is shown below. Regardless of whether we employ the FACT
function or code a factorial algorithm, the time that it takes to
execute 1!, 2!, ..., 100! 1E6 times is ten times the length of time
that is required to compute 1!, 2!, ..., 10!.

77 %let n = 10 ;
78 data _null_ ;
79 do repeat = 1 to 1e6 ;
80 fact = 1 ;
81 do n = 1 to &n ;
82 fact = fact * n ;
83 end ;
84 end ;
85 run ;

NOTE: DATA statement used:
real time 0.32 seconds
cpu time 0.32 seconds


86
87 data _null_ ;
88 do repeat = 1 to 1e6 ;
89 do n = 1 to &n ;
90 fact = fact (n) ;
91 end ;
92 end ;
93 run ;

NOTE: DATA statement used:
real time 6.51 seconds
cpu time 6.35 seconds


94
95
96 %let n = 100 ;
97 data _null_ ;
98 do repeat = 1 to 1e6 ;
99 fact = 1 ;
100 do n = 1 to &n ;
101 fact = fact * n ;
102 end ;
103 end ;
104 run ;

NOTE: DATA statement used:
real time 2.90 seconds
cpu time 2.82 seconds


105
106 data _null_ ;
107 do repeat = 1 to 1e6 ;
108 do n = 1 to &n ;
109 fact = fact (n) ;
110 end ;
111 end ;
112 run ;

NOTE: DATA statement used:
real time 1:02.67
cpu time 1:01.20


The factorial algorithm which Paul coded is faster than the FACT
function for computing every factorial value from 1! through N!,
no doubt about that. However, sequential processing
as Paul has coded it should be fast compared to a table lookup
if we have to compute every term from 1! to N!. With a table
lookup, we have to traverse the tree for every new value passed
to the FACT function. We can't take advantage of knowlege of the
value of (K-1)! when computing K! as is done in Paul's factorial
algorithm.

Below is one more proof that the FACT function probably is employing
a lookup table, rather than looping from 1 to K when computing K

a very crude macro on calculating factorial..

by sashole » Sat, 17 Jul 2004 14:53:09 GMT

ale,

Calling 1E7 a billion must have been the end of a pretty strenuous day :-),
but I am grateful, anyway, since if it WERE the billion, my patience waiting
for those tests to finish would be stretched to a personally unbearable
limit. More to the point, I am not sure you have me convinced SAS are using
a sort of table look-up. Or, if they indeed do, then they have not chosen
the right one, the is the naked truth.

Fancy that SAS used a table lookup to implement FACT. Its argument can
realistically range from 0 to 170, which means that the most straightforward
way of implementing the lookup is key-indexed search, right? No one in the
right mind would implement such search as a TREE, or even as a
general-purpose hash (for key-indexed search is by far the fastest hash).
First, you prepare a lookup table:

data fact ;
n = 0 ; fact = 1 ; output ;
do n = 1 to 170 ;
fact = fact (n) ;
output ;
end ;
run ;

Then, when you compute a factorial, you load the key-indexed table F and
search it (while the speed of search is pure O[1]):

204 %let k = 100 ;
205 data _null_ ;
206 array f [0 : 170] _temporary_ ;
207 do until (z) ;
208 set fact end = z ;
209 f [ n ] = fact ;
210 end ;
211 do repeat = 1 to 1e7 ;
212 fact = f [&k] ;
213 end ;
214 put fact= ;
215 run ;

fact=9.332622E157
NOTE: There were 170 observations read from the data set WORK.FACT.
NOTE: DATA statement used (Total process time):
real time 0.25 seconds
cpu time 0.25 seconds

That is how fast (at least) table look-up *should* run. Of course, if it
were coded in the underlying software, I would expect it to run much faster.
Now let us look at the times returned by the FACT function:

216 data _null_ ;
217 do repeat = 1 to 1e7 ;
218 fact = fact (&k) ;
219 end ;
220 put fact= ;
221 run ;

fact=9.332622E157
NOTE: DATA statement used (Total process time):
real time 5.45 seconds
cpu time 5.45 seconds

There are several ways ot explain the results:

1) SAS has FACT() do all the multiplications every time FACT(N) is to be
computed, only, having been coded in the underlying software, it does it
much faster that the same method coded in the SAS Language, which really
shows at K>10.

2) SAS have chosen to use a maladroit table look-up algorithm. Suppose they
use an algorithm similar to a format:

295 data cntlin ;
296 retain fmtname 'fact' type 'n' ;
297 do start = 1 to 170 ;
298 label = put (fact (start), rb8.) ;
299 output ;
300 end ;
301 hlo = 'o' ;
302 label = '.' ;
303 output ;
304 run ;

305 proc format cntlin = cntlin ;
306 run ;

307 %let k = 100 ;
308 data _null_ ;
309 do repeat = 1 to 1e7 ;
310 fact = input (put (&k, fact.), rb8.) ;
311 end ;
312 put fact= ;
313 run ;
fact=9.332622E157
NOTE: DATA statement used (Total process time):
real time 5.71 seconds
cpu time 5.70 seconds

314 data _null_ ;
315 do repeat = 1 to 1e7 ;
316 fact = fact (&k) ;
317 end ;
318 put fact= ;
319 run ;
fact=9.332622E157
NOTE: DATA statement used (Total process time):
real time 5.45 seconds
cpu time 5.45 seconds
-------------------------------------

Well, judging from the similarity of the times,

a very crude macro on calculating factorial..

by Richard A. DeVenezia » Sat, 17 Jul 2004 22:46:36 GMT

ale McLerran wrote:


First, determine a looping baseline. The times shown by this step are
hopefully a constant overhead in three following steps:

-----
data timing;
do N = 0 to 40 ;
t0 = datetime();
do i = 1 to 5e6;
end;
t1 = datetime();
e = t1-t0;
put n= z3. e= z10.6 fact=;
output;
end;
keep N e fact;
stop;
run;

symbol1 i=join v=dot;
proc gplot data=timing;
plot e*n;
run;quit;
-----
Looks like ~.047s per 5e5 steps. An occasional blip to .062. Blips probably
associated with CPU scheduling by OS. In Windows, use Task Manager to
change sas.exe priority to a higher setting did not give faster runtimes.



Going back to original complaint, an impatient examination of FACT times in
a tight loop led to the statement "Why doesn't FACT use a lookup?"

Impatience certainly has its downfalls. Whilst noodling about on other
things I extended my run from limit N of 20 to 170.
The first 20 showed a linear increase in 'e' (from .65s to 1.45s per 5e5
calls) and a bouncy flat for the remaining 150 (roughly a mean of 1.2s per
5e5 calls).
One would be tempted to say some sort of non-looping technique is being used
to return a value from FACT.

So why bouncy or initial linear ? Can't say.
If FACT is in inlined one would not expect the observed, so I might guess
FACT resides (on my system) in a dll that has to be managed by the operating
system which itself might be performing some blackhat supervision and
caching. The cost for the 'behind the curtain' operation is ~1.15s per 5e5
calls (1.20 - .05[overhead])

-----
data timing;
do N = 0 to 170 ;
t0 = datetime();
do i = 1 to 5e6;
fact = fact(N);
end;
t1 = datetime();
e = t1-t0;
put n= z3. e= z10.6 fact=;
output;
end;
keep N e fact;
stop;
run;

symbol1 i=join v=dot;
proc gplot data=timing;
plot e*n;
run;quit;
-----
O(1+k) ?



A slight change, moving the FACT(N) calculation outside the inner loop shows
a 'lack of optimization' by the DATA Step compiler. The compiler does not
to move expression thats are constant within one loop into the outer loop.
If the compiler recognized the fact that the argument to FACT() in the 'i'
loop is not changing, it could make the assignment outside the 'i' loop. If
it further recognized that nothing is occurring in the 'i' loop the loop
could be reduced to a simple i=<loop limit>. DO loops certainly can be
optimized in these ways since the step and limit cannot be changed once the
loop is started. DO WHILE and DO UNTIL however are much tougher since their
limit expressions are evaluated at each step.

-----
data timing;
do N = 0 to 170 ;
t0 = datetime();
fact = fact(N);
do i = 1 to 5e6;
end;
t1 = datetime();
e = t1-t0;
put n= z3. e= z10.6 fact=;
output;
end;
keep N e fact;
stop;
run;

symbol1 i=join v=dot;
proc gplot data=timing;
plot e*n;
run;quit;
-----
As expected, this is essentially identical to looping baseline.



Now a version that does use looping to calculate a FACT each time.
Limit N to 40, impatience is a hard habit to break. Besides, this is classic
O(n).
-----
data timing;
do N = 0 to 40 ;
t0 = datetime();
do i = 1 to 5e6;
* loop to calculate fact;
fact = 1;
do j = 1 to N;
fact = fact * j;
end;
end;
t1 = datetime();
e = t1-t0;
put n= z3. e=

Similar Threads

1. FACT function (was a very crude macro on calculating factorial..)

2. Problem with macro: FACTORIAL

I'm trying to write this small macro:

calculate N! Example if N=4  we will have 1*2*3*4=24 =4!

The following macro very well when N is not zero, but when N=0 It doesn't
work.
Can someone help.
Thanks a lot.

%macro FACT(N=);
%*--------------------------------------------------------;
%* Return N! Example if N=4  we will have 1*2*3*4=24 =4!  ;
%*--------------------------------------------------------;
data fac;
 FACT=1;
   %If &N=0 %then   FACT=1;
   %do I=1 %to &N;
      FACT =FACT*&I;
       %end;
proc print label;
 var FACT;
 run;

%mend FACT;

%FACT(N=4);
%FACT(N=0);

3. RE : Problem with macro: FACTORIAL

4. How to compare 2 crude rates from the same population

5. calculating a macro

6. Calculating with macro values (ie comparing to a # in if/then st atement)

Venita,

Could you show some examples of what does not work for you? Please include what is in the macro variable when it does not work.
Your example of : %if &numzeros ne 0 %then %do . . . . . %end;
should work as long as &numzeros contains numeric characters.
The macro processor should automatically do an %eval on it.

Regards,
Dennis Diskin

"DePuy, Venita" < XXXX@XXXXX.COM > wrote:
Hi all -

Hoping you can help me with an easy question that I just seem to be
missing -
I need to use a macro variable in some if/then calculations . .

ie
%if &numzeros ne 0 %then %do . . . . . %end;

%if Probt le &alpha %then %do . . . %end;

You get the idea.
The obvious issue being that macro vars are character. I'm just not
seeing a simple way to get them to numeric.
Trying a input statement gave me an error (although it could be just
me?)
I've tried %eval statements, with no luck (some error about the
operand not being present I think?)

One solution I've found is %if (%substr(&numzeros,1,1) ne 0) %then
%do . . . because numzeros is an integer, so if the first digit is zero,
it's zero . . but I know that's not the best way.

I've got several different versions of this in the program I'm
writing (ie comparing a macro variable to a number) so thought the most
efficient method would be to ask you all what I'm not seeing . . .


Thanks much!
Venita

7. Calculating with macro values (ie comparing to a # in if/then st atement)

8. Calculating with macro values (ie comparing to a # in if/the n st atement)