sas >> Sql FIRST function

by cjsadler » Thu, 21 Aug 2003 22:00:21 GMT

Is there some sort of equivalent in proc SQL for the FIRST function? (In
non-SAS SQL, this function enables you to get the first value of a variable
when using a 'group by' function). Essentially, what I want to accomplish
is this:

Take this dataset:
id vizperiod name viztotal
1 qtr1 Bob 3
1 qtr1 Robert 4
3 qtr1 Jim 2
3 qtr1 James 3

Turn in into this:
1 qtr1 Bob 7
3 qtr1 Jim 5

The ideal SQL would be something like this:
proc sql;
select id, vizperiod, first(name), sum(viztotal) as sumviz
from dataset1
group by id, vizperiod;

Right now, we don't really care that much about the name field. We are
more interested in the summaries and will deal with the varied way that
people's names are entered later. However, for the time being we do need
*some* name value. But there is no FIRST function in SAS. So how do I
deal with the name field? (I'm only curious about whether this is possible
in proc SQL-- I know there's plenty of ways to solve this problem with
regular SAS code)

Thanks,
Chris Sadler

_________________________________________________________________
<b>MSN 8:</b> Get 6 months for $9.95/month
http://join.msn.com/?page=dept/dialup


sas >> Sql FIRST function

by pudding_man » Fri, 22 Aug 2003 05:29:47 GMT


an Old Bean,

Do you really define a "good data structure" as one
in which a new field must be created solely so that
a subsequent program structure (SQL step in this
case) can grab a value from the first record in
a group? :-)

Also, your code doesn't sum the 'viztotal' data.

SQL is good for lots of things. I don't know that
this is one of 'em. I assume Chris knows
what to do with this in a data step. Can Chris
name more than one vendor's SQL implementation
that supports a FIRST function? I don't think
FIRST is ansi ...

Prost,
Puddin'

*******************************************************
*** Puddin' Man *** XXXX@XXXXX.COM *****
*******************************************************;

"Well, mama don't allow no Boogie-Woogie round here
Mama don't allow no Boogie-Woogie round here
Well we don't care what mama don't allow
We're gonna Boogie-Woogie anyhow
Mama don't allow no Boogie-Woogie round here "
- Traditional

Ian Whitlock wrote:

the



sas >> Sql FIRST function

by WHITLOI1 » Fri, 22 Aug 2003 20:31:59 GMT

uddin',

I did miss the fact that he wanted the column summed. However, both the
title and first sentence do indicate a desire to work with an odered file in
SQL. Given those conditions and the way SQL is designed to be order free, I
suggest that adding a variable is essential.

Now how would you pick out the last name or the third name in each group?

XXXX@XXXXX.COM
-----Original Message-----
From: Puddin' Man [mailto: XXXX@XXXXX.COM ]
Sent: Thursday, August 21, 2003 5:30 PM
To: XXXX@XXXXX.COM
Subject: Re: Sql FIRST function


Ian Old Bean,

Do you really define a "good data structure" as one
in which a new field must be created solely so that
a subsequent program structure (SQL step in this
case) can grab a value from the first record in
a group? :-)

Also, your code doesn't sum the 'viztotal' data.

SQL is good for lots of things. I don't know that
this is one of 'em. I assume Chris knows
what to do with this in a data step. Can Chris
name more than one vendor's SQL implementation
that supports a FIRST function? I don't think
FIRST is ansi ...

Prost,
Puddin'

*******************************************************
*** Puddin' Man *** XXXX@XXXXX.COM *****
*******************************************************;

"Well, mama don't allow no Boogie-Woogie round here
Mama don't allow no Boogie-Woogie round here
Well we don't care what mama don't allow
We're gonna Boogie-Woogie anyhow
Mama don't allow no Boogie-Woogie round here "
- Traditional

Ian Whitlock wrote:

the
need


Sql FIRST function

by RHOADSM1 » Sat, 23 Aug 2003 04:01:16 GMT

So exactly what does the initiation involve?? ;-)

I don't think in this context that DISTINCT is required (although I could
always be convinced by a counterexample). MIN(name) is only going to return
a single value, even if multiple rows share that value. I can see that
there would be a problem in other situations, such as if I were filtering on
name=MIN(name) and assuming that I would always get only one row back.

Mike Rhoads
Westat
XXXX@XXXXX.COM

-----Original Message-----
From: Sigurd Hermansen
Sent: Thursday, August 21, 2003 10:59 PM
To: Mike Rhoads; ' XXXX@XXXXX.COM '
Subject: RE: Sql FIRST function


One would probably need to add the DISTINCT qualifier to the SELECT clause
to guard against duplicate minimum values of name in a group.

Based on your solution to Chris' problem, it appears that we'll have to add
you, despite any objections you might raise, to the roster of SQLheads.

Sig
-----Original Message-----
From: Mike Rhoads
To: XXXX@XXXXX.COM
Sent: 8/21/2003 4:17 PM
Subject: Re: Sql FIRST function

Chris,

Since at this point it sounds like any name in the group will "do" (not
necessarily the first), to get a pure SQL solution I would go with

SELECT id, vizperiod, MIN(name) AS nameguess, SUM(viztotal) AS sumviz FROM
dataset1 GROUP BY id, vizperiod;

This won't give you the first name as the data are ordered, but rather the
"first" name in an alphabetical sort. The key to this is the useful fact
that the MIN and MAX aggregate functions (i.e. working across rows) in SQL
can handle character as well as numeric data.

Mike Rhoads
Westat
XXXX@XXXXX.COM

-----Original Message-----
From: Christopher Sadler [mailto: XXXX@XXXXX.COM ]
Sent: Thursday, August 21, 2003 10:00 AM
To: XXXX@XXXXX.COM
Subject: Sql FIRST function


Is there some sort of equivalent in proc SQL for the FIRST function? (In
non-SAS SQL, this function enables you to get the first value of a variable
when using a 'group by' function). Essentially, what I want to accomplish
is this:

Take this dataset:
id vizperiod name viztotal
1 qtr1 Bob 3
1 qtr1 Robert 4
3 qtr1 Jim 2
3 qtr1 James 3

Turn in into this:
1 qtr1 Bob 7
3 qtr1 Jim 5

The ideal SQL would be something like this:
proc sql;
select id, vizperiod, first(name), sum(viztotal) as sumviz
from dataset1
group by id, vizperiod;

Right now, we don't really care that much about the name field. We are
more interested in the summaries and will deal with the varied way that
people's names are entered later. However, for the time being we do need
*some* name value. But there is no FIRST function in SAS. So how do
I
deal with the name field? (I'm only curious about whether this is possible
in proc SQL-- I know there's plenty of ways to solve this problem with
regular SAS code)

Thanks,
Chris Sadler

_________________________________________________________________
<b>MSN 8:</b> Get 6 months for $9.95/month
http://join.msn.com/?page=dept/dialup


Sql FIRST function

by pudding_man » Mon, 25 Aug 2003 04:09:48 GMT

an Whitlock wrote:

in

Hmmmm. Lemme see. "Ordered file". OK ...

I

Hmmmm. Lemme see. "Order free". Wait a minute, heah ... <g>


Well, I have no objection to your adding a var. Particularly if
it solves the poster's dilemma (perhaps it does). In fact, I
have no objection to any kludge employed of necessity to get
some work done. Have been forced to employ a few my po' self
over the years. Hopefully I referred to all of 'em as
k-l-u-d-g-e-s.

My objection was to your reference to the perceived
kludge as "good data structure".

It's one thing to design files per "The Relational Model"
and get lotsa mileage out of SQL applications exploiting
proper data design. It's quite another to take "a collection
of records" and start adding superflous fields so that a
relational language can kludge record-oriented processing
(that it was not really designed to perform). Methinks this
constitutes Bad Practice. Even more so when a good record-
oriented language (i.e. data step programming) is at your
fingertips (obviating the need for superflous fields). Eh?

Do you fancy that Codd and Date (noted Relational Model
pioneers) would've described your modified dataset as "good
data structure"? :-)


If you are inviting po' me to engage in the business of
finagling the proverbial square peg ("Ordered file") into
the equally proverbial round hole ("Order free SQL"), I
decline forthwith. Feel free to demo if you like. Please
to include the desired sum. <g>

As to "Ordered file" with record-oriented programming, I
s'pose I might employ something as pedestrian as:

%let namerec = 3;

data bb(drop = i nm vt);

do i = 1 by 1 until (last.id);
set aa(rename = (name = nm viztotal = vt));
by id;
viztotal = sum(viztotal, vt);
if i = &namerec then name = nm;
end;

if i < &namerec then name = nm;
run;

This of course would be for third name when the count
of recs in a group is >= 3, last name otherwise. Some
of this little pgm structure looks vaguely familiar?

Can somebody please confirm that Chris's FIRST function
is not part of any (old or new) ansi sql standard? Just
to minimize extant mis-info on the list?? :-)

Prosit,
Puddin'

*******************************************************
*** Puddin' Man *** XXXX@XXXXX.COM *****
*******************************************************;

"Men are not final individuals but units in the greater
beast, The Phalanx ... The nature of The Phalanx is not
the sum of the natures of unit-men, but a new individual
having emotions and ends of its own, and these are foreign
and incomprehensible to unit-men."
- John Steinbeck, 1934

(In


Sql FIRST function

by HERMANS1 » Mon, 25 Aug 2003 23:43:14 GMT

y tests under SAS V9 show that you are right. SAS SQL handles the summary
value correctly without the DISTINCT qualifier (which would not affect the
results but might trigger an unnecessary sort of the yield of the query). I
have tended to throw in the DISTINCT qualifier as superstitious behavior due
to lack of understanding of when and where SAS SQL remerges data to
construct a solution. Now I know better.
Sig

-----Original Message-----
From: Mike Rhoads
Sent: Friday, August 22, 2003 4:01 PM
To: XXXX@XXXXX.COM
Subject: Re: Sql FIRST function


So exactly what does the initiation involve?? ;-)

I don't think in this context that DISTINCT is required (although I could
always be convinced by a counterexample). MIN(name) is only going to return
a single value, even if multiple rows share that value. I can see that
there would be a problem in other situations, such as if I were filtering on
name=MIN(name) and assuming that I would always get only one row back.

Mike Rhoads
Westat
XXXX@XXXXX.COM

-----Original Message-----
From: Sigurd Hermansen
Sent: Thursday, August 21, 2003 10:59 PM
To: Mike Rhoads; ' XXXX@XXXXX.COM '
Subject: RE: Sql FIRST function


One would probably need to add the DISTINCT qualifier to the SELECT clause
to guard against duplicate minimum values of name in a group.

Based on your solution to Chris' problem, it appears that we'll have to add
you, despite any objections you might raise, to the roster of SQLheads.

Sig
-----Original Message-----
From: Mike Rhoads
To: XXXX@XXXXX.COM
Sent: 8/21/2003 4:17 PM
Subject: Re: Sql FIRST function

Chris,

Since at this point it sounds like any name in the group will "do" (not
necessarily the first), to get a pure SQL solution I would go with

SELECT id, vizperiod, MIN(name) AS nameguess, SUM(viztotal) AS sumviz FROM
dataset1 GROUP BY id, vizperiod;

This won't give you the first name as the data are ordered, but rather the
"first" name in an alphabetical sort. The key to this is the useful fact
that the MIN and MAX aggregate functions (i.e. working across rows) in SQL
can handle character as well as numeric data.

Mike Rhoads
Westat
XXXX@XXXXX.COM

-----Original Message-----
From: Christopher Sadler [mailto: XXXX@XXXXX.COM ]
Sent: Thursday, August 21, 2003 10:00 AM
To: XXXX@XXXXX.COM
Subject: Sql FIRST function


Is there some sort of equivalent in proc SQL for the FIRST function? (In
non-SAS SQL, this function enables you to get the first value of a variable
when using a 'group by' function). Essentially, what I want to accomplish
is this:

Take this dataset:
id vizperiod name viztotal
1 qtr1 Bob 3
1 qtr1 Robert 4
3 qtr1 Jim 2
3 qtr1 James 3

Turn in into this:
1 qtr1 Bob 7
3 qtr1 Jim 5

The ideal SQL would be something like this:
proc sql;
select id, vizperiod, first(name), sum(viztotal) as sumviz
from dataset1
group by id, vizperiod;

Right now, we don't really care that much about the name field. We are
more interested in the summaries and will deal with the varied way that
people's names are entered later. However, for the time being we do need
*some* name value. But there is no FIRST function in SAS. So how do
I
deal with the name field? (I'm only curious about whether this is possible
in proc SQL-- I know there's plenty of ways to solve this pro

Sql FIRST function

by don.stanley » Tue, 26 Aug 2003 07:10:06 GMT

In this case as several people including I think the poster stated SQL
isn't necessary. However in cases where SQL is necessary for whatever
reason, I often pass a datastep view into the SQL. Then whatever has
to be done in SQL can be, and the extra functionality that the
underlying datastep view can also be used.

I suspect the original code may have been looking for something that
could be used in any SQL, not just SAS. This won't give that. But it
will return the first record in a group for a SAS based SQL.

Don

To solve the problem that was posted using this, use the following
code:

data test ;
input id vizperiod $ name $ viztotal ;
cards ;
1 qtr1 Bob 3
1 qtr1 Robert 4
3 qtr1 Jim 2
3 qtr1 James 3
;
run ;

data vtest / view=vtest;
set test ;
by id ;
if first.id then dummy = 1 ; else dummy = 0 ;
run ;

proc sql ;
select id , vizperiod , name , sum(viztotal) as total
from vtest
group by id , vizperiod
having dummy = 1 ;
quit ;


Similar Threads

1. Antwort: Sql FIRST function

2. SQL min function vs. .FIRST in DataStep

3. Equivalent of the "Rollup" SQL function in SAS (and the CUBE function also)

Hello,

I can't find how to get this result...

I'd like to count the number of different persons in classes, in departments 
and in the whole organization. But one person can be in more than one class 
and in more than one department.

In Oracle's SQL, I would write:
  SELECT year, department, class, COUNT(DISTINCT idnumber) AS nb
  FROM mytable
  GROUP BY ROLLUP(year, department, class);

An example:

id    Year    Dept    Class
-------------------------
1    2000    Sci.     Math
1    2000    Sci.    Math
1    2000    Sci.    Phy
1    2000    Letters    French
2    2000    Sci.    Phy
2    2000    Sci.    Phy
3    2000    Letters    French
3    2000    Letters    Span.

The result would give:
Year    Dept      Class        Nb
------------------------------
2000    Sci        Math         1
2000    Sci        Phy           2
2000    Sci        (total)        2
2000    Letters    French    2
2000    Letters    Span.      1
2000    Letters   (total)       2
2000    (total)     (total)      3

Thanks 


4. Sql-queries in aggregate functions in PROC SQL

5. huge sql join, fast way to sample first?

I have a ginormous sql join job (data comes from 2,3 and 4 tables)

what's the easiest way to first test the data, before actually letting it rip?
the goal would be to first look at a very small piece (maybe thousand rows) but the
actual job is dependent on joining tens of millions of rows from multiple tables

subsetting may also (?) create a condition that nothing is created in output with too
little data in the join to evaluate

ie.
proc sql (obs=1000);
...lots of create, select, where, having and containing statements

or
options obs=1000;
proc sql ;
...lots of create, select, where, having and containing statements

something totally different?

6. selecting the first 2 observation in SQL

7. SQL keep first obs in each group

Hi,

Is there any way to keep the first obs in each group in sql. I am looking
for an sql way of doing below. Thank you.

Cheers,
Chang


proc sort data=sashelp.class out=sorted;
  by sex name;
run;

data firstObs;
  set sorted;
  by sex name;
  if first.sex;
run;

8. SQL "nodupkey" or "first.key" equivalent