sas >> Proc SQL, multiple datepart selection

by Arto Raiskio » Thu, 07 Aug 2003 19:35:36 GMT

having some trouble selecting multiple dates from a sql table

this does not work, it gives all dates while matching to the ID part

Proc Sql;
select * from a
where (234 < id <= 235)
and (datepart(odate) = '01Aug2003'd or datepart(odate) = '05Aug2003'd)

odate contains date and time, looking like 01Aug2004:08:12:10:00




sas >> Proc SQL, multiple datepart selection

by WHITLOI1 » Thu, 07 Aug 2003 21:17:18 GMT


Arto,

Examples communicate.

This code works as expected.

data w ;
input id date date9. ;
odate = dhms(date,0,0,0) ;
cards ;
234 1jul2003
235 1aug2003
;

Proc Sql;
select * from w
where (234 < id <= 235)
and (datepart(odate) = '01Aug2003'd or datepart(odate) = '05Aug2003'd);
quit ;

Perhaps you should show some data that doesn't.

XXXX@XXXXX.COM

-----Original Message-----
From: Arto Raiskio [mailto: XXXX@XXXXX.COM ]
Sent: Thursday, August 07, 2003 7:36 AM
To: XXXX@XXXXX.COM
Subject: Proc SQL, multiple datepart selection


having some trouble selecting multiple dates from a sql table

this does not work, it gives all dates while matching to the ID part

Proc Sql;
select * from a
where (234 < id <= 235)
and (datepart(odate) = '01Aug2003'd or datepart(odate) = '05Aug2003'd)

odate contains date and time, looking like 01Aug2004:08:12:10:00

sas >> Proc SQL, multiple datepart selection

by krishna_db » Thu, 07 Aug 2003 21:49:26 GMT

rto,

The code you gave here is doing exactly what it is supposed to do.
This code selects the records having date 01Aug2003 and id = 235
(considering you have all integer ID's.).

If this is not you want, I guess you need records falling between the dates
01Aug2003 and 05Aug2003.

Here is the code for both scenarios:

proc sql;
create table test
( ID num , ODATE num informat=datetime21. format=datetime21.);
insert into test(id, odate) values(234, 1375330010);
insert into test(id, odate) values(234, 1375830000);
insert into test(id, odate) values(235, 1375730020);
insert into test(id, odate) values(235, 1375630030);
insert into test(id, odate) values(235, 1375630050);
insert into test(id, odate) values(235, 1375530060);
insert into test(id, odate) values(235, 1375330070);
insert into test(id, odate) values(235, 1375330080);
insert into test(id, odate) values(235, 1375230090);
insert into test(id, odate) values(234, 1375430010);
insert into test(id, odate) values(233, 1375530020);
insert into test(id, odate) values(236, 1375630030);
select * from test;
select * from test
where (234 < id <= 235)
and (datepart(odate) = '01Aug2003'd or datepart(odate) = '05Aug2003'd)
;
select * from test
where (234 < id <= 235)
and datepart(odate) between '01Aug2003'd and '05Aug2003'd
;
quit;

OUTPUT:

1. All the records in test dataset:

234 01AUG2003:04:06:50
234 06AUG2003:23:00:00
235 05AUG2003:19:13:40
235 04AUG2003:15:27:10
235 04AUG2003:15:27:30
235 03AUG2003:11:41:00
235 01AUG2003:04:07:50
235 01AUG2003:04:08:00
235 31JUL2003:00:21:30
234 02AUG2003:07:53:30
233 03AUG2003:11:40:20
236 04AUG2003:15:27:10

2. Records selected by query [where (234 < id <= 235)
and (datepart(odate) = '01Aug2003'd or datepart(odate) = '05Aug2003'd)]:

235 05AUG2003:19:13:40
235 01AUG2003:04:07:50
235 01AUG2003:04:08:00

3. Records selected by query [where (234 < id <= 235)
and datepart(odate) between '01Aug2003'd and '05Aug2003'd]:

235 05AUG2003:19:13:40
235 04AUG2003:15:27:10
235 04AUG2003:15:27:30
235 03AUG2003:11:41:00
235 01AUG2003:04:07:50
235 01AUG2003:04:08:00

Thanks

Krishna




_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE*
http://join.msn.com/?page=features/virus

sas >> Proc SQL, multiple datepart selection

by Arto Raiskio » Thu, 07 Aug 2003 21:58:22 GMT


"Aaron Moynahan" wrote

sure, I also get it to work if I use a data step. the problem statement was
in the first line of my original post


you see, the data comes from a sql table and when I view the table elements,
that is what the odate contains, including the last zero - it looks the same
when I use SQL Enterprise Manager or SAS to view the table

I can allow it to output the incorrect data (it still pulls the correct ID
variables) and then do a separate data step to filter (one more time) the
correct odate but that seems like wasted resources

sas >> Proc SQL, multiple datepart selection

by Arto Raiskio » Thu, 07 Aug 2003 22:05:48 GMT


"db krishna" wrote

appreciate the reply. I'm baffled by it as I clearly want a certain ID
(sometimes a range) and that part does work fine but I keep consistently
getting _all_ odate values when I only want two or sometimes three separate,
non-continuous date values

I wonder if it has something to do with internal SQL table values or formats
for the odate

sas >> Proc SQL, multiple datepart selection

by Aaron Moynahan » Thu, 07 Aug 2003 22:16:20 GMT

But your sql did work. Here are the results again. I even used the datetime
the way you gave it to us. If the results are not what you want then you
need to change the logic of the sql statement. But it is giving back what
was asked from it.

data temp;
length id odate 8;
input id odate datetime22.2;
datalines;
235 06mar1997:11:27:07.40
234.5 02Aug2003:08:16:15.70
235 01Aug2003:17:21:40.30
235 05Aug2003:02:55:21.20
236 05Aug2003:04:44:78.60
235 07Aug2003:08:13:33.80
;
run;

proc sql;
select id, odate format=datetime22.2
from temp
where (234 < id <= 235)
and (datepart(odate) = '01Aug2003'd or datepart(odate) = '05Aug2003'd);
quit;


result

id odate

235 01AUG2003:17:21:40.30
235 05AUG2003:02:55:21.20





was
elements,
same

sas >> Proc SQL, multiple datepart selection

by WHITLOI1 » Thu, 07 Aug 2003 22:33:14 GMT

Arto,

It looks like the database system and possibly the operating system are
important parts of your problem. Are you using the SAS SQL pass through
facility? If so is there a DATEPART function in the data base system? Are
you using some lib engine?

Is it reasonable to expect dates in 2004 in your database? How does your
database store datetime values? How is the system of datetime values
communicated to SAS?

XXXX@XXXXX.COM

-----Original Message-----
From: Arto Raiskio [mailto: XXXX@XXXXX.COM ]
Sent: Thursday, August 07, 2003 9:54 AM
To: XXXX@XXXXX.COM
Subject: Re: Proc SQL, multiple datepart selection


"Ian Whitlock" wrote

actually, I thought I did quite clearly and I can not attach the SQL table
so I posted what the data looks like when viewed from either SQL or from SAS

the odate _does_ contain values like what I posted, so for example if this
is the data

02Aug2004:08:12:10:00
03Aug2004:08:12:10:00
04Aug2004:08:12:10:00
05Aug2004:08:12:10:00
06Aug2004:08:12:10:00

all of these would be incorrectly seleted in the output
....and (datepart(odate) = '01Aug2003'd or datepart(odate) = '05Aug2003'd)

note that I can do a data step from the incorrect table as output from Proc
Sql and this works fine data junk;set fromSql; if (datepart(odate) =
'01Aug2003'd or datepart(odate) = '05Aug2003'd);

remember, this comes from a SQL table, in case that makes a difference

sas >> Proc SQL, multiple datepart selection

by JackHamilton » Fri, 08 Aug 2003 05:09:45 GMT

here are many, many reported problems with SQL WHERE clauses not
working correctly, especially when external databases are involved.

Do you have all the latest patches installed?

This usage note might be relevant:

=====
SN-007826
PROC SQL generates incorrect pass through code when there are multiple
OR'd expressions in the WHERE clause
--------------------------------------------------------------------------------

If you execute a PROC SQL step that performs a query on a table in
a
non-SAS database being accessed via the LIBNAME engine, the SQL
code
that is generated and passed through to the database may be
incorrect.

Specifically, if the WHERE clause contains multiple expressions that
are
OR'd together, parts of the expressions will be omitted. This may
cause
either incorrect results to be returned, or syntax errors in the
dbms.
=====

or this one:

=====
SN-009491
Implicit SQL Passthru may drop important parenthesis from WHERE
condition

--------------------------------------------------------------------------------

Incorrect SQL can be passed to a DBMS when a query contains
multiple
conditions combined with the OR operator. For example, in this
query

Proc sql;
create table x as
select t1.col2 from ora.xtab1 t1, ora.xtab2 t2
where t1.col1 = t2.col12
and
( /*** THIS SET IS LOST ***/
t1.col2 in ('A010','A015','A020','A030')
or
(t1.col2 between 'J001' and 'J002')
) /*** THIS SET IS LOST ***/
;
quit;


The loss of these parenthesis can change the meaning of the query
and
product incorrect results.
=====



Wow, I find I can replicate the problem.

I am using SAS 8.2 on a Win2K PC.
My SQL database is Oracle 8 via ODBC.
I am not using pass-through SQL. The database is referenced via
libref.

When I write query employing the datepart function and another
criterion,
more or less the same as Arto's original, I get all the records that
satisfy the first condition, just as Arto did.
select wonum, closedate, workpriority
from mp2.wo
where (3<workpriority<=4) and
(datepart(closedate)='07mar2000'd or
datepart(closedate)='08mar2000'd);

Playing around with the criteria a bit, I find that the datepart
function
can be used successfully by itself:
where
(datepart(closedate)='07mar2000'd or
datepart(closedate)='08mar2000'd);
returns all the expected values for those dates

but when adding criteria based on other fields, the query fails to
process
the datepart function and instead returns all the rows satisfying the
other
criterion. Order does not matter, nor does the logical operator.

I do succeed when substituting a date comparison without the datepart
function:
select wonum, closedate, workpriority
from mp2.wo
where (3<workpriority<=4)
and closedate lt '09mar2000'd
and closedate ge '08mar2000'd;

Next, when I use the first query to create a local SAS table, then
rerun
the query against this intermediate file, I get the correct result:
create table x as
select wonum, closedate, workpriority
from mp2.wo
where (3<workpriority<=4) and
(datepart(closedate)='07mar2000'd or
datepart(closedate)='08mar2000'd);
select wonum, closedate, workpriority
from x

sas >> Proc SQL, multiple datepart selection

by Aaron Moynahan » Fri, 08 Aug 2003 08:33:31 GMT

You know that not using SQL pass through to Oracle can often be a major
league mistake. May oracle systems are set up as massively parallel systems
operating on multiple nodes. The trick with working with these systems is to
send over a query that takes advantage of this space age architecture and
reduced the data on the Oracle system before downloading it back to SAS. If
you have a stinky Oracle system it is better just to download the data and
work with it in SAS. Morale or the story. Don't be so proud of avoiding SQL
pass through unless you can show that pass through does not yield a
significant performance improvement. Which it does when you work with a
sophisticated database design and architecture.

sas >> Proc SQL, multiple datepart selection

by tom.cross » Fri, 08 Aug 2003 08:43:48 GMT

hanks, Jack.
No, I was not up to date on my patches. Now I am.
I must point out that 'OR' was not the problem.
I'm pretty sure the problem Arto was experiencing derives from combining
the datepart (or any) SAS function with another, simpler condition in the
where clause, using AND.
To demonstate this cleanly, I have taken out the ORs altogether.
My expectation would be that when two conditions are put together with AND,
the result would be the intersection of the sets meeting each condition
separately, and the count should be less than either (or = the smaller).

1 proc sql;
2 *CRITERION 1;
3 create table x as
4 select wonum, closedate, workpriority
5 from mp2.wo
6 where workpriority=3;
NOTE: Table WORK.X created, with 137439 rows and 3 columns.

7 *CRITERION 2 AS DATEPART;
8 create table x as
9 select wonum, closedate, workpriority
10 from mp2.wo
11 where datepart(closedate)='07mar2000'd;
NOTE: Table WORK.X created, with 149 rows and 3 columns.

12 *CRITERION 2 AS DATE RANGE;
13 create table x as
14 select wonum, closedate, workpriority
15 from mp2.wo
16 where closedate lt '08mar2000'd and closedate ge '07mar2000'd;
NOTE: Table WORK.X created, with 149 rows and 3 columns.

17 *BOTH CRITERIA USING DATEPART;
18 create table x as
19 select wonum, closedate, workpriority
20 from mp2.wo
21 where workpriority=3 and
22 datepart(closedate)='07mar2000'd;
NOTE: Table WORK.X created, with 137439 rows and 3 columns.

23 *BOTH CRITERIA USING DATE RANGE;
24 create table x as
25 select wonum, closedate, workpriority
26 from mp2.wo
27 where workpriority=3 and
28 closedate lt '08mar2000'd and closedate ge '07mar2000'd;
NOTE: Table WORK.X created, with 46 rows and 3 columns.

Notice that the two-condition select using datepart (rows 17-21) worked
exactly as Arto reported, while the equivalent query that follows it works
fine. Looks like a bug to me.

-Tom

On Thu, 7 Aug 2003 15:09:45 -0600, Jack Hamilton
< XXXX@XXXXX.COM > wrote:

-----
-----

sas >> Proc SQL, multiple datepart selection

by Arto Raiskio » Fri, 08 Aug 2003 09:56:51 GMT


"Aaron Moynahan" wrote

I'm using Microsoft SQL Server 2000

Similar Threads

1. selecting Proc Sql datepart

2. Converting a datetime field to a date without using datepart() in proc sql

Hi,

Could I have some advice please as I need to convert a datetime field
to a date but I cannot use datepart() and need to use
PROC SQL.

Would it be something like this

substr(to_char(mydatetimefield,n,n,))

Thanks

Lee

3. PROC SQL Duplicate Record Selection

4. Proc SQL returning data not in selection

This code is a tad confusing, looks like fragments, not a cohesive
program :

1) there's no need to create temp table NB, why not use NEWBRN, it's
exactly the same
2) &vars isn't used
3) presumably the data step where gdg&I is create is part of a macro (
loop? ), but the same input dataset is read in each case, is there a
WHERE clause you've not provided?
4) the WHERE statement in the SQL step where MATCHES is created does not
need the IN clause, should be specified:
       where A.P_OFFICE = b.OFFICE
         AND A.TRNFR_DY >. ;
5) proc summary uses RPT, an entirely different dataset than MATCHES
created immediately above it


-----Original Message-----
From:  XXXX@XXXXX.COM  [mailto: XXXX@XXXXX.COM ]
On Behalf Of Jumbo Shrimps, Jr.
Sent: 2007, April, 07 11:30 AM
To:  XXXX@XXXXX.COM 
Subject: Proc SQL returning data not in selection

Step 1. Read in couple hundred office numbers:
and create temp table NB

DATA NEWBRN;
  INFILE NEWBRN DELIMITER=',';
  INPUT OFFICE $ OPENDATE MMDDYY8.;
RUN;

PROC SQL;
     CREATE TABLE NB AS
     SELECT *
     FROM   NEWBRN
      ;

Step 2.  Cycle through bunch of GDG data sets containing
     office data such as sales, deposits, transfers, etc.
using only these variables:
%LET VARS=SALE_DY  SALESREP ACCT
          ORIGIN P_OFFICE BALANCE AMOUNT
          SSN ACCTN OPEN TRNFR_DY TFERIAMT
          ;
DATA GDG&I. ;
  SET IN.A;

This SQL matches the couple hundred list of offices in table NB to the
huge dataset of all offices with a couple hundred variables, (but I'm
only using a dozen) matching by GDG.P_OFFICE in the GDG datasets (OUT.A)
to the list of offices in NB grabbing the variables from the %LET VARS:

 PROC SQL;
      CREATE TABLE MATCHES AS
      SELECT A.*, B.*
      FROM   OUT.A A, NB B
      WHERE  A.P_OFFICE IN (SELECT OFFICE FROM NB)
      AND A.TRNFR_DY >. ;

However, when I run the Proc summary below:

PROC SUMMARY DATA=RPT NWAY;
  CLASS  OFFICE TRNFR_DY;
  VAR BALANCE ACCT TFERIAMT;
  OUTPUT OUT=SUMM   SUM=;
RUN;
I get several hundred offices with summary data, NOT in my read in list
from step 1.

How can that be?
I only want the data that pertains to table "NB"
matched to the huge GDG data sets by
GDG. P_OFFICE to NB.OFFICE.
Could it be
WHERE  A.P_OFFICE IN (SELECT OFFICE FROM NB)
      AND A.TRNFR_DY >. ;
which is spozed to grab all those matching offices where there was a
transfer, but the SQL must be returning all offices that had a non-blank
TRNFR_D regardless of whether the office was in my list.
_______________________________________________________________________

This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations.
Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized.
If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately.

Ce courrier ectronique est confidentiel et prot L'expiteur ne renonce pas aux droits et obligations qui s'y rapportent.
Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) digns) est interdite.
Si vous recevez ce courrier ectronique par erreur, veuillez m'en aviser immiatement, par retour de courrier ectronique ou par un autre moyen.

5. multiple observations selection question

6. Multiple row selections in TableViewer (SAS/AF)

Hello,

Does anybody know, is it possible to select multiple rows in
TableViewer control?  I mean not just selecting a sequence of rows
(like when using SHIFT key) but selecting several groups of rows (like
with CTRL). I'm using SAS v9.1.3. Will be grateful for any help.

Thanks,
Konstantin

7. Proc SQL to create multiple sub-tables?

8. pulling multiple observations - PROC SQL or array?

Hello,

Here's my conundrum.  I'm fairly new at SAS so I'm having a difficult time
figuring this one out.  I posted this question before and got several
suggestions to the first part regarding matching tables but the second part
(pulling 10 observations) is still vexing me.  Here's some sample data:

In the first dataset I have a transaction date and a unique company id:

Date           SEDOL

01/02/1985     1234567
01/04/1985     3456789
01/05/1985     4534534


In the second dataset I have a date, SEDOL, and stock price info

                1234567     3456789     4534534
01/02/1985      12          19.3        41
01/03/1985      11          19          40
01/04/1985      11.5        19.1        40
01/05/1985      19          19          42
01/06/1985      14          19          42
01/09/1985      12          19          43
01/10/1985      12.5        19          40
01/11/1985      8           19.3        40
01/12/1985      12          19.9        42
01/13/1985      12          19.8        40
01/15/1985      12          19.1        44
01/16/1985      12          19.4        40
01/17/1985      12          19.2        40
01/18/1985      10          19.1        41

What I'm trying to do is pull the transaction date and SEDOL from the first
dataset and match it to the second dataset and then pull the next 10
observations in the stock price.  In terms of matching the two tables it was
suggested that I use PROC TRANSPOSE and then match using SQL.  Here's the
SQL (ds1 = the first dataset, ds2 = the second dataset) that a listserv
member kindly provided:

proc transpose data=ds2 out=ds2b;
by Date;
run;

* adjust cosmetics and types ;
data ds2c;
    set ds2b;
    SEDOL = input(substr(_name_,2),best.);
    rename col1=value;
    drop _name_;
    attrib _all_ label='';
    format date mmddyy10.;
run;

proc sql;
    create table result as
    select b.*
    from ds1 as a
    join ds2c as b
      on a.SEDOL eq b.SEDOL
     and a.date eq b.date
    ;
quit;

What I'm wondering is if pulling the 10 observations is best done in SQL or
with an array?  Somehow it feels like an array would be the smoother way to
go but either way there's a learning curve for me for both options so any
advice on relevant functions and code structure would be appreciated!