sas >> Concatenate string records using proc sql

by Torben Brandt » Mon, 11 Aug 2003 04:25:55 GMT

Hi,

If I have a dataset like this:
var1 var2
---- ----
1 ab
1 cd
2 ef

Can I then, using Proc Sql, get the result:
var1 var3
---- ----
1 abcd
2 ef

It's like the SUM-function (summarize- or aggregate-function), just for
strings.


PS: Does anyone know a good beginner tutorial to Proc Report?


:o)
Torben


sas >> Concatenate string records using proc sql

by Howard_Schreier » Mon, 11 Aug 2003 10:31:03 GMT


In your example, there are at most two rows with a particular value of
VAR1. Is that the rule, or can there be more?

On Sun, 10 Aug 2003 22:25:55 +0200, Torben Brandt < XXXX@XXXXX.COM >

sas >> Concatenate string records using proc sql

by Chul-Gyu.Baek » Mon, 11 Aug 2003 12:28:39 GMT

I used not proc sql but data step.


data a;
input var1 var2 $;
cards;
1 ab
1 cd
1 ef
2 dd
2 ee
;
run;

proc sort data=a;
by var1;

proc transpose data=a out=b(drop=_name_);
var var2;
by var1;
run;

proc print; run;

data b;
set b;
array col(*) $ _character_;
length var3 $ 50;
i=1;
do until(i gt dim(col));
var3=trim(var3)||trim(col(i));
i=i+1;
end;
keep var1 var3;
run;

proc print; run;



-----Original Message-----
From: Torben Brandt [mailto: XXXX@XXXXX.COM ]
Sent: Monday, August 11, 2003 5:26 AM
To: XXXX@XXXXX.COM
Subject: Concatenate string records using proc sql


Hi,

If I have a dataset like this:
var1 var2
---- ----
1 ab
1 cd
2 ef

Can I then, using Proc Sql, get the result:
var1 var3
---- ----
1 abcd
2 ef

It's like the SUM-function (summarize- or aggregate-function), just for
strings.


PS: Does anyone know a good beginner tutorial to Proc Report?


:o)
Torben

sas >> Concatenate string records using proc sql

by paul_dorfman » Mon, 11 Aug 2003 16:03:03 GMT

>From: Chul-Gyu Baek

< XXXX@XXXXX.COM > ????


Fine, but what do you need TRANSPOSE for? Why not just

data z ;
array c [1 : 99999] $2 _temporary_ ;
do _n_= 1 by 1 until ( last.var1) ;
set a ;
by var1 ;
c [_n_] = var2 ;
end ;
var3 = peekc ( addr(c[1]) , 2* _n_ ) ;
run ;

Kind regards,
===================
Paul M. Dorfman
Jacksonville, FL
===================


_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE*
http://join.msn.com/?page=features/virus

sas >> Concatenate string records using proc sql

by WHITLOI1 » Mon, 11 Aug 2003 21:43:32 GMT

Torben,

SQL was designed to work with normal data structures; hence de-normalization
is generally a hard thing to code in SQL. Without some restrictions on the
data the code could get quite hard and would probably involve macro to be at
all general.

Now assuming at most two distinct values of the variable to be concatenated
in any one group and no repeated values in any one group, the following code
works.

data w ; input id v :$2. ; cards ;
1 ab
1 cd
2 ef
;

proc sql ;
create view v1 as
select *
from w
group by id
having v = min(v)
;
create table wanted as
select v1.id , v1.v||s.v as var3
from v1 left join w as s
on v1.id = s.id
and s.v > v1.v
;
select * from wanted ;
quit ;

In general you could set aside the minimum values, left join the constructed
set with the remaining minimum values, and repeat this operation until set
of minimum values is empty. Such a plan removes the restriction on the
number of records in a group, but you will still run into trouble with
repeat values. With repeat values the can be no SQL solution without
introducing a sequencing variable to distinguish records with repeat values.

As you can see it much harder than the suggested DATA step solutions. One
of the big advantages to SAS is that it can work well when the data has a
good structure, and it also works much easier than most procedural languages
with poorly structured data because it imposes only a rectangular named
variable structure on SAS data and no limits on external data. In short it
is a bad idea to restrict solutions to SQL when SQL is the wrong tool for
the task.

XXXX@XXXXX.COM

-----Original Message-----
From: Torben Brandt [mailto: XXXX@XXXXX.COM ]
Sent: Sunday, August 10, 2003 4:26 PM
To: XXXX@XXXXX.COM
Subject: Concatenate string records using proc sql


Hi,

If I have a dataset like this:
var1 var2
---- ----
1 ab
1 cd
2 ef

Can I then, using Proc Sql, get the result:
var1 var3
---- ----
1 abcd
2 ef

It's like the SUM-function (summarize- or aggregate-function), just for
strings.


PS: Does anyone know a good beginner tutorial to Proc Report?


:o)
Torben

sas >> Concatenate string records using proc sql

by yhuang » Mon, 11 Aug 2003 23:30:27 GMT

If each id has maximum two records, here is a
brute force method:

data w ; input id v :$2. ; cards ;
1 ab
1 cd
2 ef
;

proc sql;
select distinct id,
case when count(*)=2 then min(v)||max(v) else min(v) end as concatv
from w
group by id
;
---------

id concatv
-------------------
1 abcd
2 ef

If max number is 3, it is still doable. Otherwise, it is really
hard to use proc sql to get what you want. This issue has been
discussed several times before.

Kind regards,

Ya Huang


-----Original Message-----
From: Torben Brandt [mailto: XXXX@XXXXX.COM ]
Sent: Sunday, August 10, 2003 1:26 PM
To: XXXX@XXXXX.COM
Subject: Concatenate string records using proc sql


Hi,

If I have a dataset like this:
var1 var2
---- ----
1 ab
1 cd
2 ef

Can I then, using Proc Sql, get the result:
var1 var3
---- ----
1 abcd
2 ef

It's like the SUM-function (summarize- or aggregate-function), just for
strings.


PS: Does anyone know a good beginner tutorial to Proc Report?


:o)
Torben

sas >> Concatenate string records using proc sql

by Torben Brandt » Tue, 12 Aug 2003 23:17:49 GMT


Thanks to all,

Although I hoped that SQL could do it in an easy way, your solutions
convinced me just to use a data step. I'll go with that.

Torben

sas >> Concatenate string records using proc sql

by HERMANS1 » Wed, 13 Aug 2003 05:03:08 GMT

Just for fun, this query solves the problem you stated:

proc sql;
create table concat as
select t1.var1 as var1,case when t1.var2 < t2.var2
then trim(t1.var2)||t2.var2
else t1.var2
end as var3
from test1 as t1 left join test1 as t2
on t1.var1=t2.var1
group by t1.var1 having length(var3)=max(length(var3))
;
quit;

Unfortunately it does not prove to be very robust. If you add the line '1
bc' to the dataset, for example, it does not give you the same form of
result.

Sig

-----Original Message-----
From: Torben Brandt [mailto: XXXX@XXXXX.COM ]
Sent: Tuesday, August 12, 2003 11:18 AM
To: XXXX@XXXXX.COM
Subject: Re: Concatenate string records using proc sql




Thanks to all,

Although I hoped that SQL could do it in an easy way, your solutions
convinced me just to use a data step. I'll go with that.

Torben

sas >> Concatenate string records using proc sql

by Torben Brandt » Wed, 13 Aug 2003 07:18:33 GMT


As some of you have figured out, I could have more than two rows in each
group. I found another way to solve the problem.

This solution need a column numbering the rows from 1 and up, but that
easy to make.
I still have one problem: Even though I write "else ''" it counts
missing values as a string containing spaces. Does anyone know how I
make that an empty string ''? (Trim leaves the leadning space)


data MyData;
id = 1; var1 = 1; var2 = 'ab'; output;
id = 2; var1 = 1; var2 = 'cd'; output;
id = 3; var1 = 1; var2 = 'ef'; output;
id = 4; var1 = 2; var2 = 'gh'; output;
run;

%macro makeSQL();
%do i = 1 %to %eval(&total-1);
max(case when id = &i then var2 else '' end) ||
%end;
%mend makeSQL;

proc sql;
select max(id) into :total from MyData;
select
%makeSQL
max(case when id = &total then var2 else '' end) as var3
from MyData
group by var1;
quit;


Torben

Similar Threads

1. Collapsing records and concatenating variables using PROC SQL

2. Collapsing records and concatenating variables using PROC

This does what you want for the sample data given, making lots of
possibly unwarranted assumptions, but something tells me it isn't what
your client wants:

=====proc sql;
    create table three as
       select *
       from (
          select
                   key,
                   max(var) as var,
                   case 1
                      when (var=min(var) and var ne max(var)) then
                         resolve('%let hold=' || var || ';')
                      when (var ne min(var) and var ne max(var)) then
                         resolve('%let hold=&HOLD. ' || var || ';')
                      when (var eq min(var) and var eq max(var)) then
                         var
                      else
                         trim(symget('HOLD')) || ' ' || var
                      end as newvar
          from
                   one (sortedby = key var)
          group by
                   key
          )
       group by
                key
       having
                newvar = max(newvar)
       order by
                key, var;
  quit;
=====


Scott Bass wrote:
> Hi,
>
> I would think I would have known how to do this by now, but alas I don't :-
> (
>
> Say I have the following data step:
>
> data one;
>    length key 8 var $1;
>    input key var;
>    cards;
> 1 A
> 1 B
> 1 C
> 2 D
> 2 E
> 3 F
> ;
> run;
>
> data two;
>    set one;
>    by key;
>    length newvar $200;
>    retain newvar;
>    if first.key then newvar = "";
>    newvar = catx(" ",newvar,var);
>    if last.key then output;
> run;
>
> Is there a way to create this dataset using PROC SQL only?  I'd rather
> just use the data step, but the customer would prefer SQL if possible.
>
> Thanks,
> Scott

3. How to concatenates the observations of one dataset using PROC SQL

4. quoted string in PROC SQL - using macro quoting functions

I intend to dynamically use a string in the PROC SQL where clause..I
have dipped my hands into Macro quoting to achieve this and have a few
questions..

given a list of variables defined in %variables= var1 var2 var3;, I
want to execute..

PROC SQL;
select name
from <data>
where name in ('var1','var2','var3');
quit;
run;

To dynamically code the string 'var1','var2',.... I have tried to code
the following macro.

%let count = 3; /* a counter set to 3 because we know there are three
variables at present */

%macro mac1;

%let i = 1;
        %let step = %scan(&variables,&i);
        %let code =;

        %do %while (%length(&step)>0);

                %if &i < &count %then %do;
                        %let code = &code%nrquote(%bquote('))&step%nrquote(%bquote('))%str(,);
                %end;

                %if &i = &count %then %do;
                        %let code = &code%nrquote(%bquote('))&step%nrquote(%bquote('));
                %end;

                %let i = %eval(&i + 1);
                %let step = %scan(&variables,&i);

        %end;
                &code

%mend mac1;

Upon invoking this macro as

PROC SQL
select name
from <data>
where name in (%mac1);
quit;

I get errors..

However, when I look at the log  -- using %put &code, instead of &code
-- I see the exact statements that I want, i.e. 'var1','var2',var3'

Thanks,

Rushi

5. How to load 100000 records at a time into db2 using proc sql

6. Best way to union(concatenate several datasets) Proc SQL Or Proc Append

I need to union(concatenate) 9-10 datasets. Each has same number of
variables and all the variables are same.

I am wondering what is the quickest way to union(concatenate) a big
number of datasets?

Proc SQL can use UNION statement like:

Proc SQL;
(select * from a)
union
(select * from b)
union
(select * from c);
quit;

But can I union 9-10 datasets at one time? Would that take long time?

If I use Proc Append? I can only append one dataset at a time, right?
Would it be more time consuming?

Thanks,

Mark

7. Best way to union(concatenate several datasets) Proc SQL Or

8. Proc SQL vs sort/merge/concatenate