Home » SQL & PL/SQL » SQL & PL/SQL » Update based on grouping (11.0.2.10)
Update based on grouping [message #660521] Mon, 20 February 2017 04:41 Go to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
Hi All,

I have the following case of duplication that I need to resolve.

The following code resolves the problem by adding one sequence to all cases:
create table test_rem_dup
  (
    id number primary key,
    fk_id number,
    v1 char(22),
    v2 char(10)
  );

insert ALL
  into TEST_REM_DUP values (10,100,'ppp','ooo')
  into TEST_REM_DUP values (20,100,'ppp','ooo')
  into TEST_REM_DUP values (30,300,'ccc','')
  into TEST_REM_DUP values (40,300,'cc','')
  into TEST_REM_DUP values (50,300,'cc','k')
  into TEST_REM_DUP values (60,300,'cc','k')
  into TEST_REM_DUP values (70,300,'cc','')
  into TEST_REM_DUP values (80,300,'cc','')
  into TEST_REM_DUP values (90,400,'uu','')
  into TEST_REM_DUP values (100,100,'xx','')
select * from dual;

drop sequence seq_temp;
create sequence seq_temp start with 1;



update TEST_REM_DUP p set v2 = trim(v2) || '*IT' ||  seq_temp.NEXTVAL 
  where (p.ID) in 
    (
      select ID from TEST_REM_DUP p, (select FK_ID, v1, v2 from TEST_REM_DUP  group by FK_ID, v1, v2 having count(*)>1) src 
        where p.FK_ID = src.FK_ID
          and trim(p.V1) = trim(src.V1)
          and (
                ( trim(p.V2) = trim(src.v2) )
                OR
                (trim(p.V2) is null and  trim(src.v2) is null)
              )
    );

However, I want to number all duplicate cases with the same fk_ID seprately such as:
ID	FK_ID	V1	V2
10	100	ppp	ooo*IT1
20	100	ppp	ooo*IT2
30	300	ccc	(null)
40	300	cc	*IT1
50	300	cc	k*IT1
60	300	cc	k*IT2
70	300	cc	*IT2
80	300	cc	*IT3
90	400	uu	(null)
100	100	xx	(null)

Thanks,
Ferro

[Updated on: Mon, 20 February 2017 04:42]

Report message to a moderator

Re: Update based on grouping [message #660523 is a reply to message #660521] Mon, 20 February 2017 05:07 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
SQL> select id, fk_id, v1,
  2         case
  3           when count(*) over (partition by fk_id, v1) = 1 then null
  4           else v2||'*IT'||row_number() over (partition by fk_id, v1, v2 order by id)
  5         end v2
  6  from TEST_REM_DUP
  7  order by 1
  8  /
        ID      FK_ID V1    V2
---------- ---------- ----- ----------
        10        100 ppp   ooo*IT1
        20        100 ppp   ooo*IT2
        30        300 ccc
        40        300 cc    *IT1
        50        300 cc    k*IT1
        60        300 cc    k*IT2
        70        300 cc    *IT2
        80        300 cc    *IT3
        90        400 uu
       100        100 xx

[Updated on: Mon, 20 February 2017 05:09]

Report message to a moderator

Re: Update based on grouping [message #660545 is a reply to message #660523] Mon, 20 February 2017 21:53 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
Thanks Michel,

I tend to forget about the partitioning option every time!

My update query is:
MERGE INTO TEST_REM_DUP p
   USING (
          select id, fk_id, v1,
            case
              when count(*) over (partition by fk_id, v1,v2) = 1 then null
              else trim(v2)||row_number() over (partition by fk_id, v1, v2 order by id)
              end v2_new
          from TEST_REM_DUP
         ) Src
   ON (p.id = Src.id)
   WHEN MATCHED THEN UPDATE SET p.v2 = src.v2_new;

For the sake of knowledge, Is there a way to do it using normal UPDATE clause instead of merge?

Thanks,
Ferro
Re: Update based on grouping [message #660546 is a reply to message #660545] Mon, 20 February 2017 22:34 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
SCOTT@orcl_12.1.0.2.0> SELECT * FROM test_rem_dup ORDER BY id
  2  /

        ID      FK_ID V1                     V2
---------- ---------- ---------------------- ----------
        10        100 ppp                    ooo
        20        100 ppp                    ooo
        30        300 ccc
        40        300 cc
        50        300 cc                     k
        60        300 cc                     k
        70        300 cc
        80        300 cc
        90        400 uu
       100        100 xx

10 rows selected.

SCOTT@orcl_12.1.0.2.0> UPDATE test_rem_dup p
  2  SET    p.v2 =
  3  	    (SELECT v2_new
  4  	     FROM   (SELECT id,
  5  			    CASE
  6  			      WHEN COUNT (*) OVER (PARTITION BY fk_id, v1, v2) = 1 THEN NULL
  7  			      ELSE TRIM (v2) || ROW_NUMBER () OVER (PARTITION BY fk_id, v1, v2 ORDER BY id)
  8  			    END v2_new
  9  		     FROM   test_rem_dup) src
 10  	     WHERE  p.id = src.id)
 11  /

10 rows updated.

SCOTT@orcl_12.1.0.2.0> SELECT * FROM test_rem_dup ORDER BY id
  2  /

        ID      FK_ID V1                     V2
---------- ---------- ---------------------- ----------
        10        100 ppp                    ooo1
        20        100 ppp                    ooo2
        30        300 ccc
        40        300 cc                     1
        50        300 cc                     k1
        60        300 cc                     k2
        70        300 cc                     2
        80        300 cc                     3
        90        400 uu
       100        100 xx

10 rows selected.
Re: Update based on grouping [message #660547 is a reply to message #660546] Mon, 20 February 2017 23:02 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
@Barbara
Thanks a lot

@Michel
I also adjusted the query as follows:
WHEN COUNT (*) OVER (PARTITION BY fk_id, v1, v2) = 1 THEN V2 --instead of null 
To preserve V2 value is case not null

Thanks,
Ferro
Re: Update based on grouping [message #660548 is a reply to message #660547] Tue, 21 February 2017 00:54 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Thanks foe the feedback.
Note that MERGE is most often more efficient than update.

Re: Update based on grouping [message #660549 is a reply to message #660548] Tue, 21 February 2017 01:01 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
@Michel

Yes, I usually use MERGE as I also find its syntax more readable and logic.

I have one more question, in case I added this record to my original data:
insert ALL
  into TEST_REM_DUP values (110,100,'ppp','ooo2')
select * from dual;

This way the merge script will add 'ooo2' creating a new duplicate, how can I avoid this knowing that the size of the field I am updating on production is just varchar2(2)?

Thanks,
Ferro
Re: Update based on grouping [message #660551 is a reply to message #660549] Tue, 21 February 2017 01:09 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

I don't understand you are updating a VARCHAR2(2) that contains 'ooo2' which has 4 characters?
And what do you want to avoid? The INSERT or updating the field?

Re: Update based on grouping [message #660552 is a reply to message #660551] Tue, 21 February 2017 01:21 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
Hi Michel,
I want to avoid updating the field V2 with a value that is a duplicate with another value already existing in the table.
What I meant with
Quote:

knowing that the size of the field I am updating on production is just varchar2(2)
is that adding a special character before the number (row_number()) will not work when I apply this solution to production as the field size is limited to 2.

Thanks,
Ferro

Re: Update based on grouping [message #660553 is a reply to message #660552] Tue, 21 February 2017 01:28 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

I don't see, please post a table and example that show the actual case.

Re: Update based on grouping [message #660554 is a reply to message #660553] Tue, 21 February 2017 02:14 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
Sure and sorry for any confusion my words might have caused:

drop table test_rem_dup;

create table test_rem_dup
  (
    id number primary key,
    fk_id number,
    v1 char(22),
    v2 char(2)
  );

insert ALL
  into TEST_REM_DUP values (10,100,'ppp','o')
  into TEST_REM_DUP values (20,100,'ppp','o')
  into TEST_REM_DUP values (30,300,'ccc','')
  into TEST_REM_DUP values (40,300,'cc','')
  into TEST_REM_DUP values (50,300,'cc','k')
  into TEST_REM_DUP values (60,300,'cc','k')
  into TEST_REM_DUP values (70,300,'cc','')
  into TEST_REM_DUP values (80,300,'cc','')
  into TEST_REM_DUP values (90,400,'uu','')
  into TEST_REM_DUP values (100,100,'xx','l')
  into TEST_REM_DUP values (110,100,'ppp','o2')
select * from dual;


update TEST_REM_DUP p set v2 = 
  (
    select v2_new 
      FROM
      (
        select id, fk_id, v1,
                   case
                     when count(*) over (partition by fk_id, v1,v2) = 1 then v2
                     else trim(v2)||row_number() over (partition by fk_id, v1, v2 order by id)
                   end v2_new
            from TEST_REM_DUP
      ) src
    where p.ID = src.ID
  )
;

MERGE INTO TEST_REM_DUP p
   USING (
          select id, fk_id, v1,
            case
              when count(*) over (partition by fk_id, v1,v2) = 1 then V2
              else trim(v2)||row_number() over (partition by fk_id, v1, v2 order by id)
              end v2_new
          from TEST_REM_DUP
         ) Src
   ON (p.id = Src.id)
   WHEN MATCHED THEN UPDATE SET p.v2 = src.v2_new;



ALTER TABLE TEST_REM_DUP ADD CONSTRAINT UNQ_v1_v2 UNIQUE (v1, v2);

The constraint will fail due to:
1 ORA-02299: cannot validate (AFESD2.UNQ_V1_V2) - duplicate keys found

as the newly updated value (o2) created a new duplicate with an already existing value. I am trying to also avoid that.

Thanks,
Ferro
Re: Update based on grouping [message #660555 is a reply to message #660554] Tue, 21 February 2017 02:55 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

What is the purpose of your MERGE/UPDATE? Fix the current data (one time pass) or is it the permanent that should contain the table?
In the later case, why don't you insert the correct data instead?
In the former, how do you handle the case there are more than 9 or 99 duplicated rows?
In addition, define "duplicated rows", in my query I partitioned by "fk_id, v1" (which was I thought the duplicate condition from your first post), you partitioned by "fk_id, v1, v2".
What is the correct case?

In short, post a complete and deterministic specification of your issue.


Re: Update based on grouping [message #660557 is a reply to message #660555] Tue, 21 February 2017 03:16 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
Sure
1- The purpose of the data correction (MERGE/UPDATE) is to eliminate duplicate records and be able to enforce uniqueness on fk_id, V1 and V2 (the constraint I tried to run in the later case is missing fk_id sorry for the typo).
--it should be:
ALTER TABLE TEST_REM_DUP ADD CONSTRAINT UNQ_v1_v2 UNIQUE (fk_id, v1, v2);
2- I am showing the duplication case I have in my production and need to solve as per point 1
3- Cannot, in this case V1 has to change. However, fortunately, this is not the case I am facing in production data where either I have one character filled in V2 with less than 9 duplicates or empty V2 with less than 99 duplicates.
4- Duplicate rows are rows with repeated fk_id, V1 and V2 values.

Thanks a lot for your time,
Ferro
Re: Update based on grouping [message #660558 is a reply to message #660557] Tue, 21 February 2017 03:23 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

In this case, after the first fix, you create the unique constraint as you mentioned, modify the application to insert correct data and there is no more problem.
I fail to see where is your issue now.

Re: Update based on grouping [message #660563 is a reply to message #660558] Tue, 21 February 2017 03:39 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
Dear Michel,

In the later example, the merge/update script will add rows that will create duplication with already existing rows.

The data before running the merge is:
ID	FK_ID	V1	V2
10	100	ppp	o
20	100	ppp	o
30	300	ccc	(null)
40	300	cc	(null)
50	300	cc	k
60	300	cc	k
70	300	cc	(null)
80	300	cc	(null)
90	400	uu	(null)
100	100	xx	l
110	100	ppp	o2

running constraint will fail:
ALTER TABLE TEST_REM_DUP ADD CONSTRAINT UNQ_v1_v2 UNIQUE (fk_id, v1, v2);

ORA-02299: cannot validate (AFESD2.UNQ_V1_V2) - duplicate keys found 

After running the merge statement:
MERGE INTO TEST_REM_DUP p
   USING (
          select id, fk_id, v1,
            case
              when count(*) over (partition by fk_id, v1,v2) = 1 then V2
              else trim(v2)||row_number() over (partition by fk_id, v1, v2 order by id)
              end v2_new
          from TEST_REM_DUP
         ) Src
   ON (p.id = Src.id)
   WHEN MATCHED THEN UPDATE SET p.v2 = src.v2_new;

the data became:

ID	FK_ID	V1	V2
10	100	ppp	o1
20	100	ppp	o2
30	300	ccc	(null)
40	300	cc	1
50	300	cc	k1
60	300	cc	k2
70	300	cc	2
80	300	cc	3
90	400	uu	(null)
100	100	xx	l
110	100	ppp	o2

Now although all previous duplicate cases have been solved, a new one was created between
ID	FK_ID	V1	V2
20	100	ppp	o2
110	100	ppp	o2

and the constraint will fail for the same reason. This is the difference between the former and the later examples. I am trying to avoid this case in my script.

Thanks,
Ferro
Re: Update based on grouping [message #660564 is a reply to message #660563] Tue, 21 February 2017 03:45 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Ah OK, I understand now.
In this case, I should either add a new column instead of updating the original one or use a temporary table, if you don't want to modify the original one and then determine the generated duplicated rows.
You have to specify what you want to do in this specific case.
Anyway, depending on what you want to do, it will most likely be a 2-pass process not a single statement.

[Updated on: Tue, 21 February 2017 03:47]

Report message to a moderator

Re: Update based on grouping [message #660567 is a reply to message #660564] Tue, 21 February 2017 04:14 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
There is no issue with temporary adding a new column to the table and then remove it. This was my first trial but failed because I did not know what to do when I run the merge for the first time and them discover that a duplicate still exists. How can I determine the new key? I couldnt write the SQL that does that.

Thanks,
Ferro
Re: Update based on grouping [message #660571 is a reply to message #660567] Tue, 21 February 2017 04:41 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

You have to define what you want to do if such a case happen.
Do you want to modify the modified row (for instance instead of changing it from "o" to "o2", changing to "o3", then the question is what if "o3" also exists...) or changing the row which originally has "o2" and changing to what?

Re: Update based on grouping [message #660573 is a reply to message #660571] Tue, 21 February 2017 04:43 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
@ Michel
Quote:

then the question is what if "o3" also exists
This is exactly what I dont know how to do. What I want is to select the minimum number (instead of 2) that will make no duplication inside the merge statement, here is where I need help.

Thanks,
Ferro
Re: Update based on grouping [message #660575 is a reply to message #660573] Tue, 21 February 2017 04:51 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

So your option is to change the new o2 to o3 or the first one that currently does not exist?
You could loop until no more duplicates exist. I have to go so can't do it now but the plan is: "after the first is made, loop "incrementing" the modified value until there is no more duplicates" (and so the necessary new column to compare to other old one). The number of rows should decrease at each loop.

[Updated on: Tue, 21 February 2017 04:53]

Report message to a moderator

Re: Update based on grouping [message #660576 is a reply to message #660573] Tue, 21 February 2017 06:35 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
If the numbers are used only to eliminate duplication and it does not matter which id ends up with v2 value of o2 or o3, then you could do it all in one update, by trimming the trailing numbers during comparison and update, as demonstrated below.

SCOTT@orcl_12.1.0.2.0> SELECT * FROM test_rem_dup
  2  /

        ID      FK_ID V1                     V2
---------- ---------- ---------------------- --
        10        100 ppp                    o
        20        100 ppp                    o
        30        300 ccc
        40        300 cc
        50        300 cc                     k
        60        300 cc                     k
        70        300 cc
        80        300 cc
        90        400 uu
       100        100 xx                     l
       110        100 ppp                    o2

11 rows selected.

SCOTT@orcl_12.1.0.2.0> UPDATE test_rem_dup p
  2  SET    p.v2 =
  3  	    (SELECT v2_new
  4  	     FROM   (SELECT id,
  5  			    CASE
  6  			      WHEN COUNT (*) OVER (PARTITION BY fk_id, v1, RTRIM(v2,'1234567890 ')) = 1 THEN
  7  				RTRIM(v2,'1234567890 ')
  8  			      ELSE RTRIM(v2,'123456790 ')
  9  				|| ROW_NUMBER () OVER (PARTITION BY fk_id, v1, RTRIM(v2,'1234567890 ') ORDER BY id)
 10  			    END v2_new
 11  		     FROM   test_rem_dup) src
 12  	     WHERE  p.id = src.id)
 13  /

11 rows updated.

SCOTT@orcl_12.1.0.2.0> SELECT * FROM test_rem_dup
  2  /

        ID      FK_ID V1                     V2
---------- ---------- ---------------------- --
        10        100 ppp                    o1
        20        100 ppp                    o2
        30        300 ccc
        40        300 cc                     1
        50        300 cc                     k1
        60        300 cc                     k2
        70        300 cc                     2
        80        300 cc                     3
        90        400 uu
       100        100 xx                     l
       110        100 ppp                    o3

11 rows selected.

SCOTT@orcl_12.1.0.2.0> ALTER TABLE test_rem_dup ADD CONSTRAINT unq_v1_v2 UNIQUE (fk_id, v1, v2)
  2  /

Table altered.

[Updated on: Tue, 21 February 2017 06:41]

Report message to a moderator

Re: Update based on grouping [message #660598 is a reply to message #660576] Tue, 21 February 2017 22:11 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
@Barbara

Thanks a lot. I've been thinking of ways to achieve the objective without having to involve a temp table or a new column (to impress Michel for once!).
Your solution works however it changes the original V2 values for non-duplicate columns which is against my requirements.
My objective updating V2 is:
Quote:

What I want is to select the minimum number (instead of 2) that will make no duplication inside the merge statement, here is where I need help
Thanks a lot,
Ferro
Re: Update based on grouping [message #660599 is a reply to message #660598] Tue, 21 February 2017 22:58 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
SCOTT@orcl_12.1.0.2.0> SELECT * FROM test_rem_dup
  2  /

        ID      FK_ID V1                     V2
---------- ---------- ---------------------- --
        10        100 ppp                    o
        20        100 ppp                    o
        30        300 ccc
        40        300 cc
        50        300 cc                     k
        60        300 cc                     k
        70        300 cc
        80        300 cc
        90        400 uu
       100        100 xx                     l
       110        100 ppp                    o2

11 rows selected.

SCOTT@orcl_12.1.0.2.0> DECLARE
  2    v_seq	NUMBER;
  3    v_id	NUMBER;
  4    v_fk_id	NUMBER;
  5    v_v1	CHAR(22);
  6    v_v2	CHAR(2);
  7  BEGIN
  8    FOR src IN
  9  	 (SELECT id, fk_id, v1, v2
 10  	  FROM	 (SELECT id, fk_id, v1, v2,
 11  			 COUNT (*) OVER (PARTITION BY fk_id, v1, v2) cnt
 12  		  FROM	 test_rem_dup)
 13  	  WHERE  cnt > 1
 14  	  ORDER  BY fk_id, v1, v2, id)
 15    LOOP
 16  	 IF v_id IS NULL OR v_fk_id != src.fk_id OR v_v1 != src.v1
 17  	    OR NVL(v_v2,'XX') != NVL(src.v2,'XX') THEN
 18  	   v_seq := 1;
 19  	 ELSE
 20  	   v_seq := v_seq + 1;
 21  	 END IF;
 22  	 LOOP
 23  	   UPDATE test_rem_dup p
 24  	   SET	  p.v2 = TRIM (p.v2) || v_seq
 25  	   WHERE  p.id = src.id
 26  	   AND	  NOT EXISTS
 27  		  (SELECT t.*
 28  		   FROM   test_rem_dup t
 29  		   WHERE  t.fk_id = p.fk_id
 30  		   AND	  t.v1 = p.v1
 31  		   AND	  t.v2 = TRIM (p.v2) || v_seq);
 32  	   IF SQL%ROWCOUNT > 0 THEN EXIT;
 33  	   ELSE  v_seq := v_seq + 1;
 34  	   END IF;
 35  	 END LOOP;
 36  	 v_id := src.id;
 37  	 v_fk_id := src.fk_id;
 38  	 v_v1 := src.v1;
 39  	 v_v2 := src.v2;
 40    END LOOP;
 41  END;
 42  /

PL/SQL procedure successfully completed.

SCOTT@orcl_12.1.0.2.0> SELECT * FROM test_rem_dup ORDER BY id
  2  /

        ID      FK_ID V1                     V2
---------- ---------- ---------------------- --
        10        100 ppp                    o1
        20        100 ppp                    o3
        30        300 ccc
        40        300 cc                     1
        50        300 cc                     k1
        60        300 cc                     k2
        70        300 cc                     2
        80        300 cc                     3
        90        400 uu
       100        100 xx                     l
       110        100 ppp                    o2

11 rows selected.

SCOTT@orcl_12.1.0.2.0> ALTER TABLE test_rem_dup ADD CONSTRAINT unq_v1_v2 UNIQUE (fk_id, v1, v2)
  2  /

Table altered.
Re: Update based on grouping [message #660604 is a reply to message #660575] Wed, 22 February 2017 02:05 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Here's the way I had in mind (I added one more duplicated row):
SQL> alter table test_rem_dup add v3 char(2);

Table altered.

SQL> select * from  TEST_REM_DUP order by 1;
        ID      FK_ID V1                     V2 V3
---------- ---------- ---------------------- -- --
        10        100 ppp                    o
        20        100 ppp                    o
        30        300 ccc
        40        300 cc
        50        300 cc                     k
        60        300 cc                     k
        70        300 cc
        80        300 cc
        90        400 uu
       100        100 xx                     l
       110        100 ppp                    o2
       120        100 ppp                    o

12 rows selected.

SQL> merge into test_rem_dup t1
  2        using (select id,
  3                      case
  4                        when count(*) over (partition by fk_id, v1, v2) = 1 then v2
  5                        else trim(v2)||row_number() over (partition by fk_id, v1, v2 order by id)
  6                      end v3
  7               from TEST_REM_DUP) t2
  8        on (t2.id = t1.id)
  9  when matched then update set t1.v3 = t2.v3
 10  /

12 rows merged.

SQL> select * from  TEST_REM_DUP order by 1;
        ID      FK_ID V1                     V2 V3
---------- ---------- ---------------------- -- --
        10        100 ppp                    o  o1
        20        100 ppp                    o  o2
        30        300 ccc
        40        300 cc                        1
        50        300 cc                     k  k1
        60        300 cc                     k  k2
        70        300 cc                        2
        80        300 cc                        3
        90        400 uu
       100        100 xx                     l  l
       110        100 ppp                    o2 o2
       120        100 ppp                    o  o3

12 rows selected.

SQL> begin
  2    loop
  3      update test_rem_dup t1
  4      set v3 = rtrim(v3,'1234567890 ')
  5               || to_char(to_number(regexp_substr(v3,'\d+ *$'))+1)
  6      where nvl(v3,'mc') != nvl(v2,'mc')
  7        and exists (select v2 from test_rem_dup t2
  8                    where t2.v3 = t1.v3
  9                      and (  ( t2.id != t1.id and t2.v3 = t2.v2 )
 10                          or ( t2.id < t1.id and t2.v3 != t2.v2 ) ));
 11      exit when sql%rowcount = 0;
 12    end loop;
 13  end;
 14  /

PL/SQL procedure successfully completed.

SQL> select * from  TEST_REM_DUP order by 1;
        ID      FK_ID V1                     V2 V3
---------- ---------- ---------------------- -- --
        10        100 ppp                    o  o1
        20        100 ppp                    o  o3
        30        300 ccc
        40        300 cc                        1
        50        300 cc                     k  k1
        60        300 cc                     k  k2
        70        300 cc                        2
        80        300 cc                        3
        90        400 uu
       100        100 xx                     l  l
       110        100 ppp                    o2 o2
       120        100 ppp                    o  o4

12 rows selected.

SQL> alter table test_rem_dup drop column v2;

Table altered.

SQL> alter table test_rem_dup rename column v3 to v2;

Table altered.

SQL> ALTER TABLE test_rem_dup ADD CONSTRAINT unq_v1_v2 UNIQUE (fk_id, v1, v2);

Table altered.

SQL> select * from  TEST_REM_DUP order by 1;
        ID      FK_ID V1                     V2
---------- ---------- ---------------------- --
        10        100 ppp                    o1
        20        100 ppp                    o3
        30        300 ccc
        40        300 cc                     1
        50        300 cc                     k1
        60        300 cc                     k2
        70        300 cc                     2
        80        300 cc                     3
        90        400 uu
       100        100 xx                     l
       110        100 ppp                    o2
       120        100 ppp                    o4

12 rows selected.

Now a question raises: "the number in v2 are no more in the same order than id, is this a problem?".
You have to choose: either the order of "v2" is not the same than "id" (in each partition), either you have to modify the original "v2" to keep this order.

[Updated on: Wed, 22 February 2017 02:09]

Report message to a moderator

Re: Update based on grouping [message #660607 is a reply to message #660604] Wed, 22 February 2017 04:49 Go to previous messageGo to next message
OraFerro
Messages: 433
Registered: July 2011
Senior Member
@Barbara
@Michel

Thanks a lot.

@Michel, right you are considering the example. The production table has a date field and a description field that make different meanings to each row. Also in my script instead of dropping V3 I will replace it with IS_AUTO_UPDATED to indicate fields that were changed by IT until user review is completed then will drop it.

Thanks a lot,
Ferro

[Updated on: Wed, 22 February 2017 04:50]

Report message to a moderator

Re: Update based on grouping [message #660608 is a reply to message #660607] Wed, 22 February 2017 04:57 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Note that I did not drop v3, I dropped v2 and renamed v3 to v2 as I built the result into v3 taking v2 as a reference to know if v3 has to be updated or not.

[Updated on: Wed, 22 February 2017 11:03]

Report message to a moderator

Re: Update based on grouping [message #660609 is a reply to message #660608] Wed, 22 February 2017 05:24 Go to previous message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
You can change my script to match your new requirements like:
SQL> alter table test_rem_dup add IS_AUTO_UPDATED number(1);

Table altered.

SQL> select * from  TEST_REM_DUP order by 1;
        ID      FK_ID V1                     V2 IS_AUTO_UPDATED
---------- ---------- ---------------------- -- ---------------
        10        100 ppp                    o
        20        100 ppp                    o
        30        300 ccc
        40        300 cc
        50        300 cc                     k
        60        300 cc                     k
        70        300 cc
        80        300 cc
        90        400 uu
       100        100 xx                     l
       110        100 ppp                    o2
       120        100 ppp                    o

12 rows selected.

SQL> merge into test_rem_dup t1
  2        using (select id,
  3                      case
  4                        when count(*) over (partition by fk_id, v1, v2) = 1 then v2
  5                        else trim(v2)||row_number() over (partition by fk_id, v1, v2 order by id)
  6                      end v2,
  7                      case
  8                        when count(*) over (partition by fk_id, v1, v2) = 1 then 0
  9                        else 1
 10                      end flag
 11               from TEST_REM_DUP) t2
 12        on (t2.id = t1.id)
 13  when matched then update set t1.v2 = t2.v2, t1.IS_AUTO_UPDATED = t2.flag
 14  /

12 rows merged.

SQL> select * from  TEST_REM_DUP order by 1;
        ID      FK_ID V1                     V2 IS_AUTO_UPDATED
---------- ---------- ---------------------- -- ---------------
        10        100 ppp                    o1               1
        20        100 ppp                    o2               1
        30        300 ccc                                     0
        40        300 cc                     1                1
        50        300 cc                     k1               1
        60        300 cc                     k2               1
        70        300 cc                     2                1
        80        300 cc                     3                1
        90        400 uu                                      0
       100        100 xx                     l                0
       110        100 ppp                    o2               0
       120        100 ppp                    o3               1

12 rows selected.

SQL> begin
  2    loop
  3      update test_rem_dup t1
  4      set v2 = rtrim(v2,'1234567890 ')
  5               || to_char(to_number(regexp_substr(v2,'\d+ *$'))+1)
  6      where IS_AUTO_UPDATED = 1
  7        and exists (select v2 from test_rem_dup t2
  8                    where t2.v2 = t1.v2
  9                      and (  ( t2.id != t1.id and t2.IS_AUTO_UPDATED = 0 )
 10                          or ( t2.id < t1.id and t2.IS_AUTO_UPDATED = 1 ) ));
 11      exit when sql%rowcount = 0;
 12    end loop;
 13  end;
 14  /

PL/SQL procedure successfully completed.

SQL> select * from  TEST_REM_DUP order by 1;
        ID      FK_ID V1                     V2 IS_AUTO_UPDATED
---------- ---------- ---------------------- -- ---------------
        10        100 ppp                    o1               1
        20        100 ppp                    o3               1
        30        300 ccc                                     0
        40        300 cc                     1                1
        50        300 cc                     k1               1
        60        300 cc                     k2               1
        70        300 cc                     2                1
        80        300 cc                     3                1
        90        400 uu                                      0
       100        100 xx                     l                0
       110        100 ppp                    o2               0
       120        100 ppp                    o4               1

12 rows selected.
Previous Topic: PLSQL - Update in Loop
Next Topic: Declare a cursor from the return of a procedure?
Goto Forum:
  


Current Time: Thu Mar 28 19:31:41 CDT 2024