Home » SQL & PL/SQL » SQL & PL/SQL » bulk collect problem through dblink (Orale 12c, 12.1.0.2.0)
bulk collect problem through dblink [message #678280] Sat, 23 November 2019 01:12 Go to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
Hello
I need to mape copy of 20milions records from one db to another through db link.
I do this by script below, but it takes a lot time. It didnt run completly after 8 hours.

Does anybody know how to do this much more efficiently ??

Thanks


declare

cursor s_cur is
       Select /*+ parallel(b 8) */ a.* 
        from CCS.CDR@NCC   a, mape_tmp_rid b
             where a.rowid =b.rid;

TYPE emp_tt  IS TABLE OF s_cur%ROWTYPE INDEX BY PLS_INTEGER;                                                                                               
   p_rec emp_tt ;   

BEGIN

 OPEN s_cur;
LOOP FETCH s_cur

BULK COLLECT INTO p_rec LIMIT 50000;
    FORALL i IN 1 .. p_rec.COUNT
      INSERT  /*+ APPEND PARALLEL */ INTO TMP_MAPE_CDR
      (CCS_ACCT_ID, CCS_ACCT_REF_ID,  CDR_TYPE, EXTRA_INFORMATION, RECORD_DATE )
     VALUES    
      ( 
        p_rec(i).CCS_ACCT_ID , 
        p_rec(i).CCS_ACCT_REF_ID,
        p_rec(i).CDR_TYPE , 
        p_rec(i).EXTRA_INFORMATION ,
        p_rec(i).RECORD_DATE 
         );

COMMIT;

    EXIT WHEN s_cur%NOTFOUND;
END LOOP;

CLOSE s_cur;

commit;
end;
Re: bulk collect problem through dblink [message #678281 is a reply to message #678280] Sat, 23 November 2019 01:38 Go to previous messageGo to next message
John Watson
Messages: 8344
Registered: January 2010
Location: Global Village
Senior Member
insert into TMP_MAPE_CDR
       Select a.* 
        from CCS.CDR@NCC   a, mape_tmp_rid b
             where a.rowid =b.rid;
I would not try to parallelize it, not for that many rows. Parallelism may make it slower.

btw, I wish you would not say "record" when you mean "row".
Re: bulk collect problem through dblink [message #678282 is a reply to message #678280] Sat, 23 November 2019 02:15 Go to previous messageGo to next message
Michel Cadot
Messages: 67363
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Db links are not parallelized, you have only ONE tunnel. So, most likely, parallelize will be slower than serial.

In addition, committing inside a loop is a bad practice, slow down the process, may lead to ORA-1555 error and logical/application errors if an Oracle one occurs during the loop.

John's solution is the best one.

Re: bulk collect problem through dblink [message #678285 is a reply to message #678282] Sat, 23 November 2019 03:56 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
I have rowid from remote table CCS.CDR@NCC. And thats for I use it for the better copy.
But stil this solution doesnt work for me right.
Re: bulk collect problem through dblink [message #678286 is a reply to message #678285] Sat, 23 November 2019 04:45 Go to previous messageGo to next message
John Watson
Messages: 8344
Registered: January 2010
Location: Global Village
Senior Member
I wasn't going to mention it, but since you have: Using rowid like that is not a good idea. To put it mildly. However, you have already ignored my first attempt to assist so I expect that you will ignore this too Smile
No problem, it is your database, not mine.
Re: bulk collect problem through dblink [message #678287 is a reply to message #678286] Sat, 23 November 2019 05:10 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
Do you mean that the best way is to use the just simple insert wihout bulk collect procedure ??
Total number of table records is 17milions.
Re: bulk collect problem through dblink [message #678288 is a reply to message #678280] Sat, 23 November 2019 05:36 Go to previous messageGo to next message
Solomon Yakobson
Messages: 2985
Registered: January 2010
Location: Connecticut, USA
Senior Member
Please provide explain plan along with indexes on tables CCS.CDR@NCC, mape_tmp_rid, table row counts and number of rows cursor s_cur is returns.

SY.
Re: bulk collect problem through dblink [message #678289 is a reply to message #678288] Sat, 23 November 2019 05:57 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
total number of MAPE_TMP_RID=17213455

I created index on MAPE_TMP_RID(RID)

Execution Plan
----------------------------------------------------------
Plan hash value: 691313085

-------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                  |   250T|   117P|  1600G  (9)|999:59:59 |        |      |
|   1 |  NESTED LOOPS      |                  |   250T|   117P|  1600G  (9)|999:59:59 |        |      |
|   2 |   REMOTE           | CDR       |   291M|   138G|  4191K  (1)| 00:05:28 |    NCC | R->S |
|*  3 |   TABLE ACCESS FULL| MAPE_TMP_RID |   860K|    15M|  5494   (9)| 00:00:01 |        |      |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("A".ROWID=("B"."RID"))

Remote SQL Information (identified by operation id):
----------------------------------------------------

   2 - SELECT ROWID,"CCS_ACCT_ID","CCS_ACCT_REF_ID","BILLING_ENGINE_ID","CALL_ID","CDR_TYPE","E
       XTRA_INFORMATION","RECORD_DATE","SCP_ID","SEQUENCE_NUMBER" FROM "CCS_ADMIN"."CDR" "A"
       (accessing 'NCC.WORLD' )

[Updated on: Sat, 23 November 2019 05:59]

Report message to a moderator

Re: bulk collect problem through dblink [message #678290 is a reply to message #678289] Sat, 23 November 2019 06:14 Go to previous messageGo to next message
Solomon Yakobson
Messages: 2985
Registered: January 2010
Location: Connecticut, USA
Senior Member
And how many rows are in CCS.CDR@NCC? And how many rows cursor returns? And what indexes CCS.CDR@NCC has?

SY.
Re: bulk collect problem through dblink [message #678291 is a reply to message #678290] Sat, 23 November 2019 06:23 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
I have created table MAPE_TMP_RID with ROWID from CCS.CDR@NCC for one partition.
Of course number of records for whole table CCS.CDR@NCC are much more.
And there is not index, just partition by date.
Re: bulk collect problem through dblink [message #678292 is a reply to message #678290] Sat, 23 November 2019 06:32 Go to previous messageGo to next message
Solomon Yakobson
Messages: 2985
Registered: January 2010
Location: Connecticut, USA
Senior Member
In any case, your query reads one row from remote table A, brings it over to local db and checks if that row ROWID exists in local table B. In other word your query pulls whole remote table A to local db. I assume remote table is way larger than local so you want the opposite - push local table B to remote side, apply condition a.rowid =b.rid and pull matching rows only. If so, add driving site hint:

cursor s_cur is
       Select /*+ driving_site(a) */ a.* 
        from CCS.CDR@NCC   a, mape_tmp_rid b
             where a.rowid =b.rid;
SY.

[Updated on: Sat, 23 November 2019 06:32]

Report message to a moderator

Re: bulk collect problem through dblink [message #678294 is a reply to message #678289] Sat, 23 November 2019 06:56 Go to previous messageGo to next message
John Watson
Messages: 8344
Registered: January 2010
Location: Global Village
Senior Member
Is the column mape_tmp_rid.rid a varchar2? That would explain why your index is not being used and you are scanning the table millions of times. Your index would need to be function based, on chartorowid(rid). And as suggested before, do it with a simple SQL statement, not the pl/sql cursor.
Re: bulk collect problem through dblink [message #678295 is a reply to message #678294] Sat, 23 November 2019 07:01 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
tmp_rid.rid is VARCHAR2 (1000 Byte)
Re: bulk collect problem through dblink [message #678296 is a reply to message #678295] Sat, 23 November 2019 07:05 Go to previous messageGo to next message
John Watson
Messages: 8344
Registered: January 2010
Location: Global Village
Senior Member
You have been given lots of advice: by me (confirmed by MC) and by SY. Why do you not actually try some of it?

For me, it is merely an intellectual exercise but for you it is (I assume) a real business problem. Do something!
Re: bulk collect problem through dblink [message #678297 is a reply to message #678287] Sat, 23 November 2019 08:49 Go to previous messageGo to next message
EdStevens
Messages: 1248
Registered: September 2013
Senior Member
mape wrote on Sat, 23 November 2019 05:10
Do you mean that the best way is to use the just simple insert wihout bulk collect procedure ??
Total number of table records is 17milions.
Well, you've already said that your current process takes several hours. So what would it hurt to actually try the suggestions and see for yourself?

A couple of years ago a co-worker brought to me a very similar problem. He was using some whiz-bang MS tool to copy rows from a MSSQL database into an Oracle and it was taking over 4 hours. I quickly threw together a PL/SQL procedure to to exactly what John Watson suggested to you, and the load took 4 minutes. This was for about 750k rows, so a smaller set than yours, but the principle is the same.

[Updated on: Sat, 23 November 2019 08:52]

Report message to a moderator

Re: bulk collect problem through dblink [message #678298 is a reply to message #678294] Sat, 23 November 2019 09:15 Go to previous messageGo to next message
Solomon Yakobson
Messages: 2985
Registered: January 2010
Location: Connecticut, USA
Senior Member
John Watson wrote on Sat, 23 November 2019 07:56
Is the column mape_tmp_rid.rid a varchar2? That would explain why your index is not being used and you are scanning the table millions of times. Your index would need to be function based, on chartorowid(rid). And as suggested before, do it with a simple SQL statement, not the pl/sql cursor.
John, OP stated CCS.CDR@NCC is much larger table and using index on smaller table isn't optimal especially when it involves pooling larger table from remote site to local. Pushing indexed table to remote site will prevent index use. Anyway, pushing mape_tmp_rid to remote site and using hash join would be, IMHO (based on info provided), is most optimal plan.

SY.
Re: bulk collect problem through dblink [message #678300 is a reply to message #678298] Sat, 23 November 2019 11:19 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
Im gonna try what john said, simple use sql insert instead of bulk procedure. I will let you know.
Re: bulk collect problem through dblink [message #678303 is a reply to message #678300] Sat, 23 November 2019 14:22 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
thanks for advice, its works fine, just simple sql query Smile
Re: bulk collect problem through dblink [message #678304 is a reply to message #678303] Sat, 23 November 2019 15:06 Go to previous messageGo to next message
EdStevens
Messages: 1248
Registered: September 2013
Senior Member
mape wrote on Sat, 23 November 2019 14:22
thanks for advice, its works fine, just simple sql query Smile
So, how long did it take, vs. th 7+ hours of your first method?
And what have you learned from this?
Re: bulk collect problem through dblink [message #678316 is a reply to message #678303] Sun, 24 November 2019 08:55 Go to previous messageGo to next message
John Watson
Messages: 8344
Registered: January 2010
Location: Global Village
Senior Member
This sounds like a very nice example of contrasting use of SQL and PL/SQL to achieve the same result. Going through a DB link makes it more interesting, and you might find some nice variations involving index(es) and the choice of driving site.

Can I persuade you to write it up as a blog article? Then you will be famous, Larry will recruit you, and you can move to Redwood Shores.
Re: bulk collect problem through dblink [message #678317 is a reply to message #678316] Sun, 24 November 2019 09:16 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
I've got the same problem, query takes a long time a does not run completly:

insert into TMP_MAPE_CDR
       Select a.* 
        from CCS.CDR@NCC   a, mape_tmp_rid b
             where a.rowid =b.rid;
Re: bulk collect problem through dblink [message #678318 is a reply to message #678280] Sun, 24 November 2019 09:37 Go to previous messageGo to next message
BlackSwan
Messages: 26733
Registered: January 2009
Location: SoCal
Senior Member
mape wrote on Fri, 22 November 2019 23:12
Hello
I need to mape copy of 20milions records from one db to another through db link.

Post EXECUTE PLAN for SELECT statement that correctly identifies the 20 million rows to be moved; as sample below

SELECT COUNT(*) FROM SOURCE_TABLE WHERE <you complete as needed>;

How long does this SELECT statement take to complete?

Final solution needs to run in/on the Source DB & "push" new rows into Target DB
Re: bulk collect problem through dblink [message #678319 is a reply to message #678318] Sun, 24 November 2019 09:40 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
mape_tmp_rid (rid rowid)

Re: bulk collect problem through dblink [message #678320 is a reply to message #678317] Sun, 24 November 2019 09:41 Go to previous messageGo to next message
John Watson
Messages: 8344
Registered: January 2010
Location: Global Village
Senior Member
mape wrote on Sun, 24 November 2019 15:16
I've got the same problem, query takes a long time a does not run completly:

insert into TMP_MAPE_CDR
       Select a.* 
        from CCS.CDR@NCC   a, mape_tmp_rid b
             where a.rowid =b.rid;
So why did you say it worked? You seem incapable of providing information. For example, what is "a long time"? What does "does not run completely" mean? If you refuse to show what you are actually doing and what happens, how can you expect assistance?
Re: bulk collect problem through dblink [message #678321 is a reply to message #678320] Sun, 24 November 2019 10:08 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
I was wrong, it didn't run right. Long time means that never run to the end.
Re: bulk collect problem through dblink [message #678322 is a reply to message #678321] Sun, 24 November 2019 10:12 Go to previous messageGo to next message
BlackSwan
Messages: 26733
Registered: January 2009
Location: SoCal
Senior Member
You can lead some folks to knowledge, but you can't make them think.
I give up trying to push this string.
He is all YOURS!
Re: bulk collect problem through dblink [message #678339 is a reply to message #678322] Mon, 25 November 2019 08:50 Go to previous messageGo to next message
Bill B
Messages: 1969
Registered: December 2004
Senior Member
why not have a mview log on the remote table and setup an mview with fast refresh on your home database. Refresh it once an hour. When you are ready to load the data you can do a final refresh and then copy it into the local table. Another major problem in inserts like this is that the index maintenance can really slow down the load. Can you disable the indexes and then rebuild them. Does this table have to stay available during the load?

[Updated on: Mon, 25 November 2019 12:31]

Report message to a moderator

Re: bulk collect problem through dblink [message #678344 is a reply to message #678339] Tue, 26 November 2019 01:59 Go to previous messageGo to next message
_jum
Messages: 573
Registered: February 2008
Senior Member
Couldn't EXPDP/IMPDP or DBMS_DATAPUMP be an alternative?
Re: bulk collect problem through dblink [message #678383 is a reply to message #678344] Thu, 28 November 2019 10:44 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
I can not use inpdp or datapump, the only one alternative is db link
Re: bulk collect problem through dblink [message #678384 is a reply to message #678383] Thu, 28 November 2019 10:46 Go to previous messageGo to next message
mape
Messages: 262
Registered: July 2006
Location: Slovakia
Senior Member
Bill B, you mean index on the remote table?
Re: bulk collect problem through dblink [message #678387 is a reply to message #678384] Fri, 29 November 2019 07:22 Go to previous message
Bill B
Messages: 1969
Registered: December 2004
Senior Member
No, An MVIEW LOG records all changes to a base table. You can build a materilized view on your local database and by running a scheduled refresh job, oracle will pull over all changes in the remote table into your local Materialized View. This is what is known as a fast refresh. If you schedule the refresh, for example, every 30 minutes it will keep up with the remote files changes and the remote table's information will be on your local machine. Once that is done you have two options.
1) copy the data in the MVIEW into your production table TMP_MAPE_CDR.
2) simply use the MVIEW for whatever task you need to do. By default an MVIEW is read only. If this works for you then great.

An MVIEW can also have indexes applied so you can have fast access. Why run your own refresh from the remote system when oracle will do it for you in the background.

By the way, it is normally NOT a good idea to store rowid's to match to rows in the base table. The rowid's can be trusted only within the same transaction. There are a number of things that can happen that would cause oracle to move a datablock.

A quick example of an MVIEW log follows. This command assums that their is a primary key on the remote table. This command MUST be run on the remote database

CREATE MATERIALIZED VIEW LOG ON CCS.CDR
STORAGE (
BUFFER_POOL DEFAULT
FLASH_CACHE DEFAULT
CELL_FLASH_CACHE DEFAULT
)
NOCACHE
LOGGING
NOPARALLEL
WITH PRIMARY KEY;

[Updated on: Fri, 29 November 2019 07:32]

Report message to a moderator

Previous Topic: Break line as per required attached output.
Next Topic: ORA 01847
Goto Forum:
  


Current Time: Fri Sep 18 06:57:07 CDT 2020