Data pump Process Architecture -Master Table, Worker process

Data Pump jobs use a master table, a master process, and worker processes to perform the work and keep track of the progress. For every Data Pump Export and Data Pump Import job, a master process is created. The master process controls the entire job, including communicating with the clients, creating and controlling a pool of worker processes, and performing logging operations.

Let's see an example.
1)Starting data pump jobs in one session.
expdp schemas=shaik userid=shaik/a dumpfile=shaik_30_04_2009.dmp logfile=arju_20_04_09.log

2)Query from dba_datapump_session to know the data pump job status.


set lines 150 pages 100
col program for a20
col username for a5
col spid for a7
col job_name for a25
select to_char(sysdate,'YYYY-MM-DD HH24:MI:SS') "DATE", s.program, s.sid,
   s.status, s.username, d.job_name, p.spid, s.serial#, p.pid
from v$session s, v$process p, dba_datapump_sessions d
where p.addr=s.paddr and s.saddr=d.saddr;
DATE                PROGRAM                  SID STATUS   USERN JOB_NAME                  SPID    SERIAL#     PID
------------------- -------------------- ------- -------- ----- ------------------------- ------- ------- -------
2009-04-30 16:49:46 expdp.exe                148 ACTIVE   SHAIK SYS_EXPORT_SCHEMA_01      3164         14      16
2009-04-30 16:49:46 ORACLE.EXE (DM00)        144 ACTIVE   SHAIK SYS_EXPORT_SCHEMA_01      5376         20      17

SQL> /

DATE                PROGRAM                  SID STATUS   USERN JOB_NAME                  SPID    SERIAL#     PID
------------------- -------------------- ------- -------- ----- ------------------------- ------- ------- -------
2009-04-30 16:49:50 expdp.exe                148 ACTIVE   SHAIK SYS_EXPORT_SCHEMA_01      3164         14      16
2009-04-30 16:49:50 ORACLE.EXE (DM00)        144 ACTIVE   SHAIK SYS_EXPORT_SCHEMA_01      5376         20      17
2009-04-30 16:49:50 ORACLE.EXE (DW01)        141 ACTIVE   SHAIK SYS_EXPORT_SCHEMA_01      7352       7      20

SQL> /

no rows selected

You see in the above output I ran the same query three times. First one is just after submitting data pump jobs. Second query I ran few seconds after data pump job is submitted. And third one is after data pump job is completed.

In the first output of the query just a master table is created as well as master process(DM00). This master process controls the entire job, including communicating with the clients, creating and controlling a pool of worker processes, and performing logging operations.

In the second output, worker process is created (DW01). Since we did not set any parallelism so by default PARALLEL is 1 (one) and there is only one worker process. If we set PARALLEL to more than 1, then we would see multiple worker process like DW01, DW02 etc. The worker processes are responsible for updating the master table with information on the status (pending, completed, failed) of each object being processed. This information is used to provide the detailed information required to restart stopped Data Pump jobs.

In the third output we see no rows as data pump process is disappeared when the data pump job is completed or it stops.

If you say architecture of data pump job then,
data pump job is initiated after creating Master Table. This table is created at the beginning of a Data Pump operation and is dropped at the end of the successful completion of a Data Pump operation. However if you kill the job by command kill_job in the interactive prompt then the Master Table can also be dropped. If a job is stopped using the stop_job interactive command or if the job is terminated unexpectedly, the Master Table will be retained.

Note that during data pump jobs we can set the keep_master parameter Y which ensure to retain the Master Table at the end of a successful job.

Suppose my datapump job command is,
C:>expdp schemas=shaik userid=shaik/a dumpfile=shaik_30_04_.dmp logfile=shaik_20_04_09.log keep_master=y
Export: Release 10.2.0.1.0 - Production on Thursday, 30 April, 2009 17:11:09

Copyright (c) 2003, 2005, Oracle. All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
Starting "SHAIK"."SYS_EXPORT_SCHEMA_02": schemas=shaik userid=shaik/******** dumpfile=shaik_30_04_.dmp logfile=shaik_20_04_0
9.log keep_master=y

As we set keep_master=y so we can see master table "SHAIK"."SYS_EXPORT_SCHEMA_02" anytime after data pump jobs is completed.

Structure of a master table is shown below.


SQL> desc "SHAIK"."SYS_EXPORT_SCHEMA_02";
Name                                                  Null?    Type
----------------------------------------------------- -------- ------------------------------------
PROCESS_ORDER                                                  NUMBER
DUPLICATE                                                      NUMBER
DUMP_FILEID                                                    NUMBER
DUMP_POSITION                                                  NUMBER
DUMP_LENGTH                                                    NUMBER
DUMP_ALLOCATION                                                NUMBER
COMPLETED_ROWS                                                 NUMBER
ERROR_COUNT                                                    NUMBER
ELAPSED_TIME                                                   NUMBER
OBJECT_TYPE_PATH                                               VARCHAR2(200)
OBJECT_PATH_SEQNO                                              NUMBER
OBJECT_TYPE                                                    VARCHAR2(30)
IN_PROGRESS                                                    CHAR(1)
OBJECT_NAME                                                    VARCHAR2(500)
OBJECT_LONG_NAME                                               VARCHAR2(4000)
OBJECT_SCHEMA                                                  VARCHAR2(30)
ORIGINAL_OBJECT_SCHEMA                                         VARCHAR2(30)
PARTITION_NAME                                                 VARCHAR2(30)
SUBPARTITION_NAME                                              VARCHAR2(30)
FLAGS                                                          NUMBER
PROPERTY                                                       NUMBER
COMPLETION_TIME                                                DATE
OBJECT_TABLESPACE                                              VARCHAR2(30)
SIZE_ESTIMATE                                                  NUMBER
OBJECT_ROW                                                     NUMBER
PROCESSING_STATE                                               CHAR(1)
PROCESSING_STATUS                                              CHAR(1)
BASE_PROCESS_ORDER                                             NUMBER
BASE_OBJECT_TYPE                                               VARCHAR2(30)
BASE_OBJECT_NAME                                               VARCHAR2(30)
BASE_OBJECT_SCHEMA                                             VARCHAR2(30)
ANCESTOR_PROCESS_ORDER                                         NUMBER
DOMAIN_PROCESS_ORDER                                           NUMBER
PARALLELIZATION                                                NUMBER
UNLOAD_METHOD                                                  NUMBER
GRANULES                                                       NUMBER
SCN                                                            NUMBER
GRANTOR                                                        VARCHAR2(30)
XML_CLOB                                                       CLOB
NAME                                                           VARCHAR2(30)
VALUE_T                                                        VARCHAR2(4000)
VALUE_N                                                        NUMBER
IS_DEFAULT                                                     NUMBER
FILE_TYPE                                                      NUMBER
USER_DIRECTORY                                                 VARCHAR2(4000)
USER_FILE_NAME                                                 VARCHAR2(4000)
FILE_NAME                                                      VARCHAR2(4000)
EXTEND_SIZE                                                    NUMBER
FILE_MAX_SIZE                                                  NUMBER
PROCESS_NAME                                                   VARCHAR2(30)
LAST_UPDATE                                                    DATE
WORK_ITEM                                                      VARCHAR2(30)
OBJECT_NUMBER                                                  NUMBER
COMPLETED_BYTES                                                NUMBER
TOTAL_BYTES                                                    NUMBER
METADATA_IO                                                    NUMBER
DATA_IO                                                        NUMBER
CUMULATIVE_TIME                                                NUMBER
PACKET_NUMBER                                                  NUMBER
OLD_VALUE                                                      VARCHAR2(4000)
SEED                                                           NUMBER
LAST_FILE                                                      NUMBER
USER_NAME                                                      VARCHAR2(30)
OPERATION                                                      VARCHAR2(30)
JOB_MODE                                                       VARCHAR2(30)
CONTROL_QUEUE                                                  VARCHAR2(30)
STATUS_QUEUE                                                   VARCHAR2(30)
REMOTE_LINK                                                    VARCHAR2(4000)
VERSION                                                        NUMBER
DB_VERSION                                                     VARCHAR2(30)
TIMEZONE                                                       VARCHAR2(64)
STATE                                                          VARCHAR2(30)
PHASE                                                          NUMBER
GUID                                                           RAW(16)
START_TIME                                                     DATE
BLOCK_SIZE                                                     NUMBER
METADATA_BUFFER_SIZE                                           NUMBER
DATA_BUFFER_SIZE                                               NUMBER
DEGREE                                                         NUMBER
PLATFORM                                                       VARCHAR2(101)
ABORT_STEP                                                     NUMBER
INSTANCE                                                       VARCHAR2(60)

Let's see the number of rows populated inside master table.
SQL> select count(*) from "SHAIK"."SYS_EXPORT_SCHEMA_02";

COUNT(*)
--------
1125

We see in the master table the total rows are 1125. In fact master table contains all data pump log messages. It is used to track the detailed progress information of a Data Pump job - which is more than the log messages. It conatins following information.

- Completed rows of a table.

- Total number of errors during data pump operation.

- Elapsed time for each table to do data pump export/import operation.

- The current set of dump files.

- The current state of every object exported or imported and their locations in the dump file set.

- The job's user-supplied parameters.

- The status of every worker process.

- The state of current job status and restart information.

- The dump file location, the directory name information.

And many other useful information.

Search This Blog

World of Oracle DBA

Data pump Process Architecture -Master Table, Worker process

Comments

Post a Comment

Popular posts from this blog

ORA-00939: too many arguments -when case arguments exceed 255

ORA-04062: timestamp of procedure has been changed

ORA-31655: no data or metadata objects selected for job