Add a new node using ADD_MASTER_DATABASE : Group Quiescing Method

Posted By Sagar Patil

I have 2 databases (repA and repB) which are part of multi master replication cluster server A and Server B. I now need to add another Server C with database name of repC. Consider this as a Live system and require a very minimum downtime.

Pre-Requisites :

Step 1> Check there are no pending X’actions at both sites repA,repB

Step 1> Find if there are any Transactions Currently Being Propagated to a Remote Master

repAA> SELECT /*+ ORDERED */ P.XID "Tran Being Propagated",
(MAX(C.CALLNO) + 1) "Number of Calls in Tran",
(P.SEQUENCE/MAX(C.CALLNO) + 1) * 100 "% Processed Calls"
FROM V$REPLPROP P, DEFCALL C
WHERE P.NAME LIKE '%SLAVE%'
AND upper(P.DBLINK) = 'REPB'
AND C.DEFERRED_TRAN_ID = P.XID
GROUP BY P.XID, P.SEQUENCE;

no rows selected

repBB> SELECT /*+ ORDERED */ P.XID "Tran Being Propagated",
(MAX(C.CALLNO) + 1) "Number of Calls in Tran",
(P.SEQUENCE/MAX(C.CALLNO) + 1) * 100 "% Processed Calls"
FROM V$REPLPROP P, DEFCALL C
WHERE P.NAME LIKE '%SLAVE%'
AND upper(P.DBLINK) = 'REPA'
AND C.DEFERRED_TRAN_ID = P.XID
GROUP BY P.XID, P.SEQUENCE;

no rows selected

Step 2> Listing the Number of Deferred Transactions for Each Destination Master Site

repAA> SELECT DBLINK DEST, COUNT(*) TRANS
FROM DEFTRANDEST D
GROUP BY DBLINK;

no rows selected

repBB> SELECT DBLINK DEST, COUNT(*) TRANS
FROM DEFTRANDEST D
GROUP BY DBLINK;

no rows selected

Step 3> Determining the Total Number of Transactions Queued for Propagation

SELECT COUNT(DISTINCT DEFERRED_TRAN_ID) “Transactions Queued” FROM DEFTRANDEST;

Transactions Queued
——————-
0

repAA>select sname, master, status from sys.dba_repcat;

SNAME M STATUS
—————————— – ———
SCOTT_REPG Y NORMAL

repBB>>select sname, master, status from sys.dba_repcat;

SNAME M STATUS
—————————— – ———
SCOTT_REPG Y NORMAL
repAA> Alter system set job_queue_processes=0 scope=both;

repBB> Alter system set job_queue_processes=0 scope=both;

repAA> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

repBB>>select sname, master, status from sys.dba_repcat;
SNAMEM STATUS
—————————— – ———
SCOTT_REPGY NORMAL
repAA>cd /opt/oracle/oradata
repAA>ls -l
total 4
drwxr-xr-x 2 oracle dba 1024 Sep 18 16:57 repA
drwxr-xr-x 2 oracle dba 1024 Sep 18 17:08 repB

repAA>ls -l
total 2468686
-rw-r—– 1 oracle dba 1871872 Sep 23 16:47 control01.ctl
-rw-r—– 1 oracle dba 1871872 Sep 23 16:47 control02.ctl
-rw-r—– 1 oracle dba 1871872 Sep 23 16:47 control03.ctl
-rw-r–r– 1 oracle dba 20975616 Sep 23 16:47 drsys01.dbf
-rw-Br–r– 1 oracle dba 143527936 Sep 23 16:47 example01.dbf
-rw-r–r– 1 oracle dba 26218496 Sep 23 16:47 indx01.dbf
-rw-r–r– 1 oracle dba 20975616 Sep 23 16:47 odm01.dbf
-rw-r—– 1 oracle dba 104858624 Sep 23 11:08 redo01.log
-rw-r—– 1 oracle dba 104858624 Sep 23 16:47 redo02.log
-rw-r—– 1 oracle dba 104858624 Sep 22 12:19 redo03.log
-rw-r–r– 1 oracle dba 346034176 Sep 23 16:47 system01.dbf
-rw-r–r– 1 oracle dba 99618816 Sep 18 17:10 temp01.dbf
-rw-r–r– 1 oracle dba 10489856 Sep 23 16:47 tools01.dbf
-rw-r–r– 1 oracle dba 209719296 Sep 23 16:47 undotbs01.dbf
-rw-r–r– 1 oracle dba 26218496 Sep 23 16:47 users01.dbf
-rw-r–r– 1 oracle dba 39981056 Sep 23 16:47 xdb01.dbf
Note : The same timestamp for all datafiles is same & means we have consistent backup set.

*** Prepare repCC instance as a new Master to be added into repication Cluster.

repAA>

cp -Rf repAA repCC
repAA>ls -l
total 6
drwxr-xr-x 2 oracle dba 1024 Sep 18 16:57 repAA
drwxr-xr-x 2 oracle dba 1024 Sep 18 17:08 repBB
drwxr-xr-x 2 oracle dba 1024 Sep 23 16:55 repCC

I have copied initrepB.ora init file as initrepC.ora and changed all datafile paths to repCC. I aos edited job_queue_processes and set it to 0;

repAA> alter database backup controlfile to trace;

Database altered.

repCC>export ORACLE_SID=repCC
repCC>set| grep ORACLE_SID
ORACLE_SID= repCC

$ orapwd file=orapwrepCC password=repcc entries=5

repCC> conn / as sysdba
Connected to an idle instance.

repCC> startup nomount pfile=’/opt/oracle/admin/repC/pfile/initrepC.ora’;
ORACLE instance started.

Total System Global Area 320563864 bytes
Fixed Size 735896 bytes
Variable Size 285212672 bytes
Database Buffers 33554432 bytes
Redo Buffers 1060864 bytes

repCC> show parameter job_queue

NAME TYPE VALUE
———————————— ———– ——————————
job_queue_processes integer 0

repCC> show parameter db_name

NAME TYPE VALUE
———————————— ———– ——————————
db_name string repAA

repCC> CREATE CONTROLFILE set DATABASE “repAA” NORESETLOGS ARCHIVELOG
MAXLOGFILES 50
MAXLOGMEMBERS 5
MAXDATAFILES 100
MAXINSTANCES 1
MAXLOGHISTORY 113
LOGFILE
GROUP 1 ‘/opt/oracle/oradata/repCC/redo01.log’ SIZE 100M,
GROUP 2 ‘/opt/oracle/oradata/repCC/redo02.log’ SIZE 100M,
GROUP 3 ‘/opt/oracle/oradata/repCC/redo03.log’ SIZE 100M
DATAFILE
‘/opt/oracle/oradata/repCC/system01.dbf’,
‘/opt/oracle/oradata/repCC/undotbs01.dbf’,
‘/opt/oracle/oradata/repCC/drsys01.dbf’,
‘/opt/oracle/oradata/repCC/example01.dbf’,
‘/opt/oracle/oradata/repCC/indx01.dbf’,
‘/opt/oracle/oradata/repCC/odm01.dbf’,
‘/opt/oracle/oradata/repCC/tools01.dbf’,
‘/opt/oracle/oradata/repCC/users01.dbf’,
‘/opt/oracle/oradata/repCC/xdb01.dbf’
CHARACTER SET WE8ISO8859P1;
/

repCC>alter database open NORESETLOGS;

Database altered.

repCC> ALTER TABLESPACE TEMPORARY ADD TEMPFILE ‘/opt/oracle/oradata/repCC/TEMPORARY2.dbf’
SIZE 5242880 REUSE AUTOEXTEND OFF; 2

Tablespace altered.

repCC>select instance from v$thread;

INSTANCE
—————-
repAA

At any cost do not drop replication objects using API "EXECUTE Dbms_Repcat.Drop_Master_Repobject('SCOTT', 'EMP', 'TABLE'); "
Just drop replication group - >
repCC>execute dbms_repcat.drop_master_repgroup('SCOTT_REPG');

Problem : At one event Oracle returned error so I used following API to drop Objects before Group.. Don't do this.
repCC>>execute dbms_repcat.drop_master_repgroup('SCOTT_REPG');
BEGIN dbms_repcat.drop_master_repgroup('SCOTT_REPG'); END;
*
ERROR at line 1:
ORA-23353: deferred RPC queue has entries for object group "PUBLIC"."SCOTT_REPG" so drop all individual objects before dropping group.

repCC>> EXECUTE Dbms_Repcat.Drop_Master_Repobject('SCOTT', 'EMP', 'TABLE');
repCC> EXECUTE Dbms_Repcat.Drop_Master_Repobject('SCOTT', 'DEPT', 'TABLE');
PL/SQL procedure successfully completed.

I dropped objects using above API which were then added to repCC administrative queue. When enabled replication ,it dropped objects at repBB.
Also this method failed to move group from QUIESCED to NORMAL condition which could prove a disaster on live system.
The trick here is dropping replication group by keeping DB name as repAA and then renaming database to repCC. Please make sure jobs are set to 0.
Rename repAA instance after dropping replication group to RepCC

repCC>CREATE CONTROLFILE set DATABASE “repCC” RESETLOGS ARCHIVELOG
MAXLOGFILES 50
MAXLOGMEMBERS 5
MAXDATAFILES 100
MAXINSTANCES 1
MAXLOGHISTORY 113
LOGFILE
GROUP 1 ‘/opt/oracle/oradata/repCC/redo01.log’ SIZE 100M,
GROUP 2 ‘/opt/oracle/oradata/repCC/redo02.log’ SIZE 100M,
GROUP 3 ‘/opt/oracle/oradata/repCC/redo03.log’ SIZE 100M
DATAFILE
‘/opt/oracle/oradata/repCC/system01.dbf’,
‘/opt/oracle/oradata/repCC/undotbs01.dbf’,
‘/opt/oracle/oradata/repCC/drsys01.dbf’,
‘/opt/oracle/oradata/repCC/example01.dbf’,
‘/opt/oracle/oradata/repCC/indx01.dbf’,
‘/opt/oracle/oradata/repCC/odm01.dbf’,
‘/opt/oracle/oradata/repCC/tools01.dbf’,
‘/opt/oracle/oradata/repCC/users01.dbf’,
‘/opt/oracle/oradata/repCC/xdb01.dbf’
CHARACTER SET WE8ISO8859P1;
repCC>>alter database rename global_name to repCC;

Database altered.

6. Add new database instance repCC at $ORACLE_HOME/network/admin/listener.ora

LSNRCTL>reload<br>Connectingto(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=bmhp01.uk.pri.o2.com)(PORT=1521)))
The command completed successfully

7. Create fresh DB links from existing master sites to new one and vice versa. Create new links at new database repC. Note repC does have links pointing to repA.

repCC – Should have links for repAA, repBB
repAA – Should have links for repBB, repCC
repBB – Should have links for repAA, repCC

repCC>>alter database rename global_name to repCC;

Database altered.
repAA>conn sys/repa@repaa as sysdba
Connected.
repBB>create public database link repcc using ‘repcc’;

Database link created.

repBB>conn sys/repb@repbb as sysdba
Connected.
repB>create public database link repcc using ‘repcc’;

Database link created.

repBB>conn sys/repc@repcc as sysdba
Connected.

repC>create public database link repbb using ‘repbb’;
create public database link repb using ‘repb’
*
ERROR at line 1:
ORA-02082: a loopback database link must have a connection qualifier

repCC>alter database rename global_name to repCC;

Database altered.

repCC>create public database link repbb using ‘repbb’;

Database link created.

repAA>connect repadmin/repadmin@repaa

repAA>create database link repcc connect to repadmin identified by repadmin;

repBB>connect repadmin/repadmin@repbb

repBB>create database link repcc connect to repadmin identified by repadmin;

repBB>>select count(*) from tab@repcc;

COUNT(*)
———-
0
repCC>connect repadmin/repadmin@repcc

repCC>create database link repbb connect to repadmin identified by repadmin;

repCC>show parameter db_name

NAME TYPE VALUE
———————————— ———– ——————————
db_name string repCC

repCC>>select sname, master, status from sys.dba_repcat;

no rows selected
repAA>>select sname, master, status from sys.dba_repcat;

SNAME M STATUS
—————————— – ———
SCOTT_REPG Y NORMAL

repAA>>BEGIN
DBMS_REPCAT.SUSPEND_MASTER_ACTIVITY ( gname => ‘SCOTT_REPG’);
END;
/

PL/SQL procedure successfully completed.
repAA>>select sname, master, status from sys.dba_repcat;

SNAME M STATUS
—————————— – ———
SCOTT_REPG Y QUIESCED

repAA>>SELECT COUNT(*) FROM DBA_REPCATLOG WHERE GNAME = ‘SCOTT_REPG’;

COUNT(*)
———-
0
repAA>>show user
USER is “REPADMIN”
repAA>> BEGIN
DBMS_REPCAT.ADD_MASTER_DATABASE (
gname => ‘scott_repg’,
master => ‘repcc’,
use_existing_objects => TRUE,
copy_rows => FALSE,
propagation_mode => ‘ASYNCHRONOUS’);
END;
/
PL/SQL procedure successfully completed.

repAA>> SELECT GNAME, substr(DBLINK,1,20), MASTERDEF FROM DBA_REPSITES WHERE MASTER = ‘Y’ AND GNAME NOT IN (SELECT GNAME FROM DBA_REPSITES WHERE SNAPMASTER = ‘Y’) ORDER BY GNAME;

GNAME SUBSTR(DBLINK,1,20) M
—————————— ——————– –
SCOTT_REPG REPAA.US.ORACLE.COM Y
SCOTT_REPG REPBB.US.ORACLE.COM N
SCOTT_REPG REPCC.US.ORACLE.COM N

repAA>>SELECT COUNT(*) FROM DBA_REPCATLOG WHERE GNAME = ‘SCOTT_REPG’;

COUNT(*)
———-
9

repBB>>SELECT GNAME, substr(DBLINK,1,20), MASTERDEF FROM DBA_REPSITES WHERE MASTER = ‘Y’ AND GNAME NOT IN (SELECT GNAME FROM DBA_REPSITES WHERE SNAPMASTER = ‘Y’) ORDER BY GNAME;

GNAME SUBSTR(DBLINK,1,20) M
—————————— ——————– –
SCOTT_REPG REPBB.US.ORACLE.COM N
SCOTT_REPG REPAA.US.ORACLE.COM Y
SCOTT_REPG REPCC.US.ORACLE.COM N

repBB>>SELECT COUNT(*) FROM DBA_REPCATLOG WHERE GNAME = ‘SCOTT_REPG’;

COUNT(*)
———-
0

repCC>>SELECT GNAME, substr(DBLINK,1,20), MASTERDEF FROM DBA_REPSITES WHERE MASTER = ‘Y’ AND GNAME NOT IN (SELECT GNAME FROM DBA_REPSITES WHERE SNAPMASTER = ‘Y’) ORDER BY GNAME;

GNAME SUBSTR(DBLINK,1,20) M
—————————— ——————– –
SCOTT_REPG REPCC.US.ORACLE.COM N
SCOTT_REPG REPAA.US.ORACLE.COM Y
SCOTT_REPG REPBB.US.ORACLE.COM N

repCC>>SELECT COUNT(*) FROM DBA_REPCATLOG WHERE GNAME = ‘SCOTT_REPG’;

COUNT(*)
———-
6

repBB> select count(*) from REPCAT$_REPOBJECT;

COUNT(*)
———-
6
repAA>select count(*) from REPCAT$_REPOBJECT;

COUNT(*)
———-
6
repCC>select count(*) from REPCAT$_REPOBJECT;
COUNT(*)
———-
0

You should wait until the DBA_REPCATLOG view is empty. This view has temporary information that is cleared after successful execution.

Run do_deferred_repcat_admin Manually to push the Admin Queue. Execute the following SELECT statement in another SQL*Plus session to monitor the DBA_REPCATLOG view:

repAA>SELECT COUNT(*) FROM DBA_REPCATLOG WHERE GNAME = ‘SCOTT_REPG’;

COUNT(*)
———-
9
repBB>SELECT COUNT(*) FROM DBA_REPCATLOG WHERE GNAME = ‘SCOTT_REPG’;

COUNT(*)
———-
0
repCC>SELECT COUNT(*) FROM DBA_REPCATLOG WHERE GNAME = ‘SCOTT_REPG’;

COUNT(*)
———-
6

The easiset method to reoslve admin request is to use grid contorl else Oracle EM. I used it at instances repAA,BB,CC to resolve pending admin requests.

repAA> BEGIN
DBMS_REPCAT.RESUME_MASTER_ACTIVITY (gname => ‘scott_repg’);
END;

repAA>select sname, master, status from sys.dba_repcat;

SNAME M STATUS
—————————— – ———
SCOTT_REPG Y NORMAL

repBB>select sname, master, status from sys.dba_repcat;

SNAME M STATUS
—————————— – ———
SCOTT_REPG Y NORMAL

repCC>>select sname, master, status from sys.dba_repcat;

SNAME M STATUS
—————————— – ———
SCOTT_REPG Y QUIESCED

I had to run dbms_defer_sys.push as well as do_deferred_repcat_admin at repAA/CC to get this group ito normal State

repCC>>select sname, master, status from sys.dba_repcat;

SNAME M STATUS
—————————— – ———
SCOTT_REPG Y NORMAL

repAA

repBB

repCC

Most important Checks after a new database is added into Replication Clustyer

1. Make sure the target (remote) database does have all replication support created for objects.
select sname,oname,type,STATUS from DBA_REPOBJECT minus
select sname,oname,type,STATUS from DBA_REPOBJECT<at>remote_db;

select object_name from dba_objects where object_name like ‘%$RP%’
minus
select object_name from dba_objects<@>remote_db where object_name like ‘%$RP%’;

If above SQL return any values menas there is a replication support missing at those objects at target database. Please use post

2. If you have used SQLPLUS “RECOVER until Change ” to carry recovery at new masterĀ  then DBID would be same for remote database as existing master. This would prove a problem for rman backup.
Please use nid utility to change DBID of target Database

Leave a Reply

You must be logged in to post a comment.

Top of Page

Top menu