Replicating RAC database using RMAN at Remote Server
Here I am duplicating 11g RAC database from one RHEL Server to Another by old 10g method.
I could have used 11g “DUPLICATE TARGET DATABASE TO TARGET_DB FROM ACTIVE DATABASE” which doesn’t need previous rman backup at source. But it may not be a good option for large databases or at places with narrow network bandwidth.
Assumptions Made:
– RAC Clusterware and Database binaries are installed at Destination Nodes
– Clusterware services “crsctl check crs” reported active
PRIMARY site Tasks (Ora01a1,Ora01a2):
- Create FULL RMAN Backup
- Copy backup files from PRIMARY server to New server
- Create pfile from spfile at source RAC
- Copy init.ora from $Primary_Server:ORACLE_HOME/dbs to $New_Server:ORACLE_HOME/dbs
- Copy $Primary_Server:ORACLE_HOME/dbs/password file to $New_Server:ORACLE_HOME/dbs
[oracle@Ora01a1 RAC1]$ scp Ora01a1BKUP.tgz oracle@Node1:/mnt/data
Warning: Permanently added (RSA) to the list of known hosts.
Ora01a1BKUP.tgz 100% 274MB 11.4MB/s 00:24
SQL> show parameter pfile
NAME TYPE VALUE
———————————— ———– ——————————
spfile string /mnt/data/oradata/primary/spfileRAC.oraSQL> show parameter spfile
NAME TYPE VALUE
———————————— ———– ——————————
spfile string /mnt/data/oradata/primary/spfileRAC.oraSQL> create pfile=’/mnt/data/oradata/primary/init.ora’ from spfile;
File created
[oracle@Ora01a1 RAC]$ scp init.ora oracle@Node1:/mnt/data/rman_backups/bkup/init.ora 100% 1612 1.6KB/s 00:00
[oracle@Ora01a1 dbs]$ scp /mnt/data/oradata/primary/orapwRAC oracle@Node1:/mnt/data/rman_backups/bkup orapwRAC 100% 1536 1.5KB/s 00:00
Destination Site Tasks (Node1,Node2)
Create required directories for bdump,adump as well as database mount volumes.
[oracle@Node1]$ grep /mnt initRAC.ora
*.control_files=’/mnt/data/oradata/primary/control01.ctl’,’/mnt/data/oradata/primary/control02.ctl’
*.db_recovery_file_dest=’/mnt/logs/oradata/primary/fast_recovery_area’
*.log_archive_dest_1=’LOCATION=/mnt/logs/oradata/primary/arch’[oracle@Node1]$ mkdir -p /mnt/data/oradata/primary/
[oracle@Node1]$ mkdir -p /mnt/logs/oradata/primary/fast_recovery_area
[oracle@Node1]$ mkdir -p /mnt/logs/oradata/primary/arch“opt” is a local volume for each instance so create directories on both RAC nodes
[oracle@Node1]$ grep /opt initRAC.ora
*.audit_file_dest=’/opt/app/oracle/admin/primary/adump’
*.diagnostic_dest=’/opt/app/oracle’[oracle@Node1]$ mkdir -p /opt/app/oracle/admin/primary/adump
[oracle@Node1]$ mkdir -p /opt/app/oracle[oracle@Node2]$ mkdir -p /opt/app/oracle/admin/primary/adump
[oracle@Node3]$ mkdir -p /opt/app/oracle
Under 11g background trace will be maintained at “$ORACLE_BASE/diag/rdbms”, if required create necessary directories there.
Modify init.ora file ($ORACLE_HOME/dbs/init.ora) and amend/change parameters. I had to comment out “remote_listener” parameter as the serversnames at destination are different.
Copy init.ora at both nodes “Node1,Node2″@$ORACLE_HOME/dbs
[oracle@Node1 dbs]$ cp initRAC.ora initRAC1.ora
[oracle@Node1 dbs]$ echo $ORACLE_SID
RAC1
SQL> startup nomount;
ORACLE instance started.
Total System Global Area 9152860160 bytes
Fixed Size 2234056 bytes
Variable Size 6945769784 bytes
Database Buffers 2181038080 bytes
Redo Buffers 23818240 bytes
[oracle@Node1 dbs]$ rman target / nocatalog
connected to target database: RAC (not mounted)
using target database control file instead of recovery catalog
RMAN> restore controlfile from ‘/mnt/data/rman_backups/bkup/c-4020163152-20110405-01’;
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=588 instance=RAC1 device type=DISK
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
output file name=/mnt/data/oradata/primary/control01.ctl
output file name=/mnt/data/oradata/primary/control02.ctl
.
…
………
Finished restore at 05-APR-11
Verify controlfiles are copied at right location
[oracle@Node2 RAC]$ pwd
/mnt/data/oradata/RAC[oracle@Node2 RAC]$ ls -lrt
-rw-r—– 1 oracle oinstall 22986752 Apr 5 16:35 control01.ctl
-rw-r—– 1 oracle oinstall 22986752 Apr 5 16:35 control02.ctlRMAN> alter database mount;
database mounted
RMAN> RESTORE DATABASE;
Starting restore at 05-APR-11
Starting implicit crosscheck backup at 05-APR-11
allocated channel: ORA_DISK_1
allocated channel: ORA_DISK_2
******* This returned errors
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 04/05/2011 16:36:54
RMAN-06026: some targets not found – aborting restore
RMAN-06023: no backup or copy of datafile 3 found to restore
RMAN was not able to locate backup since backupset was not registered with rman inventory & copied at different location. Let’s catalog backup pieces that were shipped from Primary database.
I have multiple copies of backup files so I used
RMAN> CATALOG START WITH ‘/mnt/data/rman_backups/bkup/’ NOPROMPT;
List of Cataloged Files
=======================
File Name: /mnt/data/rman_backups/bkup/c-4020163152-20110405-04
File Name: /mnt/data/rman_backups/bkup/c-4020163152-20110405-05
File Name: /mnt/data/rman_backups/bkup/db_bk_ub8m91cg7_s3432_p1_t747680263.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ub9m91cg8_s3433_p1_t747680264.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubam91cg9_s3434_p1_t747680265.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubcm91cgi_s3436_p1_t747680274.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubdm91cgi_s3437_p1_t747680274.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubbm91cgi_s3435_p1_t747680274.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubem91ck0_s3438_p1_t747680384.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubfm91ck0_s3439_p1_t747680384.bkp
File Name: /mnt/data/rman_backups/bkup/ctl_bk_ubhm91ck3_s3441_p1_t747680387.bkpRMAN> RESTORE DATABASE;
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /mnt/data/oradata/primary/system01.dbf
channel ORA_DISK_1: restoring datafile 00005 to /mnt/data/oradata/primary/undotbs02.dbf
channel ORA_DISK_1: reading from backup piece /mnt/data/rman_backups/bkup/db_bk_ubdm91cgi_s3437_p1_t747680274.bkp
.
..
channel ORA_DISK_1: piece handle=/mnt/data/rman_backups/bkup/db_bk_ubdm91cgi_s3437_p1_t747680274.bkp tag=TAG20110405T165753
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:01:26
channel ORA_DISK_2: piece handle=/mnt/data/rman_backups/bkup/db_bk_ubcm91cgi_s3436_p1_t747680274.bkp tag=TAG20110405T165753
channel ORA_DISK_2: restored backup piece 1
channel ORA_DISK_2: restore complete, elapsed time: 00:01:26
channel ORA_DISK_3: piece handle=/mnt/data/rman_backups/bkup/db_bk_ubbm91cgi_s3435_p1_t747680274.bkp tag=TAG20110405T165753
channel ORA_DISK_3: restored backup piece 1
channel ORA_DISK_3: restore complete, elapsed time: 00:01:56
Finished restore at 05-APR-11RMAN> recover database;
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=3289
channel ORA_DISK_1: reading from backup piece /mnt/data/rman_backups/bkup/db_bk_ubem91ck0_s3438_p1_t747680384.bkp
channel ORA_DISK_2: starting archived log restore to default destination
channel ORA_DISK_2: restoring archived log
archived log thread=2 sequence=3484
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 04/05/2011 17:17:46
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 3290 and starting SCN of 246447604RMAN> ALTER DATABASE OPEN RESETLOGS;
database opened
Shutdown and restart database RAC1
SQL> shutdown abort;
ORACLE instance shut down.set sqlprompt ‘&_CONNECT_IDENTIFIER > ‘
RAC1> startup;
ORACLE instance started.
Total System Global Area 9152860160 bytes
Fixed Size 2234056 bytes
Variable Size 6945769784 bytes
Database Buffers 2181038080 bytes
Redo Buffers 23818240 bytes
Database mounted.
Database opened.RAC1> set linesize 200;
RAC1> set pagesize 20;
RAC1> select inst_id,substr(member,1,35) from gv$logfile;
INST_ID SUBSTR(MEMBER,1,35)
———- ——————————————————————————————————————————————–
1 /mnt/data/oradata/primary/redo02.log
1 /mnt/data/oradata/primary/redo01.log
1 /mnt/data/oradata/primary/redo03.log
1 /mnt/data/oradata/primary/redo04.log
I can see , INSTANCE 2 REDO log files are not listed so startup RAC2 instance at Node2
[oracle@Node2 dbs]$ echo $ORACLE_SID
RAC2
SQL> set sqlprompt ‘&_CONNECT_IDENTIFIER > ‘
RAC2> startup;
ORACLE instance started.
Total System Global Area 9152860160 bytes
Fixed Size 2234056 bytes
Variable Size 6610225464 bytes
Database Buffers 2516582400 bytes
Redo Buffers 23818240 bytes
Database mounted.
Database opened.
I can now locate REDO files for Instance 1 as well as 2
select inst_id,substr(member,1,35) from gv$logfile;
INST_ID SUBSTR(MEMBER,1,35)
———- ——————————————————————————————————————————————–
2 /mnt/data/oradata/primary/redo02.log
2 /mnt/data/oradata/primary/redo01.log
2 /mnt/data/oradata/primary/redo03.log
2 /mnt/data/oradata/primary/redo04.log
1 /mnt/data/oradata/primary/redo02.log
1 /mnt/data/oradata/primary/redo01.log
1 /mnt/data/oradata/primary/redo03.log
1 /mnt/data/oradata/primary/redo04.log
8 rows selected.
I will carry log switchs to see ARCHIVE files create at Archive Destination “/mnt/logs/oradata/primary/arch”
RAC1 > alter system switch logfile;
System altered.
RAC1 > /
System altered.
RAC2 > alter system switch logfile;
System altered.
RAC2 > /
System altered.
[oracle@Node2 arch]$ pwd
/mnt/logs/oradata/primary/arch
[oracle@Node2 arch]$ ls -lrt
total 5348
-rw-r—– 1 oracle oinstall 777216 Apr 6 10:00 1_10_747681489.arc
-rw-r—– 1 oracle oinstall 4096 Apr 6 10:00 1_11_747681489.arc
-rw-r—– 1 oracle oinstall 4667392 Apr 6 10:00 2_11_747681489.arc
-rw-r—– 1 oracle oinstall 56832 Apr 6 10:01 2_12_747681489.arc
We have some background jobs in this database. I will set them to sleep at both databases for some time
RAC1 > alter system set job_queue_processes=0;
System altered.RAC2 > alter system set job_queue_processes=0;
System altered.
See if there are any alrtlog errors reported at nodes node1/node2 before Registering database with CRS
RAC1> create spfile from pfile;
File created.[oracle@Node1 dbs]$ pwd
/opt/app/oracle/product/11.2/db_1/dbs
-rw-r—– 1 oracle oinstall 3584 Apr 6 10:20 spfileRAC1.ora
Move spfile at a shared clustered location accessible to both Nodes/Instances RAC1/RAC2.
cp spfileRAC1.ora /mnt/data/oradata/primary/spfileRAC.ora
[oracle@(RAC1 or RAC2 ) ]$ df -k
/dev/mapper/System-Opt 20314748 14636172 4630208 76% /opt — Local Storage
NETAPP_Server:/vol/prod_data 52428800 33919456 18509344 65% /mnt/data — Clustered Storage[oracle@RAC1 PROD]$ ls -l /mnt/data/oradata/primary/spfile*
-rw-r—– 1 oracle oinstall 7680 May 10 15:18 spfileRAC.ora
Link individual init files on nodes RAC1/RAC2 to spfile
[oracle@RAC1]$ cd $ORACLE_HOME/dbs
[oracle@RAC1 dbs]$ cat initRAC1.ora
SPFILE=’/mnt/data/oradata/primary/spfileRAC.ora’[oracle@RAC2 dbs]$ cat initRAC2.ora
SPFILE=’/mnt/data/oradata/primary/spfileRAC.ora’
Registering database with CRS
[oracle@Node1 dbs]$ srvctl add database -d RAC -o /opt/app/oracle/product/11.2/db_1 -p /mnt/data/oradata/primary/spfileRAC.ora
[oracle@Node1 dbs]$ srvctl add instance -d RAC -i RAC1 -n Node1
[oracle@Node1 dbs]$ srvctl add instance -d RAC -i RAC2 -n Node2
[oracle@Node2 arch]$ crsstat.sh | grep RAC
ora.RAC.db OFFLINE OFFLINE
Before using services, we must check the cluster configuration is correct
[oracle@Node1 dbs]$ srvctl config database -d RAC
Database unique name: RAC
Database name:
Oracle home: /opt/app/oracle/product/11.2/db_1
Oracle user: oracle
Spfile: /mnt/data/oradata/primary/spfileRAC.ora
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: RAC
Database instances: RAC1,RAC2
Disk Groups:
Mount point paths:
Services:
Type: RAC
Database is administrator managed
[oracle@Node1 dbs]$ srvctl start database -d RAC
PRCR-1079 : Failed to start resource ora.RAC.db
CRS-5017: The resource action “ora.RAC.db start” encountered the following error:
ORA-29760: instance_number parameter not specified
Solution of the Problem
srvctl is case sensitive. So we need to ensure that instance and database definitions set in spfile/pfile are same case as those in the OCR and as are used in the srvctl commands. I made a mistake here and added “GDPROD1/2” in lowercase “RAC1/RAC2” while creating services.
Before going into solution be sure that ORACLE_SID reflects correct case so that instance can be accessed using SQL*Plus
I will have to remove services created earlier and add them with “UPPERCASE” instance name
[oracle@Node1 dbs]$ srvctl remove database -d RAC
Remove the database RAC? (y/[n]) y
[oracle@Node1 dbs]$ srvctl remove instance -d RAC -i RAC1
PRCD-1120 : The resource for database RAC could not be found.
PRCR-1001 : Resource ora.RAC.db does not exist
[oracle@Node1 dbs]$ srvctl remove instance -d RAC -i RAC2
PRCD-1120 : The resource for database RAC could not be found.[oracle@Node1 dbs]$ srvctl add database -d RAC -o /opt/app/oracle/product/11.2/db_1 -p /mnt/data/oradata/primary/spfileRAC.ora
[oracle@Node1 dbs]$ srvctl add instance -d RAC -i RAC1 -n Node1
[oracle@Node1 dbs]$ srvctl add instance -d RAC -i RAC2 -n Node2
[oracle@Node2 arch]$ crsstat.sh | grep RAC
ora.RAC.db OFFLINE OFFLINE
Moment of TRUTH , start the Database
[oracle@Node1 dbs]$ srvctl start database -d RAC
[oracle@Node1 dbs]$ crsstat.sh | grep RAC
ora.RAC.db ONLINE ONLINE on Node1
[oracle@Node1 ~]$ export ORACLE_SID=RAC1
SQL> set sqlprompt ‘&_CONNECT_IDENTIFIER > ‘
RAC1 > archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /mnt/logs/oradata/primary/arch
Oldest online log sequence 18
Next log sequence to archive 19
Current log sequence 19SQL> set sqlprompt ‘&_CONNECT_IDENTIFIER > ‘
RAC2 > archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /mnt/logs/oradata/primary/arch
Oldest online log sequence 20
Next log sequence to archive 21
Current log sequence 21
Finally have a look at alrtlog for any issues reported
To test session failover , I will create SQLPLUS connection and see if it gets migrated to other node when instance goes down.
SQL> select machine from v$session where rownum <5;
MACHINE
—————————————————————-
Node1
Node1
Node1
Node1Node1 RAC1> shutdown abort;
ORACLE instance shut down.SQL> select machine from v$session where rownum <5;
MACHINE
—————————————————————-
Node2
Node2
Node2
Node2