FAN (Fast Application Notification) & ONS (Oracle Notification Services)

Posted by Sagar Patil

Read more…

How does the failover mechanism work?

Posted by Sagar Patil

Read more…

TAF can be verified by querying the Oracle-provided views

Posted by Sagar Patil

Read more…

EM grid console active only at RAC 1 Instance

Posted by Sagar Patil

Case 1 : EM console is working at Node 1. Node 1 is shutdown , services failover to Node 2 but oemctl doesn’t failover to Node 2

[oracle@wygora02 ~]$ showcrs
HA Resource                                   Target     State
-----------                                   ------     -----
ora.wygora01.ASM1.asm                         ONLINE     OFFLINE
ora.wygora01.LISTENER_WYGORA01.lsnr           ONLINE     OFFLINE
ora.wygora01.gsd                              ONLINE     OFFLINE
ora.wygora01.ons                              ONLINE     OFFLINE
ora.wygora01.vip                              ONLINE     OFFLINE
ora.wygora02.ASM2.asm                         ONLINE     ONLINE on wygora02
ora.wygora02.LISTENER_WYGORA02.lsnr           ONLINE     ONLINE on wygora02
ora.wygora02.gsd                              ONLINE     UNKNOWN on wygora02
ora.wygora02.ons                              ONLINE     UNKNOWN on wygora02
ora.wygora02.vip                              ONLINE     ONLINE on wygora02
ora.wygprod.db                                ONLINE     ONLINE on wygora02
ora.wygprod.wygprod.cs                        ONLINE     ONLINE on wygora02
ora.wygprod.wygprod.wygprod1.srv              ONLINE     OFFLINE
ora.wygprod.wygprod.wygprod2.srv              ONLINE     ONLINE on wygora02
ora.wygprod.wygprod1.inst                     OFFLINE    OFFLINE
ora.wygprod.wygprod2.inst                     ONLINE     ONLINE on wygora02
emctl start dbconsole
Z set to GB-Eire
racle Enterprise Manager 10g Database Control Release 10.2.0.1.0
opyright (c) 1996, 2005 Oracle Corporation.  All rights reserved.
ttp://wygora01.wyg-asp.com:1158/em/console/aboutApplication
gent Version     : 10.1.0.4.1
MS Version       : 10.1.0.4.0
rotocol Version  : 10.1.0.2.0
gent Home        : /u01/app/oracle/product/10.2.0/db_1/wygora02_wygprod2
gent binaries    : /u01/app/oracle/product/10.2.0/db_1
gent Process ID  : 26599
arent Process ID : 26554
gent URL         : http://wygora02.wyg-asp.com:3938/emd/main
tarted at        : 2008-03-13 15:58:50
tarted by user   : oracle
ast Reload       : 2008-03-13 15:58:50
ast successful upload                       : 2008-03-13 16:43:03
ast attempted upload                        : 2008-03-13 16:44:54
otal Megabytes of XML files uploaded so far :     6.40
umber of XML files pending upload           :        1
ize of XML files pending upload(MB)         :     0.00
vailable disk space on upload filesystem    :    65.82%
gent is already started. Will restart the agent
his will stop the Oracle Enterprise Manager 10g Database Control process. Continue [y/n] :y
topping Oracle Enterprise Manager 10g Database Control ...
...  Stopped.
gent is not running.
tarting Oracle Enterprise Manager 10g Database Control ..... started.
-----------------------------------------------------------------
ogs are generated in directory /u01/app/oracle/product/10.2.0/db_1/wygora02_wygprod2/sysman/log
  No grid Console running
Case 2 : EM console is working at Node 1. Instance 1 is shutdown (Note just a instance & not server) , oemctl working Fine
[oracle@wygora01 ~]$ showcrs
HA Resource                                   Target     State
-----------                                   ------     -----
ora.wygprod.db                                ONLINE     ONLINE on wygora01
ora.wygprod.wygprod.cs                        ONLINE     ONLINE on wygora02
ora.wygprod.wygprod.wygprod1.srv              ONLINE     OFFLINE
ora.wygprod.wygprod.wygprod2.srv              ONLINE     ONLINE on wygora02
ora.wygprod.wygprod1.inst                     OFFLINE    OFFLINE
ora.wygprod.wygprod2.inst                     ONLINE     ONLINE on wygora02

RAC/CRS/Voting disk failover Tests

Posted by Sagar Patil

Read more…

How to recover from a Loss of Voting Disk

Posted by Sagar Patil


Loss of Voting Disk

Check where voting disks are located using “crsctl check crs”

Backup of voting disk : dd if=/dev/raw/votingdisk of=/vmasmtest/BACKUP/VOTING/votingdisk_06_may_07
dd: reading `/dev/raw/votingdisk’: No such device or address
305172+0 records in
305172+0 records out
[root@vmractest1 VOTING]# ls -l
total 152744
-rw-r–r– 1 oracle dba 156248064 May 6 16:40 votingdisk_06_may_07

Delete voting disks using rm command
Check RAC status “crs_stat -t”
Look into alrtlog messages at both instances and both Instance should show instance terminated.
Check available backups
Restore Voting Disk

Restore Voting Disk dd if=/vmasmtest/BACKUP/VOTING/votingdisk_06_may_07 of=/dev/raw/votingdisk
305172+0 records in
305172+0 records out

Restart CRS /etc/init.d/init.crs start

Check and Restart all Cluster components
./crsctl check crs
./crsctl query css votedisk
./crsctl start resources
Login into database & see everything is OK

TAF Failover Configuration and Testing

Posted by Sagar Patil

Configure the service on RAC servers for a failover

TNS Client side config

PROD =
(DESCRIPTION =
(enable=broken)
(LOAD_BALANCE = yes)
(ADDRESS = (PROTOCOL = TCP)(HOST = oravip01.oracledbasupport.com)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = oravip02.oracledbasupport.com)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = prod)
(failover_mode=(type=select)(method=basic))
)
)

Let’s test a Failover – Connect to an Oracle Instance 1 or 2

oracle@ora02 ~]$ showcrs
HA Resource Target State
———– —— —–
ora.ora01.ASM1.asm ONLINE ONLINE on ora01
ora.ora01.LISTENER_ora01.lsnr ONLINE ONLINE on ora01
ora.ora01.gsd ONLINE UNKNOWN on ora01
ora.ora01.ons ONLINE UNKNOWN on ora01
ora.ora01.vip ONLINE ONLINE on ora01
ora.ora02.ASM2.asm ONLINE ONLINE on ora02
ora.ora02.LISTENER_ora02.lsnr ONLINE ONLINE on ora02
ora.ora02.gsd ONLINE UNKNOWN on ora02
ora.ora02.ons ONLINE UNKNOWN on ora02
ora.ora02.vip ONLINE ONLINE on ora02
ora.prod.db ONLINE ONLINE on ora01
ora.prod.prod.cs ONLINE ONLINE on ora02
ora.prod.prod.prod1.srv ONLINE ONLINE on ora01
ora.prod.prod.prod2.srv ONLINE ONLINE on ora02
ora.prod.prod1.inst ONLINE ONLINE on ora01
ora.prod.prod2.inst ONLINE ONLINE on ora02

SQL> select instance_name from v$instance;
INSTANCE_NAME
—————-
prod2

[oracle@ora02 ~]$ crs_stop ora.prod.prod2.inst
Attempting to stop `ora.prod.prod2.inst` on member `ora02`
Stop of `ora.prod.prod2.inst` on member `ora02` succeeded.
At this stage the connections are diverted to prod1 instance.

SQL> select instance_name from v$instance;
INSTANCE_NAME
—————-
prod1

[oracle@ora02 ~]$ showcrs
HA Resource Target State
———– —— —–
ora.ora01.ASM1.asm ONLINE ONLINE on ora01
ora.ora01.LISTENER_ora01.lsnr ONLINE ONLINE on ora01
ora.ora01.gsd ONLINE UNKNOWN on ora01
ora.ora01.ons ONLINE UNKNOWN on ora01
ora.ora01.vip ONLINE ONLINE on ora01
ora.ora02.ASM2.asm ONLINE ONLINE on ora02
ora.ora02.LISTENER_ora02.lsnr ONLINE ONLINE on ora02
ora.ora02.gsd ONLINE UNKNOWN on ora02
ora.ora02.ons ONLINE UNKNOWN on ora02
ora.ora02.vip ONLINE ONLINE on ora01
ora.prod.db ONLINE ONLINE on ora01
ora.prod.prod.cs ONLINE ONLINE on ora02
ora.prod.prod.prod1.srv ONLINE ONLINE on ora01
ora.prod.prod.prod2.srv ONLINE OFFLINE
ora.prod.prod1.inst ONLINE ONLINE on ora01
ora.prod.prod2.inst OFFLINE OFFLINE

[oracle@ora02 ~]$ crs_start ora.prod.prod2.inst
Attempting to start `ora.prod.prod2.inst` on member `ora02`
Start of `ora.prod.prod2.inst` on member `ora02` succeeded.

What happens if Server is restarted?

I am connected to prod2 instance and a reboot migrates my connection to prod1 automatically.

SQL> select instance_name from v$instance;
INSTANCE_NAME
—————-
prod2

SQL> select count(*) from
(select * from dba_source union select * from dba_source union select * from dba_source union select * from dba_source union select * from dba_source)
COUNT(*)
———-
292465

SQL> select instance_name from v$instance;
INSTANCE_NAME
—————-
prod1

Let’s see how a RAC Load balancing works? Write a small sql test Script (verify.sql) like below

REM the following query is for TAF connection verification
col sid format 999
col serial# format 9999999
col failover_type format a13
col failover_method format a15
col failed_over format a11
SELECT   sid,
 serial#,
 failover_type,
 failover_method,
 failed_over
 FROM   v$session
 WHERE   username = 'SU';

REM the following query is for load balancing verification
SELECT   instance_name FROM v$instance;
exit

REM We can also combine two queries:
col inst_id format 999
col sid format 999
col serial# format 9999999
col failover_type format a13
col failover_method format a15
col failed_over format a11
SELECT   inst_id,
 sid,
 serial#,
 failover_type,
 failover_method,
 failed_over
 FROM   gv$session
 WHERE   username = 'SU';

REM a simple select to see the distribution of users when testing connection : load balancing
 SELECT   inst_id, COUNT ( * )
 FROM   gv$session
GROUP BY   inst_id;

Write loop.sh file to make number SQL connections. Please copy and paste at least 100 entries of line below. Oracle Listener will load balance connections by diverting new connections to least loaded oracle RAC instance.

nohup sqlplus system/0ra01@failover @verify.sql &
sleep 1
nohup sqlplus system/0ra01@failover @verify.sql &
sleep 1
nohup sqlplus system/0ra01@failover @verify.sql &
sleep 1
nohup sqlplus system/0ra01@failover @verify.sql &
sleep 1

Run loop.sh and note down connections shared between RAC 1 & RAC 2 nodes

[oracle@ora01 scripts]$ grep prod2 nohup.out | wc -l
35
[oracle@ora01 scripts]$ grep prod1 nohup.out | wc -l
41

 

Top of Page

Top menu