DMGR HA: How to backup websphere deployment manager for a Disaster Recovery

Posted By Sagar Patil

By its very nature, WebSphere Application Server Network Deployment is a distributed system ranging across many machines. While few things are more stressful and frustrating than an unplanned outage, there are ways you can lessen the impact. The goal of this article is to show how you can harness deployment manger system and make recovery a quick and simple task.

So why do you want to backup DMGR(deployment manger) configuration?

In an ideal world this is not necessary but I have massive distributed environments. Although I have admin access to systems, there are other teams with access to Monitor, Control websphere processes. Often I came across issues where something was changed and DMGR breaks next time I recycled services, thanks to websphers’s XML repository approach. It breaks not when you make change but next time services recycled.

I ended up writing backup-dmgr_sh to backup DMGR on my RHEL boxes , but it does it with a twist.

What is a twist?

To make sure I have a working configuration for a reliable backup , I shutdown DMGR services and restart them. I then use RHEL wget command to receive valid response from DMGR port before making a DMGR backup.  This way I know backup is valid and don’t contain a rogue configuration.

Attached is a sample log file

#! /bin/bash
# This shell script will backup profile at a websphere node
# Script tested successfully on 15-Dec-2010
# Script generate logfiles taking hardware clock than unix date format

set -x
## Every shell command will be expanded and printed.

TEE=/usr/bin/tee
[[ ! -x $TEE ]] && TEE=/bin/tee
if [[ ! -x $TEE ]]
then
echo $0 will not work without ‘tee(1)’ command!
exit 1
fi
TEE=”$TEE -a”

# Shell’s internal field separator
# The default value should be <space><tab><new-line>
IFS=$’ \t’ # == <space><tab>

# Current date and time
# OS clock
# DATE=`/bin/date +%Y%m%d-%H%M%S`
# Hardware RT chip
# DATE=`/sbin/hwclock –show`
# DATE=`echo $DATE | awk ‘{print $2 “-” $3 “-” $4 “-” $5}’`
# Use following to get well-formatted RT chip’s date and time
DATE=$(hwclock –show | cut -d ‘ ‘ -f 1,2,3,4,5,6,7)
DATE=$(date -d “$DATE” “+%Y%m%d-%H%M%S”)

# Log file name for tee(1)
# We may have LOG_FILE been empty string or unset/commented at all.
LOG_FILE=/home/was61/`/bin/basename $0`-${DATE}.log

# TMPDIR If set, Bash uses its value as the name of a directory in  which
#    Bash creates temporary files for the shell’s use.
#    But we need to assure it exists and has write permissions.
#    The same directory as backup destination, as a last resort.
test -d ${TMPDIR:=/tmp} && test -w $TMPDIR || TMPDIR=/var/tmp
test -d $TMPDIR && test -w $TMPDIR || TMPDIR=`/usr/bin/dirname $LOG_FILE`
# Temp file for wget output
TMP_WGET=${TMPDIR}/$$.wget.tmp

# Diagnostic messages level
# 0 == be silent
# 1 == normal messages
# 2 == additional debug messages
# 3 == yet more messages
DEBUG=3

DMGR=dmgr
# Regexp to look by “ps|grep” for dmgr process
DMGR_REGEXP=’java.*ibm.*websphere.*dmgr’
####DMGR_REGEXP=’bin.*httpd’

STOP_COMMAND=/opt/IBM/WebSphere/AppServer/profiles/Profile01/dmgr/bin/stopServer.sh
START_COMMAND=/opt/IBM/WebSphere/AppServer/profiles/Profile01/dmgr/bin/startServer.sh

WGET_URL=’https://websphere_node:9043/ibm/console/logon.jsp’
# –spider
#    Wget will not download the pages, just check that they are there
# -T seconds, –timeout=seconds
#    Set the network timeout to seconds seconds.  This is equivalent to
#    specifying –dns-timeout, –connect-timeout, and –read-timeout,
#    all at the same time
# –retry-connrefused
#    Consider “connection refused” a transient error and try again
# -t number, –tries=number
#    Set number of retries to number
# -w seconds, –wait=seconds
#    Wait the specified number of seconds between the retrievals
WGET_OPT=”–spider –timeout=10 –retry-connrefused –tries=3 –wait=5 –no-check-certificate”
if [[ $DEBUG -eq 0 ]]
then
# -q, –quiet
#    Turn off Wget’s output
WGET_OPT=”–quiet “$WGET_OPT
elif [[ $DEBUG -eq 1 ]]
then
# -nv, –no-verbose
#    Turn off verbose without being completely quiet (use -q for that),
#    which means that error messages and basic information still get
#    printed
WGET_OPT=”–no-verbose “$WGET_OPT
elif [[ $DEBUG -ge 2 ]] # 2+
then
# -v, –verbose
#    Turn on verbose output, with all the available data.
#    The default output is verbose
# -S, –server-response
#    Print the headers sent by HTTP servers and
#     responses sent by FTP servers
WGET_OPT=”–verbose –server-response “$WGET_OPT
fi

# Array declaration: sources for backup,
# may be several files and/or directories
typeset -a BACKUP_SRC=( /opt/IBM/WebSphere/AppServer/profiles/Profile01 )
####typeset -a BACKUP_SRC=(/home/spk/src1 /home/spk/src2)

# Array declaration: backup exclusion patterns
# May be emty list.
# May contain shell regexp patterns or plain strings,
# entire directory(-ies) exclusion is also possible..
####### Read “info tar”, section 6.5 describes tar patterns in details. ########
#    A PATTERN should be written according to shell syntax, using wildcard
# characters to effect globbing.  Most characters in the pattern stand
# for themselves in the matched string, and case is significant: `a’ will
# match only `a’, and not `A’.  The character `?’ in the pattern matches
# any single character in the matched string.  The character `*’ in the
# pattern matches zero, one, or more single characters in the matched
# string.  The character `\’ says to take the following character of the
# pattern _literally_; it is useful when one needs to match the `?’, `*’,
# `[‘ or `\’ characters, themselves.
#    Periods (`.’) or forward slashes (`/’) are not considered special
# for wildcard matches.  However, if a pattern completely matches a
# directory prefix of a matched string, then it matches the full matched
# string: excluding a directory also excludes all the files beneath it.
###############################################################################
# Log files are often very big so I added them in exclude list
typeset -a BACKUP_EXCLUDE=(/opt/IBM/WebSphere/AppServer/profiles/Profile01/Node/logs  /opt/IBM/WebSphere/AppServer/profiles/Profile01/dmgr/logs/dmgr)
####typeset -a BACKUP_EXCLUDE=(*.o *.a Makefile Makefile.am README)

# Backup destination file name (.tgz suffix and timestamp will be appended)
BACKUP_DST=/home/was61/dmgr_bkup

if [[ $DEBUG -ge 2 ]]
then
TAR_VERBOSE=”–verbose”
fi

#start log file (deleting previous one, if any)
echo ” *************  DMGR & profile backup started (at `date “+%d-%m-%Y %H:%M”`) Per OS Date ********** ” | $TEE $LOG_FILE
echo “———-> The Hardware Clock is (`/sbin/hwclock`) <———- ” | $TEE $LOG_FILE

################################################################################

found=`ps -ef | /bin/grep –invert-match grep\
| /bin/grep –ignore-case $DMGR_REGEXP`

if [[ -n $found ]]
then # dmgr is running
if [[ $DEBUG -ge 1 ]]
then
echo Running \”$DMGR\” found. | $TEE $LOG_FILE
if [[ $DEBUG -ge 2 ]]
then
count=`echo $found | wc -l`
if [[ $count -ge 2 ]]
then
echo Warning: found $count matching processes.\
| $TEE $LOG_FILE
if [[ $DEBUG -ge 3 ]]
then
echo $’\n’$found$’\n’ | $TEE $LOG_FILE
fi
fi
fi
echo Stopping \”$DMGR\” server… | $TEE $LOG_FILE
fi

if ! $STOP_COMMAND $DMGR
then
if [[ $DEBUG -ge 1 ]]
then
echo Can not stop \”$DMGR\ server. | $TEE $LOG_FILE
fi
exit 1
fi
else    # dmgr is not running
if [[ $DEBUG -ge 1 ]]
then
echo \”$DMGR\” is not running. | $TEE $LOG_FILE
fi
fi

if [[ $DEBUG -ge 1 ]]
then echo Starting \”$DMGR\” server… | $TEE $LOG_FILE
fi

if ! $START_COMMAND $DMGR
then
if [[ $DEBUG -ge 1 ]]
then
echo Can not start \”$DMGR\” server. | $TEE $LOG_FILE
fi
exit 1
fi

#——————————————————————————-

/usr/bin/wget $WGET_OPT $WGET_URL > $TMP_WGET 2>&1
res=$?
/bin/cat $TMP_WGET | $TEE $LOG_FILE
/bin/rm -f $TMP_WGET
if [[ $res -ne 0 ]]
then
if [[ $DEBUG -ge 1 ]]
then
echo wget returned $res | $TEE $LOG_FILE
echo Can not connect to \”$WGET_URL\” | $TEE $LOG_FILE
fi
exit 1
fi

#——————————————————————————-

if [[ $DEBUG -ge 1 ]]
then
echo Starting backup… | $TEE $LOG_FILE
fi

# Separate temporary output files for tar, gzip, exclusion list…
TMP_TAR=${TMPDIR}/$$.tar.tmp
TMP_GZIP=${TMPDIR}/$$.gzip.tmp
echo -n “gzip: ” > $TMP_GZIP

if [[ -n $BACKUP_EXCLUDE ]]
then # Build exclusions’ file
TMP_EXCLUDE=${TMPDIR}/$$.exclude.tmp
TAR_EXCLUDE=’–exclude-from’
echo -n > $TMP_EXCLUDE
for f in ${BACKUP_EXCLUDE[@]}
do
echo $f >> $TMP_EXCLUDE
done
fi

# Do we need absolute names to be stored in tar?
# Tar complains and strips leading slashes.
# -P , –absolute-names
#    Don’t strip leading ‘/’s from file names
/bin/tar c $TAR_VERBOSE $TAR_EXCLUDE $TMP_EXCLUDE –file – ${BACKUP_SRC[@]} \
2>$TMP_TAR | /usr/bin/gzip -v9 > $BACKUP_DST-$DATE.tgz 2>>$TMP_GZIP

res=$? # $res will be exit code of last program in the pipe, i. e. gzip.
/bin/cat $TMP_TAR | $TEE $LOG_FILE
/bin/cat $TMP_GZIP | $TEE $LOG_FILE
/bin/rm -f $TMP_TAR
/bin/rm -f $TMP_GZIP
/bin/rm -f $TMP_EXCLUDE

if [[ $res -ne 0 ]]
then
if [[ $DEBUG -ge 1 ]]
then
echo Backup failed. | $TEE $LOG_FILE
fi
exit 1
else
if [[ $DEBUG -ge 1 ]]
then
echo Backup succeeded. | $TEE $LOG_FILE
fi
exit 0
fi

#——————————————————————————-

exit 255

Leave a Reply

You must be logged in to post a comment.

Top of Page

Top menu