Websphere : Plug-in Workload Management Failover
We have a 2 node clustered websphere 6.x vertical cluster . Number of times we see the system going down and coming back up in less than 5 mins.
Investigation:
Websphere uses SESSIONID to divert user sessions to relevant JVMs . Plug-in polling interval keeps track of status of JVMs (up/down/hung). Under situation we had, HTTP plug-in should direct user session to another JVM (JVM2 here) . But I think it didn’t . To do so, we need to configure parameter “ConnectTimeout” to force it to look for another server.
“ConnectTimeout” makes plug-in use a non-smoking connect.
Setting ConnectTimeout to a value of zero (default here) is equal to not specifying ConnectTimeout attribute, that is, the plug-in performs a blocking connect and waits until the operating system times out (For Linux it can take up to 5-10 minutes for the Socket to time-out).
ConnectTimeout
The ConnectTimeout attribute of a Server element enables the HTTP plug-in to perform non-blocking connections with a backend cluster member. Non-blocking connections are beneficial when the HTTP plug-in is unable to contact the destination to determine if the port is available or unavailable for a particular cluster member.
If no ConnectTimeout value is specified, the HTTP plug-in performs a blocking connect in which the HTTP plug-in sits until an operating system TCP timeout occurs (as long as 2 minutes depending on the platform) and allows the HTTP plug-in to mark the cluster member unavailable. A value of 0 causes the HTTP plug-in to perform a blocking connect. A value greater than 0 specifies the number of seconds you want the HTTP plug-in to wait for a successful connection. If a connection does not occur after that time interval, the HTTP plug-in marks the cluster member unavailable and fails over to one of the other cluster members defined in the cluster. Caution: In an environment with busy workload or a slow network connection, setting this value too low could make the HTTP plug-in mark a cluster member down falsely. Therefore, caution should be used whenever choosing a value for ConnectTimeout.
Set attribute “ConnectTimeout” to an integer value greater than zero to determine how long plug-in should wait for a response when attempting to connect to a server. A setting of 15 means that the plug-in waits for 15 seconds to time out than 5-10 minutes thru OS settings.
<Server CloneID="10k66djk2" ConnectTimeout="10" ExtendedHandshake="false" LoadBalanceWeight="1000" MaxConnections="0" Name="Server1_WebSphere_Appserver" WaitForContinue="false">
<Transport Hostname="server1.domain.com" Port="9091" Protocol="http"/>
</Server>
Leave a Reply
You must be logged in to post a comment.