Author Topic: Resource Manager HA Configuration (Read 6852 times)

rkd · « **on:** March 04, 2014, 05:10:49 PM »

In my environment, we use 2 RMs in two VM ware Linux machines on same DC and in Active-Standby mode.

Currently one RM is configured as Primary and other as back up, and VIP IP will be pointed to the primary only. Whenever there is a run mode change happens, the VIP IP supposed to be pointing to the primary RM at that time. But most of the time VIP is not getting pointed correctly and calls are failing. Genesys asked us to remove the primary-back up configurations and then it will work smoothly.

But when I do that, both RMs are in primary mode in SCS and the run mode changes are not happening (5150/5151) to initiate the VIP failover scripts.

Does anyone use RMs in Active Standby mode with VIP IPs? I would like to know what alarm codes we need to check for initiating the VIP IP failover. I have referred Genesys voice manuals and nowhere its mentioned about clear picture of RM Failover.

Kubig · « **Reply #1 on:** March 04, 2014, 05:35:58 PM »

I having using RM in active-standby mode on Linux platform at several customer environment, each installation was using bonding mechanism. From version of GVP 8.1.5+ I have not registered any issue related to HA. Other things is, that the roles viewed in SCI are irelevant, because the role on SCI does not match physical role within RM cluster - in active-standby mode, if the master (with higher ID) RM is running, will be always active.

Did you check if your reaction scripts were started during the failover of the RMs?

rkd · « **Reply #2 on:** March 04, 2014, 07:41:53 PM »

Thanks for the response.

We are not using bonding failover but simple VIP failover is used here.

During my testing, I made the RM_P down in SCS to test the VIP IP will be pointed to RM_B. And it was pointing to RM_B.

I have created Alarm reaction scripts (to initiate VIP failover) for the 5150, 5064 and 5091 alarms.

But when I try to restart the RM_P, its not starting up.

I see the below in logs.

2014-03-04T13:29:46.956 Std 20142 WARN 00000000-00000000 4157245136 09400333 RM Startup
2014-03-04 13:29:46.956 DBUG 00000000-00000000 4157245136 08500000 CCIMFConfigImpl.cpp:3327 RegisterNotification >> Registered handler for attribute [cluster.periodic-data-update].
2014-03-04T13:29:46.961 Std 20007 EROR 00000000-00000000 4157245136 09400201 NIC link detection unavailable - Device [eth1] cannot be initialized - ioctl() call failed [-1]
2014-03-04 13:29:46.961 DBUG 00000000-00000000 4157245136 09400901 RMCommTCPBonding.cxx:80 NIC link detection service unavailble.
2014-03-04 13:29:46.961 DBUG 00000000-00000000 4157245136 09400901 RMCommTCPBonding.cxx:112 636420160 Opening Server socket connection
2014-03-04T13:29:46.961 Std 20003 WARN 00000000-00000000 4157245136 08500000 VGSocket::SocketCreateOS socket not bound
2014-03-04T13:29:46.961 Std 20009 EROR 00000000-00000000 4157245136 09400203 636420160 Could not create the server socket at port 5070
2014-03-04T13:29:46.961 Std 20006 CRIT 00000000-00000000 4157245136 09400101 CCPRM::Initialize failed
2014-03-04T13:29:46.961 Std 20143 WARN 00000000-00000000 4157245136 09400334 RM Shutdown

Yes the reaction scripts are running, but due to the above error, RM_P is not starting up.

Kubig · « **Reply #3 on:** March 05, 2014, 05:37:47 AM »

Are the RM in HA on Genesys level? There are no switchovere message in the logs. I very recommend to use bonding mechanism for RM cluster, I know that it does not make a sense, but trust me - it works more properly than IP takeover.

According to the log, it seems you have bad configured NIC parameters on RM object in CME - it looks like the RM cannot detect NIC state.

rkd · « **Reply #4 on:** March 05, 2014, 05:31:46 PM »

RM HA currently controlled by SCS alarm reactions (5150/5151) and we are trying to change that as per Genesys recommendation.

Yes I am trying to make the client convince to sure the bonding failover as well and yet to hear back. But we are using only one NIC per host.

I found one error in NIC configuration, but to correct it, Genesys has given all option for bonding failover only in manual. I will do more testing today and update the result. Thank you for your response.

rkd · « **Reply #5 on:** March 05, 2014, 08:35:20 PM »

Hi

When I corrected the NIC and port #, I am able to start application. But still I am not sure whether I am doing the switch over in correct way. I am continuing with VIP IP failover only until i get a go ahead from client.

I have set up the Alarm reactions to trigger the VIP IP change for the alarms 5064 and 5091. Not sure this is the correct way to switch over when RM fails. Please advise. I have added the cluster and gvp options below.

[cluster]
electiontimer=3000
failoverscript=/usr/bin/GCTI/a_LAB_rm_p/bin/NLB.bat
ha-mode=active-standby
heartbeattimer=2000
hotstandby=TRUE
initial-electiontimer=10000
member.1=207.88.61.100:1500
member.2=207.88.61.101:1500
members=1 2
mymemberid=2
networkrecoverytime=5000
virtual-ip=207.88.61.109
virtual-ip-in-via=true

[gvp]
nic.eth0=
nic.eth0:2=
nic.linkattribute=MII Status:
nic.upvalue=up
nics=0

Kubig · « **Reply #6 on:** March 06, 2014, 07:24:36 AM »

How you doing the switchover of the RM?

rkd · « **Reply #7 on:** March 06, 2014, 10:06:31 PM »

in the current environment its handled by SCS as the back up RM is assinged as backup in priamry RM configuration.

I am trying to break this and need cluster to take care of switch over.

Is there any other way of RM switch over except Alarm reactions, when I configure RMs in Active Standby mode in cluster.

And whats the benefit of using bonding failover instead of VIP failover when I use only one NIC card in VM ware host.

Kubig · « **Reply #8 on:** March 07, 2014, 07:18:25 AM »

Of course it is handled by SCS -there is no other way:-) My question was, if you was trying make switchover via "switchover" function within SCI or stop/kill primary RM's process?
Advantage of using bonding mechanism is in better possibility to monitor NIC status. It is true that the using of bonding mechanism on server with only one NIC does not make a sense, but it is well tested,used and suggested mode. As I wrote, I always using this mode and never registered any issue related to switchover.

rkd · « **Reply #9 on:** March 07, 2014, 04:17:59 PM »

Hey

Thanks. I am sorry that i didn't understand your question. I did testing with both conditions , killing the process in Host and stopping the RM in SCI. These conditions were covered in my alarm reaction scripts to point the VIP to back up, so it was working fine.

But am confused, what will happen when SCS loses connection with the primary RM host? losing LCA or losing Host. In this scenario, we cant run the VIP down script in Primary and how should we handle this?

I am making the necessary changes to do Bonding fail over now.

Kubig · « **Reply #10 on:** March 10, 2014, 08:24:56 AM »

If you want to "cover" all possible failure scenarios, you have to deploy external load-balancer like Big-IP from F5 vendor.

Genesys CTI User Forum

News:

Author Topic: Resource Manager HA Configuration (Read 6852 times)

rkd

Resource Manager HA Configuration

Kubig

Re: Resource Manager HA Configuration

rkd

Re: Resource Manager HA Configuration

Kubig

Re: Resource Manager HA Configuration

rkd

Re: Resource Manager HA Configuration

rkd

Re: Resource Manager HA Configuration

Kubig

Re: Resource Manager HA Configuration

rkd

Re: Resource Manager HA Configuration

Kubig

Re: Resource Manager HA Configuration

rkd

Re: Resource Manager HA Configuration

Kubig

Re: Resource Manager HA Configuration