Author Topic: Resource Manager  (Read 3973 times)

Offline smile

  • Sr. Member
  • ****
  • Posts: 286
  • Karma: 6
Resource Manager
« on: December 13, 2012, 10:29:02 PM »
Hello all

how is your RM feeling yourself? I'm going to write 3rd SR today during this week... It crashed on different sites, at different circumstances. Several times it crashed with core dumps and yesterday tier3 said that they found the root issue. Hope they really did...
I'm just wondering that this very important component has too much serious bugs! Does anyone has the same troubles with it? What version are you using? My is 8.1.502.82 on Linux platform.
With best regards and so on

Offline Kubig

  • Hero Member
  • *****
  • Posts: 2508
  • Karma: 41
Re: Resource Manager
« Reply #1 on: December 13, 2012, 11:00:23 PM »
We are using RM on Windows OS version 8.1.502.86 and all works fine. No problem we have registered. RM provides all services as voicexml,cpd,media,annc,recordingclient. We preffered and suggested using RM with TCP transport protocol. So, I do not know about any bug from production env. GVP daily serve more than 30k+ calls.

We had a few issues registered during the deployment and first days of using,but RM never has crashed. The issues were in bad search of configured LRG.

What problems/issue do you registered?
Genesys certified professional consultant (GVP, SIP, GIR and Troubleshooting)

Offline Timur Karimov

  • Sr. Member
  • ****
  • Posts: 415
  • Karma: 2
Re: Resource Manager
« Reply #2 on: December 14, 2012, 03:08:57 AM »
Hello all
how is your RM feeling yourself? I'm going to write 3rd SR today during this week... It crashed on different sites, at different circumstances. Several times it crashed with core dumps and yesterday tier3 said that they found the root issue. Hope they really did...
I'm just wondering that this very important component has too much serious bugs! Does anyone has the same troubles with it? What version are you using? My is 8.1.502.82 on Linux platform.
Hi there

my suggestion-  check the cache settings. it's very sensetive to cachesize/prompt file size correlation

WBR Timur

Offline smile

  • Sr. Member
  • ****
  • Posts: 286
  • Karma: 6
Re: Resource Manager
« Reply #3 on: December 14, 2012, 06:14:47 AM »
Hello all
how is your RM feeling yourself? I'm going to write 3rd SR today during this week... It crashed on different sites, at different circumstances. Several times it crashed with core dumps and yesterday tier3 said that they found the root issue. Hope they really did...
I'm just wondering that this very important component has too much serious bugs! Does anyone has the same troubles with it? What version are you using? My is 8.1.502.82 on Linux platform.
Hi there

my suggestion-  check the cache settings. it's very sensetive to cachesize/prompt file size correlation

WBR Timur

do you mean rm or mcp? rm doesn't process prompts and doesn't work with cache...
With best regards and so on

Offline smile

  • Sr. Member
  • ****
  • Posts: 286
  • Karma: 6
Re: Resource Manager
« Reply #4 on: December 14, 2012, 06:45:18 AM »

Hi Kubig,

your version look better even though this:

Resource Manager now no longer terminates unexpectedly while closing a SIP Transaction. Previously, this situation sometimes caused an assert in the stack. (ER# 300007729)

and this fix for previous 8.1.502.85:
Now Resource Manager does not terminate due the following race condition: one thread detects a socket read failure and closes the socket, while another thread sends a message on the same socket at the same time. (ER# 305054141)

and this...:

Resource Manager (RM) now correctly handles an invalid action that it receives in messages from another RM node in High Availability. Previously, RM could terminate in this situation. (ER# 304433067)

 ??? do i need to upgrade?

The "funny" thing is that we had to upgrade from previous version (don't remember what was) because of this bug:

Resource Manager now no longer terminates unexpectedly while closing a SIP Transaction. Previously, this situation sometimes caused an assert in the stack. (ER# 300007729)

the all humour is that sip server version previously installed with that rm had another one bug - when msml service became unavailable and someone put call on hold, then sip server terminate unexpectedly. I think you know why i know too much about this details...

Well, this is thing of the past. Today i have following complaints:

1. Situation where the active RM sends new call information to the backup and gets a response before it can update its own transaction record. Engineering believes that this is rare situation. but during last ~ 6 months i got it 3 or 4 times. I think you will see new release note about it. The bad thing that rm crashed into core in this case.

2. sometimes (i don't know why,but at last time it occurred immediately after 1st item) nodes in rm cluster run out of sync. The interesting thing is both 1st and 2nd node see each other and send/receive HB messages (i suppose). But there is no inter-node updates with current status. Nodes may work such way during several days (my observation - 4 days). after that - see num. 3.

3. sometimes (at last time it occurred 4 days after 2nd item ;) ) primary node received an invite and... that's all. no other messages sent or received. BUT the backup node "think" that primary is in service! moreover,because of item 2 (or something else) if you kill primary node, secondary not doesn't become primary =( epic fail. during last 2 days i saw it twice in different sites.

i know it looks strange and it works good in lab, but in production it's crazy %)
looking on release notes with such fix when "in very rare cases" and "rm can became unstable"... i believe it is not good-prepared product.
sorry for emotions. if i could throw away RM i would do it. in 7.6 it was non-mandatory component.. good times.
With best regards and so on

Offline Kubig

  • Hero Member
  • *****
  • Posts: 2508
  • Karma: 41
Re: Resource Manager
« Reply #5 on: December 14, 2012, 08:49:32 AM »
We are using older version that 8.1.5x of RM on TDA platform(8.1.4x) and same issues as you we do not register,but currently we have not deployed RM in HA pair. This need is in process,so wich me luck :-) I believe,that the HA "mode" for RM can bring other problems and issues.
As I wrote,we have registered some issues when RM returns 4xx SIP message and says "Cannot allocate LRG for service servicetype". However, I think that after the deploy RM to HA I will be not so optimistic like yet.
Genesys certified professional consultant (GVP, SIP, GIR and Troubleshooting)

Offline smile

  • Sr. Member
  • ****
  • Posts: 286
  • Karma: 6
Re: Resource Manager
« Reply #6 on: December 14, 2012, 03:03:55 PM »
We are using older version that 8.1.5x of RM on TDA platform(8.1.4x) and same issues as you we do not register,but currently we have not deployed RM in HA pair. This need is in process,so wich me luck :-) I believe,that the HA "mode" for RM can bring other problems and issues.
As I wrote,we have registered some issues when RM returns 4xx SIP message and says "Cannot allocate LRG for service servicetype". However, I think that after the deploy RM to HA I will be not so optimistic like yet.

oh, i'm not alone :D we have already another one SR about "Cannot allocate LRG for service servicetype". Genesys said that probably we don't have configured recording LRG %) LOL, investigation is in progress...
so have you solved it?
With best regards and so on

Offline Kubig

  • Hero Member
  • *****
  • Posts: 2508
  • Karma: 41
Re: Resource Manager
« Reply #7 on: December 14, 2012, 10:09:47 PM »
Our issue with bad allocating of configured LRG was caused by bad configuration. During we changes transport protocal from UDP to TCP,we forgot to change all required section. RM has problem with monitoring of LRG and in logs I can see that all LRGs status was "offline". Since clean configuration we do not registered same issue again. But I think that the issue was not last,unfortunately...
Genesys certified professional consultant (GVP, SIP, GIR and Troubleshooting)

Offline Timur Karimov

  • Sr. Member
  • ****
  • Posts: 415
  • Karma: 2
Re: Resource Manager
« Reply #8 on: December 14, 2012, 11:26:33 PM »
Hello all
how is your RM feeling yourself? I'm going to write 3rd SR today during this week... It crashed on different sites, at different circumstances. Several times it crashed with core dumps and yesterday tier3 said that they found the root issue. Hope they really did...
I'm just wondering that this very important component has too much serious bugs! Does anyone has the same troubles with it? What version are you using? My is 8.1.502.82 on Linux platform.
Hi there

my suggestion-  check the cache settings. it's very sensetive to cachesize/prompt file size correlation

WBR Timur

do you mean rm or mcp? rm doesn't process prompts and doesn't work with cache...

yeap. You right. Read one thread...answer in another =( Need more sleep and less work. But still have the problem with mcp in my own project =(
Sorry.
WBR Timur

Offline Timur Karimov

  • Sr. Member
  • ****
  • Posts: 415
  • Karma: 2
Re: Resource Manager
« Reply #9 on: December 14, 2012, 11:32:57 PM »

3. sometimes (at last time it occurred 4 days after 2nd item ;) ) primary node received an invite and... that's all. no other messages sent or received. BUT the backup node "think" that primary is in service! moreover,because of item 2 (or something else) if you kill primary node, secondary not doesn't become primary =( epic fail. during last 2 days i saw it twice in different sites.

We have similar issue on windows 2008 x64 with 8.1.502.80 ver of RM. Have the respond from Genesys T3  - "look like it's a SIP server problem, so we open new SR against SIP and continue investigate". On my next question - why service back after complete restart GVP...still whait the answer =(

WBR Timur

Offline smile

  • Sr. Member
  • ****
  • Posts: 286
  • Karma: 6
Re: Resource Manager
« Reply #10 on: February 06, 2013, 06:28:06 PM »
the one good news today - new RM 8.1.602.01 has fix of one of our issue:
The Resource Manager no longer terminates when it synchronizes data with its HA counterpart and the response comes back before the sending RM can update its transaction record. (ER# 314483785)

working on other issues still in progress...
With best regards and so on