Lync 2010 Gateway Timeout Call Failures Post CU4 Deployment

I come across this issue sometime ago. I just wanted to get down the particulars so it was easy to find. Hopefully this might help someone else running into this issue.

As the KB article below describes a 10 second timer began working for the OutBoundRouting Component to allow faster recovery from gateway failures. This was implemented during the CU4 update process. The Mediation Server is expecting a Session Progression message before the 10 second timer expires.

http://support.microsoft.com/default.aspx?scid=kb;EN-US;2565378

“This article describes an update that adds 10-second timers to Microsoft Lync Server 2010 Outbound Routing (OBR) components. The timers are used to improve efficiency when the OBR components route Public Switched Telephone Network (PSTN) calls. The timers start when the OBR components send INVITE requests to PSTN calls. If the OBR components do not receive a PSTN gateway response within 10 seconds, the OBR components route the calls to other PSTN gateways.”

For most partner gateways this requirement is not an issue and most will return a session progression message regardless of the actual state of the call in progress. Whereas other qualified gateways (in my case it was a Cisco ISR) will only send a progression message back after the PSTN carrier or PBX has acknowledged call progression. In the case of International calls, calls to cell phones etc, carrier acknowledgement can be more than 10 seconds. This does cause call failures in cases where the 10 second timer expires.

Solution

You can change the 10 sec timer. The file which has the configurable parameters is 'OutboundRouting.exe.config”. Normally it is not supported to change the values in this file and unless your experiencing issues this value should not be altered . The file can be found on the front end server:

C:\Program Files\Microsoft\Lync Server 2010\Server\Core

Change this line:

<add key="FailOverTimeout" value="10000"/>

10000 represents 10 seconds. Most deployments should be happy with a value of 15-20 seconds. See Example below for 15 seconds:

<configuration>

    <appSettings>

      <add key="FailOverTimeout" value="15000"/>

      <add key="MinGwWaitingTime" value="1"/>

      <add key="MaxGwWaitingTime" value="20"/>

      <add key="FailuresForGatewayDown" value="10"/>

      <add key="FailuresForGatewayLessPreferred" value="25"/>

      <!-- Valid values are between 5 and 600 -->

      <add key="HealthMonitoringInterval" value="300"/>

      <!-- Valid values are between 60 and 3600 -->

      <add key="GatewayStateReportingInterval" value="1800" />

  </appSettings>

</configuration>

You will need to restart the front end service after the change for the new value to take effect.

This is a work around. Future CU updates may in fact change this value back or alter this value, so its something to be aware of when performance maintenance upgrades.

VoIPNorm

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.