Managing Network Connection Breakdowns

When network outages or delays reach certain levels, some on Sinch Contact Center functionality is lost or hindered. In environments where network connection breaks occur, for example, when server connections go across intercontinental WAN, or in high load situations, adjusting the IpcConnectionTimeout parameter may be necessary.

Network Outage Effects on Signaling

In the table below, the timelines of cumulative effects when a network outage grows longer are shown.

Table 1. Network Outage Effects

Outage in Seconds

Effects

1

All functions are normal

2

Predictive outbound calls start to get abandoned due not being able to dispatch connected calls to agents within limits set by legislation in many countries.

3

Delays in voice dispatching starts to be noticeable

10

Server modules consider connections to other servers as lost. Once connections are re-established, some or all ongoing calls and chat discussions can get disconnected when system tries to recover from lost state information.

20

The entire server is considered as lost, and the system high availability mechanisms begin to start processes and server IP addresses on backup servers on both sides of the network outage. If large enough parts of the system are still in contact, the so called split brain syndrome is possible. In this case, after network is up again, certain processes that are already up and possibly serving may be shut down, and this may cause loss of active operations. Otherwise, the system is resumed normally as above.

In addition to above general effects, there are special situations if only part of the network is down. The most important special cases are the listed below:

  • If clients (such as CDT) lose connection to the Connection Server (COS), after 20 seconds the server is considered as lost and upon reconnection the ongoing call is disconnected.

  • If connection between database and Agent Server is lost, new logons are not possible because authentication information is kept in the database only for security reasons. Existing sessions are not affected, except that operations requiring database (such as storing script results or creating a new user) fail. Once connection is re-established, new logons can be processed normally.

  • Effects of network delays and outages between a H.323 or SIP trunk (gateway) and the H323 Bridge and SIP Bridge are subject to H.323 and SIP standards respectively. To have no ill effects, delays there should not be longer than hundreds of milliseconds.

Registry Parameters for Connection Breakdown Management

As of SP04 Patch 1, the formerly hardcoded values for managing network connections have made to parameters adjustable in registry. Of these parameters, the IpcConnectionTimeout may be helpful in environments where network connection breaks occur, for example, when server connections go across intercontinental WAN, or in high load situations.

IpcConnectionTimeout

Timeout in seconds for determining silent loss and connect establishment. The higher the value is set, the longer connection breaks can be tolerated, but the availability of the system may be compromised.

The default value is 10 seconds; do not set the value smaller than that. In environment where network disconnections cause continuous trouble, it may be advisable to set the value to 20, and follow if the availability is high enough, that is, if new requests or calls can be set. If not, the value can be set a few seconds lower, and so little by little find the optimum level for that environment.

Enter the parameter under the module’s registry key for servers that do peer-to- peer communication with other contact center servers. Type of the registry key can be string or DWORD.

Note:

Only Agent Server (AS) and Connection Server (CoS) are able to start using the parameters on the fly; Call Dispatcher, Chat Server, Chat Portal Server, Data Collector, Directory Server, External Terminal Controller, H.323 Bridge, Media Routing Server, Quality Monitoring Server, and SIP Bridge must be restarted before parameter values take effect.

The parameter must be set for both directions, for example for AS – CoS connection, it must be set both in AS and CoS.

The following two parameters are related to the previous one but only to be used in internal testing, do not change the value in production use:

  • IpcHeartbeatInterval: Interval in seconds. Heartbeats are sent only when no other messages have been sent recently. The default value is 3 seconds.

  • IpcConnectionRetryTime: Time in seconds waited between connection retries. The default value is 5 seconds.