Monday, November 06, 2006

Cellular-IP Friendly Apps - Retrying Socket Opens

Most industrial applications allow the user to set a slow poll rate – such as one poll per 5 minutes. This allows a user to budget a cell plan at 5MB per month and be quite assured of not going over. Unfortunately, this steady-state poll rate is unrelated to initial TCP/IP socket connection opens!

If the remote device is powered down or the TCP socket open fails for any reason, most applications will attempt to reopen the TCP socket continuously. On Ethernet this may make sense; the more frequently the open is retried, the sooner the failed connection will recover. Most Ethernet-based applications will retry opening a TCP socket every 5 to 30 seconds forever. However, for cellular you are paying for all traffic entering the cellular system. It is not Cingular or Sprint or Verizon's fault your remote device is off-line. You will be billed for each and every TCP retry. I have literally seen applications create up to 1000 MB of traffic each day attempting to reopen a TCP socket to an unreachable remote IP. On a 5 MB per month plan, this 1GB of overage could easily cost you $1000 or more for the month!

Recommendation: all applications must include a user-settable option to delay attempts to reopen TCP sockets. This value can default to no-delay, but users must be able to set a delay of at least 1 hour between retries. This enables the user to define and stay within their data usage budget regardless of success or failure of the TCP connection.

Impact: On Ethernet this should have no direct consequences since the recommended default is no delay. Cellular users must adjust this retry delay to match their data traffic expectations and their cell plan budget.

For example, an application polling 10 Modbus registers per 5 minutes via TCP/IP creates about 198 bytes per poll. This works out to 2376 bytes per hour or a little under 2 MB per month. This is a very safe poll rate when paying for a 5 MB per month plan.

Therefore the desired TCP reconnect scenario should also create no more than 2400 bytes per hour. Consider that a 20-second timeout under Windows creates at least 120 bytes of traffic to an off-line remote. Windows sends a 40-byte [SYN] packet and retries the same 40-byte [SYN] in roughly 3 and then 8 seconds from the previous [SYN] packet. Increasing the timeout to 30 or more seconds creates a fourth 40-byte [SYN] packet sent about 18 seconds after the third. So forcing an application to only attempt one connection per 5 minutes will create from 1440 to 1920 bytes of traffic per hour. This will not break our budgeted cell plan.

No comments: