Tuesday, October 13, 2009

Cellular DNS Lookup Cost

I have been working with a customer using two cellular devices with a DDNS provider, so the 'Client/Master' device used a DNS name to connect to the remote 'Server/Slave' device.

Cost #1 - added delay
The first cost they were seeing was added time lag to open a connection.
  • To open a TCP socket between two idle (but connected) cellular devices requires each device to be "unparked" by their cell tower. So the client device would need to request a resource and be assigned it by the local cell tower - this can require 2 to 5 seconds.
  • Next, a DNS query would be sent and a response would be required. Some cellular carrier DNS servers don't seem very peppy, so this could add a few more seconds.
  • Then the client device would receive notice from the cell-tower control channel to request a resource and be "unparked"
  • Finally, after "X" seconds of delay and lag the client device would receive confirmation that the socket is open (or it might have given up to fast already & aborted).
So using a DNS name instead of an IP Address might add a few seconds delay - which might be just enough to cause problems. The customer I am working with has a client tool with a hard-coded MAX timeout of 25 seconds, and it seems to often hit this during original socket open. So they have no trouble communicating once the TCP socket is open, but it can be difficult to get the first data packets moving.

Cost #2 - added traffic
Of course the DNS query also adds raw traffic. DNS uses UDP, and a query might be 44 bytes and a response 60 bytes. So 104 bytes extra to open a TCP socket (which can be hundreds of bytes by itself).

Is this traffic important? Perhaps. Perhaps Not. Most cell plans today round up traffic hourly, so an extra 104 bytes for DNS and 150 bytes for a Modbus poll/response is still "1K of traffic".

For TCP/IP, the DNS lookup would only occur for a socket open. So regardless of the DNS name Time-To-Live, you might only see this added 104 bytes once per day or month - or anytime you open/reopen a socket.

However for UDP/IP you MIGHT see this DNS lookup for every packet. For example, a DDNS record from dyndns.org might expire in 60 seconds, so sending one 100 byte UDP packet every 5 minutes, could become 204 bytes per 5 minutes, so 2448 bytes per hour rounded up to 3K per hour or 72K per day and NOT the 49K expected.

So while UDP/IP can save a lot of traffic cost over TCP/IP, if short-lived DNS names are required then some of UDP's savings will be lost.

One solution is a custom application to do indirect DNS lookup. In other words, to directly lookup the DNS name within the application, then to ignore the DNS "Time-to-Live" and use only the cached IP address UNTIL COMMUNICATIONS FAILS - then do a new lookup. This is not how normal operating system DNS caches work - they literally honor the DNS server's time to live and would cause a new look up based on the 60 second time-to-live. Yet this is general safe for cellular end-devices since normally the dynamic IP only changes when the cell link is lost, so the remote IP is unlikely to change more than a few times per day.

Wednesday, June 10, 2009

Does Cellular 3G help your Rockwell PLC Access?

In short - No.

Cellular data systems are continuously being sub-optimized for the unidirectional fire-hose paradigm of web page viewing, media streaming, image viewing, email download and so on. In other words, you only see the 3G "broadband" performance when you have tens of thousands of bytes to shove in one direction.

Actual field tests of RSLogix 5000 downloading to ControlLogix processors show that using 3G is only a few percent faster than standard technology - plus still probably slower than analog dialup. The problem is caused by RSLogix and Ethernet/IP moving thousands of small transactions in a half-duplex manner. So RSLogix downloads the PLC object by object, confirming success after each object is written. With cellular latency, each of these transactions can take up to a second to complete, so multiple 1 second by 3000 to 8000 transactions and you have your PLC download time.

3G apologists will claim "Oh, but latency in 3G is much lower than older, coarser technologies."

That may be, but when the rubber meets the road (aka RSLogix5000 downloads), 3G doesn't perform any better. 3G might have other value, but for most people moving data "faster" just means your data usage and the final bill racks up faster as well.

Saturday, June 06, 2009

Cellular to Redundant Rockwell ControlLogix

Can RSLinx access a redundant pair via cellular?

A Pair of Rockwell Automation ControlLogix racks with SRM Module and dual ENBT's will share a pair of IP addresses. One IP address is "the primary", and the other IP is "the backup" (if/when it is online). When the ENBTs switch role, they will issue the requisite Gratuitous ARPs to cause other local Ethernet devices (like a Digi cellular gateway) to update their ARP cache, thus comprehending that the IP-to-MAC address mapping has changed. Thus a NAT/Router forwarding to the primary IP should handle the fail-over with only modest bumps.

Config Details
A user desiring RSLinx (and RSLogix or RSView etc) to access a remote Rockwell ControlLogix (any RA/AB PLC) will be doing what the industry calls "Mobile-Terminated" access. The user needs to arrange a cell plan which offers either a fixed IP address to target, or at least a Dynamic DNS name to target (like tk101.iatips.com or panel2.digi.com). This is NOT what you obtain with an iPhone or personal air-card data plan. Those will have private IPs which only permit outgoing connections - called "Mobile-Originated". That was a buzzword lesson - expect to be asked about those two terms when you ask about cellular data plans!

So once you can arrange your targetable IP or DNS name, you need a cellular router such as one of the Digi Connect WAN family products. My favorite model today is the ConnectPort X4, but that has large memory for Python programming, wireless mesh and other goodies you won't need to link up your Rockwell PLC (Hey, I said it was MY favorite - doesn't mean it has to be yours!)

Note that contrary to folklore or urban legend, all cellular devices need certification to work on a system - even GSM devices. Many small suppliers get around this by including fine-print that say the device buyer is responsible to arrange such legalities, and since you (the buyer) don't read such fine print the salesperson will just say "Heck, it's GSM - so it is allowed everywhere world wide!" Deal with this issue as you see fit, but Digi has more formal certs in more countries worldwide than any of the other industrial players.

But back on subject, when the gateway comes up, it is assigned your known IP or DNS name. This is exactly how your home or business DSL/T-line line works. Yet when RSLinx tries to talk to the gateway on Ethernet/IP's well-known TCP port of 44818, the gateway will reject the connection as a weird attempt at hacking. You need to instruct the gateway:
  1. to not reject the Ethernet/IP traffic on TCP port 44818
  1. to instead forward it to a local IP on the Ethernet - which would be the IP of your ControlLogix ENBT (or the primary IP of the redundant pair)
The details of how this man-in-the-middle fake-out works is fascinating (to me), but quite a pile of text. If you are interested, this older blog entry goes through the NAT/Router details blow by blow. But bottom-line, your ENBT receives the RSLinx packet and needs to have its own Gateway IP set to the Digi gateway's local Ethernet IP address. Free hint: 9 out of 10 guys who call saying "Why can't RSLinx see my PLC through your Digi gateway?" have failed to set the correct Gateway IP in the PLC/ENBT.

So assuming your gateway and PLC are setup correctly, then targeting the RSLinx "Ethernet Devices" driver (with timeouts slowed down to 30 seconds) will cause your PLC's little icon to show up. With RSLinx running, you will create up to 200MB of billable cell traffic per month doing absolutely nothing - so don't leave it active. Note that the "Ethernet/IP Driver" won't work as it requires UDP broadcast, which can't be routed over the Internet.

At this point you'll say "Cool, now can I see my backup ControlLogix or a second PLC?", and the simple answer is "No." One of the realities of RSLinx and AB PLC is that the Ethernet/IP protocol is hard fixed to only the TCP/IP port 44818, and the NAT/Router can only forward TCP port 44818 to a single local PLC. The easy fix would be for Rockwell to change RSLinx to enable adding both an IP/DNS name and TCP port number - then the NAT/Router could forward TCP port 44818 as 44818 to the primary ENBT, TCP 44819 fixed-up to 44818 to the secondary ENBT, TCP 17256 (a random number :-]) fixed up to 44818 to an RSView panel and so on. Because the NAT/Router can restore all traffic to 44818 on the local Ethernet, RSLinx is the only tool needing to change.

Will Rockwell ever do this? It would take a programmer half a day to do - then a few weeks to test - then a few months to document and forestall support headaches. So who knows. They might. They might not.

But bottomline is a simple cellular NAT/Router can be used to talk to a pair of ControlLogix running in a redundant configuration - you will just be limited to seeing only the primary ENBT and the primary IP address.

Tuesday, April 07, 2009

Cellular Antenna and your PLC

One of my customers is having fun with one of their own customers. My customer uses Digi gateways running a Python application to collect hourly tank levels, which are fetched by cellular once per day. The tanks hold 250-gallons of a chemical additive which is injected at a variable rate into crude oil pipelines. Small battery-powered ultrasonic level sensors (www.massa.com) push the last 8 hourly samples every few hours using wireless IEEE 802.15.4 (aka Zigbee). The end customer's goal is to forecast when the tanks need to be refilled ... and I don't mean just receive an alarm when it is running low, they literally want to plan truck routes in advance.

Fun stuff.

The problem is that one of the pilot sites has bad cell coverage, meaning many days the central system cannot upload the log. Of course the data eventually it all uploads since it is held for over a month. Now, never mind that this system sits down in a dry wash (a small valley), the end-user says "Hah, that's because carrier A*** stinks; we all love carrier S****." My customer does not want to mix carriers - especially since negotiated pricing between the two in question is so different.

So remember that "coverage" for fixed RTU must be viewed differently from "coverage" for mobile uses.

For a mobile device (or user) "coverage" is defined by the probability that a valid carrier connection will be available as the device moves around. Thus the number of towers (etc) is important, and if there is no signal in one spot, then hopefully will be one a mile or two away.

Unfortunately, our little RTU bolted to a power pole in that gully won't be moving a mile or two ever. So in the end, "coverage" for a fixed device is defined only by the tower(s) which can be seen. So the carrier with the best cell-phone coverage might not be the best carrier to support a particular fixed RTU.

The first step will be to move the RTU up out of the gully, which is easy since the data signals are all wireless and power can be tapped from any of the privately owned power poles. Probably carrier A*** (stinky or not) will work fine once the RTU moves. As plan B and make the end-customer feel listened to, a S****-based RTU will be placed on the same pole as the relocated A*** one.

Could they run out a long external antenna? Sure, they could. But why bother? Put the RTU where it has a nice signal. Even in 2 or 3 store buildings we suggest customers put the cellular router up in the roof-area and run the Ethernet UTP it's 100 meters instead of trying to deal with the signal loss in long antenna cables. We even have customers using 900Mhz Ethernet bridges to link a cellular router placed where it must be placed back to the Ethernet devices which want access.

Worst case, a directional yagi antenna could be used, however you need to understand that cell towers are routinely turned off without warning. The carriers are truly geared towards mobile users who expect bad signals in some places. Towers (or the active elements) also move. Most of the cell towers you see along the highway work like strip-malls; a company owning the tower and supplying power leases tower space to a mix of carriers. This allows everyone to be flexible.

So prematurely locking down a cellular device to a single tower with a directional antenna can cause future problems since it will not see other weaker towers should the targeted tower be turned off or even moved.