Lynn's Industrial Protocols over IP: 2008

Tuesday, November 25, 2008

Ethernet/IP PCCC Service Codes

Summary: the Rockwell PCCC Service Codes (or how to get a 1761-NET-ENI or Digi One IAP to send specific DF1 DST/SRC bytes)

A customer had a product which acted as an Ethernet/IP originator (client) and needed to talk through a Digi One IAP to feed DF1 into a DH+ gateway. The catch is the DH+ gateway used the DF1 destination byte as the DH+ node address - and so far they could only make the Digi One IAP spit out a destination = 1. This made for pretty boring and unprofitable DH+ design!

(Sorry: I cannot explain "How to write an Ethernet/IP stack" in a single blog entry, so this entry assumes you already have a product speaking Ethernet/IP and just want to tweak how you talk to remote DF1 devices.)

The PCCC object (code: hex 0x67) is not part of the ODVA spec - it is a vendor specific object used by Rockwell/Allen-Bradley to talk to SLC5, PLC5E and MicroLogix PLC. I won't fully explain it here ... not much to explain; do a wireshark trace of a ControlLogix talking to a SLC5/05 or MicroLogix 1100 and you'll see all you need to see about the simple object.

The default service code of 0x4B has this form:

Exec PCCC Service	4B
IOI to PCCC Object	02 20 67 24 01
Originator Info	07 03 85 50 4d 41 41
Example PCCC Message	0f 00 5c 00 a2 14 07 89 00 00

The only mystery here should be the "Originator Info". The 07 means a total of 7 bytes (including the 07). 0385 is a unique sequence number which you should change between messages. The 504d4141 is a "CIP Serial Number" which must be unique to your vendor id - most vendors just use the last 4 octets of an Ethernet MAC address.

However, there is NO remote node or slave address info, so the Digi One IAP or 1761-NET-ENI will create a DF1 message for destination=1 and source=0. So back to the customer's question ... how to force a DF1 message with say destination=5 and source=2?

The easiest solution is to switch to service code 0x4C, which has this form:

DH+Like Service	4C
IOI to PCCC Object	02 20 67 24 01
DH+ Like Header	00 00 02 00 00 00 05 00
Example PCCC Message	0f 00 5c 00 a2 14 07 89 00 00

So the "Originator Info" has been swapped with an 8 byte structure of the form AA AA BB XX CC CC DD XX. The "XX" are control bytes of some sort - just leave 0x00. "AA AA" is the Destination Link; "BB" is the Destination node; "CC CC" is the Source Link; "DD" is the Source node.

Voila - both the Digi One IAP and the Rockwell 1761-NET-ENI will now create a DF1 message with destination=5 and source=2. Not too painful, was it.

To my knowledge all Rockwell Ethernet/IP PLC accept either code, and if you'd like to see the service 0x4C in action just set your RSLogix5000 MSG block to use "CIP with Source ID" and the ControlLogix switches from using service 0x4B to using 0x4C.

For completeness, there is a final service code of 0x4D with this form:

Local PCCC Service	4D
IOI to PCCC Object	02 20 67 24 01
Example PCCC Message	0f 00 5c 00 a2 14 07 89 00 00

It is documented in DeviceNet gateway manuals to reduce total byte count - but it also lacks any destination or source info so is of limited use in Ethernet-based systems.

Thursday, August 07, 2008

Creeping Firewalls and Change

Summary: Network infrastructure changes can create unexpected havoc (surprised?)

We had a customer come in with a complaint recently about some old products - a 4-port terminal server (a TS4) running Modbus bridging firmware. Things had gone on smoothly for years, but suddenly they started having problems which power-cycling the units solved ... obviously it's a Digi problem, right?

Well, I'm not going to run through all of the issues, but I'll highlight one which will affect more and more people over time. Remember the old days - first you had dumb dialup modems, then you had to pay a lot for various error-detection standards, and now all analog modems have a big chip which does a hundred different standard things and you CANNOT buy any low-cost analog dial-up modem which is any dumber than state-of-the-art.

Such creep occurs in Ethernet switches also - once hubs were cheap and switches cost a king's ransom. Now even the "cheap" $19 home switches purchased in big-box stores are auto-sense, auto-cross, auto-everything. The same things happens - once the features are rolled into software or firmware, future revs of the product inherit such features for free ... said another way, it is often cheaper to sell lots of one powerful model than handle the marketing and support costs of a dozen models of various power and feature-set.

Well, this customer had been upgrading their wide-area-network infrastructure - a tangle of routed IP-based leased lines, old analog lines, radio/microwave links and so one, and as part of the improvement the various router/PPP end-points had added stateful awareness of packets ... part of an increase in US government security demands for utilities. After all, when you have hundreds of raw Ethernet ports scattered across rural American protected with little more than a padlock and chain-link fence, one has to consider the possibility of someone trying to 'jack' back into the 100% private network from a remote location.

The issue with such an addition is that routers with firewalls commonly default to a 5-minute TCP life rule. Every stateful firewall creates a table entry for every TCP or UDP activity going "out" from a safer network and into a wilder, less safe network - your home DSL/Cable router does the same thing. NAT (etc) is optional, but bottom line is the firewall needs to understand and agree to the context of every packet trying to pass through.

For a TCP socket this usually means at least 1 packet (data or ACK or Keepalive) must move every 5 minutes to refresh the context and inform the firewall that the TCP socket is still authorized. If no packets are seen, the fire wall discards the context and the socket is now ignored and 100% blocked without comment. Of course both TCP peers might think the socket is alive and well, but they will never be able to use it again.

So back to our TS4 off in some remote field - a host computer has 4 TCP sockets open to it through various routed subnets and for whatever reason the host doesn't send any new packets for 6 minutes. The TS4 has no reason to talk since it is connected to Modbus slaves. So 6 minutes later the host issues 4 Modbus requests on 4 sockets, which hit the stateful firewall. The firewall looks up the context (called forwarding or trigger rules in some systems) and discovers there is no existing context for these TCP packets. Well, once upon a time the firewall would just assume since these were safe since they came from the safe side, and thus auto-create a new context, yet modern stateful firewalls might NOT do this since it was a common hacker exploit to fool firmwalls to move malformed packets. Thus modern security demands that the first packet of any new TCP socket be a [SYN] packet. All four packets are silently discarded as unknown (note: this is a configurable behavior in most firewalls - to auto-recreate a context for an 'old' TCP socket continuing or to only allow this for new TCP sockets.)

The TS4 never sees the four new request; the host sees neither an [ACK] nor a [RST] of the TCP socket, so it retries after 3 seconds, then 6 and so on. Eventually it comprehends the TCP socket has died and opens four new TCP sockets. These are allowed through the firewall since they contain the correct TCP state as being new sockets. The TS4 sees the four new socket requests and now has eight (8) sockets connected.

Now you might say "What a minute - the four old sockets were closed!". Well, yes - the HOST knows they are dead and closed them, but the TS4 does not know this. Six minutes later, the same is repeated and the TS4 now has 12, then 16, then 20 sockets open. Eventually the TS4 runs out of resources, and the customer's open recovery option is a hard reboot of the TS4. This is of course worse if the host only occasionally takes longer than 5 minutes to poll, since the increase in TS4 sockets could take an unpredictable amount of time to build up.

How to solve this problem?

The first line of defense in the modern IP age is to always, ALWAYS enable TCP keepalives with an aggressive 4 minute 30 second idle time in any product you install. Even if you don't have stateful packet inspection within your IP network today, that doesn't mean the next time someone replaces a router or serial PPP end-point you won't gain one. Unfortunately the Digi products usually default to a 2 hour TCP Keepalive - which is also disabled (for historical reasons). Yet having a 4.5 minute keepalive would have saved the TS4 because the first time the host delayed longer than 4.5 minutes to poll, the TS4 would have sent a TCP keepalive packet back through the stateful firewall, which the host returns and the firewall's context for this socket is refreshed. Thus when the host does finally send the next Modbus poll after 6 minutes (1.5 minutes after the TCP keepalive exchange) the firewall is satisfied with the packets and forwards them out to the TS4. Everyone is happy. Plus even if the TCP socket has been aborted by the firewall, the TS4's TCP keepalive will be silently discarded and the TS4 will retry and eventually comprehend that the TCP socket is no longer valid. This is the purpose of TCP Keepalive - to allow a TCP peer with no data to move to test (and refresh) the health of an idle connection.
The second possibility is unique to the Digi IA/Modbus application. You can enable a setting referred to as "idletimeout" on incoming client or outgoing server connections. Unlike the TCP Keepalive which create traffic and keeps an idle socket open, the idletimeout literally just aborts an idle socket without giving it any chance to prove it is healthy. So setting a 5 minute idle timeout in the TS4 would cause it to just assume any incoming Modbus client (master) connection which has NOT sent a new request is bad. This setting would also have saved the TS4, forcing the four old sockets to be aborted before the TS4 had a chance to build up a herd of dead sockets.

So understanding TCP Keepalive is a critical skill for all modern industrial Ethernet users, after all more and more facilities are breaking their floors and functional area into distinct subnets with stateful firewall protection to keep the Accountants from changing the color of the widgets being produced - and to keep the production engineers from printing on the accounting departments laser printer which uses red ink :-).

More complete discussion of TCP Keepalive is here (one of dozens of web site you can find in a web search).

Friday, August 01, 2008

Tunneling Serial Data over Cellular

Tunneling Serial Data over Cellular

Question: you have a Windows-based application which expects to talk over normal serial ports - how can you connect to a remote serial device over a cellular-IP link?

Products: works with Digi Connect WAN, WANIA or WANVPN, Digi ConnectPort VPN, X4 or X8

Answer: Our Digi RealPort driver for Windows 2000/XP/Vista dated Nov 2008 now supports a nice low-overhead "UDP: Serial Data Only" mode. This will cost a fraction of what normal TCP-mode Realport or competitors TCP redirectors will cost.

Here is a web page explaining how to install RealPort for UDP mode (DDNS names can be used!)
http://digitips.wikispaces.com/Digi+Realport+with+UDP

Here is a web page explaining how to set the WANIA to UDP Sockets.
http://digitips.wikispaces.com/Digi+UDP+Sockets+WAN

I set this up yesterday and have been polling a Modbus/ASCII serial slave on my CPX4 using RealPort and UDP mode. Responses take from 2 to 3 seconds to return … in part because I am polling so slowly that the modem PARKS in between every poll and I wait 30 seconds for a response timeout. I see perhaps 0.25% error / no response timeout but then my SIM doesn’t have the best signal where my product sits.

However, I still believe people need to be realistic – even in the situations where the slave response times out I’d suggest NOT retrying immediately since the retries have a higher than normal probability of failing as well. Missing a few polls a day is better than needlessly paying more for data plans.

Wednesday, July 23, 2008

Evolution of Data Plan Billing

Summary: the big three have moved away from unlimited data, towards limited data.

It is interesting - I once (as in last year) had a talk with a potential partner who'd been at some European conference and was convinced the world was on the verge of low-cost (sub-$20/month) unlimited cellular data plans. We were discussing the creation of report-by-exception tools to reduce SCADA costs, and this partner's strong faith in this belief caused them to eventually bail out of the talks, saying "In a year or two, no SCADA company will care about how much cellular data they use."

Yet as of the summer of 2008 the world of cellular data is moving in the opposite direction. Last year the big three (AT&T/Sprint/Verizon) offered "Unlimited Data" for personal users with the Service Terms listing a VERY narrow list of permitted activities - mainly email and web browsing, with many common things like file download/upload, media-streaming prohibited. So when ever one of the big three would cut off a user for moving too much data on an "unlimited plan", the service provider would fall back on the "You are doing prohibitted things, thus impacting our network, thus take your business elsewhere". What a way to cause bad feelings, eh? Note that this change is CONSUMER plans - machine-to-machine have always been limited, priced by the MB/month without rollover, plus with charges for data overages.

Now all three have dropped the price from the $80/month range down to $60/month range ... but added a hard limit of 5GB per month. Isn't free & vigorous market competition wonderful?

Sounds reasonable - 5,000 megabytes of data is a lot, yet this doesn't mean 5GB of data transfer. It means 5GB of metered activity, with many activities I've studied including up to 95% overhead. Thus someone only moving 20-30MB of real data in small packets per month might hit pretty close to their 5GB limit! My experience with normal wide-area-network traffic hints that a real PC user doing simple email and web-browsing once a day would probably move 1-2GB of data before hitting the 5GB total activity limit.

To paraphrase the wireless data service terms for all three:

Data transport is always measured in full kilobytes
Actual transport is always rounded up to next full-kilobyte at "end of session"
Network overhead and resend requests caused by network errors can increase measured kilobytes.
2 of 3 mention always rounding up to nearest kilobyte every hour period.
All warn that you will NOT receive an itemized detail of how your charges are calculated; you will NOT see which services were used or during which time periods the charges were inccurred under.

So if I send a single 50 byte UDP/IP packet, is that a full session and billed as 1024 bytes? Could be under this language since UDP is 'sessionless'.

Hmm, the term session is pretty ambiguous. Perhaps it means per "time you enable your PC-based cellular data card." That seems likely - plus if you left your device on twenty-four hours a day then the once per hour round-up would catch you.

I'm afraid I haven't offered any new answer here, other than to suggest you understand that low-cost unlimited data plans ARE NOT just around the corner ... at best we left them behind last year and I don't foresee them ever returning. I suppose all three now understand that huge new profits are to be made with these 5GB limits, which will cause many "super-salesman" using their cellular data plan daily to spend an extra $50 to $500 in monthly overage charges.

Friday, July 11, 2008

Lower Cost Cellular to Rockwell AB PLC

I have several customers now working through how to manage cost-effective cellular access to Rockwell PLC such as ControlLogix, CompactLogix, Micrologix 1100 and so on. Unfortunately the most straight forward way to link using Ethernet/IP is fairly costly.

First, a personal recommendation from me – a free tool which I find very useful and think you will too. Today, you can buy 2GB USB flash drives for $15 – if you're old like me, you remember when an entire Windows computer only had 0.020GB of hard drive space! Did you know you can literally install and run many Windows applications from these portable USB drives? This means any Windows computer you plug this USB drive into has your applications, your settings, and your data files. I've used one of these for over a year and it is invaluable - all free open source code too! You can run OpenOffice (which can read/write MSOffice 2003 files and is much faster than the MSOffice 2007 we use at Digi), Firefox web browser, plus a dozen other tools. Take a special look at KeePass, which I use daily from my USB drive to securely hold all of my hundred-plus account names and passwords.

Portableapps.com - What is a Portable App?

Okay, back to work.

Periodic PLC Access from RSLogix

Customers who want to peek into a single PLC at a remote site for an hour or two can use RSLinx to connect to either an IP or DNS name, then see the PLC via cellular. The catch is RSLinx will create from 12MB to 200MB of background traffic per month. So you need to create a new Ethernet Driver (not Ethernet/IP!) JUST for this one-time use, configure in your details, connect and do your work. When you are done you need to turn browsing off, then delete the comm driver. Why not just delete the IP or DNS name? Unfortunately once RSLinx has seen a device, it can be like a bad rash to get rid of it.

Data polling – at Central Office

Customers with an OPC server speaking DF1, CSPv4, or even Ethernet/IP can poll PCCC-type data through a Digi One IAP, which converts the polls into DF1 Radio Modem under UDP. Tests have show using DF1 Radio Modem every few minutes accomplishes the same data movement as Ethernet/IP with only 5% the data cost (or Ethernet/IP uses 2000% more data bytes). One unit of Digi One IAP can poll up to 60 remote IP or DNS names. If your OPC server can encapsulate DF1 Radio Modem directly into UDP/IP, then you won't need the Digi One IAP to act as your host.

Data Polling – at Remote Site

If you have an AB PLC which speaks DF1 Radio Modem directly, then any Digi cellular router can be configured for UDP Sockets, with shuttles UDP data received to the serial port. Make sure you use the latest Digi firmware so it can just return UDP responses to last sender without explicit address configuration.

If your AB PLC doesn't speak DF1 Radio Modem, or you want to use an Ethernet link, then using a Digi cellular router with Python support allows a simple script which accepts DF1 requests and uses a local Ethernet/IP session to query responses from the PLC's PCCC Object. This Python code even runs on a PC under Windows or Linux. As soon as I have a link or web page explaining how to get and use this code, I'll edit this post to add it here.

Thursday, July 10, 2008

Quick Data Comms to AB PLC

One of my readers was asking for a quick way to talk via Ethernet to a Rockwell AB PLC.

You can actually talk to a ControlLogix by only understanding TWO (2) different packets, each with a response, so four packets I guess. The problem is this uses "UCMM" style communications, which the PLC has very limited resources for. Said another way, the CIP Connected Messaging or I/O production both include an inherent allocation of resources, while the UCMM is designed to be used ONLY to setup such pre-allocated resources.

So yes, you can use the information below to create a literal quick-n-dirty solution, and if you talk more than a few times a second you might start to interfer with other communications to the PLC (which is not a good thing!) However you could treat this as a proof of concept, and then work to do the communications more fully per the ODVA specs.

Here is the PDF of how to read/write to the PCCC Object in a Logix Processor here.

Friday, May 30, 2008

Public Internet Risk in Common Tools

Two months ago a SCADA customer asked me to enable FTP (File-Transfer-Protocol) on a test RTU they'd sent me to put online for them. It was on a DSL link and although I warned them it was a bad idea they said it would be okay because the RTU had username/password protection and the RTU had nothing important on it.

The punchline is that a few days later the customer sent me an email saying they couldn't FTP into the RTU anymore, so couldn't check the log files.

I looked, and the RTU now had 3 TCP sockets open (all the sockets allocated for FTP on this RTU) to some FTP client at an IP address registered in Korea. All 3 were just slowly walking through a dictionary attack of username/passwords (user bill, pass honda14 ... user billk, pass 12tomes ...) No doubt the IP and FTP client belonged to some university student running kiddy-scripts obtained on the Internet. No doubt the kid probably didn't even care if the odd device he or she had never seen before was not a computer (FTP servers always announce what and who they are when you connect). No doubt this attack wasn't costing them anything, as either the university or parents were paying for the Internet connection - or the IP and connection belonged to some patsy whose home computer had been compromised.

So okay, it was not causing any real harm, except it defeated the purpose of enabling FTP since the end user no longer had access to FTP. I suppose you could call it a denial-of-service attack, yet I'm sure that was NOT the intension of the 'attacker'. The student was probably just hoping to be able to post a message on some forum saying 'I hacked a computer at this IP address in the USA, and here is the FTP user name and password I created for you to access.' The fact that the RTU only contained a dozen binary log files would be irrelevant.

Do I have a moral to this story? Hmm, not really - other than industrial users have to understand that what is NOT interesting to them might be interesting to others for very different reasons.

If this user had allowed me to change the FTP port to some random value like TCP port 38207, then it is very likely this particular student would NOT have found it. Since true port-scans are so easy to detect, the student's tool probably had a list of a few dozen TCP ports commonly used by FTP servers, then it would randomly try them over a few weeks at any single target IP.

The same story could be true for web servers on TCP port 80 (or 8000 or 8080). Digi ships our cellular products with the web server enabled on port 80 because that is what customers expect. Sure, they add a username and password, but what will happen to their data plan bill if their 3MB per month plan moves 3-GigaByte because some scripts are trying dictionary attacks on the login of home web page?

I've had to always mention that 3GB part (meaning a $500+ bill for a month) since every time I mention to such users not to leave port 80 setup as a web browser, the industrial customer's answer is invariably '... it'll be okay because the unit has username/password protection and it has nothing worth trying to see on it ... '

Thursday, May 22, 2008

Cellular to Wireless Zigbee

An interesting new market we are moving into is cellular access to wireless mesh (Zigbee as example). In some sense it's a supply-chain dream come true. Imagine you supply a product to customers and EVERYONE (except the customer's IT department) want you to be able to see what the stock level is and auto-schedule deliveries.

The customer benefits because they can treat the product as a 'utility' - turn the tap and there it is.

You benefit because you can minimize emergency truck-rolls - no more angry, panicked customers demanding you send a truck over two-thirds empty because someone forgot to schedule a special delivery because the customer needed to use 60% more product for two days. Of course as a supplier the cost of the truck, fuel, and driver are critical parts of your margin/profit. You desire to only send out full trucks which return gracefully empty!

So we are now working with several of the largest chemical suppliers in the world to enable:

drop in a powered cellular unit at the customer site
drop in powered or battery tank sensors
log levels hourly, for the supplier to upload daily (reduces cellular data charges) The supplier uses this as their 'secret-sauce', their own proprietary value-add to predict when trucks need to roll to maximize efficiency
enable alarm call-out if the levels hit unexpected low-low levels

How this works varies by suppliers. The one I'm working with is using Modbus/TCP to pull up the logs daily. Some other suppliers are having SMTP clients push emails back to the supplier with XML formatted reports. The next supplier I might work will wants the binary logs to be compressed (ZIP'd) and then pushed upstream once a day by FTP, where their accounting system will convert from binary to XML to import and issue bills on product usage per MINUTE.

Of course key to all of this is the wireless drop-in-network concept. The supplier doesn't want to invest thousands of dollars pulling wires through SOMEONE ELSE'S PLANT - especially when the supplier's contract might end in a few months.

Wireless sensors aren't new; cellular data access isn't new; supply-chain systems which auto-detect product levels aren't new. What is new here is the merger of many technologies which reduce infra-structure costs, and thus increase ROI.

Friday, March 14, 2008

PCCC Protected Typed Logical Write with Mask

Someone asked about the "DF1 Supplement for SLC500" from 1995 which was online at ab.com for a short period, then was pulled - probably because it is a very poor quality optical scan. However, I had it ... then lost it ... then found it stashed away on one of my ftp sites.

So here is the original AB PDF (which I downloaded from ab.com last year) Grab it here while it survives.

The masked write is on page 11 and the first data-word is the mask and all data following has the SAME mask applied. So [0x0001,0x0000] would clear the LSBit and so on. It doesn't offer mask-data pairs - just one mask and N words. Thus this command is mainly for use with 1 word element writes ... unless you have three or four consecutive N-file words you wish having the same mask applied.

Friday, March 07, 2008

Optimizing Modbus for Cellular

Goal: lower your data costs

How nice it would be if you could take your Ethernet applications and just move them to cellular (or satellite). Well, of course you can ... but you'll pay through the nose for this.

At the moment I'm in the process of creating intelligent cellular gateways accessible by Modbus/UDP (aka Modbus/TCP form in UDP) which support data logging, report-by-exception and other cost-saving goodies.

A few Facts:

You are charged for all IP, TCP, and UDP overhead - the cellular system moves your TCP/IP packet as raw payload encapsulated in mobile-IP or other transports. So to them the 40-52 bytes TCP/IP header as NOT DISTINGUISHED from your data. ( My Blog entry on this )
Thus Modbus/UDP (aka Modbus/TCP in UDP/IP) will save you from 60 to 90% of your data costs. It is a single one-shot request followed by a single one-shot response. In contrast TCP/IP might require up to 400 bytes of socket open & close overhead, plus TCP acknowledge packets.

So if you are concerned about cost, you should first make sure your DATA POLLING can be done with Modbus/UDP - not TCP. No sweat if you need to use Modbus/TCP to reprogram or monitor your RTU or PLC short-term, just make sure your 24/7 repetitive data polling in via UDP. ( Is UDP reliable enough? )

So I've defined a few extensions (read as 'heresy' to many). Full Details are Here at iatips.com:

I allow use of the full Modbus/TCP header - so one can read 500 registers in a single request. This greatly reduces charges for header overhead
I allow returns LESS than data than requested when the context is appropriate. This saves having to pay for data padding, plus not having to poll a status register to see how much logged data is waiting
I allow use of data compression like ZLIB. Bottomline, 'ZIP' compression of small data sucks, but since I can return 500 registers (1K bytes) ZLIB starts showing value.
I allow packing multiple Modbus-ADU below a single header, which (unlike pipelining) signals the gateway to return multiple responses in a single packet.
I am looking at support for simple AES encryption, not as true 'security', but as a good-enough means to support Modbus without making development difficult.

If you are interested in the details, I have more discussion on my wiki site.

Lost My SIM

Well, I've been quiet for awhile - lost my "free development SIM" as part of Cingular's reorg into AT&T. Thus my collection of PLC with free demo access are off-line. Is interesting to support cellular without a SIM ... not that my boss is ignoring this, but there are 'plans in the works' which involve several companies ... plans which just keep moving out.

However, I am working on Cellular gateways with intelligence now. Goal is to allow a Modbus client/master to come in once per day and upload time-stamped logs; plus if certain events occur use Modbus to call for help.