Monday, January 29, 2007

Do you need a special Industrial Ethernet hardware?

This is a bit off topic, but someone asked me about “Industrial-Grade” Ethernet designs. Well, I’m not a hardware expert but have better than average exposure to issues of grounding and surge management. Plus I’ve been involved in many detailed discussions about how things may break Ethernet devices with engineers from Rockwell and other industrial suppliers. Bottomline is standard Ethernet designs are quite robust - the main thing I see hurting careless designers is PCB via or tracks violating the 1500v isolation of the Ethernet magnetics. This means a customer actually testing high-voltage noise or surges could see arcing across the isolation barrier which causes product reboots or even failure.

The $9.95 Mindset:
Any detailed discussion about “special Ethernet for Industry” first starts with the fact that customers can buy 10/100 computer Ethernet adapters from any big-box store for $9.95. So users have this perception that the increment for Ethernet is small & cheap. While they may not expect you to sell Ethernet products for $10 more than serial products, they won’t be happy to hear that your Ethernet product is $300 more. I will tie this together below, but the bottom line is the closer you can match your Ethernet hardware to the market norm, the lower your over all costs will be.

Who Pays for Extra Software Work?
Of course, that $9.95 computer Ethernet card doesn’t include:
  1. Microsoft’s ROI on TCP/IP and network stack work
  2. the OPC server cost to add Ethernet drivers in place of serial drivers
  3. tool vendors need (ie: your need) to rewrite serial-based tools to become network-based tools.
Plus don’t forget that the customers are NOT really just asking for “Ethernet” – Customer X will want web pages for configuration, Customer Y will want SNMP for remote management, and Customer Z will want strong 128-bit AES encryption suitable for general US-Government usage. So unlike the $9.95 PC card, you must hope your customers help pay for a whole lot of added, ongoing engineer work.

What is the Market Supply Sweet-Spot?
Go online and look at the cost of hard-drives – a 300GB (300,000MB) drive is in the $75 range, while an old 20MB drive (Meg, not Gig) costs about $140. We all understand this oddity – there is high demand for 300GB drives and virtually no demand for old legacy 20MB drives needed for repair. Even trying to buy a 40GB (Gig) drive today is hard. The market has what people call a “sweet spot” – a range of product features and capacity which is the cheapest and easiest to buy. Product builders trying to use components that are better than (or even worse-than) the market sweet-spot have disproportionally higher costs than builders using components in sync with the market sweet-spot.

The same thing happens for Ethernet components – for example buying magnetics rated at 1500v isolation (normal IEEE commercial spec) is very cheap while trying to source magnetics with 2500v isolation can cost an order of magnitude more. So while your company could define a number of electrical improvements for an “Industrial Ethernet” interface, you have to weigh this against the added cost and supply headaches of buying against the grain – of ignoring the gigantic “sweet-spot” for commercial-grade Ethernet components that enable creation of that $9.95 PC Ethernet adapter.

What is Your Manufacturing Sweet-Spot?
Just as the world market has a sweet-spot, so does your own in-house production; just ask your purchasing department. Adding Ethernet is NOT just the cost of adding a few new chips - the NIC, MAC/PHY, magnetics, and RJ45 connector. You may need to upgrade your whole basic hardware design away from a simple 8-bit CPU with 64KByte of memory to a 16 or 32-bit CPU with several MByte of memory. For example, Digi’s basic Device Server platform has a 32-bit CPU, 4MB flash, and 8MB RAM. Few of our products really need this much horsepower, but putting for example 8MB of RAM into all products is cheaper given purchasing logistics and reliability of supply than buying a mix of 2, 4, and 8MB chips. In fact, today we are looking at the cost tradeoff in shifting the basic design from 8MB to 16, 32, or even 64MB. Yes, 16MB (or 64MB) will cost more than 8MB, but given some products need 16MB (or 64MB) there are both tangible and intangible benefits to moving a larger volume of products up the curve to retain supply-chain advantages. This is especially true of FLASH and RAM chips which frequently suffer feast-and-famine availability cycles.

All small companies quickly learn – often the hard way – that during market shortages, it is the small volume purchases that get last delivery. During a chip famine low-volume purchasers will NOT be able to buy sufficient chips at any price to maintain their production. The higher your volume of a part, the lower is your price and perhaps more importantly the more reliable is your supply. So when you start to add Ethernet products and reduce sales of non-Ethernet products, you may find you need to upgrade the CPU design of some your non-Ethernet products to gain or retain reliability of parts supply.

How Robust is Commercial-Grade Ethernet?
So far I have been saying that trying to create special Ethernet hardware for industry may be costly and not very cost-effective. Worse, your average commercial-grade Ethernet is already very robust when compared to RS-232, RS-485 or USB serial. Ethernet uses a transformer-isolated signal with differential pairs, plus has nice, low-level, hardware-supported error detection. Given the high signal frequency, low signal voltage and isolation transformer, trying to add extra surge protection greatly complicates product ground design and weakens the signal, shortening the supported cable length below the 100m length customers have etched within their minds. So trying to boost your Ethernet spec for an industrial design gives questionable gain for the extra cost and lost profits. Plus your customers won’t likely perceive a market differentiation that they are willing to pay for if you say you have better isolation, etc.

A Note on Shielded Ethernet Cables:
Many industrial users start out assuming STP (Shielded Twisted Pair) is better than UTP (Unshielded Twisted Pair) for Ethernet. Oddly enough, STP has proven a bit like ABS brakes in private automobiles; despite lots of hoopla about saving lives when the US government forced ABS brakes into cars, insurance industry records continue to show it has had no measurable impact in real world road deaths. It seems while an expert driver can be helped immensely by ABS, your average idiot or careless driver still reacts to skidding situations in ways ABS brakes cannot fix.

The same appears true for STP cables and Ethernet. I have seem many discussions where industrial users tried STP cables and found the system only works reliably when they lay temporary UTP cables across the weld-shop floor! I suspect the main problem is traditional IT groups have used and measured STP success in terms of preventing Ethernet cable emissions affecting other equipment. This is not the same as using STP to prevent external interference from affecting the Ethernet signal. So ignoring the issues old truisms of a floating shield is worse than no shield and a shield grounded at both ends and creating a ground loop is worse than no shield, it appears that only experts and a very detailed system design results in STP Ethernet working better than UTP. My recommendation is to use optical fiber whenever you really worry about noise interfering with UTP Ethernet.

Vibration and RJ45:
Field tests of RJ45 connectors have shown them very bad in areas of high vibration. This is actually very easy to see for yourself - take any RJ45 connector with pins facing down and wiggle it up and down. What happens? That little finger-catch / lock acts as a pivot point and you are actually scrubbing the gold-flash contacts of the connector against the socket contacts. Metal-against-Metal; quess the result. Tests on industrial robot arms have shown even high-quality gold plated RJ45 connectors self-destruct in months or even weeks. If you expect vibration, better look for alternative connectors - such as any of the many (way too many) IP67 locking designs.

Industry and CAT 5, 5e, and 6:
Another insteresting twist to the commercial evolution of Ethernet is tests of bulk cable shows that CAT 5 is the likely the best for industrial use where the noise rejection properties of the twisted pair (differential signal) is desired. This is because - so I have been told - one of the tradeoffs IEEE allowed for CAT 5e and 6 is to allow less consistancy within the wires of a pair. After all, few Ethernet systems ever see serious external interference, so things which improved speed outweighed things which reduced noise rejection in abnormally high noise conditions. Several large automation companies tried to bring up the idea of a CAT 5i with IEEE which emphasised better noise rejection and special jacket plastic, however ... it appears it went no where. If the big computer, networking, and cable vendors don't see the value, it cannot happen through IEEE.

Been quiet 2 weeks - had some domain fun

Sorry I have not posted in 2 weeks - I was trying to move between hosters. This took 2 weeks longer than it should have and while the DNS info was in limbo I didn't want to risk breaking the site with partial updates. I am trying a new supplier with better web stats and email SPAM filtering at the SMTP level - I don't want to even see the 300-400 junk email a day I get from "known SPAM servers" on black-lists.

Hint: if you have a personal or private domain hosted by a 3rd party, don't have the same firm manage your domain record. It should have only taken 48-hours to change a DNS record, but my web hoster was trying to save me from myself (ie: my moving my DNS record away from an active 6-year old host account). I suppose there was also some self-interest in my hoster erroring on the side of caution.

With the various scams, general lack of IP knowledge in most customers, and the waiting periods to RESTORE incorrectly changed DNS info I don't blame my hoster for being careful, however I wish they'd have sent me email notifications that they wanted confirmation. I wasted the first 6 days doing 2 online requests to change. It was only after the 2nd request produced no result in 72 hours that I started calling them by phone to push the issue.

Friday, January 12, 2007

Cellular Costs - bytes you pay for each month

Sadly, we are about to enter one of the dark-arts of cellular usage ... what are you actually billed for. Given the 50 page voice cell phone bill my family gets each month, one would NOT think the cell phone companies lacked the ability to explain - let alone document - what they charge data users for! It is not that one cannot get a verbal answer from cellular providers' engineers; one can get too many different answers.

However, there are some facts we can know.

An example: Modbus/RTU via TCP/IP, one poll per 10 minutes
Let us build up an example. Start with a customer named Joe who plans to poll 10 words of data every 10 minutes, or 4320 polls per month. Under Modbus/RTU this would be 8 bytes in the request and 25 bytes in the response. So Joe starts with a the wonderful view that he'll only be moving 143K per month and maybe one of those $3.95/month plans for half-a-meg will fit nicely.

Sorry to throw some cold water on Joe's euphoria, but Joe still must pay for the TCP and IP header overhead. After all, the cell data network is in effect "tunneling" his TCP/IP and Modbus/RTU and so treats even the TCP and IP headers a billable payload. So Joe needs to consider that 4320 round-trip polls per month results in 8640 TCP/IP data packets and potentially another 8640 TCP acknowledge packets. Perhaps half of these TCP acknowledge packets will be merged with the TCP/IP data packet returning the Modbus/RTU response ... but then again maybe they won't. So to keep it simple and budget safe, Joe should assume worst case and that all 8640 TCP acknowledgements travel alone. Assuming each IP header is 20 bytes and each TCP header is another 20 bytes (they may be 28 is you use Linux), this amounts to another 17,280 times 40 bytes or 691K bytes (0.7MB) JUST for the theoretical TCP/IP overhead. Joe is up to 834K per month now - clearly a 1MB/mo or larger plan is required.

Ok, wait a second ... now why did I say "theoretical TCP/IP overhead"? Because in reality Joe will end up moving more TCP/IP traffic than the 4320 polls strictly require. The first extra overhead will be from premature TCP retransmissions. The high variable latency of cellular means Joe will see from 2% to 10% retransmissions, and since cellular is very reliable, each transmission will result in duplicate TCP acknowledgements as well. Sticking to worst case, budget-safe assumptions Joe should budget about 10% or 100K per month for premature TCP retransmissions. So now Joe is up to 934K per month.

However, there is yet another overhead Joe should budget for - TCP Keepalive probes to detect lose or death of the TCP socket. Without this, one end of the connection could go away and the other end would never know and never recover the socket resource. Since wide-area-networking is involved, Joe also needs to assume at least one intermediate device will abort and discard the TCP context if idle more than 5 minutes. Given Joe polls every 10 minutes, he'll need at least one TCP keepalive exchange between each poll. Each TCP keepalive exchange consists of another 40 plus 40 bytes, so we are talking 4320 x 80 bytes or another 346K of billable traffic. This puts Joe up to 1.28MB of billable traffic to move 143K of Modbus traffic.

Now, why not close and reopen the socket? Yes, that is an option but each TCP close and reopen generates about 320 bytes - not including TCP retransmissions. So Joe can either pay for 346K worth of TCP Keepalive or 1.38M of TCP socket thrashing; which would be 1.28MB and 2.32MB per month respectively.

So Joe is up near 1.5MB per month just to move his 10 registers of data once per 10 minutes, and this doesn't include any time he checks the web interface of his cellular device for status (say another 200-500K per access), nor does it include any on-demand HMI data access screens which trigger other Modbus/RTU polls. These could easily create many MB of traffic per month and requires carefully, mindful behavior by Joe and his colleges to control costs. One careless person can easily drive the cellular bill up by hundreds of dollars in a month!

  • Raw Modbus/RTU data = 140K per month
  • Basic TCP/IP headers to move and acknowledge data = 691K per month
  • Estimated 10% premature retransmission = 100K per month
  • One TCP Keepalive exchange between 10 minute polls = 346K per month
  • Overall, Joe should expect at least 1.5MB per month and I'd suggest he budget for 3MB or even 5MB. This puts him up into the $20/month cell plan range.

Thursday, January 11, 2007

WiFi 802.11b Modbus Bridge - TCP to serial

- Want to query a movable Modbus slave on a conveyor?
- Want to roll a Modbus Master HMI around on a cart or AGV?
- Want to query your outdoor oxygen tanks without running a cable through the wall?

I was a bit surprised, yet pleased, to see our embedded team selected to port the Modbus Bridge function into the full line of low-end Digi Connect products. The latest 82001220_H2.bin firmware for the small, low-cost Digi Connect Wi-SP includes Digi's basic Modbus bridge functionality which allows network Modbus masters to share serial Modbus RTU and Modbus ASCII slaves. It is similar to the function of the Digi One IA and a subset of the Digi One IAP.

Protocol and transport combinations supported:
  • Modbus/TCP
  • Modbus/TCP via UDP/IP
  • Modbus/RTU via serial, TCP/IP or UDP/IP
  • Modbus/ASCII via serial, TCP/IP or UDP/IP

So for example, a Modbus/TCP OPC server could query a portable Modbus/RTU serial slave; such as a machine cell moved around as production mixes require. Or a serial Modbus/RTU master on a repair cart could wirelessly query network-based Modbus PLC and test devices.

Here is the full datasheet for the Digi Connect Wi-SP, but feature summary:

  • Field selectable RS-232, RS-422, RS-485
  • Flexible 9-30vdc power supply (serial NOT isolated from power ground)
  • Standard 802.11b with WPA2/802.11i security
  • SSL/TLS strong security support
  • Digi RealPort support for comm-port redirection
  • Software development tools available for custom firmware development

Saturday, January 06, 2007

Real Numbers (Part 2) - Modbus/RTU over cellular

My previous post covered Modbus/RTU polls once per 30 seconds. Here is a second set of results for 24 hours with polling once every 60 seconds.
  • Of 1440 polls, only 3 failed responses with a 30 sec timeout.
  • Fastest poll was 451 msec round-trip (1/2 sec)
  • Slowest poll was 10,407 msec round-trip (10.5 sec)
  • Statistical average of successful polls was 1748 msec
  • In chart below, white dots are round-trip times and black line is the moving average over a 15-minute period.
  • Notice a few interesting trends; such as the very fast response around 11am and again around 3am the next morning.
24 hours chart
(Click the image to see a large version)
(Click here to download ZIP file with Excel and OpenOffice spreadsheet of times)

Friday, January 05, 2007

Real Numbers - Modbus/RTU over cellular

I'm working on a fuller set of numbers, but here is a real-world example of Modbus/RTU poll-response times over cellular.

I am running a Modbus/TCP poller (ModScan32), which polls a Digi PortServer TS4H by Ethernet, and the TS4H in turn is using Modbus/RTU-in-TCP/IP via the Digi corporate backbone to poll a Digi Connect WAN on Cingular GSM with a Twido PLC on the serial port.

Why use the TS4H here? Why not go directly from a Windows computer to cellular? Well, three reasons:
1) The Digi TCP/IP stack is much more graceful over cellular than Windows' TCP/IP stack - the Windows stack retries too hard and wastes bandwidth that some patience would save. With Windows you'll commonly see 2 to 5 percent retransmissions which - given how cellular is VERY reliable - ends up doing nothing but create duplicated TCP acknowledgments you must pay for. This is actually a weakness in the design of TCP/IP; which proponents claim is SOLVED by TCP/IP as-is. TCP allows hosts to auto-adjust timing behavior to match real-world performance. Unfortunately, the standard TCP algorithms keep timing too close to the "average" behavior which wastes CASH over cellular links with high and variable latency.
2) I am testing a slave timeout algorithm in the Modbus Bridge code for the Digi One IAP and TSx family related to "stale" responses arriving after the slave timeout. This is a common weakness in Modbus/RTU hosts which assume either a timely response or NO response.
3) The Digi Modbus Bridge keeps nice timing statistics such as min/max/average round-trip delay and most Windows tools do NOT.

So for example, my Master polls once per 30 seconds. This means the GSM modem in the Digi Connect WAN maintains a constant data slot allocation with the cell tower. After 370 polls, 352 have had a round-trip of 2500 msec or less and 18 polls have had a round-trip above 2500 msec (ie: I have seen 18 timed out requests with a "stale response" arriving AFTER the timeout period - behavior I am investigating).

The Digi PortServer TS4H telnet trace includes this info:
01:38:15 IA INFO: mbrtu:s02 complete rsp min:467 avg:1565 max:9142 msec

This means the fastest poll took only 467 msec, the slowest took 9142 msec (nearly 10 seconds!) and the moving average round-trip time is about 1.5 seconds. So the minimum round-trip was only one-third of the average, while the maximum round-trip was nearly six-times the average. Since every poll is exactly the same, one would NEVER see such variance on a direct RS-232 or RS-485 serial line. I'd be surprised if a PLC would have even a +/- 10% variance in response times. This is one of the problems with using off-the-shelf tools with cellular - the vendors have just NOT designed the tools for such variation in response performance.

As a side note, the polls being less than 40-50 seconds is of interest here because the cell tower (with 2.5G GSM/CDMA) will take away the data slots from the modem if it has been idle too long - the time varies but is often in the 40-50 seconds. When this happens, the modem is still connected to the tower but requires some control traffic to be reassigned bandwidth to move data. So using a poll rate slower than this idle time would shift the average round-trip time up. Once cellular system make the move to 3G this "idle period" will drop to a few seconds only, meaning telemetry systems may perform much WORSE in the new faster networks.

Wednesday, January 03, 2007

Rockwell Bridging - Ethernet to DF1

Question: We have a MicroLogix 1500 with only 1 serial port. What Digi product can we use to enable Ethernet access from RSLinx or a HMI display?

The Digi One IAP allows bridging AB protocols:
- Ethernet/IP Master (such as ControlLogix) can query DF1 PLC
- CSPv4 Master (such as RSLinx, PLC5E or SLC5/05) can query DF1 PLC
- DF1 encapsulated in TCP/Ip (such as OPC server) can query serial DF1 PLC
- Modbus Master (TCP, RTU, or ASCII) can query DF1 PLC as-if a Modbus slave.

All of these can function concurrently, as the serial port is moving pure DF1.

The Digi One IAP cannot support DH485 because (like ProfiBus) the token rotation is too fast to be encapsulated over Ethernet successfully.

The general Rockwell Bridging is discussed in this PDF file:

Quick comparison of Digi One IAP to the 1761-NET-ENI:
ProductDigi One IAP1761-NET-ENI
Ethernet/IP to DF1 FullDuplexYESYES
CSPv4 (PLC5E protocol) to DF1 FullDuplexYESno
Modbus to DF1 FullDuplexYESno
DF1 encapsulated in TCP/IP or UDP/IPYESno
DF1 by port redirectionYESYES
Supports DF1 Radio ModemYES (next release "H")no
Maximum active Masters/Peers644
Configuration by WebYESno
Configuration by Telnet or SSHYESno

Basically, the Digi One IAP does everything the 1761-NET-ENI does (plus much more) except:
  • The Digi One IAP does NOT handle CIP encapsulated within DF1, which is required only for RSLinx to CompactLogix RS-232 port
  • The Digi One IAP does NOT have emails triggered by PLC MSG blocks
  • RSLinx won't talk Ethernet/IP through the Digi One IAP - it will talk "Ethernet Driver" fine. This is *NOT* related to existance of an EDS file. RSLinx talks via the 1761-NET-ENI because it is hard-coded to treat the ENI special. There is no EDS file information which RSLinx examines to function with the ENI.