Problem with ping or connect to WIZ810MJ

I have a device using WIZ810MJ. On most networks it is working fine. But when connected to a network which uses VRRP (virtusl Router Redundancy Protocol) I am having trouble Pinging or connecting to the WIZ810MJ. If the ping is sent from a computer connected to the same switch as the WIZ810MJ the ping works fine. But if the ping is sent from another computer on the network on a different subnet, the WIZ810MJ receives the ping and sends a reply, but the reply is not received. I determined via Wireshark captures that the WIZ810MJ is sending the ping reply to the WRONG MAC ADDRESS. Because the ping originates on a different subnet, the reply must be sent via the gateway. The WIZ810MJ has the gateway address set to 10.70.152.7. The MAC address associated with that gateway IP address is 00-00-5e-00-01-82. But the WIZ810MJ is sending the ping reply to MAC address ec-e5-55-67-d0-0d. This is NOT the MAC address of the gateway, it is the MAC address of the switch the WIZ810MJ is connected to. Upon further testing I discovered that other devices connected to the same switch, and having the same gateway IP address, could be pinged just fine. And these other devices (not using a Wiznet module) were sending their ping reply to the MAC address of the gateway (00-00-5e-00-01-82).

So why is the WIZ810MJ sending the ping reply to the wrong MAC address? It seems to do it ONLY on this particular network which is using Virtual Router Redundancy Protocol (and I know nothing about VRRP). And attempts to connect also fail because the reply to the connect request is also sent to the wrong MAC address.

Also, can anyone tell me if it is possible to get in touch with an actual knowledgeable technical support person from Wiznet? Sending an email to their support_team@wiznettechnology.com email address was utterly useless.

Can I get the captured file?

And now big holidays like New Year’s Day in Korea. so the response may be delayed.

Seems like you must learn about this protocol. Wikipedia says:

A virtual router must use 00-00-5E-00-01-XX as its Media Access Control (MAC) address. The last byte of the address (XX) is the Virtual Router IDentifier (VRID), which is different for each virtual router in the network. This address is used by only one physical router at a time, and it will reply with this MAC address when an ARP request is sent for the virtual router’s IP address.

In your case W5100 does not use virtual router’s MAC address, and to identify what is going on you must share capture log as Ricky requested.

Guessing: it may happen that there’re conflicting ARP responses, one saying that gateway/router is ec-e5-55-67-d0-0d and another saying that gateway/router is 00-00-5e-00-01-82. While Windows and Linux PC’s drivers may know that they should use virtual router special MAC address, W5100 may not know about it (and about this VRRP in general).

Please post Wireshark log in here so that we can have a look.

The Wireshark capture file is attached. Pings from a local computer (same subnet) are if frames 1-5816. These local Pings work fine. Pings from a remote computer (different subnet) start at frame 5817. These Pings do not work because the reply is being sent to the wrong MAC address. The recorder containing the WIZ810MJ device being pinged is 10.70.154.81. The IP address of the gateway is 10.70.152.7. The MAC address of the gateway is 00-00-5e-00-01-82. The MAC address of the switch the recorder is connected to is ec-e5-55-67-d0-0d.

Unfortunately, this Wireshark capture does not include the ARP requests or replies. That happened well before the capture was started. I will attempt to obtain another Wireshark capture containing the ARP data, but the recorder is located is at a customer’s site and access to that site is highly restricted. It may be a few days before I can get that wireshark data.
wireshark1.zip (206.9 KB)

Looking to pings 5817/5818 and 5935/5936.

Request came from ec-e5-55-67-d0-0d to 00-08-dc-01-18-96, and reply performed from 00-08-dc-01-18-96 to ec-e5-55-67-d0-0d.

Where’s the problem? Requester (ec-e5-55-67-d0-0d) got its request fulfilled. The question is why requester W5100 responds to does not follow up forwarding response to the original sender.

Request has source address of 0a.22.07.f0 (10.34.7.240) and destination address 0a.46.9a.51 (10.70.154.81), and response has source address of 0a.46.9a.51 (10.70.154.81) and destination address of 0a.22.07.f0 (10.34.7.240). All IP header fields are properly set in both packets.

Seems you have issue in the first routing device W5100 is connected to not forwarding response to 10.34.7.240.

Eugeny, I’ll tell you where the problem is.
My recorder with Wiznet module is IP 10.70.154.81, MAC address 00:08:dc:01:18:96
My subnet mask is 255.255.252.0
My gateway address is 10.70.152.7 (the MAC address of that gateway is 00:00:5e:00:01:82)
I am is connected to an ethernet switch with MAC address of ec:e5:55:67:d0:0d

The ping request came from 10.37.7.240. Of course the source MAC address was ec:e5:55:67:d0:0d, because that is the MAC address of the ethernet switch I am connected to. The ping reply has to go back to 10.37.7.240, which is not on my subnet. The reply has to be sent via the gateway. The gateway is 10.70.152.7, and the MAC address of the gateway is 00:00:5e:00:01:82. The reply needs to be sent to 00:00:5e:00:01:82, but is was sent to ec:e5:55:67:d0:0d instead. You see, the switch is not the gateway. The ping reply got sent to the switch, not the gateway. Of course the switch did not forward the response to 10.34.7.240. The response has to be sent to the GATEWAY to get forwarded.

The ping response will be sent to the device on the network which requested it. If ping is routed, I guess switch must substitute source address with its own so that it receives reply, and forward further according to the originator IP address in the IP header.

In order for W5100 to send ping reply to 00:00:5e:00:01:82, this MAC address must be present in SRC field of the Ethernet header when ICMP packet is received by the W5100.

I am attaching two more Wireshark capture files. For each of the captures I started with the recorder (containing the WIZ810MJ) powered off. The Wireshark capture was started, then the recorder was powered on. This lets me see every ethernet frame sent or received by the WIZ810M.

For the capture file in wireshark2.zip I had NTP time sync enabled in the recorder, so at powerup it requested time from the time server at 10.8.85.48 (not on my subnet). Capture frames 2887-2890 show an ARP request by my data recorder, followed by an ARP reply identifying the MAC address of the gateway (10.70.152.7) as 00:00:5e:00:01:82. Then an NTP time request is sent to 10.8.85.48 via the gateway at 00:00:5e:00:01:82. An NTP reply is received from 10.8.85.48. This is perfect. Everything worked exactly as expected.

For the capture file in wireshark3.zip I had disabled NTP time sync, and tried to pint the recorder (from a computer not on my subnet) after the recorder had booted up. capture frames 3005-3006, 3124-3125, 3270-3271 and 3407-3408 show the ping requests and replies. The ping requests all came from 10.34.7.240. Of course the source MAC address was ec:e5:55:67:d0:0d because that is the MAC address of the switch I am connected to. But all the ping replies were sent back to ec:e5:55:67:d0:0d (the switch) instead of to the gateway (IP 10.70.152.7, MAC 00:00:5e:00:01:82). Note also the complete absence of any ARP from the WIZ810MJ! It never even tried to look up the MAC address of the gateway, even though the ping clearly came from an IP address outside of my subnet.

I think the behavior shown in wireshar3.zip is clearly wrong. Is there anything that can be done about this?
wireshark2.zip (31.8 KB)
wireshark3.zip (38.4 KB)

It seems W5100 blindly sends ICMP replies back the same route it arrived, and I do not see the point why it should NOT work.

Which gateway IP address you have configured in the W5100?

Why W5100 receives packets with MAC address set to ec:e5:55:67:d0:0d while it should receive with MAC address being 00:00:5e:00:01:82 (packet arriving outside of the subnetwork segment through router)?

You stated it in the first post; if ping implemented in W5100 is so dumb it just sends ICMP reply back without thinking about subnets, then TCP/IP must work properly anyway; in your case even TCP/IP does not work, which means there’s some configuration issue.

Take a time drawing full diagram of the path from ICMP packet originator to the W5100, stating all IP and MAC addresses, and then see if you have configured W5100 properly in terms of gateway IP address.

It would also help tracking ARP packets when W5100 tries to identify MAC address of configured gateway. What is this supplied MAC address of the gateway?

To review, when I attach my recorder (using WIZ810MJ) to my customer’s network and try to ping it from a remote location (not on the same subnet), all pings fail. The ping requester never sees a ping reply. However, when I replace my recorder with a different device the customer has (it happens to be an ethernet to serial port bridge), that device can be pinged successfully. Both devices (my recorder and the ethernet to serial port bridge) are set to the SAME IP address (10.70.154.81) and the SAME gateway address (10.70.152.7). Of course, only one of the devices is connected to the network at a time (and they were connected to the same port on the switch, using the same ethernet cable).

Bottom line, pings to the ethernet to serial port bridge device work fine. Pings to the recorder (using WIZ810MJ) fail every time.

I have already posted a Wireshark capture file (wireshark3.zip) showing data to and from the WIZ810MJ with pings failing. For comparison purposes, I captures Wireshark data to and from the ethernet to serial port bridge device (set to IP 10.70.154.81, gateway 10.70.152.7). That Wireshark capture file is here:
wireshark4.zip (33.5 KB)
The Wireshark capture was started prior to turning on the ethernet to serial port bridge so I could see everything, including any ARP requests.

At frame 357 there is a ping request. There was no reply because the unit under test was not turned on yet.

At frame 775 there was another ping request. Frame 776 had a ping reply, but sent to the wrong MAC address. I presume the unit under test was not yet fully initialized.

At frames 1167 and 1167 there was another ping request with a reply sent to the wrong MAC address.

At frame 1607 there was another ping request, but no reply (unit under test still initializing?)

At frame 2013 there is another ping request. This time the unit under test was initialized, and an ARP request was done seeking the MAC address of the gateway. The ARP reply identified the MAC address of the gateway (IP 10.70.152.7) as 00:00:5e:00:01:82. At frame 2016 the ping reply was sent to IP 10.34.7.240 using the correct MAC address of the gateway (00:00:5e:00:01:82). This ping reply was received by the remote computer.

Thereafter there are a series of ping requests/replies, with all of the replies being properly received by the remote computer. Note that in each case the ping request comes from IP address 10.34.7.240, MAC address ec:55:67:d0:0d (the MAC address of the switch), but the ping reply is sent to IP address 10.34.7.240 using destination MAC address 00:00:5e:00:01:82 (the properly resolved MAC address of the gateway). This is 100% proper behavior, and the pings (and any remote connection requests) work perfectly.

But once again I must emphasize that neither pings nor remote connection requests work with the WIZ810MJ on this network. Why is that? It is because the WIZ810MJ is not resolving the MAC address of the gateway. It is making an invalid assumption that the MAC address where the reply should be sent is simply the MAC address of the request. This is NOT TRUE. The MAC address where the reply should be sent is the MAC address of the gateway, which must be resolved using ARP. The WIZ810MJ isn’t even bothering to do an ARP request at all to get the MAC address of the gateway.

So why is the MAC address of the gateway different than the MAC address of the received ping or connection request? I’m not 100% sure, but I’ll bet you dollars-to donuts that is it because this network is using VRRP (Virtual Router Redundancy Protocol). Once again, this is a customer’s network (a very large network at that), and I am not allowed to modify its configuration in any way. I am simply expected to make my product (using WIZ810MJ) work on his network.

Sadly, there seems to be a serious flaw in the WIZ810MJ with regard to implementation of the gateway.

For starters, I would like a detailed explanation from an engineer at Wiznet about how the gateway is implemented and under what conditions the resolved gateweay address is and is not used.

Help! What can I do to make my device work on my customer’s network?

Actually I suspected it is doing this way, but I can be wrong.

Everything you say sounds logical and clear. However I still have a doubt. Looking to the wireshark3 log, I do not see any ARP requests and responses from/to W5100. Chip used to send ARP requests to the configured router asking for its MAC address, at least when some activity directed to it (W5100) is happening. We MUST see at least several ARPs from 00-08-dc-01-18-96. I see zero of them, and it is generally not normal.

Where Wireshark is installed? Draw the connection (wiring) diagram please.

Edit: to get proper log of what is going on between W5100 and network you must put some PC (Windows or Linux) in the bridge mode running Wireshark between W5100 and router. Depending on where your current Wireshark is located, it may not see ARP exchange between W5100 and router at ec-e5-55-67-d0-0d it is connected to.

Usual wiring is as follows:
WIZ810MJ<---------------->Switch (connects to other devices and rest of network)

For Wireshark captures, wiring is as follows
WIZ810MJ<—>| HUB |<---->Switch (connects to other devices and rest of network)
Laptop<--------->|______|

I connected an ethernet hub (not a switch, a true old fashioned hub) between the WIZ810MJ and the switch. The laptop running Wireshark was also connected to the hub, so the ethernet interface on the laptop is able to see ALL the trafic to and from the WIZ810MJ.

You are correct, there are NO ARP requests in wireshark3 data. And I believe this is unquestionably WRONG. There should be an ARP request to get the MAC address of the gateway, but there isn’t any!

Now here is the really interesting part. If I enable NTP time requests in my recorder, I see an ARP request to get the gateway MAC address as soon as the recorder powers up (see wireshark2 data for an example of ARP and NTP at powerup). But if I then try to ping the recorder from a remote computer it fails. The ping replies are sent to the MAC address of the switch, NOT the MAC address of the gateway. Never mind that the WIZ810MJ did an ARP request to get the MAC address of the gateway (it did this to send the NPT request), so the WIZ810MJ does know the MAC address of the gateway. The ping reply is still sent to the WRONG MAC ADDRESS (the MAC address og the switch instead of the MAC address of the gateway).

My conclusion is that the WIZ810MJ (i.e. W5100) does not properly implement the gateway. It is using the MAC address of the received ethernet frame (as opposed to the resolved MAC address of the gateway) when replying to IP addresses that are not on the current subnet.

Do you agree with my analysis? The question is, why is it doing this, and can it be fixed?

I hope it does not filter the ARP messages. The best way as I said in my previous reply, to set up this “laptop” having two Ethernet connectors as a bridge and monitor both sides - W5100 side and another one looking at the switch side.

That is not so straightforward. W5100 has TCP/IP stack implemented in it, but it is a good question how it is driven. Let me explain. W5100 can operate at the level of its own implementation of sockets, and can respond to ping requests by itself (without intervention of the host/driver). However software/OS driver can manage chip differently - set up its socket 0 into the MACRAW mode, and use its own software implementation of sockets, and respond to pings. At the MACRAW level driver has control over MAC layer data in the packets, as well as defines source and destination using MAC addresses rather than IP addresses.

It is a good question how you drive W5100 in your implementation and who is responding - W5100 using its built-in stack, or it is a driver doing it. It may be a software driving W5100 not operating properly, not W5100 itself.

At this point, having read information in this thread, I am inclined to think that

  1. you either did not configure gateway IP address properly in the common register set;
  2. it is software driver operating incorrectly;
  3. W5100 is really having a bug, but I would say what we see is a real problem, and I am sure if there would be such a bug then people in different setups would notice that much much earlier.

Re: 1 - can you dump gateway setting directly from the register set of W5100 to ensure it is set properly? Ideally dump all registers, if you have access to them through the software;
Re: 2 - give information how W5100 is being driven - which OS is running, who was developing drivers, and at which level W5100 is used;
Re: 3 - I think this situation needs to be reproduced in lab.

To answer your questions/concerns:

  1. Yes, I have a “backdoor” debug monitor in my application which allows directly accessing the memory mapped registers in the W5100. Yes, I have looked at all of the settings. Yes, the gateway is configured correctly. Please see the Wireshark data in wireshark2.zip. There is an ARP request to obtain the MAC address of the gateway. The MAC address of the gateway was resolved properly for the NTP time request (done via UDP). If the gateway was not configured properly, this could not have been done. And since the ARP request was captured by Wireshark, clearly the hub is NOT filtering out ARP requests. (Hubs cannot filter out ARP requests, otherwise networks built using hubs would not work).

  2. I am driving the W5100 using driver software based on the sample driver code from Wiznet. Basically I only modified a few hardware access related things to make it work efficiently with my STM32 Arm cortex processor. I am using the socket based interface, with the W5100 handling TCP and UDP protocols (different sockets use different protocols), and the W5100 handles all pings internally.

  3. I think the W5100 (and also the W3150A and the W5300) have a serious bug with improper handling of the gateway. In cases where traffic from a different subnet is routed through a physical router, and the MAC address of that router can be used to forward replies back to the gateway. But in the case of VRRP, the MAC address of the gateway will NOT be the same as the MAC address of the device physically sending the frames to the W5100. Think about it. In wireshark2 data the W5100 is clearly replying to ping that come from a remote computer on a different subnet, yet there is NO ARP whatsoever, and the W5100 ends up sending the reply to something other that the gateways MAC address.

We have used NM7010B+, WIZ810MJ and WIZ830MJ modules in various generations of our products for many years. They are all failing in the same way on this particular network.

Yes, I agree that engineers at Wiznet need to investigate this situation in their lab. How do I make that happen?

Thank you Eugeny for your input as always!
I’ll have this matter looked at as priority after the Chinese New Year.

Best,
Jake

Keith,

As mentioned above, I’ll have our staff review this case after the Chinese New Year.
I’m sorry that it is unfortunate timing.

Best,
Jake

I conclude the following:

  • if W5100 originates the packet, it sends ARP to router and proceeds to the right node on the network. It happens in log #2 with NTP request;
  • if W5100 is not an originator of the packet, but responding device, then it just responds to the sending MAC, without checking that target node is out of local subnetwork and assuming it got the request from the gateway, not from intermediate device.

I did a basic setup, and see that there is no ARP request before responding to the ping request, though my pinging device is on the same subnet.

In any case I assume you are looking for the quick fix (rather than replacing the chip or loading firmware into if it is possible at all). I see two possible ways:

  1. remove intermediary device, connecting W5100 directly to the gateway router (through this simple hub you use which does not substitute MAC address of the originator);
  2. set up this intermediary device as a router/gateway for W5100. It will require exploring and changing this device’s settings, I can not advise anything as no idea what the device (ec:e5:55:67:d0:0d) is - vendor seems to be Hirschmann Automation, but I do not see this MAC address in the Wireshark. By the way, there’re several devices on the network those MAC address starts with ec:e5:55.

I feel I need to make a few things clear here.

  1. It is not just a single data recorder that is failing. My customer has no fewer that 40 data data recorders (all using Wiznet modules) attached to his network. All of the recorders are failing. The failures started several years ago when the customer made some major upgrades to his network. Among other things, Nortel switches were replaced with Hirschmann switches. Apparently the VRRP protocol was also implemented at that time. I do not know all of the details about changes made to the network

  2. The customer’s network is very large, and covers a large geographic area. The network, among other things, is part of a safety-critical control system. I am not allowed to touch, or reconfigure, ANY portion of the network. The only thing I can touch is the CAT5 cable connecting to the data recorder, and that can be done only under the direct supervision of the customer’s IT personnel.

  3. The customer expects these data recorders to just work. They are presently the ONLY pieces of equipment attached to his network that are NOT working. The customer is NOT happy. My company is likely to lose all future business with this customer if this problem is not resolved.

  4. Any “quick fix” involving any change to how our recorder is connected to the network, or any reconfiguration of the network is not possible.

I need a real, ultimate fix. The Wiznet modules need to be repaired (via a firmware update, which is likely not possible) or replaced with new modules that implement a fix to the gateway problem. I expect all Wiznet modules my company currently has in inventory to be replaced with properly functioning modules. I do not expect all modules in the field to be replaced at no charge, since they are likely out of warranty (however, replacing all the modules (approximately 40) at this one site sure would be a nice gesture on the part of Wiznet).

I did some research and tried to see how it works, and I think we are still in the dark. Looking to the Wireshark logs does not give much insight into physical configuration of your network (no wiring diagram I asked before), and we have no information on W5100 register configuration. For example:

  1. why packet coming from outside to W5100 is having some other switch MAC address (ec-e5-55-67-d0-0d), and not VRRP router MAC address at the first place (00-00-5e-00-01-82)?
  2. if we assume that W5100 functions properly, you will see this behavior when you configure SUBR to 255.0.0.0 (or any other value) so that 0a.22.07.f0 and 0a.46.9a.51 appear on the same subnetwork.

Thus again, please provide wiring diagram with the port IP address and MAC address identification, and dump of the W5100 common and socket registers.

I did some further testing:

Configured W5100 with IP=192.168.1.2, mask=255.255.255.252, GW=192.168.1.1
Configured PC with IP=192.168.1.51, mask 255.255.255.0, GW=192.168.1.1
There’s another PC in bridge mode between them with Wireshark running.

In this configuration PC and W5100 appear on the different subnetworks. Ping works fine:

  • requesting PC sends request to bridge to its front-end MAC address;
  • bridge forwarding it to W5100 MAC address from its back-end MAC address;
  • W5100 replying to bridge’s back-end MAC address;
  • and bridge forwarding to original requester from its front-end MAC address.

During this process W5100 does NOT issue any ARPs, thus it does not differentiate my/foreign subnetwork, just sends reply back.

We are one step from the claiming this behavior as a bug, and there must be one thing clarified: is there any RFC stating that network device MUST send packets directly to gateway in case destination IP address in the IP field identifies host in another subnetwork?

In this situation we troubleshoot, unless stated otherwise, I consider W5100’s behavior as acceptable. Device which re-routes ping request and uses MAC address ec-e5-55-67-d0-0d must be capable of forwarding received ping reply back the path it got the request through.

The fact that other devices work in this circumstance sending replies to the 00-00-5e-00-01-82 may not be a required behavior per RFC (while this behavior is very useful in this circumstance).

In general, my custom device using W5100 has caught many issues in customer network configuration - because W5100 implements quite minimal required standard, while Windows and Linux PC drivers tend to implement more features covering different situations in network configuration, and thus not exposing network configuration issues.