W830MJ socket status 0x10

#1

I am using the WIZ830MJ module, which uses a W5300. Most of the time I am listening for a TCP/IP connection and am in the LISTEN state. When I need to connect to a server I first make sure no connection is up (still in LISTEN state), then I do a close, followed by a connect to establish an outgoing TCP/IP connection. I then wait for the socket status register to indicate either 0x17 (SOCK_ESTABLISHED) to indicate the connection has been established, or 0x00 (SOCK_CLOSED) to indicate the connection could not be established (refused or timed out, for example). Most of the time this works very well. However, every once in a while my system locks up and I cannot communicate with it over the ethernet port. This happens rarely, and it has taken me a while to figure out what is going on. It turns out that when my system locks up the socket status register (S0_SSR1 at address 0x209) has the value 0x10, and it stays that way forever until I power cycle my system.

What is socket status 0x10? This is not documented in the W5300 datasheet. What does it mean? How does it get into this state? What should I do to recover from this state?

#2

Hello,


0x10 means SOCK_SYNSENT state. After sending a SYN packet from the W830MJ, it is assumed to be in the SYNSENT state because the server does not give RST packet or SYNACK packet to the SYN.

#3

Hello,

Or it could be the situation on the Errata Sheet below.
See page 3

#4

I don’t know where you got your information about socket status 0x10 being SOCK_SYNSENT, but that is wrong. On page 78 of the W5300 datasheet Version 1.3.3 (or page 75 of the W5300 datasheet Version 1.2.2) it clearly states that 0x15 is state SOCK_SYNSENT.

The Errata sheet looks more promising, but I am still confused about why the Errata sheet should be talking about socket status register values of 0x10 and 0x11 when neither of these values is mentioned at all in the W5300 datasheet. So I looked at my driver code I use to control the W5300. My driver code was derived directly from the sample driver code provided by Wiznet. I looked at my close() function, and found that the fix suggested in the Errata sheet was already in my code. This is not surprising, since the code came from Wiznet.

So, in spite of the Errata sheet fix already being in my code, my system is still locking up from time to time with a socket status register value unexpectedly stuck permanently at a value of 0x10.

Where do I go from here? I am still stuck.

#5

I checked the RTL code. 0x10 is SYN_SENT. Of course, 0x15 is also SYNSENT. As soon as the SYN is transmitted, it changes from 0x15 to 0x10.

#6

So what you are saying is that the Wiznet documentation is incomplete and inaccurate. I can believe that. But that still leaves me with a system with driver code that already has implemented the “fix” suggested in the errata sheet you referenced, but my system is still occasionally locking up with the socket status register stuck (apparently permanently) at 0x10. My question remains unanswered: Where do I go from here? What can I do to fix it?

#7

As I said, 0x10 is SYN SENT state. Under normal circumstances, a timeout occurs because no response is received from the server and becomes a closed state . Or it receives a SYN ACK from the server and becomes an established state. So we decided that we did not need to present 0x10 to the document. Your situation is not normal. You should check wireshark to see what happens when the W830MJ sends a SYN.

#8

If you send the packet file by e - mail, I will check it.
becky@wiznet.io

#9

You say “So we decided that we did not need to present 0x10 to the document.”, yet on pages 78-80 of the W5300 datasheet (version 1.3.3) Wiznet did chose to document the following “temporary status” states of the socket status register: 0x15 SOCK_SYNSENT, 0x16 SOCK_SYNRECV, 0x18 SOCK_FIN_WAIT, 0x1B SOCK_TIME_WAIT, 0x1D SOCK_LAST_ACK, 0x01 SOCK_ARP. I cannot imagine why you would fail to document 0x10.

You say “Under normal circumstances, a timeout occurs because no response is received from the server and becomes a closed state . Or it receives a SYN ACK from the server and becomes an established state.” Yes, that is what I expect. I call connect(), then wait for socket status to change to either SOCK_ESTABLISHED (indicating the connection was successful), or SOCK_CLOSED (indicating the connection failed (was refused, or timed out). This is what I expect to happen, either connection established, or socket closed.

Is this the correct way to handle a connection (call connect(), then wait for SOCK_ESTABLISHED or SOCK_CLOSED)? It seems to be what is documented.

Yet sometimes I wait forever, getting neither SOCK_ESTABLISHED nor SOCK_CLOSED. Instead I see socket status 0x10 persisting forever. When this happens, my system is locked up and needs to be power-cycled.

I would be happy to provide you with a wireshark capture of what happens when the W830MJ sends a SYN, but I have a problem doing that. Most of the time it works as expected, giving me SOCK_ESTABLISHED (or in rare cases SOCK_CLOSED if the server is not available). I have attempted to duplicate the socket status 0x10 problem in my lab, and after months of testing have seen the problem appear exactly one time. But that doesn’t mean this is not a serious problem. My customer has hundreds of these systems installed in the field operating on a live network, and failures are happening at a rate of many failures per week. It is not practical for me to do wireshark captures on all of the customer’s systems. Even if I wanted to, I would not be allowed to because the systems are operating on the customer’s live network, and I am not allowed to connect anything to that live network. The network is used for critical safety systems.

I think a better approach would be for you (or a more knowledgeable programmer) to review the code and figure out how the socket status can be stuck forever at 0x10. Then perhaps you or someone else can suggest a work around that I can implement in my software to get around the bug in the W830MJ. Clearly the work around in the errata sheet does not fix the problem. I am asking for help. Can anyone at Wiznet help me?

#10

Hi~ @microaide

0x10 of Sn_SR mean that SYN Packet is being sent.
0x15 of Sn_SR mean that SYN Packet is sent.

Actually, you can’t see the value of 0x10 because it is internal state value.
But if TCP connected socket occur timeout or it occur wrong operation, sometimes it can see the value.

Could please you check your code?
After send command executes and you have to confirm the Sn_CR is ‘0’.
if possible, Could please send the source code to @becky.

When this state occur, I can offer two way.

  1. System reset
  2. the method of using UDP in Errate sheet (@becky mentioned)

thanks,
irina
BR