Difficulty making client TCP connection.

I have a embedded app running on an STM32F107, which opens 4 TCP Server sockets and handles the incoming and outgoing data. This works fine, and has been in the field for a while.

I have been asked to add the ability to connect to these boxes from other boxes rather than from a browser. I tried using the remaining 4 sockets to open client connections. This is always returning a timeout. Here is the relevant code:

Server setup is via this routine:

int32_t Process_TCP(uint8_t sn, void* ring_in_buf, void* ring_out_buf, uint16_t port)
{
 int32_t ret;
  uint16_t size = 0, sentsize=0;
  uint8_t* buf = gETH_RECV_BUF;
#ifdef _ETHERNET_DEBUG_
  uint8_t destip[4];
  uint16_t destport;
#endif


  switch(getSn_SR(sn))
  {
    case SOCK_ESTABLISHED :
      if(getSn_IR(sn) & Sn_IR_CON){
#ifdef _ETHERNET_DEBUG_
        getSn_DIPR(sn, destip);
        destport = getSn_DPORT(sn);
        myprintf("%d:Connected - %d.%d.%d.%d : %d\r\n",sn, destip[0], destip[1], destip[2], destip[3], destport);
#endif
        setSn_IR(sn,Sn_IR_CON);
      }
      if((size = getSn_RX_RSR(sn)) > 0){
#ifdef _ETHERNET_DEBUG_
        myprintf("%d:Received - %d bytes\r\n",sn, size);
#endif
        if(size > ETH_RECV_BUF_SIZE) size = ETH_RECV_BUF_SIZE;
        if(size > RingBuffer_GetFree((RINGBUFF_T*)ring_in_buf)) 
          size = RingBuffer_GetFree((RINGBUFF_T*)ring_in_buf);
        ret = recv(sn, buf, size);
         if(ret <= 0) 
          return ret;
        RingBuffer_InsertMult((RINGBUFF_T*)ring_in_buf, buf, size);
      } 
      size = RingBuffer_GetCount((RINGBUFF_T*)ring_out_buf);
      if(size > ETH_RECV_BUF_SIZE) size = ETH_RECV_BUF_SIZE;
      size = RingBuffer_PopMultNoRemove((RINGBUFF_T*)ring_out_buf, buf, size);
      sentsize = 0;
      if(size > 0)
      {
        if (1)
        ret = send(sn,buf+sentsize,size-sentsize);
        if(ret < 0){
          close(sn);
          return ret;
        }
        sentsize += ret; // Don't care SOCKERR_BUSY, because it is zero.
      }
      if(sentsize > 0){
        RingBuffer_RemoveMult((RINGBUFF_T*)ring_out_buf, sentsize);
#ifdef _ETHERNET_DEBUG_
        myprintf("%d:Sent - %d bytes\r\n",sn, sentsize);
#endif
      }
      return 0;

    case SOCK_CLOSE_WAIT :
#ifdef _ETHERNET_DEBUG_
      myprintf("%d:CloseWait\r\n",sn);
#endif
      if((ret=disconnect(sn)) != SOCK_OK) 
        return ret;
#ifdef _ETHERNET_DEBUG_
      myprintf("%d:Socket closed\r\n",sn);
#endif
      break;

    case SOCK_INIT :
#ifdef _ETHERNET_DEBUG_
      myprintf("%d:Listen, TCP server, port [%d]\r\n",sn, port);
#endif
      if( (ret = listen(sn)) != SOCK_OK) 
        return ret;
      break;

    case SOCK_CLOSED:
#ifdef _ETHERNET_DEBUG_
      myprintf("%d:TCP server start\r\n",sn);
#endif
      if((ret=socket(sn, Sn_MR_TCP, port, 0x00)) != sn)
        return ret;
#ifdef _ETHERNET_DEBUG_
      myprintf("%d:Socket opened\r\n",sn);
#endif
      break;

    default:
      break;
  }
  return 1;
}

This works fine, bringing the socket from closed to listening to sending and receiving.

When I try to open a client, it fails.   Here is the code (explanation following the code):

void ClientSockets(uint8_t sn2, void* ring_out_buf)
{
  uint16_t size = 0, sentsize=0;
  uint8_t* buf = gETH_RECV_BUF;
  uint8_t destip[4];
  uint16_t port = ports[sn2];
  int32_t ret;

  size = RingBuffer_GetCount((RINGBUFF_T*)ring_out_buf);
  if(size > ETH_RECV_BUF_SIZE) size = ETH_RECV_BUF_SIZE;
  size = RingBuffer_PopMultNoRemove((RINGBUFF_T*)ring_out_buf, buf, size);
  sentsize = 0;
  while(size != sentsize)
 if(size > 0)
 {
    uint8_t res;
    if (!SocketMade[sn2])
    {
      getSIPR(destip); // get own IP address
      destip[3] ^= 1; // this will be the partner.  Connect to partner as client
      uint8_t ss = getSn_SR(sn2);
      if (ss == SOCK_CLOSED)
      {
        res = socket(sn2, Sn_MR_TCP, 0xC000, 0x00);
#ifdef _ETHERNET_DEBUG_
        myprintf("%d:Opened socket %d on port %d\r\n",sn2, sn2, port);
#endif
      }
      if (res == sn2)
      {
        res = connect(sn2,destip,port);
      }
      if (res == SOCK_OK)
      {
#ifdef _ETHERNET_DEBUG_
        myprintf("Connection succeeded\r\n");
#endif
        SocketMade[sn2] = true;
      } else
      {
#ifdef _ETHERNET_DEBUG_
        myprintf("Connection failed error code = %x\r\n", res);
#endif  
      }
      ret = send(sn2,buf+sentsize,size-sentsize);
      if(ret < 0)
      {
        close(sn2);
        return;
      }
      sentsize += ret; // Don't care SOCKERR_BUSY, because it is zero.
    }
    if(sentsize > 0)
    {
      RingBuffer_RemoveMult((RINGBUFF_T*)ring_out_buf, sentsize);
    }
#ifdef _ETHERNET_DEBUG_
    myprintf("%d:Sent - %d bytes\r\n",sn2, sentsize);
#endif
    return;
  }
}

The IP address being sent to is always to an IP address which “pairs” one box with another. For example, 192.168.1.90 will always try to talk to 192.168.1.91 and vice-versa. They will open two sockets to converse between them . While debugging, I have removed the line

" destip[3] ^= 1; // this will be the partner. Connect to partner as client"

thus trying to send to myself.

The line which opens the socket has gone through several forms, this is the last one I tried:

    res = socket(sn2, Sn_MR_TCP, 0xC000, 0x00);

sn2 is usually 4 in my case, sometimes 5. Sockets 0-3 are used by the 4 Server sockets. 4 and 5 are trying to send to sockets 1 and 2 (Socket 0 is for control, socket 3 is currently not used). It, like the server sockets are TCP, I have tried both 0xc000 (all ports) or setting a specific port here, and 0 for the flags, as I want a blocking socket.

This seems to always succeed, with the value of res being sn2. However, the call to

    res = connect(sn2,destip,port);

fails always, with 0xF3 (timeout) as the return code.

I can connect to the server socket from, say, TeraTerm, so I know the problem is with the client socket.

Please edit your post so that code looks properly.

Reusing same local port # and remote port # within several minutes may cause issues when connecting (new session may be treated as old one by router and router may respond with RST); consider using some random local socket #, or use anysocket++ (or whatever it is defined in the library).

Did you set gateway IP address properly? In server mode it does not matter much (as I know), in client mode it is vital because W5500 sends ARP requests to gateway to know other device’s MAC address, and if gateway does not respond properly connect process times out.

Hello Eugeny, thank you for the quick response. I’ll answer in parts:

Please edit your post so that code looks properly.

Tried to find - the only editing control I found doesn’t seem to let me edit the text, only the title.

Reusing same local port # and remote port # within several minutes may cause issues when
connecting (new session may be treated as old one by router and router may respond with RST);
consider using some random local socket #, or use anysocket++ (or whatever it is defined in the
library).

These are new sessions, not old ones. The code opens the first sockets (server sockets) when it comes up, when the device has something to send on one of the sockets it tries to open a client socket for the first time, and they all fail.

Did you set gateway IP address properly? In server mode it does not matter much (as I know), in
client mode it is vital because W5500 sends ARP requests to gateway to know other device’s
MAC address, and if gateway does not respond properly connect process times out.

I’ll double check. There is not really a gateway here, as the connections are local (192.168.1.xx via a hub connecting them), If this is an issue, please explain what I might need to do to implement it.

Thanks,

Steve

Found the editor…

This is very good news. Connect Windows or Linux PC to this hub, and run Wireshark on it. You will be able to capture all packets on the segment seeing things going on.

When W5500 is given connect command, it tries to resolve IP address of remote deivice to the MAC address sending ARP broadcast. I think there must be some rules who and how devices respond, but I think if there’s anyone on the network knowing MAC address of the IP address (in its cache), then it will respond to the request.

It may happen, while I am not sure, that W5500 may not have ARP cache in it, and may not respond to the ARP request (even for its own IP/MAC combination?), and as there’s no other device knowing MAC address corresponding to the IP address, connecting of W5500 to another W5500 fails…

You can confirm or disprove this by looking at the Wireshark capture.

I am wrong with it. Just have set W5100 up with gateway address 0.0.0.0, and tried to connect to local PC. W5100 sent broadcast, PC responded and connection was successful.

And vice versa, configured PC with fixed IP address and no gateway IP address, PC sent ARP request, and W5100 responded with its MAC address.

I am afraid we need log from Wireshark to see what exactly is going on.

Again thanks -

I will be able to do this Thursday morning. Until then I have only one box and I’m testing it against itself (setting the IP address to myself, so I send from my clients to one of my own servers - and I’m guessing that it won’t go out on the wire at all).

Hi Eugeny,

Here is what I’m seeing:

I have the client box set at IP 192.168.2.91. This is the one I’m debugging. I have the server box set at IP 192.168.2.90. My PC is at 192.168.2.95, running both Wireshark and, later on, TeraTerm.

At first, I had the gateway set to a non-existant 192.168.2.1. I saw that the call to “connect” generated this in Wireshark:
535 789.643971 AsustekC_15:05:76 Broadcast ARP 42 Who has 192.168.2.1? Tell 192.168.2.95

Now, not sure why it was asking for the reply to the PC, but I know that nothing will answer from the non-existant gateway. So I set the gateway to 0.0.0.0 on the client box. It now generates this:

248 296.352602 Wiznet_ab:cd:ee Broadcast ARP 60 Who has 192.168.2.90? Tell 192.168.2.91

However, to this there is no response.

The PC is able to connect to the server box. When I use teraterm to telnet to port 5001 on the server box this happens:

302 422.122989 192.168.2.95 192.168.2.90 TCP 66 58442 → 5001 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM=1
303 422.123146 192.168.2.90 192.168.2.95 TCP 60 5001 → 58442 [SYN, ACK] Seq=0 Ack=1 Win=2048 Len=0 MSS=1460

And the connection is made, and I can send data to the server box. There does not seem to be anything doing ARP.

The lines of code I have for making the client connection, are, again,

    res = socket(sn2, Sn_MR_TCP, 0xC000, 0x00);

followed by

    res = connect(sn2,destip,port);

sn2 is either 4 or 5.

Hi Eugeny,

Changed the parameter in the call to socket from 0xC000 to the port used to send on (sending from port 5001 or 5002 to port 5001 or 5002, port to port), and it now seems to be working. I’ll have to continue testing, but now the client connection is succeeding.

Thanks,

Steve

Seems to be the first step is to find out gateway’s MAC address.

And if gateway is not configured, or as a second step, just ask everyone who is the destination MAC.

I suspect PC may have performed ARP earlier if you were connecting before, and MAC-IP address entry is cached in the ARP table of the PC. W5x00 does not have this table, so they explicitly send ARP before any connect request.

Is this source port of the client? If you change from 0xc000 to 5000 and it start working then most probably something may block packets packets with local port # of 0xc000. This is a guess, Wireshark must show what is going on the network when client tries to connect with local port 0xc000 or 5000.