Acceptable performance of the W5500

I have been writing a small TCP server code to run on an STM32F4 with the W5500

  • CPU: 168 MHz
  • SPI: 21 MHz
  • single socket 0

I have a small TCP client in Linux that sends and receives traffic to and from the W5500

while(1) {
send 1460 bytes
wait_to_receive reply (1450 bytes from W5500)
}

I can manage to transfer around ~5 Mbps both ways
ccc

The problem I’m having is it can’t sustain this for very long. I’ve seen it fail at the 10,000+ pkt sent/received, sometimes it goes all the way up to 180,000+ pkts before i get a watchdog timeout (set at 2000ms) from the STM32 and resets everything, or if i’m lucky, the socket just closes and the transfer stops without crashing the MCU/W5500. In this instance, I have the following values from the status registers

IR = 0x00
SIR = 0x01
Sn_IR = 0x14

If the MCU doesn’t crash, I can get the stats that I constantly capture

In the image above, I was able to transfer ~90,000 pkts and ~130 Mbytes before the TCP connection broke.

Everything is interrupt driven with a “state machine” to service the W5500, and I have a counter that counts the number of times an IRQ is triggered before the previous IRQ is finished and that seems to stay at zero i.e. the STM32 if fast enough to service the Rx/Tx interrupts before the new data arrives from the Linux client.

I’m not suspecting a code issue since it varies with each traffic test. I’m more inclined to believe the W5500 is overheating causing it to hang or something but there is no way for me to check since the chip is not accessible for a “touch monitoring” with my finger.

I also tried putting a delay in the Linux TCP client to slow it down to maybe 500 Kbps and sometimes it crashes too after a few hours

Interrupts for socket 0 and interrupt status is SEND_OK and RECV. You have got the data you probably did not read.

You wrote a lot of text which originally confused me, but after reading it several times and writing nonsense I see that your real problem is that you most probably did not get RECV interrupt on INT pin. It happens intermittently and it explains why you do not have a pattern to catch it.

I propose to revise the ISR code and especially interrupt flag management logic. When you enter the ISR, you must clear the flag you are about to process and not touch it anymore until you exit the ISR and it will be called again if flag got set during your previous ISR processing. You should start with it because there’re more advanced techniques, but they bear some more processing power and more risks.

I did not realize I can clear the Sn_IR_RECV bit before doing all the READ_RECEIVE_DATA routine

I currently have it after updating the RD0 and RD1 read pointer registers. I move the clearing of Sn_IR_RECV now right after reading the Sn_IR register. That should clear the bit earlier before proceeding to retrieve the received data from the W5500 which can take some time in computer terms.

I really don’t have anything in my ISR, most of the processing is in the loop. Its pretty straight-forward

ISR (falling edge trigger of IRQ pin) {
set W5500_flag
}

while(1)
if(W5500_flag) {
W5500_flag = false;
stateM = true;
starting_state = 1;
}

if(stateM)
{
 switch(starting_state)...
case xxx...
case finish:
    stateM = false;
}
}

Then stop using interrupts at all. Perform polling of the interrupt register or receive size register. This way you will save on interrupt management times.
but anyway the main point here is you clear interrupt condition before working on its consequences. simply because when you are finished with current interrupt and clear the flag, you risk to clear new interrupt and get stuck with it.

I did make a few changes on my routines and that greatly improved the stability of the program

  • clearing the Sn_IR_RECV flag before retrieving the received data from the W5500 Rx buffer
  • on the last state of my “state machine”, I read the IRQ pin once more, and if still LOW, I set a flag to trigger the state machine on the next loop. I figured reading the IRQ input pin is faster than doing an SPI read of the SIR/IR registers
  • in one of my timer routine (feed the watchdog every second), I inserted a code to query the IRQ pin and if LOW, check if the state machine is active. If it’s active, I just ignore it. But if the pin is LOW and state machine is not active, I set a flag to trigger it

I’m keeping the INT in case I want to put the MCU to sleep and just wake it to feed the watchdog or a signal from W5500 to receive/transmit data

So far from my testing, the only “crashes” I’m seeing now is triggered by the MCU watchdog. But I have a feeling that has something to do for using an AliExpress W5500 module. I’m ordering a couple of WIZ850io from Mouser shortly.

If IRQ is active, the ISR will be automatically called again. What is the reason spending time reading it?

Should you set up MCU to get from sleep when it gets interrupt?

The STM32 is setup to detect the falling edge of the W5500 IRQ pin. The ISR is just to set a flag true

ISR(falling_edge_of_W5500_INT_pin)
{
	clear_stm32_irq_flag;

	w5500_flag = true;
}

Most of the work is done in the state machine loop

while(true)
{
	if(w5500_flag) {
		w5500_flag = false;
		state_machine = true;
	}
	
	if(state_machine)
	{
		switch (states)
		{
			case 1:
			case 2:
			case 3:
			....
			case finished:
				state_machine = false;
				
				if(w5500_int_pin == low) {
					w5500_flag = true;
				}
			default:
		}
	}
}

all of the W5500 INT flags are handled (cleared) pretty much after querying the Sn_IR register. The last query of the W5500_int_pin is there to detect if the INT pin went low again after the flags are cleared but before the state machine is finished

But I have bigger issue to fix. I did a true loopback test from the Linux client and check the received data from the W5500

Linux → send 1460 bytes → received by W5500
W5500 replies with same 1460 bytes → Linux compares rx data with tx data

and it turns out I’m getting corrupted (mismatched) data every now and then

I’m planning to put the STM32 to sleep and have the W5500 INT pin wake it up if there is some W5500 related tasks to attend to