W5500 stops receiving UDP Art-Net packets after random time

I am designing an LED Lighting product using W5500 to receive Art-Net. (Art-Net is a UDP protocol with packets of 534 bytes). The controller is ESP32. I am using the interrupt pin of the W5500 to trigger reading of packets.

Everything works fine until we send a large number of Art-Net packets together (e.g. a burst of 24 packets), then W5500 stops receiving the UDP data after a short length of time (30sec - 2 minutes). The interrupt pin stops triggering. W5500 can still be read but reports no data has been received. I’m running the SPI bus at 32MHz. I have the W5500 set to one single socket with a buffer size of 16KB. If I read back the configuration registers of W5500 everything still seems to be set correctly.

The only way to restore operation is to reset the W5500, then it will work correctly again for a short time.

ESP32 is using Arduino framework with the standard ethernet library. I’m not sure how to debug this further as the problem seems to be inside the W5500.

I have made a little progress with this issue. It seems to relate to UDP data arriving faster than it is being read by the MCU. I would expect the W5500 to drop packets if there is not enough buffer space, but in fact it appears to stop receiving data altogether.

Could Wiznet please advise what should happen if more packets are received than available buffer space?

If I place the UDP reading code inside the interrupt routine, and stay in the interrupt routine until all packets have been read, then the W5500 will successfully receive 32 consecutive packets of 572 bytes. However this causes other problems with watchdog resets in the ESP32 as other tasks are starved of time.
If I play nice with the rtos and put the UDP reading code in its own high priority task and use a semaphore from the interrupt routine to release the task, then the W5500 locks up in less than a minute.

According to your description of your architecture: interrupt pin will activate when data is received if you programmed its mechanism properly. In ISR you must read all the data received to date, not only one packet of data. And the gray area is if data will arrive just before or during you clearing interrupt flag - you risk to clear new interrupt request. To prevent it, in the ISR, first clear interrupt flag in W5500, then process data, then exit ISR. If there will be new request coming during ISR processing data, it will be brought in again.

Hi, thanks for reply, I am reading all data that has been received. If I stay in ISR until all data is read, this can cause watchdog reset or idle task reset on ESP32, so I was attempting to trigger a high priority task to read the data, using a semaphore from the ISR. However this causes a delay in reading the data and this delay causes the W5500 to stop receiving further UDP data.

In the case of data arriving during interrupt flag being cleared, I expected that the W5500 should still show that some data is waiting to be read from the buffer if it is polled afterwards.

The strange thing is that W5500 says no data is in the buffer, and does not receive any more UDP packets from the network until it is reset. If the W5500 internal buffer fills, could that cause it to stop receiving UDP and return a data count of 0?

It will return in RSR as much data as it has. It should not return 0 if buffer is full.

I propose you considering using polling rather than interrupts.

I have tried polling (both polling the RX_RSR register and polling the INT pin) but it’s too slow to respond, at least using the Arduino ESP32 framework.

What I am trying to understand, is what is happening to the W5500 when it stops receiving the broadcast UDP data. All its registers are still set correctly but it just ignores the packets. INT pin is high (inactive) and RX_RSR reads 0. It is definitely related to the density of packets on the network, if I reduce the amount of data being sent then it works fine.

Make a dump of all he common and all socket (of all sockets) registers (see datasheet) during normal operation and during lock-up.

I have finally got back to looking at this after having to work on a different project, and have added code to print out the W5500 registers.

What I said above is slightly incorrect, when the W5500 stops, the RX_RSR for socket 0 shows data is present, but the interrupt does not trigger. According to datasheet, if I try to clear the SnIR registers while data is still present in the buffer,the SnIR flags should remain set, but it seems to be possible (in some situations, not all the time) to clear the flag while data is still in the buffer to be read.

The register settings look like this after the W5500 has stopped receiving UDP. (Only Socket 0 is in use)
IR:0 IMR:0 SIR:0 SIMR:FF SnMR:2,0,0,0,0,0,0, SnCR:0,0,0,0,0,0,0, SnIR:0,0,0,0,0,0,0,0 SnIMR:4,0,0,0,0,0,0,0 RSR(0):3E12 SnSR(0):22 SnPORT(0):1936 SnRXBUFSIZE(0):10

Where did you find it? I think Sn_IR is a register, not a logic cell. This means W5500 sets it when specific condition occurs, and you clear it when you process this condition. There’s must be a condition of concurrent access (by W5500 and driving software) and that’s why it very important to process ALL the data received before exiting the ISR. So that new data arriving raise the interrupt.

This is written in the W5100 datasheet which has more detail than W5500 datasheet but I believe the operation is the same (page 29)

Sn_IR bit 2 = RECV : It is set as ‘1’ whenever W5100 receives data. And it is also set as ‘1’
if received data remains after execute CMD_RECV command.

W5500 datasheet just says “This is issued whenever data is received from a peer”. But I have verified that bit 2 of Sn_IR remains set and the INT pin remains activated if you clear the Sn_IR register when some data remains in the RX buffer, so I think the statement for W5100 does also apply to W5500.

I do ensure that all data is read from W5500 before leaving ISR, but I think it must still be possible for data to arrive during the clear instruction and I think this is when the problem occurs. The ArtNet UDP data is arriving very rapidly and I am not sure how this situation can be avoided.

I think it wants to say that if you had some data in the buffer and issued RECV command, even if RECV did not receive anything new it will set the interrupt flag. This is logical - after you enter ISR, you clear the interrupt flag, read pointers, get all the data from the buffer, and then issue RECV command to get more data - and this RECV command will raise interrupt if you did not get all the stuff from the buffer (chip received some data while you were getting existing), or if if has received new data while you were working with the buffer.

Right, that makes sense. So the correct procedure in the ISR would seem to be -
1-Clear the Sn_IR flag
2-Read the data, send RECV
3-If Sn_IR flag sets again, Repeat from (1)

I will test this.

In step 3 you exit ISR, and if IR’s bit is set, you just have ISR invoked again. No need to loop within ISR and spend time re-reading IR.

Unfortunately on ESP32 the enter/exit ISR is extremely slow. The only way it can keep up with the incoming data is to stay in the ISR and check for more data having arrived.

You have another option - do not use ISR but perform polling instead.

I may be doing this wrong but ESP32 uses freeRTOS, even if I allocate one of the processor cores to do nothing but a task polling W5500, it misses data due to switching to the “Idle” task. Using ISR is the only way I can read quickly enough to not miss data.

Thanks for your suggestions with this problem.

External Interrupt Latency - ESP32 Forum

I have changed to polling the W5500 using a tight loop with ESP32 watchdog disabled. All other tasks are running on the other core of the ESP32. I am polling RX_RSR to check if any data is in the buffer.

This seems to work reliably. I have not been able to get the interrupt system to work reliably - wherever in the code I reset the interrupt flag, it seems to happen every few minutes that the interrupt pin is inactive but data remains in the buffer. It only happens if I really thrash the ethernet by sending 32+ universes of ArtNet data continuously.

Again, thanks for your suggestions which have helped me to persevere with this problem.

1 Like