W5500 buffer address clarification

I’m writing a W5500 driver from scratch.

The documentation for the W5500 does not explain the buffering concept well.

It appears to be a circular buffer for which the driver code is responsible for maintaining the write address. I assume the chip maintains the read address. Is this correct?

Reading the Sn_TX_WR register, is the value an absolute address, or relative to an internally-calculated base address for the socket?
In other words, if I write a 0 to Sn_TX_WR, will that always select the first byte of the buffer, no matter what socket?

If I write Sn_TX_WR with an address beyond the range for that socket, is it automatically wrapped back to the start of the buffer?
In other words, does the driver code need to AND the address with a mask before writing it to make sure it does not go beyond the top of the buffer?

If this is the case, then I presume the mask will change depending on the size specified for the buffer?
Then by default the mask would be 0x7ff for a 2k byte buffer?


Refer to W5100 datasheet, it has concepts explained very well.

Aye. Ya there is a lot of extra information there. Thanks.

Though I don’t think there is anything clear about the explanation in the 5100 sheet.

In the 5100 datasheet, there is mention of physical addresses, and a memory map. I don’t see any physical addresses in the 5500 sheet. In the 5500 sheet it says the 16k RAM blocks are in a 64k memory space. But at what base address?

In the 5100 sheet, they are saying basically you take the size of the block, and subtract 1 to get the mask. This makes perfect sense of course. And it appears they are saying that the base address for each socket is the sum of all the sizes of the lower-numbered sockets. That makes sense too. Is that right?

I can reverse engineer it, though there are two simple facts here which were apparently hard to explain by someone for whom English is a second language.

I keep wondering if there is any need to use the Sn_TX_WR register. Why not just keep track of the pointer in local RAM? But I imagine it has to do with how the chip manages the transmission of that data. It should be able to maintain the pointers itself based upon the number of bytes that are written to the queue, but I am imagining that would have required more transistors and so was left to the firmware.


I think it is made this way for compatibility reasons with applications designed for earlier devices allowing random parallel access to its buffers (e.g. W5100).

I noticed the 5500 sheet says that Sn_TX_RD increments up to Sn_TX_WR after the transmission is complete. So yes, the chip is not counting the bytes coming in, it just writes them into RAM and then you have to tell it where the end is. So therefore 1. there is no reason ever to read Sn_TX_RD as at the end of a cycle Sn_TX_RD == Sn_TX_WR and 2. one could keep a copy of Sn_TX_WR in local RAM, and avoid the register read at the beginning of the cycle. That is, assuming one clearly understood the pointer math involved and so could duplicate the math in firmware.

The 5100 sheet has code that demonstrates how to calculate the physical addresses. But the 5500 sheet does not refer to the 5100 sheet, so we are just assuming that it is the same.

The 5500 sheet explains (poorly) a scheme which seems to indicate you don’t need a physical address, it seems to be saying that you just treat the point as a 16 bit unsigned and it automagically keeps it within the range of the particular socket’s buffer. Or maybe not. Maybe its saying all the buffers are in a 64k memory space and adding 1 to 0xffff gives you 0. The reference is figure 20, which has arrows pointing all over the place and Lord knows what it means.

Quoting page 32:
“The Socket n TX Buffer Block allocated in 16KB TX memory is buffer for saving data
to be transmitted by host. The 16bits Offset Address of Socket n TX Buffer Block has
64KB address space ranged from 0x0000 to 0xFFFF, and it is configured with
reference to ‘Socket n TX Write Pointer Register (Sn_TX_WR)’ & ‘Socket n TX Read
Pointer Register(Sn_RX_RD)’. However, the 16bits Offset Address automatically
converts into the physical address to be accessible in 16KB TX memory such as Figure
20. Refer to ‘Chapter 4.2’ for Sn_TX_WR & Sn_TX_RD.”

I did not code the W5100 or W5500 buffer management myself for a very long time. Actually did that only for the W3100 in the early years.
But, the W5500 has now an advaced hardware buffer management to reduce load on the MCU calculating addresses and moving memory content around. Also the SPI I/F is much more effective compared W5100 to W5500.
So, the documentation of W5100 is not that much helpfull for the W5500 here.
Maybe you can just re-engineer a little and have full understanding quite quickly by starting here:
e.g. scroll down to → ESTABLISHMENT: Check send data / Send process

On the page linked there is this pseudo code

Sn_TX_WR += send_size;

That code has no masking, and no consideration of offset for the particular socket (as the W5100 code does). Its just a 16 bit unsigned being incremented and allowed to wrap around to 0.
I think the answer must be that on the 5500, masking and base offset are taken care of. That makes sense given there is never any reference to physical base addresses. That is plausible as it would be easy to do in hardware…its just base + (index & mask).

I don’t see any reason one could not maintain a copy of Sn_TX_WR in RAM and avoid reading it. Its just a 16 bit unsigned (a “short” I believe, for most if not all compilers) then just add the size of the data each time you write. I suppose its possible there could be errors that would put the RAM copy out of sync with the register, though its highly unlikely with SPI. There could just as likely be errors in reading Sn_TX_WR via SPI.

For reference I’ve put my flowchart on the web below:

Every rectangle is a w5500 register read or write. Which it is depends on the “r” or “w” in the corner. rectangles attached to eachother are combined writes (either by DMA, or by BEN, which is a SPI mode on my micro with a 16-byte deep SPI, so fairly asynchronous for short commands. )

The scheme is for simple UDP based packet sending. Receive and send use distinct sockets, and send is typically broadcast to a port. There can be multiple send sockets. Interrupts are configured that the only interrupt is the receive interrupt of the socket used for the receive process.

The 4kb receive buffer rotates through the W5500’s 16-bit addressspace. I typically poll RSR on interrupt, calculate if there is new data, and then process. If head and tail of the buffer are more than 2k apart, I use the procedure marked as “discard” to synchronize the pointers again.

Send is self contained and does not rely on discard.

I’ve got an 16-bitter, so the 16-bit integers used for the pointer values wrap around in the same way as the W5500, and I only have to take care of watching wraparound if calculate the new number of bytes

A round trip (receive a 10-20 byte packet and reply to it) takes about 100us, of which about 15us are really CPU time. ( 60MHz/MIPS dspic33EPxxxMU8xx)

Wow, that flowchart is a thing of beauty. I will need to study it in detail. I appreciate your help. I am trying to avoid the migraine that goes with having to reverse engineer chip behaviour. Done it too many times already. Thanks

Note that it is packet based (broadcast) UDP only. No DHCP (but I guess that could be added), no TCP, so no HTTP etc.

We use it to communicate between boards (and PC) in machine control. Usually only the PC talks to the outside world, so no need for anything else. The current way has good realtime characteristics (never hold up the CPU for more than 10us, and typically about than 5us at a time), and is easy.

Hello! Could you re-upload the flowchart to send an UDP package, the link is not working?

I uploaded the png again.

Oh, thanks a lot!