W5500 Transmission Buffer Issue

Hi.

I am facing an issue using the W5500 chip attached to an STM32F429I-DISC1 board.

I am writing 1 + 2 MQTT packets to a MQTT broker running on my machine through the W5500 ethernet component. I am using the EthernetLib to.

Everything works correctly for a while, but after a certain point of the MQTT messages sent to the broker become corrupted and the broker closes the connection. The main code sends one connect packet and then 2 packets (publish) inside a loop. I hardcoded the packets in the code to help diagnose the problem.

As you can see there’s a Connect Command [Malformed Packet] that marks the beginning where the MQTT packets become corrupted, although the Connect Command was already sent (not shown in the image).

It seems that for some reason the TX Buffer data starts to overlap (for some unknown reason).

Questions:

Can this be related with Clock issues? The STM32 clock tree is configured as follows:

and the SPI Configuration is:

static void MX_SPI1_Init(void)
{

  /* USER CODE BEGIN SPI1_Init 0 */

  /* USER CODE END SPI1_Init 0 */

  /* USER CODE BEGIN SPI1_Init 1 */

  /* USER CODE END SPI1_Init 1 */
  /* SPI1 parameter configuration*/
  hspi1.Instance = SPI1;
  hspi1.Init.Mode = SPI_MODE_MASTER;
  hspi1.Init.Direction = SPI_DIRECTION_2LINES;
  hspi1.Init.DataSize = SPI_DATASIZE_8BIT;
  hspi1.Init.CLKPolarity = SPI_POLARITY_LOW;
  hspi1.Init.CLKPhase = SPI_PHASE_1EDGE;
  hspi1.Init.NSS = SPI_NSS_SOFT;
  hspi1.Init.BaudRatePrescaler = SPI_BAUDRATEPRESCALER_32;
  hspi1.Init.FirstBit = SPI_FIRSTBIT_MSB;
  hspi1.Init.TIMode = SPI_TIMODE_DISABLE;
  hspi1.Init.CRCCalculation = SPI_CRCCALCULATION_DISABLE;
  hspi1.Init.CRCPolynomial = 10;
  if (HAL_SPI_Init(&hspi1) != HAL_OK)
  {
    Error_Handler();
  }
  /* USER CODE BEGIN SPI1_Init 2 */

  /* USER CODE END SPI1_Init 2 */

}

The main code is:

  // RESET W5500
  HAL_GPIO_WritePin(GPIOC, GPIO_PIN_8, GPIO_PIN_RESET);
  HAL_Delay(600);
  HAL_GPIO_WritePin(GPIOC, GPIO_PIN_8, GPIO_PIN_SET);
  HAL_Delay(20);

  // Setup Buffers
  uint8_t rx_tx_buff_sizes[] = {2, 2, 2, 2, 2, 2, 2, 2};
  wizchip_init(rx_tx_buff_sizes, rx_tx_buff_sizes);

  // Define callbacks
  reg_wizchip_cs_cbfunc(&SPIChipSelect, &SPIChipUnselect);
  reg_wizchip_spi_cbfunc(&SPIReadByte, &SPIWriteByte);

  // Open TCP Socket
  socket(0, Sn_MR_TCP, 32000, 0);
  uint8_t addr[] = { 192, 168, 1, 1 };
  connect(0, addr, 1883);

	 const uint8_t connectCommand[] = "\x10\x25\x00\x04\x4d\x51\x54\x54\x05\x02\x01\xf4\x00\x00\x18\x56" \
	 "\x61\x63\x75\x75\x6d\x5f\x30\x30\x3a\x30\x38\x3a\x44\x43\x3a\x37" \
	 "\x35\x3a\x43\x30\x3a\x37\x44";


	 const uint8_t connectedStatus[] = "\x30\x8c\x01\x00\x37\x41\x63\x6d\x65\x2f\x48\x61\x72\x64\x77\x61" \
			"\x72\x65\x2f\x30\x30\x3a\x30\x38\x3a\x44\x43\x3a\x37\x35\x3a\x43" \
			"\x30\x3a\x37\x44\x2f\x53\x74\x61\x74\x75\x73\x2f\x43\x6f\x6e\x6e" \
			"\x65\x63\x74\x69\x6f\x6e\x53\x74\x61\x74\x75\x73\x00\x7b\x0a\x22" \
			"\x6d\x65\x73\x73\x61\x67\x65\x54\x79\x70\x65\x22\x3a\x20\x22\x52" \
			"\x65\x70\x6f\x72\x74\x56\x61\x6c\x75\x65\x22\x2c\x0a\x22\x6e\x61" \
			"\x6d\x65\x22\x3a\x20\x22\x43\x6f\x6e\x6e\x65\x63\x74\x69\x6f\x6e" \
			"\x53\x74\x61\x74\x75\x73\x22\x2c\x0a\x22\x76\x61\x6c\x75\x65\x22" \
			"\x3a\x20\x22\x43\x6f\x6e\x6e\x65\x63\x74\x65\x64\x22\x0a\x7d";


	 const uint8_t chamberPressureData[] ="\x30\x89\x01\x00\x36\x41\x63\x6d\x65\x2f\x48\x61\x72\x64\x77\x61" \
			"\x72\x65\x2f\x30\x30\x3a\x30\x38\x3a\x44\x43\x3a\x37\x35\x3a\x43" \
			"\x30\x3a\x37\x44\x2f\x53\x74\x61\x74\x75\x73\x2f\x43\x68\x61\x6d" \
			"\x62\x65\x72\x50\x72\x65\x73\x73\x75\x72\x65\x00\x7b\x0a\x22\x6d" \
			"\x65\x73\x73\x61\x67\x65\x54\x79\x70\x65\x22\x3a\x20\x22\x52\x65" \
			"\x70\x6f\x72\x74\x56\x61\x6c\x75\x65\x22\x2c\x0a\x22\x6e\x61\x6d" \
			"\x65\x22\x3a\x20\x22\x43\x68\x61\x6d\x62\x65\x72\x50\x72\x65\x73" \
			"\x73\x75\x72\x65\x22\x2c\x0a\x22\x76\x61\x6c\x75\x65\x22\x3a\x20" \
			"\x33\x32\x2e\x32\x30\x30\x31\x30\x30\x30\x0a\x7d";

	    // SEND MQTT Connect Command
		WriteAll(connectCommand, sizeof(connectCommand) - 1);

		
		HAL_Delay(250);

		 while(1)
		 {
			 // MQTT Publish Message #1
			 WriteAll(connectedStatus, sizeof(connectedStatus) - 1);
			 
			 // MQTT Publish Message #2
			 WriteAll(chamberPressureData, sizeof(chamberPressureData) - 1);
		 }

´
I also wrote this auxiliary functions:

void WriteAll(const uint8_t* data, const size_t dataSize)
{
	 uint8_t* ptr = (uint8_t*)data;
	    size_t remaining = dataSize;

	    while (remaining > 0) {
	        auto result = send(0, ptr, remaining);

	        if (result > 0) {
	            remaining -= result;
	            ptr += result;
	        } else {
	        	//UART_Printf("Send result: %d\n", result);

	        	while(true)
	        	{
	        	}
	        }
	    }
}

void SPIChipSelect()
{
	HAL_GPIO_WritePin(GPIOC, GPIO_PIN_7, GPIO_PIN_RESET);
}

void SPIChipUnselect()
{
	HAL_GPIO_WritePin(GPIOC, GPIO_PIN_7, GPIO_PIN_SET);
}

uint8_t SPIReadByte()
{
    uint8_t byte;
    HAL_SPI_Receive(&hspi1, &byte, 1, 10000);
    return byte;
}

void SPIWriteByte(uint8_t byte)
{
	 HAL_SPI_Transmit(&hspi1, &byte, 1, 10000);
}

The wiring is the following:

PC8 -> W5500 RESET PIN
PC7 -> W5500 CS PIN
PA5  -> W5500 CS CLOCK
PA6  -> W5500 MISO
PA7 -> W5500 MOSI

PA5/PA6/PA7 are associated with the SPI1 module.

I estimate probability of the issue being an outcome of bad clocking or any other configuration problem as low. This needs to be investigated further, and patterns must be found.

How long this while is? Is the period reproducible? Between failures? After power on?
How much data has been sent when issue appears? Is it always the same amount after the whole system power up? If you look into the raw data in the Wireshark, is this data total mess or you can recognize parts of valid data, but wrongly arranged? Did you check at the driving MCU side that you really use correct pointer to data when issue starts to appear? Is the MCU data you use at that pointer is what you expect it to be at the time when issue appears?

I will put some trace output in the code to try to answer your questions. Meanwhile should i be concerned about the TX write pointer when the the updated Tx pointer crosses the 64Kb address? I read in the datasheet that if overflow occurs when setting the updated tx write pointer the carry flag is ignored and the tx pointer is updated with the lower 16bits. I know that this is handled by the EthernetLib and i am assuming that the implementation there is good, correct?

And is also safe to assume that when SEND interrupt flag is set, the data was fully delivered to the peer? In my case the MQTT broker is the peer.

That’s why I asked how much data has been sent before issue happens. If it happens far after buffer wraps, (e.g. 32kB with default socket memory config of 2k) then it is not the TX buffer pointer wrapping/socket data addressing issue. If you get it after you send 2kB of data, then the first thing to check TX pointers.