Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

Hi I’m in need of some help. I have a FreeRTOS project generated from CubeMX (v4.25.0) targetting a STM32L432. The project is generated for, compiler in and debugged in Atollic TrueStudio, so the tool chain is gcc. When I compile with optimisation level -O2 or -O3 I get hard faults in xQueueGenericReceive() . If I use other levels (-O0, -O1, -Og, -Os, -Of) then it runs fine. According to the fault analyser in Atollic True Studio, the PC when the hard fault occurs is on this line in queue.c, function xQueueGenericReceive(), ~~~ for( ;; ) { taskENTERCRITICAL(); { const UBaseTypet uxMessagesWaiting = pxQueue->uxMessagesWaiting; <— HARD FAULT HERE ` ~~~ I have a screenshot of the disassembly from the Atollic Fault Analyser here , showing the error location to be the following instruction ldr r6, [r4, #56] This causes a precise bus fault because r4 contains address 0x32000, so the instruction is trying to access address 0x320038. I have spent the last few days reading about hardfaults and debugging. But I’m still a bit lost, being new to STM32. I have followed the advice on the FreeRTOS site about Cortex M3/4 processors and interrupt priority and debugging hard faults. I have config assert defined: #define configASSERT( x ) if ((x) == 0) { taskDISABLE_INTERRUPTS(); __asm volatile("BKPT #01"); for( ;; ); } My interrupt config is generated by CubeMX as follows: #define configPRIO_BITS 4 #define configLIBRARY_LOWEST_INTERRUPT_PRIORITY 15 #define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY 5 #define configKERNEL_INTERRUPT_PRIORITY (configLIBRARY_LOWEST_INTERRUPT_PRIORITY << (8 - configPRIO_BITS) ) #define configMAX_SYSCALL_INTERRUPT_PRIORITY (configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY << (8 - configPRIO_BITS) ) My application is not using any interupts. It is configured without premption, and has 3 tasks that wait on a FreeRTOS queue, blocking indefinitely till there is something to process. The 3 tasks are 1) An application task running a state machine. It waits on a queue that contains state machine events. 2) A message processing task for handling communications from a PC application. The PC app is not being used when the fault occurs, so this task is simply blocked. 3) And finally a task processing bytes from the GPS UART. This is running when the fault occurs. The GPS UART is configured for DMA to a circular buffer, and the buffer is emptied in the idle task hook, where all bytes are read out and pushed into the FreeRTOS queue. There are 2 timers running, their callbacks push timer expired events into the state machine queue. The state machine logic will start the timers as approriate. At the time of the hard fault, the application task (HSM) is spinning in a loop waiting for GPS fix, it checks the fix flag and if there is no fix it calls vTaskDelay(). It seems that the error is somehow related to this delay causing a context swtich. The same happens if I replace the delay with a taskYIELD(). Please can someone help me debug this further? I’m really stuck thanks Stuart

Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

Curious. Can you tell me the FreeRTOS and GCC versions you are using.

Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

Richard FreeRTOS is v9.0.0 – it is what CubeMX spits out. I have not compared to the “official” download. gcc is the Atollic version: ~~~ .arm-atollic-eabi-gcc.exe –version arm-atollic-eabi-gcc.exe (GNU Tools for ARM Embedded Processors (Build 17.03)) 6.3.1 20170215 (release) [ARM/embedded-6-branch revision 245512] Copyright (C) 2016 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ~~~

Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

I will see if I can get the same GCC version and try it myself. Can you look at the assembly and see how r4 is getting set? It might give a clue as to why it seems to hold the wrong value when it is used.

Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

Richard I did a compare on the CubeMX generated FreeRTOS source code and did not find any significant differences. You can download the tool chain (complete IDE) for free from Atollic website, STM have bought them and now make the tool available for STM32 for free – https://atollic.com/resources/download/. BTW, I’m using the Windows version. Here is the dissassembly for the -O2 optimised version. Things seem to have been moved around a bit… ~~~ 1238 { xQueueGenericReceive: 0x08010d28: stmdb sp!, {r4, r5, r6, r7, r8, r9, r10, lr} 0x08010d2c: sub sp, #16 0x08010d2e: str r2, [sp, #4] 1244 configASSERT( pxQueue ); 0x08010d30: cmp r0, #0 0x08010d32: beq.w 0x8010e5a <xQueueGenericReceive+306> 1245 configASSERT( !( ( pvBuffer == NULL ) && ( pxQueue->uxItemSize != ( UBaseTypet ) 0U ) ) ); 0x08010d36: cmp r1, #0 0x08010d38: beq.w 0x8010e9c <xQueueGenericReceive+372> 0x08010d3c: mov r4, r0 0x08010d3e: mov r9, r3 0x08010d40: mov r8, r1 1248 configASSERT( !( ( xTaskGetSchedulerState() == taskSCHEDULERSUSPENDED ) && ( xTicksToWait != 0 ) ) ); 0x08010d42: bl 0x8011868 0x08010d46: cbnz r0, 0x8010d60 <xQueueGenericReceive+56> 0x08010d48: ldr r5, [sp, #4] 0x08010d4a: cbz r5, 0x8010d62 <xQueueGenericReceive+58> 0x08010d4c: mov.w r3, #80 ; 0x50 0x08010d50: msr BASEPRI, r3 0x08010d54: isb sy 0x08010d58: dsb sy 1248 configASSERT( !( ( xTaskGetSchedulerState() == taskSCHEDULERSUSPENDED ) && ( xTicksToWait != 0 ) ) ); 0x08010d5c: bkpt 0x0001 0x08010d5e: b.n 0x8010d5e <xQueueGenericReceive+54> 0x08010d60: movs r5, #0 1258 taskENTERCRITICAL(); 0x08010d62: bl 0x8010434 1260 const UBaseTypet uxMessagesWaiting = pxQueue->uxMessagesWaiting; 0x08010d66: ldr r6, [r4, #56] ; 0x38 1401 portYIELDWITHINAPI(); 0x08010d68: ldr.w r10, [pc, #332] ; 0x8010eb8 <xQueueGenericReceive+400> 1371 prvLockQueue( pxQueue ); 0x08010d6c: movs r7, #0 1264 if( uxMessagesWaiting > ( UBaseTypet ) 0 ) 0x08010d6e: cmp r6, #0 0x08010d70: bne.n 0x8010dfc <xQueueGenericReceive+212> 1343 if( xTicksToWait == ( TickTypet ) 0 ) 0x08010d72: ldr r3, [sp, #4] 0x08010d74: cmp r3, #0 0x08010d76: beq.w 0x8010e7e <xQueueGenericReceive+342> 1351 else if( xEntryTimeSet == pdFALSE ) 0x08010d7a: cbnz r5, 0x8010d82 <xQueueGenericReceive+90> 1355 vTaskSetTimeOutState( &xTimeOut ); 0x08010d7c: add r0, sp, #8 0x08010d7e: bl 0x80117b0 1365 taskEXITCRITICAL(); 0x08010d82: bl 0x8010478 1370 vTaskSuspendAll(); 0x08010d86: bl 0x801138c 1371 prvLockQueue( pxQueue ); 0x08010d8a: bl 0x8010434 0x08010d8e: ldrb.w r3, [r4, #68] ; 0x44 0x08010d92: cmp r3, #255 ; 0xff 0x08010d94: it eq 0x08010d96: strbeq.w r7, [r4, #68] ; 0x44 0x08010d9a: ldrb.w r3, [r4, #69] ; 0x45 0x08010d9e: cmp r3, #255 ; 0xff 0x08010da0: it eq 0x08010da2: strbeq.w r7, [r4, #69] ; 0x45 0x08010da6: bl 0x8010478 1374 if( xTaskCheckForTimeOut( &xTimeOut, &xTicksToWait ) == pdFALSE ) 0x08010daa: add r1, sp, #4 0x08010dac: add r0, sp, #8 0x08010dae: bl 0x80117d0 0x08010db2: cmp r0, #0 0x08010db4: bne.n 0x8010e40 <xQueueGenericReceive+280> 1918 taskENTERCRITICAL(); 0x08010db6: bl 0x8010434 1920 if( pxQueue->uxMessagesWaiting == ( UBaseTypet ) 0 ) 0x08010dba: ldr r3, [r4, #56] ; 0x38 0x08010dbc: cmp r3, #0 0x08010dbe: bne.n 0x8010e2e <xQueueGenericReceive+262> 1929 taskEXITCRITICAL(); 0x08010dc0: bl 0x8010478 1382 if( pxQueue->uxQueueType == queueQUEUEISMUTEX ) 0x08010dc4: ldr r3, [r4, #0] 0x08010dc6: cmp r3, #0 0x08010dc8: beq.n 0x8010e6e <xQueueGenericReceive+326> 1397 vTaskPlaceOnEventList( &( pxQueue->xTasksWaitingToReceive ), xTicksToWait ); 0x08010dca: ldr r1, [sp, #4] 0x08010dcc: add.w r0, r4, #36 ; 0x24 0x08010dd0: bl 0x801168c 1398 prvUnlockQueue( pxQueue ); 0x08010dd4: mov r0, r4 0x08010dd6: bl 0x8010974 1399 if( xTaskResumeAll() == pdFALSE ) 0x08010dda: bl 0x801139c 0x08010dde: cbnz r0, 0x8010df0 <xQueueGenericReceive+200> 1401 portYIELDWITHINAPI(); 0x08010de0: mov.w r3, #268435456 ; 0x10000000 0x08010de4: str.w r3, [r10] 0x08010de8: dsb sy 0x08010dec: isb sy 0x08010df0: movs r5, #1 1258 taskENTERCRITICAL(); 0x08010df2: bl 0x8010434 1260 const UBaseTypet uxMessagesWaiting = pxQueue->uxMessagesWaiting; 0x08010df6: ldr r6, [r4, #56] ; 0x38 1264 if( uxMessagesWaiting > ( UBaseTypet ) 0 ) 0x08010df8: cmp r6, #0 0x08010dfa: beq.n 0x8010d72 <xQueueGenericReceive+74> 1270 prvCopyDataFromQueue( pxQueue, pvBuffer ); 0x08010dfc: mov r1, r8 0x08010dfe: mov r0, r4 1268 pcOriginalReadPosition = pxQueue->u.pcReadFrom; 0x08010e00: ldr r5, [r4, #12] 0x08010e02: bl 0x801094c 1272 if( xJustPeeking == pdFALSE ) ~~~

Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

OK, so I was able to trap it in the debugger. Basically, R4 gets loaded from R0, which is the first argument to xQueueGenericReceive, that is QueueHandle_t xQueue. If I add the following line to the start of xQueueGenericReceive if (xQueue == 0x320000) __asm volatile("BKPT #01"); then the debugger stops just before the hard fault is generated. But if I go back in the stack trace, the call has a valid address for xQueue: Stack: Thread #1
(Suspended : Signal : SIGTRAP:Trace/breakpoint trap)
xQueueGenericReceive() at queue.c:1,239 0x8010e62
gps_task() at gps.c:196 0x80136f6
uxListRemove() at list.c:238 0x8010374 Looking at gpstask() it contains the following block of code ~~~ for (;;) { uint8t newbyte; BaseTypet status = xQueueReceive(ggpsuartqueuehandle, &newbyte, portMAXDELAY); if (pdTRUE == status) { ~~~ Disassembled this gives ~~~ 196 BaseTypet status = xQueueReceive(ggpsuartqueuehandle, &newbyte, portMAXDELAY); 0x080136e6: movs r3, #0 0x080136e8: mov.w r2, #4294967295 0x080136ec: add.w r1, r7, #11 0x080136f0: ldr r0, [r4, #0] 0x080136f2: bl 0x8010d28 197 if (pdTRUE == status) 0x080136f6: cmp r0, #1 0x080136f8: bne.n 0x80136e6 <gpstask+22> ~~~ In the load register instruction @ 0x080136f0 ldr r0, [r4, #0], r0 = 0x320000, but r4 = 0x2000000a. So why is r0 wrong?

Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

Reading this on my phone at the moment so can’t read the code properly. Will look in more detail tomorrow. If you get a chance post the disassemble of the code at that point too.

Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

Thanks Richard. I think the disassembly you asked for is there. It seems something is corrupting the queue handle passed into xQueueReceive(). I have added a watch point on it, and the only times it gets touched is at init time, and when the queue is created. Yet stepping through in assembly clearly shows it is wrong. It works for a while and then suddenly goes wrong. I will try another compiler tomorrow, the AC6 System Workbench one. I’m baffled.

Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

I have tried the AC6 tool chain v2.4(http://www.openstm32.org/HomePage), it uses the same compiler verison (6.3.1, but a slightly later build) and also generates the hard fault. ~~~ ./arm- none-eabi-gcc –version arm-none-eabi-gcc.exe (GNU Tools for ARM Embedded Processors 6-2017-q2-update) 6.3.1 20170620 (release) [ARM/embedded-6-branch revision 249437] Copyright (C) 2016 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ~~~

Hardfault in xQueueGenericReceive() at certain optimisation levels on STM32 ARM Cortex M4, FreeRTOS v9.0.0

Thanks for your help Richard. I found the problem, and as I expected it was in my code. I found an array overrun that was clobbering the queue handle. It seems that the optimisation settings changed what got clobbered when the overrun happened. It was just coincidence that it always clobbered the queue handle for levels -O2 and -O1.