Richard:Bug in the LPC2000 port of FreeRTOS ?

Hello, A colleague of mine was able to isolate the the following problem: In queue.c, line 489 (”xQueueGenericSend”),
/* If there was a task waiting for data to arrive on the
                queue then unblock it now. */
                if( listLIST_IS_EMPTY( &( pxQueue->xTasksWaitingToReceive ) ) == pdFALSE )
                {
                    if( xTaskRemoveFromEventList( &( pxQueue->xTasksWaitingToReceive ) ) == pdTRUE )
                    {
                        /* The unblocked task has a priority higher than
                        our own so yield immediately.  Yes it is ok to do
                        this from within the critical section - the kernel
                        takes care of that. */
                        portYIELD_WITHIN_API();
                    }
                }
when “portYIELD_WITHIN_API” is a SWI, the interrupt line of the scheduler (note that before this code there is a call to “taskENTER_CRITICAL”) are _NEVER_ enabled UNLESS some API within the target task is called that invokes taskEXIT_CRITICAL(). We have seen this happening, ending up in a task hogging the processor with the interrupt line of the hardware timer disabled!
Our incomplete solution was to enabled the ISR line in the “vPortYieldProcessor” function, but we did not take the reference counting into consideration.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Before Cortex-M took over, ARM7 was one of the most popular FreeRTOS targets, and this used to come up frequently (eg http://www.freertos.org/FreeRTOS_Support_Forum_Archive/March_2009/freertos_taskYIELD_in_critical_section_3119904.html).  It is a long time since I have had to explain it, but needless to say this is not a bug but an essential part of the design for processor cores that perform yields using synchronous unmaskable software interrupts/yields. Each task has its own interrupt nesting count.  Switching to a task that was not in a critical section will result in interrupts becoming enabled again while that task is running.  Switching back to the task that was in a critical section when it yielded will result in interrupts becoming disabled again until the task exits the critical section it originally entered.  This is essential because it is (possibly) going to need atomic access to the queue (semaphore, mutex, whatever) when it starts running and already assumes it is in a critical section.  Were it to resume with interrupts disabled – it would have been a bug. Generally it is best not to mess with the core kernel code without asking first – the code is extremely robust – although obviously not infallible. Regards.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Richard, I still have not inspected the link you provided but thank you for it.
We understand this behaviour and think it is a good thing – of course!
However, in order to fail the system (my previous post refers to the same situation) we only needed to add one empty task –  as in really empty – and the system would end up in that task with the scheduler disabled. In order to “release” the scheduler, we can poll a queue with time 0. Is this by design?
Thanks again for your time and attention!

Richard:Bug in the LPC2000 port of FreeRTOS ?

However, in order to fail the system (my previous post refers to the same situation) we only needed to add one empty task –  as in really empty – and the system would end up in that task with the scheduler disabled. In order to “release” the scheduler, we can poll a queue with time 0. Is this by design?
Do you mean with the scheduler suspended?  That is, with the variable uxSchedulerSuspended set to a non-zero value in tasks.c.  Or do you mean with interrupts disabled, which was implied in your previous post. Are you following the guidelines provided here:
http://www.freertos.org/FAQHelp.html ? Regards.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Richard, I meant that the system ends up with the interrupt line itself disabled. I think I comply with all the FAQ recommendations.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Is the task completely empty, as in
void vATask( void )
{
    for( ;; );
}
When is the task create?  Before the scheduler starts, or after. What is the priority of the task relative to configMAX_PRIORITIES and relative to other tasks? What is the value of ulCriticalNesting in portISR.c (if you are using GCC) when the task is running? Do you ever manipulate the interrupt flags yourself, other than through calls to taskENTER_CRITICAL() and taskEXIT_CRITICAL()? Regards.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Richard, All tasks are created before the scheduler is started. The priority of the empty task is equal to all other tasks (4/8). I am using the Keil compiler, but I don’t know the value of that variable (not at work now…). We only use the functions you mentioned to manipulate the status of the scheduler. But Richard, we see it happening clearly: the function I post an excerpt from above disables the scheduler, then yield the processor while is it off following the logic above. I would expect that the new task (the empty one) would then have the scheduler enabled during the context switch by his own private status, but that does not happen. The scheduler remains disabled forever! I would give you the code but it is a commercial product…if you don’t have an idea, maybe I can attempt to write a tiny example program. It will require at least one empty task and some simple signalling in such a way that the logic above is enabled. If I am successful, is there a way to deliver the test are to you? I hope we won’t get that far though…

Richard:Bug in the LPC2000 port of FreeRTOS ?

Richard, I am having a better look at the code. Once “portYIELD_WITHIN_API()” above is called, the following happens: since #define portYIELD_WITHIN_API portYIELD and #define portYIELD() vPortYield() vPortYield in portASM.s is called. This function triggers SVC 1. SVC1 is the function “vPortYieldProcessor” of portASM.s, which saves and restores the context of the switched tasks. But nowhere is the scheduler enabled if the reference counter hits 0 ! and this is the reason for the problem – unless I missed something.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Getting the code to me is not a problem, and if you are using Keil, then it should run in their simulator too which will make it easier.  However, I hope that won’t be necessary. When you are in the office, look at the file Source/portable/RVDS/ARM7_LPC21xx/portmacro.inc.  It contains the portRESTORE_CONTEXT macro that runs at the end of the yield function.  In that macro, note the line that has the comment “Get the SPSR from the stack.” - the interrupt enable bit is set in the PSR (bit 7).  When the task is created, the initial PSR that is put on the task’s stack is set to 0x1f (System mode, with interrupts enabled).  Step through the end of the yield function in the debugger when the task starts for the first time.  When the instruction “LDMFD LR!, {R0}” (on the same line as the comment) executes it should load this 0x1f from the task’s clean stack into R0, then executing the the following line “MSR SPSR_cxsf, R0” should load this into the PSR register – ensuring interrupts are enabled the first time the task runs.  If you are seeing any value other than 0x1f coming from the stack at that point then the task stack is corrupt before it even starts, and I suspect this is what is happening if the task really is empty.  Calling any function that enters then exits a critical section, such as the queue function you mention, will cause interrupts to become enabled, so this is explainable.  This is because the critical nesting count will be incremented when the critical section is entered and decremented when the critical section is exited – and having a zero nesting count will result in interrupts becoming enabled. Please let me know what you find. Regards.

Richard:Bug in the LPC2000 port of FreeRTOS ?

I will report my findings on Monday! It has been a while for me since I delved in the ARM7 architecture but you re right…
Thanks again, I let post what I found.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Good morning Richard, As far as I can tell, the controller ends up in the newly added task with the interrupt line of T0 disabled while the interrupt mask I in CPSR is enabled (value 0) for that task.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Richard, Based on what you posted above and the code, I cannot find anywhere in the code where the interrupt line of the scheduler’s hardware timer is enabled. Yes, the reference counter is restored currently but if the scenario above is followed – i.e., a forced context switch using an SVC while the scheduler is disabled for the task that is left – the interrupt line is not actively enabled/disabled before the target task is switched in. Again: the reference counter is restored, but not the underlying interrupt line of the processor does not seem to be adjusted according the latest status of the target task.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Wait a second.
We disable/enable interrupts using the interrupt line of the hardware timer that drives the kernel.
I think the kernel assumes synchronization occurs strictly using CPSR…
I will change it and see what happens!
If that is the problem, it is worth mentioning in the manual….

Richard:Bug in the LPC2000 port of FreeRTOS ?

One more comment.
If my previous statement is the root cause, then every time one enters a critical section, all interrupts will be disabled rather than only the interrupt source of the scheduler. That may not be necessary.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Ok – that is a completely different matter to anything discussed so far then because T0 (presumably used to create the tick interrupt) is an LPC2000 peripheral, and not part of the ARM7 core.
Based on what you posted above and the code, I cannot find anywhere in the code where the interrupt line of the scheduler’s hardware timer is enabled.
Can you tell me if, up to that point, the tick interrupt was firing and the xTickCount variable in tasks.c is non-zero? The peripheral timer is configured in prvSetupTimerInterrupt() in port.c, which is called from xPortStartScheduler(). When the timer generates an interrupt (assuming configUSE_PREEMPTION is set to 1) vPreemptiveTick() in portISR.s executes to service the interrupt.  The interrupt is cleared in that function (you will see the comments “; Clear the timer event” and “; Acknowledge the interrupt”).
Yes, the reference counter is restored currently but if the scenario above is followed – i.e., a forced context switch using an SVC while the scheduler is disabled for the task that is left –
By reference counter, I presume you mean the critical nesting counter. By scheduler disabled, I presume you mean a in a critical section (disabling the scheduler has a different meaning in FreeRTOS). I don’t know what you mean by “the task that is left” though.  Do you mean the task that is being switched in, or the task being switched out?
the interrupt line is not actively enabled/disabled before the target task is switched in.
In one post before your last post, you said the global interrupts were enabled, so you are seeing that what you are saying you cannot see happen has actually happened.  The global interrupt state is in the PSR register that is saved and restored – so by swapping the PSR register as you swap tasks you are enabling and disabling interrupts – there is not a separate instruction to do this, which is probably why you cannot see it. If a task is swapped out with interrupts disabled, the PSR register is saved as part of that task’s context (with the interrupt enable bit in the register).  If the task being swapped in has interrupts enabled in its copy of PSR, then interrupts are enabled when its own PSR copy is moved into the PSR register.  Then, when the original task is eventually swapped back in, its saved PSR register is copied into the real register and, because it was saved with the interrupts disabled and the bit is in the PSR register, interrupts are once again disabled.
Again: the reference counter is restored, but not the underlying interrupt line of the processor does not seem to be adjusted according the latest status of the target task.
Maybe I’m not understanding you because two posts ago you said the interrupt enable line was being re-enabled, but the interrupt of T0 was disabled? Regards.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Richard,
Can you tell me if, up to that point, the tick interrupt was firing and the xTickCount variable in tasks.c is non-zero?
It is indeed non-zero.
I don’t know what you mean by “the task that is left” though.  Do you mean the task that is being switched in, or the task being switched out?
I meant the task that is switch out.
Maybe I’m not understanding you because two posts ago you said the interrupt enable line was being re-enabled, but the interrupt of T0 was disabled?
I see not surprises when it comes to the global interrupt bit field and the task interrupt status. Inspecting the context switch code, the scheduler relies on status of CPSR of the to be switched-in task. So using the interrpt line on the VIC of the hardware timer that drives the kernel is the wrong way to synchronize? In other words, portENTER_CRITICAL MUSTmaniplulate the I bit of CPSR? Is this correct?

Richard:Bug in the LPC2000 port of FreeRTOS ?

Our posts have crossed over somewhat.
We disable/enable interrupts using the interrupt line of the hardware timer that drives the kernel.
The kernel does not do that.  So, if you are doing that yourself, then once again I am confused by what you are saying, because some time ago I asked if you manipulated interrupt enables in any way other than using the taskENTER_CRITICAL() and taskEXIT_CRITICAL() macros, and you said no? T0 is under the control of the kernel, and should not be reconfigured in any way. Regards.

Richard:Bug in the LPC2000 port of FreeRTOS ?

Richard,
T0 is under the control of the kernel, and should not be reconfigured in any way.
I guess this is the underlying cause of all of this.
I am sorry, I did not understand your question at the time, but as you can above I realized what was wrong a couple of minutes ago.
Thanks a lot for the time and effort! Great product, by the way.