|
Quantum Leaps, LLC 3452 South Court Palo Alto, CA 94306 www.state-machine.com |
| February 27, 2007 | ||||||||||||||||||||
In this issue:
Understanding ARM Cortex-M3
The new ARM Cortex-M3 architecture is getting lately a lot of attention. (Check for instance the IAR white paper "Choosing an ARM processor ARM7 vs Cortex-M3", or just google up "Cortex-M3"). While most of the writings about the new ARM architecture present a high-level overview of features and perhaps some comparisons and market forecasts, you will be much better informed if you look at actual implementations of everyday tasks with Cortex. For instance, how do you do preemptive mutlitasking with Cortex? How do you use the Wait-For-Interrupt sleep mode? How do you implement system clock tick, or software trace dump with Cortex? All of these questions and more are answered in the new Quantum Development Kits (QDKs) for Cortex-M3. Quantum Leaps has recently released QDKs for QP/C, QP/C++, as well as QDKs-nano for QP-nano, each in two versions: for RealView and the IAR ARM compiler. All these QDKs come with detailed manuals in PDF and all have been validated with the EKC-LM3S811 evaluation kit from Luminary Micro -- the first silicon vendor of Cortex-M3 devices.
Perhaps the most interesting part of the QDKs for Cortex-M3 is the implementation of the QK preemptive kernel. As you'll see, a single-stack, preemptive kernel implements in software exactly the same prioritization algorithm as the new Nested Vectored Interrupt Controller (NVIC) integrated into ARMv7 architecture. In other words, QK extends to the task level the NVIC prioritization scheme used for nested interrupts.
Quantum Kernel (QK) Supports Generic Co-processors
Starting from version 3.2.04, QK supports a generic extended context switch for various co-processors, such as hardware floating point co-processors or other hardware accelerators. C/C++ compilers typically support various co-processors, such as FPUs, in that they generate instructions for these hardware accelerators. However, the compiler typically assumes that any interrupt necessarily returns to the exact point of preemption and therefore the C/C++ compiler generally does NOT perform the co-processor context saving and restoring in the interrupts.
Being a preemptive kernel, QK in general does not return to the original interrupted task, but rather performs asynchronous preemption if a higher-priority task becomes ready to run as a result of the interrupt. This obviously violates the assumption of the C/C++ compiler, and therefore requires special care to save and restore the co-processor context upon asynchronus preemption. For best performance, maximum flexibility, and minimal stack usage, the implementation of the extended context switch is divided in two parts. The first part is handled in the interrupt exit (typically the QK_ISR_EXIT() macro). The following listing shows the QK_ISR_EXIT() macro for the x86 processor with the x87 math co-processor (FPU). The interrupt exit sequence begins with disabling all interrupts (1), which is immediately followed by writing the EOI instruction to the interrupt controller (2). The QK priority is restored from the temporary stack variable pin_ (passed as the argument to the macro). The rest of the exit sequence is only executed only if the interrupt has preempted a task (e.g., the saved priority pin_ is less than QF_MAX_ACTIVE). If additionally, the preempted task is using the co-processor (which is stored in the bitmask os_Object__ for this task) the extended QK scheduler is invoked. Otherwise the regular QK scheduler is called.
#define QK_ISR_EXIT(pin_) do { \
(1) disable(); \
(2) outportb(0x20, 0x20); \
QK_currPrio_ = (pin_); \
if ((pin_) <= QF_MAX_ACTIVE) { \
if ((QF_active_[pin_]->osObject__ & QK_FPU_THREAD) != 0) { \
(3) QK_scheduleExt_(); \
} \
else { \
(4) QK_schedule_(); \
} \
} \
} while (0)
The extended scheduler QK_schedExt_() performs the co-processor context saving and restoring before launching any task that would asynchronously preempt the interrupted task (3). To do this, QK_schedExt_() allocates the co-processor context on the stack, so the extended scheduler call uses MORE of the stack than the regular scheduler. However, if the co-processor is not used by a particular task, the extended context switch is not necessary, and the regular QK scheduler is adequate (4). In this case the extended co-processor context switch is not performed and the co-processor context does not even need to be allocated on the stack. The QK implementation of the extended context switch is much more efficient than it can be done in any traditional multi-stack kernel. In QK, you only pay of the extended context in asynchronous preemptions. The synchronous task-to-task preemptions are regular C-function calls, which don't need any extra attention from the kernel. In contrast, in a traditional kernel you pay the full price of the extended context switch every time. Additionally, QK allows you to be selective and perform the extended context switch only for tasks that actually use the co-processor, rather than for all tasks.
QK is described in Section 4 in the "QP Programmer's Manual »
Download QP with the QK extended context switch for the x87 FPU » Quantum Kernel (QK) Supports Thread-Local Storage (TLS)
Starting from version 3.2.04, QK supports "Thead-Local-Storage" (TLS), which is used for example in the Newlib C-runtime library. The concept of "Thead-Local-Storage" (TLS), as it is used in the Newlib's reentrancy model, might not be quite obvious at first glance, so here is a short description how it works (see ESD article "Embedding with GNU: Newlib" ). Once you know the details, it will be clear how to make sure you set it up properly in your system. Newlib declares one _reent structure and aims the global _impure_ptr pointer at it during initialization, so everything starts out correctly for situations where only one thread of execution is in the library at a time. To facilitate multiple contexts, you must take two additional steps: you must pro-vide one _reent structure for each execution thread (active object in QF), and you must move _impure_ptr between these structures during context switches. QK supports the TLS concept and provides a context-switch hook QK_TLS(), which is invoked every time a different task priority is processed. The following code fragment from QK port to the Altera Nios II processor defines the macro QK_TLS() for re-assigning the Newlib's _impure_ptr during context switches:
#define QK_TLS(tls_) \
if ((tls_) != (void *)0) { \
_impure_ptr = (struct _reent *)(tls_); \
} else ((void)0)
While the QK_TLS() macro will move the _impure_ptr, you are responsible for allocating the _reent structure in each active object that actually uses the Newlib facilities. In other words, you have the option of not allocating the _reent structure and not performing the _impure_ptr re-assignment for those active objects that don't use the Newlib. The QDPP sample application for Nios II processor provides an example of using the TLS selectively for Philosopher active objects, but not for the Table active object. The Philosopher active objects use the TLS as follows (see file
class Philosopher : public QActive {
private:
struct _reent tls_; // thread-local storage (TLS) for NewLib
uint8_t num_; // number of this philosopher
QTimeEvt timeEvt_; // to timeout thinking or eating
. . .
};
Next, the pointer to the TLS storage is passed to the framework during the invocation of the QAc-tivre::start() function:
void philosopherStart(uint8_t n, uint8_t prio,
QEvent const *qSto[], uint32_t qLen)
{
Q_REQUIRE(n < N);
_impure_ptr = &l_philo[n].tls_; // initialize the TLS for NewLib
_REENT_INIT_PTR(_impure_ptr);
TableEvt ie; // initialization event
ie.philNum = n;
l_philo[n].start(prio,
qSto, qLen,
&l_philo[n].tls_, // pointer to the TLS
0,
(QEvent *)&ie);
}
In contrast, the Table active object does not use the TLS, and simply passes the NULL pointer to the QActive::start() method (see
QP integrated with emWin from SEGGER
|
||||||||||||||||||||
Latest Development Kits
|
||||||||||||||||||||