"Fly 'n' Shoot" GameQP/C Certification Kit

The main principle of low-power design for software is to keep the hardware in the most appropriate low-power sleep mode for as long as possible. Most commonly, the software enters a low-power sleep mode from the idle callback (a.k.a. "idle hook"), which is called when the software has nothing left to do and is waiting for an interrupt to deliver more work. The QP/C and QP/C++ Real-Time Embedded Frameworks (RTEFs) support the idle callback in all of the built-in real-time kernels, such as the non-preemptive QV kernel, the preemptive non-blocking QK kernel and the preemptive blocking QXK kernel. Also, such an idle callback is provided in all 3rd-party traditional RTOS kernels that QP/C/C++ have been ported to.

Remarks: Design for low-power is a broad subject that requires a holistic system approach to achieve a really long battery-powered operation. This example covers only certain software-related aspects of the problem.

Most real-time systems, including traditional RTOSes and RTEFs, require a periodic time source called the system clock tick to keep track of time delays, timeouts, and QP::QTimeEvt "time events" in case of the event-driven QP frameworks. The system clock tick is typically a periodic interrupt that occurs at a predetermined rate, typically between 10Hz and 1000Hz.

While the system clock tick is very useful, it also has the unfortunate side effect of taking the processor out of a low power state as many as 1000 times per second regardless if real work needs to be done or not. This effect can have a significant negative impact on the power efficiency of the system.

Additional power dissipation caused by the system clock tick

The Tickless Mode

Some real-time kernels use the low-power optimization called the "tickless mode" (a.k.a. "tick suppression" or "dynamic tick"). In this mode, instead of indiscriminately making the clock tick fire with a fixed period, the kernel adjusts the clock ticks dynamically, as needed. Specifically, after each clock tick the kernel re-calculates the time for the next clock tick and then sets the clock tick interrupt for the earliest timeout it has to wait for. So for example, if the shortest wait the kernel has to attend to is 300 milliseconds into the future, then the clock interrupt will be set for 300 milliseconds.

This approach maximizes the amount of time the processor can remain asleep, but requires the kernel to perform the additional work to calculate the dynamic tick intervals and to program them into the hardware. This additional bookkeeping adds complexity to the kernel, is often non-deterministic and, most importantly, extends the time CPU spends in the high-power active mode and thus eliminates some of the power gains the "tickless mode" was supposed to bring.

Also, the "tickless mode" requires a more capable hardware timer that must be able of being reprogrammed for every interrupt in a wide dynamic range and also must accurately keep track of the elapsed time to correct for the irregular (dynamic) tick intervals. Still, "tickless mode" often causes a drift in the timing of the clock tick.

Multiple Tick Rates

For the reasons just mentioned, the QP™ Real-Time Embedded Frameworks don't provide the "tickless mode". Instead, the QP™ frameworks support multiple clock tick rates, which can be turned on and off, as needed. The QP™ frameworks also provide methods for finding out when a given clock tick rate is not used, which allows the idle callback inside the application to shut down the given clock tick rate and also to decide which sleep mode to use for the CPU and the peripherals.

The support for multiple static clock tick rates is much simpler than the "dynamic tick", and essentially does not increase the complexity of the kernel (because the same code for the single tick rate can handle other tick rates the same way). Also, multiple static tick rates require much simpler hardware timers, which can be clocked specifically to the desired frequency and don't need particularly wide dynamic range. For example, 16-bit timers or even 8-bit timers are completely adequate.

Yet the multiple clock rates can deliver similar low-power operation for the system, while keeping the QP framework much simpler and easier to certify than "tickless" kernels. The rest of this example explains how to use the multiple static clock rates in QP/C/C++ and shows how to leverage this feature to achieve low-power software design.

The Low-Power Code

The low-power example is located in QP/C and QP/C++ distributions, in the directory with the following structure:

qpc|qpcpp/                    // QP/C/C++ installation directory
 +-examples/                  // QP/C/C++ examples directory (application)
 | +-arm-cm/                  // QP/C/C++ examples for ARM Cortex-M
 | | +-low-power_ek-tm4c123gxl/ // Low-Power example on the EK-TM4C123GLX board
 | | | +-qk/      //----------- Projects for the preemptive QK kernel
 | | | | +-arm/               // ARM-KEIL toolchain
 | | | | |   low-power-qk.uvprojx // uVision project
 | | | | +-gnu/               // GNU-ARM toolchain
 | | | | |   Makefile         // Makefile for building the project
 | | | | +-iar/               // IAR-ARM toolchain
 | | | |     low-power-qk.eww // IAR EW-ARM workspace
 | | | |     bsp.c            // BSP for the QK kernel
 | | | +-qv/      //----------- Projects for the non-preemptive QV kernel
 | | | | +-arm/               // ARM-KEIL toolchain
 | | | | |   low-power-qv.uvprojx // uVision project
 | | | | +-gnu/               // GNU-ARM toolchain
 | | | | |   Makefile         // Makefile for building the project with GNU-ARM
 | | | | +-iar/               // IAR-ARM toolchain
 | | | |     low-power-qk.eww // IAR EW-ARM workspace
 | | | |     bsp.c|.cpp       // BSP for the QV kernel
 | | | +-qxk/     //----------- Projects for the dual-mode QXK kernel
 | | | | +-arm/               // ARM-KEIL toolchain
 | | | | |   low-power-qxk.uvprojx // uVision project
 | | | | +-gnu/               // GNU-ARM toolchain
 | | | | |   Makefile         // Makefile for building the project
 | | | | +-iar/               // IAR-ARM toolchain
 | | | |    low-power-qxk.eww // IAR EW-ARM workspace
 | | | |    bsp.c|.cpp        // BSP for the QxK kernel
 | | | |    xblinky1.c|.cpp   // eXtended thread for the QXK kernel

Note: To focus the discussion, this example uses the EK-TM4C123GXL evaluation board (a.k.a. TivaC LaunchPad) with Cortex-M4 MCU. However, the general demonstrated principles apply to most modern MCUs.

EK-TM4C123GXL (TivaC LaunchPad)

Behavior
The the low-power example illustrates the use of two clock tick rates to toggle the LEDs available on the EK-TM4C123GXL board. After the application code is loaded to the board, the Green-LED starts blinking once per two seconds (a second on and a second off), while the Red-LED lights up and stays on. If no buttons on the board are pressed, the Green-LED stops blinking after 4 times. The Red-LED shows the idle condition, where the system is in a sleep mode.

When your press the SW1-Button, the Green-LED starts blinking as before, but additionally, the Blue-LED starts blinking rapidly for 13 times (1/10 of a second on and 1/10 off).

So, depending when the SW1 switch is pressed, you can have only Green-LED blinking, or both green and blue blinking at different rates. The Red-LED appears to be on all the time.

Note: Actually, the Red-LED is also turned off for very brief moments, but this is imperceptible to the human eye. Instead, the Red-LED appears to be on all the time, which corresponds to the application being mostly idle.

The behavior just described is designed for the slow human interaction with the application. However, for more precise measurements with a logic analyzer, it is more convenient to speed up the application by factor of 100. This speed up can be achieved by editing the bsp.h header file:

// The following ticks-per-second constants determine the speed of the app.
// The default (#if 1) is the SLOW speed for humans to see the blinking.
// Change the #if 1 into #if 0 for FAST speed appropriate for logic analyzers.
//
#if 0
    #define BSP_TICKS0_PER_SEC   2U
    #define BSP_TICKS1_PER_SEC   20U
#else
    #define BSP_TICKS0_PER_SEC   200U
    #define BSP_TICKS1_PER_SEC   2000U
#endif
. . .

The following logic analyzer trace shows the behavior of the low-power application at the faster time scale. The explanation section immediately following the trace explains the most interesting points:

low-power application after pressing the SW1 button

[0] The plot shows the logic-analyzer traces of the following signals (from the top): SW1 button (active low), Blinky1 (Blue-LED), Blinky0 (Green-LED) and Idle (Red-LED). The Idle callback turns the Red-LED on when the system enters the low-power sleep mode and it turns the Red-LED off when the system is active.

[1] At this time the SW1 button is depressed, which triggers an interrupt that activates both the slow and the fast clock tick rates. The clock tick interrupts trigger toggling of the Blinky0 (Green-LED) at the slower tick rate-0 and Blinky1 (Blue-LED) at the faster tick rate-1.

[2] The Blinky1 (Blue-LED) stops toggling after 13 cycles. At this point also the Idle callback turns the fast tick rate-1 off.

[3] The Blinky0 (Green-LED) stops toggling after 4 cycles.

[4] The Idle callback turns the slow tick rate-0 off. The system enters the low-power sleep mode without any clock ticks running and remains in this state until the SW1 button is pressed again.

Remarks: The Idle line (Red-LED) goes down twice as fast as the changes in the state of the Green-LED or the Blue-LED. This is because the application uses timeouts of 2 clock ticks for toggling these lines instead of just one clock tick, which then could be slower. This is not quite optimal for the energy dissipation (as the CPU is woken up twice as often as it needs to be), but it illustrates more accurately how the fixed clock rates work as well as the one-tick delay to enter the low-power sleep mode from the idle callback.

State Machines

The versions of this low-power example for the QK and QV kernels use two active objects Blinky0 and Blinky1, which toggle the Green-LED at the slow tick rate 0, and the Blue-LED at the fast tick rate 1, respectively. The state machines of the Blinky0 and Blinky1 active objects are shown below:

state machines Blinky0 and Blinky1 active objects

[0] The Blinky0 state machine, in the entry action to the "active" state, calls BSP_tickRate0_on() to turn the tick rate-0 on. This is done before arming the time event me->timeEvt0 that uses the clock rate-0.

[1] Similarly, the Blinky1 state machine, in the entry action to the "active" state, calls BSP_tickRate1_on() to turn the tick rate-1 on. This is done before arming the time event me->timeEvt1 that uses the clock rate-1.

Extended Thread (QXK)
The version of this low-power example for the QXK kernel uses one active object Blinky0 (with the state machine shown above), but instead of the Blinky1 active object, the QXK version uses an eXtended thread (QXThread) called XBlinky1, with the code shown below:

    #include "qpc.h"
    #include "low_power.h"
    #include "bsp.h"
 
    // local objects -------------------------------------------------------------
    static void XBlinky1_run(QXThread * const me);
 
    // global objects ------------------------------------------------------------
    QXThread XT_Blinky1;
    QXSemaphore XSEM_sw1;
 
    //............................................................................
    void XBlinky1_ctor(void) {
        QXThread_ctor(&XT_Blinky1,
                      &XBlinky1_run, // thread routine
                      1U); // associate the thread with tick rate-1
    }
 
    //............................................................................
    static void XBlinky1_run(QXThread * const me) {
        bool isActive = false;
 
        (void)me; // unused parameter
        for (;;) {
            if (!isActive) {
[0]             QXSemaphore_wait(&XSEM_sw1, QXTHREAD_NO_TIMEOUT);
                isActive = true;
            }
            else {
                uint8_t count;
[1]             BSP_tickRate1_on();
                for (count = 13U; count > 0U; --count) {
                    BSP_led1_on();
                    QXThread_delay(1U);
                    BSP_led1_off();
                    QXThread_delay(1U);
                }
                isActive = false;
            }
        }
    }

[0] The XBlinky1 extended thread emulates the states with the isActive flag. When the flag is not set (meaning that the system is not active), the thread waits (and blocks) on the global semaphore XSEM_sw1.

[1] After the semaphore is signalled (from the GPIO interrupt in the BSP), the XBlinky1 extended thread calls BSP_tickRate1_on() to turn the tick rate-1 on. This is done before later calling QXThread_delay() function, which in case uses the clock rate-1.

The Idle Callback (QK/QXK)

The most important functionality in this low-power example is implemented in the idle callback located in the BSP (Board Support Package). The idle callback QK_onIdle() for the preemptive QK kernel and the idle callback QXK_onIdle() for the QXK kernel are almost identical and are explained in this section.

[0] void QXK_onIdle(void) {
 
[1]     QF_INT_DISABLE();
[2]     if (((l_activeSet & (1U << SYSTICK_ACTIVE)) != 0U) // rate-0 enabled?
[3]         && QF_noTimeEvtsActiveX(0U))  // no time events at rate-0?
        {
            // safe to disable SysTick and interrupt
[4]         SysTick->CTRL &= ~(SysTick_CTRL_TICKINT_Msk | SysTick_CTRL_ENABLE_Msk);
[5]         l_activeSet &= ~(1U << SYSTICK_ACTIVE); // mark rate-0 as disabled
        }
[6]     if (((l_activeSet & (1U << TIMER0_ACTIVE)) != 0U) // rate-1 enabled?
[7]         && QF_noTimeEvtsActiveX(1U))  // no time events at rate-1?
        {
            // safe to disable Timer0 and interrupt
[8]         TIMER0->CTL  &= ~(1U << 0); // disable Timer0
[9]         TIMER0->IMR  &= ~(1U << 0); // disable timer interrupt
[10]        l_activeSet &= ~(1U << TIMER0_ACTIVE); // mark rate-1 as disabled
        }
[11]    QF_INT_ENABLE();
 
[12]    GPIOF->DATA_Bits[LED_RED] = 0xFFU; // turn LED on, see NOTE2
[13]    __WFI(); // wait for interrupt
[14]    GPIOF->DATA_Bits[LED_RED] = 0x00U; // turn LED off, see NOTE2
    }

[0] The idle callback for QK and QXK are called from the idle loop with interrupts enabled.

[1] Interrupts are disabled to access the shared bitmask l_activeSet, which stores the information about active clock rates and peripherals. This bitmask is shared between the idle callback and the application-level threads.

[2] If the SYSTICK timer is active (source of the tick-0 rate)

[3] The QF_noTimeEvtsActiveX(0) function is called to check whether no time events are active at the clock rate-0.

Remarks: The QF_noTimeEvtsActiveX() function is designed to be called from a critical section, which is the case here.

[4] If both of these conditions hold, it is safe to turn the clock rate-0 off, which is done here.

[5] The bit indicating that SYSTICK timer is active is cleared in the l_activeSet bitmask.

[6] Similarly, the bit corresponding to TIMER0 is checked in the l_activeSet bitmask.

[7] The QF_noTimeEvtsActiveX(1) function is called to check whether no time events are active at the clock rate-1.

[8-9] If both of these conditions hold, it is safe to turn the clock rate-1 off, which is done here.

[10] The bit indicating that TIMER0 timer is active is cleared in the l_activeSet bitmask.

[11] Interrupts are enabled, so the following code is no longer inside critical section

[12] The Red-LED is turned ON right before entering the low-power sleep mode

[13] The __WFI() instruction stops the CPU and enters the low-power sleep mode

[14] The Red-LED is turned off after waking up from the sleep mode

The Idle Callback (QV)
The idle callback QV_onIdle() for the non-preemptive QV kernel is slightly different, because it is called with interrupts disabled. The following listing shows the complete QV_onIdle() callback, with the most important points explained in the section below:

[0] void QV_onIdle(void) { // CAUTION: called with interrupts DISABLED
 
[1]     if (((l_activeSet & (1U << SYSTICK_ACTIVE)) != 0U) // rate-0 enabled?
[2]         && QF_noTimeEvtsActiveX(0U))  // no time events at rate-0?
        {
            // safe to disable SysTick and interrupt
            SysTick->CTRL &= ~(SysTick_CTRL_TICKINT_Msk | SysTick_CTRL_ENABLE_Msk);
            l_activeSet &= ~(1U << SYSTICK_ACTIVE); // mark rate-0 as disabled
        }
        if (((l_activeSet & (1U << TIMER0_ACTIVE)) != 0U) // rate-1 enabled?
            && QF_noTimeEvtsActiveX(1U))  // no time events at rate-1?
        {
            // safe to disable Timer0 and interrupt
            TIMER0->CTL  &= ~(1U << 0); // disable Timer0
            TIMER0->IMR  &= ~(1U << 0); // disable timer interrupt
            l_activeSet &= ~(1U << TIMER0_ACTIVE); // mark rate-1 as disabled
        }
 
        GPIOF->DATA_Bits[LED_RED] = 0xFFU; // turn LED on, see NOTE2
[3]     QV_CPU_SLEEP(); // atomically go to sleep and enable interrupts
        GPIOF->DATA_Bits[LED_RED] = 0x00U; // turn LED off, see NOTE2
    }

[0] The idle callback for QV is called from the QV event-loop with interrupts disabled.

[1] The l_activeSet bitmask is tested right away, because interrupts are already disabled

[2] The QF_noTimeEvtsActiveX(0) function is called to check whether no time events are active at the clock rate-0.

Remarks: The QF_noTimeEvtsActiveX() function is designed to be called from a critical section, which is the case here.

[3] The QV_CPU_SLEEP() macro enters low-power sleep mode with interrupts still disabled. This port-specific macro is designed to re-enable interrupts atomically with entering the sleep mode.

"Fly 'n' Shoot" GameQP/C Certification Kit