Hello and welcome to the Modern Embedded Systems Programming course. I'm Miro Samek, and in this lesson, I go back to the foreground background architecture, also known as the "superloop." This time, I focus on combining it with the low-power sleep modes of the CPU. If you're new to the Quantum Leaps channel and the Modern Embedded Systems Programming course, I introduced the foreground/background architecture in lesson #21. However, I still need to explain the use of low-power sleep modes in the "superloop" because it is tricky, and many developers, including myself in the past, get it wrong. The need to explain these aspects came up while I planned future lessons about the various ways of executing event-driven Active Objects, so this lesson lays some groundwork for the next one. However, the subject of today's lesson is important in its own right, especially because the "superloop," also known as the "bare metal" firmware structure, is applied in such a wide range of products, many of them battery-operated. So, let's begin today with the basic principles of low-power design. The power dissipated by general-purpose digital electronics consists of static dissipation due to current leakage and dynamic dissipation due to switching. Dynamic dissipation typically dominates and is caused by charging and discharging internal parasitic capacitances as the circuit switches from zero to one and one to zero. Perhaps a useful mental model of the process is that switching from zero to one is like filling a bucket (capacitance) with water (electric charge), and switching from one to zero is like tossing the water out (discharging the capacitor). This happens at every clock tick, meaning at megahertz frequencies. The amount of wasted power is proportional to how many such buckets are filled and how frequently that happens. That's why modern microcontrollers allow the software to turn the clock signal on and off for various peripherals. However, to achieve really low power consumption, the clock signal must be turned off for the CPU itself, which is called a low-power sleep mode. This poses an interesting problem because the CPU stops executing instructions, so the software is no longer in control. The CPU must thus be woken up by external means, typically an interrupt. Which brings us to the foreground/background software architecture, where the foreground consists of interrupts, and the background consists of the endless loop inside the main function. As usual, the interrupts should be short and fast, and the bulk of the work should be performed in the background loop. But now, the background loop must also enter the CPU sleep mode. The critical design decision is how exactly the background loop should detect that the CPU can be safely stopped in a low-power sleep mode. Well, many people ask this question. Here, for example, is a post on the STM32 community forum: "How to work with interrupts and low power modes when using the HAL?" HAL means "Hardware Abstraction Layer," and in the context of STM32, it implies the "superloop" architecture. The person asking this question provides his sample code, which is based on the global bitmask of flags, where each bit is associated with a task to be performed in the background loop. The flags are set in the foreground, like here in the callback functions invoked from the interrupt service routines. The advantage of having all the task flags grouped together in a single bitmask is that the background loop can quickly check whether all the flags are cleared, meaning there are no tasks to perform. In that case, the background loop suspends some peripherals and enters sleep mode. This stops the code execution until an interrupt occurs. When the CPU wakes up, the background loop resumes and checks which individual flags are set. For all set bits, the loop calls the associated handler functions and also clears the bits to indicate that the tasks are done. This looks like a reasonable starting point, so let's actually implement this design and run it on your TivaC LauchPad. For this, you can copy the original project for lesson #21 about the "superloop" and rename it to lesson #52. Here, let me quickly explain the new structure of the companion code downloads, which now are self-contained and include all the dependencies and projects for various boards, not just the TivaC LaunchPad. For example, many code downloads come with projects for the STM32 NUCLEO-C031C6 board. The work to support this Cortex-M0+ board and perhaps others as well is ongoing, but the downloads are gradually updated and maintained. But going back to the code for this lesson, let's get into the tm4c-keil directory and double-click on the micro-vision project lesson. Now, let's copy the low-power "superloop" pseudocode from the ST discussion forum and paste it into the main.c file in your micro-vision project. Since this is just pseudocode, I need to make it more specific and improve its formatting in the process. The initialization is board-specific, so it'll be a call BSP_init(). Now, inside the "superloop", all steps related to entering the sleep mode will be accomplished in the BSP_goToSleep() function. In the task activation section, the first task will, for example, deploy an airbag. The second task will, for example, engage an Anti-Lock Braking System. Once I know what BSP facilities will be needed, I must declare them in bsp.h header file. Now, in the bsp.c source file, I must provide the board-specific implementation of all these facilities plus the Interrupt Service Routines (ISRs). The interrupts will be triggered by the onboard buttons, which are connected to the GPIOF port. Specifically, pressing SW1 will activate task-1 and pressing SW2 will activate task-2. I should not forget to provide the definitions of the button pins and, most importantly, the global bitmask of task flags. The BSP_init() function needs to be extended to initialize the GPIOF pins for the buttons, and it also needs to configure and enable the GPIOF interrupt. The BSP_deployAirbag() function will turn the onboard red LED to emulate airbag deployment. Similarly, the BSP_engageABS() function will turn the onboard blue LED to emulate the ABS engagement. Finally, the most interesting for today BSP_goToSleep() function will put the CPU to sleep by means of the ARM Cortex-M WFI instruction: Wait-for-Interrupt. This is equivalent to the original function HAL_PWR_EnterSLEEPMode() with the PWR_SLEEPENTRY_WFI argument. Additionally, the go-to-sleep function will count the number of its invocations using a global counter, which will be helpful for debugging. This would be all for now, so let's try to build the project... Alright, so let's run it! So here is my arrangement of the various debugger views. I have CPU registers, source code, disassembly window, the call stack, and watch view with two critical variables: the global isr_flags bitmask and the goToSleep counter. My first test is to just run the code free. You can see that the goToSleep counter has immediately incremented, but only once. When I break into the code, I find it inside the BSP_goToSleep() function called from main(). This, plus the value of the goToSleep counter, means that the CPU was indeed stopped at the Wait-For-Interrupt instruction. If the CPU was not stopped, the goToSleep counter would be in the millions. I can repeat it, but every time I find the code stopped at WFI inside the BSP_goToSleep(). Now, I add a breakpoint in the GPIOF interrupt handler and let the code run. Next, I press the SW1 button on the board. My breakpoint is immediately hit, so I step through the code to verify that the task-1 flag is set in the global isr_flags bitmask. I keep stepping to find out where the interrupt returns and what happens inside the background "superloop." Indeed, the loop continues from goToSleep to checking the global bitmask. It finds the task-1 flag, clears it, and calls the deployAirbag function. That function turns on the red LED and returns. The test-2 flag is not set, so the loop wraps up and goes back to sleeping. I can repeat it, and it works as before. Now, I remove the breakpoint and run the code free. Pressing the SW1 button increments the goToSleep counter, but sometimes by more than once, which is expected due to the bouncing of the mechanical switch. Now, I break into the code and reset the target to start over. Pressing SW1 turns the red LED, and pressing SW2 turns the blue LED, which verifies that my second task engageABS(), also works as expected. So far, so good. What's not to like? Well, for starters, if you watched lesson #20 about race conditions, you should immediately realize that this code is chock full of them. Specifically, the isr_flags bitmask is shared between the background loop and interrupts, so every time it is accessed, you have a potential race condition. But in lesson #20, you also learned that you can avoid such race conditions by making sure that every access to the shared variable occurs within a critical section, that is, with interrupts disabled. So, let's try to apply this to the background loop, whereas the functions for interrupt disabling and enabling will be defined in the BSP. Starting from the top of the loop, you disable interrupts right before testing the shared isr_flags bitmask. Now, it seems obvious that you must enable interrupts before going to sleep because only interrupts can wake the CPU up. You will see why this is problematic in a minute, but I've done something like that in the early days of my career, and I keep seeing this published in books, so it must be popular in the field and therefore requires explanation. But for now, let's press on with your immediate goal of eliminating race conditions. So, after goToSleep() returns, you disable interrupts to proceed to the testing of the shared variable in the next step. Please note that in case the if-branch wasn't taken, interrupts remain disabled, so all possible execution paths are covered. You repeat the same pattern of enabling interrupts before calling the task function and disabling them after it returns to proceed to the next task. Of course, you keep clearing the task bit inside the critical section. After the last task, you enable interrupts so that they can be serviced before looping back to the top of your "superloop." Now, you still need to add the new BSP function prototypes to the bsp.h header file. And for the implementation, in bsp.c, you define the interrupt disabling and enabling functions by means of the intrinsic functions __disable_irq() and __enable_irq(), respectively. The project builds cleanly, so let's see how this works. The first order of business is to check that this version works at least as well as the previous. So, you run the code free and press the SW1 button. Red LED means that the first deployAirbag() task was called. Now, you press SW2, and the blue LED lights up, proving that the second engageABS() task was also called. When you break into the code, you can see that it sits inside goToSleep, and the isr_flags bitmask is cleared, which means that no tasks are ready. So far, so good. But now, let's investigate deeper. Reset the target to start from scratch and set a breakpoint at the first test of the shared isr_flags variable. When you run the code and stop at the breakpoint, verify that the PRIMASK register is 1, which means that interrupts are disabled. Now, set a second breakpoint at the beginning of the GPIOF interrupt handler and move the first breakpoint to the intDisable function just after goToSleep(). Press the SW1 button. This emulates interrupt occurring at this precise point in the code. I mean, the button press can occur at any time, so why not here. When you run the code, the first breakpoint hit is inside the interrupt. This makes sense because the interrupt was serviced immediately after interrupts got enabled in the superloop. Step inside the ISR to see that it sets the task-1 flag and continue to see what happens next. And... nothing else happens. In particular, the breakpoint downstream from sleeping is NOT hit. Indeed, when you break into the code, you find it sleeping in the goToSleep function, but the isr_flags bitmask is still 1, which means that task-1 is ready to run. Only now, when you continue, you hit the first breakpoint, and when you continue again, you see the red LED light up. All this is really bad because it literally means that the CPU fell asleep at the switch. Specifically, the airbag was not deployed on time, and instead, the CPU was put to sleep until some other completely unrelated interrupt would wake it up. The mechanism of this failure is quite simple. If the interrupt occurs during the first critical section, it will be serviced immediately after interrupts are re-enabled. However, due to its simplistic, non-preemptive nature, the "superloop" is committed to going to sleep no matter what, and so it does. So, let's now discuss the ways you can fix the problem. It turns out that on the Cortex-M CPUs this "superloop" structure can be salvaged by replacing the Wait-for-Interrupt instruction with Wait-for-Event. As described in the "Definitive Guide to ARM Cortex-M3/M4 Processors" by Joseph Yiu, WFE is conditional and relies on some internal flags maintained by the Cortex-M CPU. To work around some hardware defects in early Cortex-M3s, Joseph Yiu recommends applying WFE in a more elaborate sequence. That's why you see this sequence in the STM32 implementation of the HAL_PWR_EnterSLEEPMode() function referenced earlier in this video. However, I will not investigate this solution any further because it only applies to ARM Cortex-M. In most other CPUs, enabling interrupts before entering low-power sleep mode can't be made to work safely in a "superloop" architecture. On the other hand, virtually all CPUs, including ARM Cortex-M, support entering low-power sleep modes atomically with interrupts still DISABLED. Indeed, several years ago, in 2007, I surveyed various embedded microcontrollers, many outdated by now, in my article "Use an MCU's low-power modes in foreground/background systems." The motivation for that article was precisely to avoid the problem caused by disabling interrupts before going to sleep. My old article is not alone, of course, as the various microcontroller vendors also publish techniques for entering sleep mode with interrupts still disabled. One notable example is the MSP430 microcontroller family from Texas Instruments, one of the industry's lowest-power designs. The application notes and code examples for MSP430 always enter the sleep mode with interrupts disabled and re-enable interrupts atomically when going to sleep. This way of implementing low-power modes in the "superloop" is also recommended for Cortex-M. Indeed, let's go back to the STM32 support forum and find the original question I used at the beginning of this video. The question has an answer, where an ST expert provides the recommended code structure for a low-power "superloop." The expert-recommended architecture differs in some essential aspects from what you have at this point, so let's copy, paste, and adapt it for your project. The recommeded top of the "superloop" is essentially the same, whereas your version just hides the sleep transition in the BSP_goToSleep() function entered with interrupts disabled and re-enabling interrupts after the Wait-For-Interrupt instruction. However, the recommended task-processing code does not follow directly, but rather is executed in the else-branch. It also differs in other aspects, so let's adapt it. The main difference here is that there is only one critical section, where you are supposed to select only the task to execute and clear the task bit in the bitmask. Regarding the selection of the task to execute, this is an ideal use case for a pointer to function, which I will name 'task'. If you don't remember what pointers-to-function are, they were used extensively in lesson #39 about the optimal state machine implementation. Since you're supposed to choose only one task, you need to put the other selection in the else-branch. Now, regarding the task execution, I use the explicit syntax with pointer de-referencing to make it absolutely clear that this is a call via the pointer named "task." But I still need to define the "task" variable and the typedef for its task_handler type. Finally, I need to do something in case none of the if branches are taken because this will leave the "task" pointer-to-function uninitialized, so the subsequent task execution will certainly cause a crash. This should never happen in a correctly running scheduler that you've just built, so that is an ideal place to use an assertion. If you are unfamiliar with embedded assertions, they were covered in lessons #47 and #48. Here, I just call the assertion handler already defined in the BSP. This should be all, so let's try to build the project. All right, I still need to provide the assertion handler prototype in bsp.h. And I have to delete the superfluous for-ever directive. No errors an no warnings. So, let's test this... As usual for today, the first order of business is to verify that it works in a nominal case. I just run the code free and press SW1 button: red LED and SW2 button: additional blue LED. So far, so good. Now, let's look deeper. Reset the target to start over and set a breakpoint inside the critical section at the top of the "superloop." Run the code free and when it hits the breakpoint verify that PRIMSK is set, meaning interrupts are disabled. Set a breakpoint in the GPIOF interrupt handler and at the WFI instruction in BSP_goToSleep(). Only now, press the SW1 button and run the code free. This time, the first breakpoint hit at the WFI instruction and, quite specifically, NOT in the interrupt handler because interrupts are still disabled. Move the breakpoint from WFI to the next instruction downstream. When you run the code free, your new breakpoint is hit, and the interrupt is still not called because interrupts are ony now just about to be enabled. When you run the code free, you finally hit the breakpoint in the interrupt handler. Step through to verify that it sets the task-1 flag and watch where it returns. Eventually, you end up at the top of your "superloop," with interrupts disabled. The isr_flags bitmak is not zero, meaning some of your tasks are ready to run. You skit going to sleep, find that the task-1 flag is set, clear it, and select BSP_deployAirbag() task function to execute. Finally, you enable interrupts and call the task function. Alright, so this is the generally recommended "superloop" structure that incorporates low power sleep modes. It took a while to build, but, as the ST expert mentioned in his post, what you have just built is quite a powerful cooperative scheduler. The task functions can be activated (by setting the corresponding bit in the global bitmask) not just from interrupts, but also from other tasks. Therefore, the name "isr_flags" is a bit of a misnomer. A better name would be "task_flags" or perhaps "ready_set" because this bitmask represents the set of all tasks that are ready to run. In fact, let me just refactor the code by globally replacing the name. Alright, so this is all regarding the general "superloop" structure. However, in the remaining few minutes of this lesson, I still need to address the Cortex-M-specific issue of disabling interrupts selectively with the BASEPRI register rather than indiscriminately with PRIMASK. BASEPRI is only available in ARMv7-M and higher architectures, such as M3 and higher, and is not available in Cortex-M0/M0+. Here, I must defer again to Joseph Yiu's "Definitive Guide," but basically, BASEPRI disables interrupts only up to the specified priority level and does not affect higher-priority interrupts, which run with the so-called "zero latency." Please note, however, that the high-priority interrupts are not allowed to access any resources used by your "superloop" or the lower-priority interrupts. So, let's apply the BASEPRI interrupt-disabling strategy to today's code. Fortunately, all necessary changes will be confined to the BSP, as it should be. First, you define the BASEPRI priority cutoff level. The level 0x3F should work for any ARMv7-M CPU with 3 or more bits of interrupt priority. At this point, it is also convenient to define the CMSIS priority cutoff, useful for setting the interrupt priorities with CMSIS functions. Now, you re-implement interrupt-disable by setting the BASEPRI register to the previously defined cutoff value. Interrupt-enable becomes now setting BASEPRI back to zero. Now, most interestingly for today, the transition to sleep mode will need to combine the use of PRIMASK and BASEPRI registers to take advantage of the special properties of PRIMASK described by Joseph Yiu. Specifically, the goToSleep() function is now entered with BASEPRI set, but PRIMASK clear. So, before the WFI instruction, you must set PRIMASK and clear BASEPRI. After the WFI you clear PRIMASK, as before. The last set of changes involves the interrupts because once you apply the BASEPRI critical section policy, it is crucial to explicitly set the priority of the interrupts that interact with your scheduler to the cutoff level or numerically higher. If you don't do this, the interrupts will not respect your critical sections, and you'll have race conditions. Also, now that your interrupts can have different priorities, they can preempt each other, so you need to protect the shared "task_set" with a critical section as well. And this concludes this lesson about using low-power sleep modes in the ubiquitous "superloop" architecture SAFELY. It turns out that adding sleep modes required creating a small cooperative scheduler, which will be useful in the next lesson. Finally, please note that the requirement for entering the sleep mode atomically applies only to the simple "superloop" and does not apply when you use a more advanced architecture, such as a preemptive kernel or a preemptive RTOS. The main difference between a preemptive kernel and a "superloop" is that the kernel enters the low-power sleep mode from the lowest-priority idle thread. This idle thread is not committed to entering sleep mode because it gets immediately preempted by any threads that might become ready to run. If you like this channel and would like to see more lessons like that, please give this video a like and subscribe to stay tuned. You can also visit state-machine.com/video-course for the class notes and project file downloads. Finally, all the projects are also available on GitHub in the Quantum Leaps repository "modern embedded programming course." Thanks for watching!