Hello and welcome to the Modern Embedded Systems Programming course. My name is Miro Samek, and in this lesson, you'll learn about "debugging by printf" as the most common software tracing technique. Today you'll see how to implement software tracing with printf on your TivaC LaunchPad board, and you'll also explore some shortcomings of this primitive technique. In any real-life project, getting the code written, compiled, and successfully linked is only the first step. The system still needs to be debugged, tested, and optimized. So far in this course, the primary tool for that was the single-step debugger, where you set breakpoints and, after a breakpoint is hit, you inspect registers, variables, and memory. In case of any problem, the biggest question is always: "How did I get there?" The call stack might provide some help as to which functions were called in which order, but the information is minimal. In event-driven systems, specifically, the call stack is of less value because event handlers return after each event, thus removing any trace of their execution from the call stack. In the end, the biggest problem with a single-step debugger is that it stops the running system and completely changes its real-time behavior. It's like trying to study a living organism by first killing it. What you really need is a method to observe the live interactions within the software running at full speed or close to full speed. For that, most software developers turn to the tried and true printf debugging technique. In that method, printf statements are placed throughout the code, which is called "instrumentation," so that the code itself reports what it is doing. Such "Debugging by printf" (or sprintf, snprintf, and similar functions) is the most primitive of a class of techniques collectively called "Software Tracing." You saw an example of this technique in lesson #42, where the hierarchical state machine produced a trace of all actions it has executed. But that was done on a host computer. Today, I'd like to show you how to do it on your embedded board. For that, let's go back a few lessons to lesson-41. Copy that directory and rename it to lesson-45. Get inside the lesson-45 directory and double-click on the project "lesson" to open it in the uVision IDE. To remind you quickly what happened back in lesson 41, you have created a hierarchical state machine for a TimeBomb active object. The state machine was triggered by the button presses and time events. It turned the LEDs on and off to emulate various actions. Your job for today is to instrument the code such that it will also produce a trace output in real-time about the performed actions. Given this problem statement, any C programmer would typically turn to the standard C library function "printf". This function requires the header file. printf() takes a format string as the first parameter, which it copies to the output, but also performs percent-substitutions. For example, format string printf("LED-%s is %d\n", "red", 1); will produce the output "LED-red is 1" new-line, in which all % signs are substituted with the values of parameters following the format. OK, so let's put such printf() instrumentation in all functions for the LEDs. And also, in the SysTick_Handler ISR, to trace the button presses. All right, so let's just try to build the project. Well, what do you know! All this compiles and links error- and warning-free! But when you try to run it... it gets stuck at a hardcoded breakpoint somewhere inside sys_open, before even reaching the main function. Well, it turns out that the printf() facility requires a more elaborate configuration. Since printf() is a C-library function, the first step is to enable MicroLIB, which is a very lightweight version of the standard C library specifically designed for embedded microcontrollers. So, let's see if that helps. This time around, you reach the main() function, which is significant progress. But when you attempt to run the code, you still hit a hardcoded breakpoint, but this time in fputc(), which is another C library function called from printf. So, what's going on with these hardcoded breakpoints, anyway? Well, it seems that the MicroLIB library provided with the Keil uVision toolset implements all standard functions. But the library cannot fully define hardware-specific functions. So instead of not defining such functions like fputc() at all, the library apparently provides dummy implementations with hard-coded breakpoints. The solution is to provide the definitions of such functions in your application code. Then, due to the linking rules for libraries, which you've learned in lesson 14 about the startup code, the linker will take your functions instead of the versions from the library. Alright, so let's just provide an empty definition of fputc(). Per the C documentation, the fputc() function takes the character to send and a FILE pointer. Upon success, it returns the character it sends. You'll still need to decide where to send the character, but let's ignore it for now. When you build and load the code now, set a breakpoint inside your new fputc() function. Run the code, and after hitting the breakpoint, you can see on the call stack that fputc() is called from the printf_core(), which is called from your TimeBomb_wait4button and ultimately from main(). Also, when you keep running the code and hitting the breakpoint, you can see that your fputc() function receives different characters. All this looks very reasonable. Finally, when you remove the breakpoint and run the code free, you can see that your TimeBomb works as before, although it still does not produce any trace output. So, now, you need to finish the job with fputc() and finally send the characters somewhere where you can see the output. For that, you have a few options: Your first option is to send the output to the debugger. Specifically, the Cortex-M processor has a built-in ITM module, which stands for Instrumentation Trace Macrocell. The ITM is specifically designed for supporting the "printf-style debugging". Unfortunately, ITM requires a more advanced hardware debugger probe than the Stellaris ICDI installed on your TivaC LauchPad board. The hardware debugger doesn't need to be expensive, just a little more modern. For example, the ST-Link debugger provides tracing capability based on the ITM. In fact, I've adapted the TimeBomb example for the very inexpensive STM32 NUCLEO-L152RE board equipped with the ST-Link hardware debugger. Let me quickly demonstrate the ITM option with the NUCLEO project, which will be included in the project downloads for this lesson. After this, I'll go back to TivaC to explore other options for sending the tracing output. The NUCLEO project uses exactly the same main.c with the TimeBomb hierarchical state machine, but obviously a bit different bsp.c implementation than TivaC. However, the BSP for NUCLEO has the same tracing instrumentation with printfs. The essential difference is that the fputc() implementation calls ITM_SendChar() to output the character to the Instrumentation Trace Macrocell. The other difference is the ST-Link hardware debugger setup, where you need to enable tracing and correctly set the core clock because leaving it at the default value won't work. For that, you could set a breakpoint at QF_onStartup() and run the debugger. Once the breakpoint is hit, you point to the SystemCoreClock CMSIS variable and read its value. Now, go back to the trace setup and convert the SystemCoreClock value to Megaherz. You need to copy the result into the Core Clock field in the trace setup. Also, as you are at it, make sure that "Ignore packets with no SYNC" is unchecked. Alright, so finally, you can start the debugger again and open the view: Serial Windows-Debug (printf) Viewer. Now, when you continue, the window shows the first printout from the top-most initial transition. When you press the button on the NUCLEO board, you get the real-time trace of the TimeBomb activity. Alright, so that was a quick demonstration of the printf debugging with the ITM. While the programming was relatively simple, the method requires an appropriate hardware debugger probe, such as ST-Link, which is external hardware to the microcontroller. Also, you must run the code under a software debugger, which then displays the output. For these reasons, often a better solution is to make the microcontroller produce the tracing output by itself, without any external hardware, which leads us back to the TivaC LauchPad project. Here, specifically, you need to implement fputc() using a piece of communication hardware already present in your microcontroller. This peripheral is called UART, which stands for Universal Asynchronous Receiver and Transmitter. I don't want to go into too much detail about how UART works, but it uses just one Tx pin to transmit data. The transmission happens by modulating the voltage level on the Tx pin. The speed of the transmission, called the baud rate, is determined upfront and must match the rate expected by the receiver within 1% or so. For example, if the transmitter uses baud rate 115,200 bits per second, the serial receiver must also be configured to 115,200 bits per second. Your TivaC has eight such independent UART peripherals, but UART0 is special because it is already connected to the USB cable and shows as a virtual COM port on the host PC. This means that you won't need any additional wiring to access the serial data transmitted through UART0. The presence of the virtual COM port is, actually, one of the main reasons why I selected the TivaC LaunchPad board for this video course some years ago. So, here is the code for your fputc() to send a character to the UART0: First, you must wait as long as the UART0 is busy by polling the BUSY bit in the UART FR register. And next, you need to write the byte into the UART0 Data Register. But while sending bytes is easy, the UART requires some non-trivial initialization. Here is the code. As the comments explain, first you need to enable the clock gating for the UART0 peripheral and the GPIO-A, which controls the TX pin of the UART. Next, you need to properly configure the UART0 Tx and Rx pins. And finally, you configure the UART for the desired baud rate and 8-N-1 operation, which means 8-bit data, no-parity, and one stop bit. The baud rate as well as other constants, need to be defined before they are used. You still need to actually call uart_init() from BSP_init(), and you need to provide the prototype of the function at the top of the file. Alright, let's check how this works... Now, I will specifically NOT go into debugger, but only upload the code to the board because the UART does not need the debugger support as ITM did previously. But instead, you need to receive the printf output via the virtual Serial port. To see that, you can open the Windows Device Manager, where in the section Port (COM & LPT), you should see Stellaris Virtual Serial Port. On my machine, the COM port number is 3, but it might be different on your PC. Now, you will need a serial terminal utility to actually show the printf output. Many such terminals are available for download. I happen to have here a simple and freeware terminal called Termite from CompuPhase. A link to the download will be provided in the video description. Termite typically detects the virtual serial port and connects to it, but you can also adjust the settings. Here, the important value is the baud rate, which as you remember, was configured to 115200 in your TivaC. Now, when you press the reset button on the board, you can see the first printout from the top-most initial transition. This is awestome! Now, when you press the SW1 button on the board, you can see the sequence of blinks and then boom! Hey, it works! Let's compare it to the trace produced by the ITM on the NUCLEO board. As you can see, they are... identical! So, here you have it: software tracing with printf for your TivaC LauchPad board. But this is just a first cut because, for real-life use, you can still significantly improve the design. For starters, you should be able to deactivate and activate the tracing in your code easily. In a minute, I will explain the reasons why you might NOT want to have instrumentation like that in your code permanently, but now let's just see what I mean by deactivating the instrumentation. Because I don't mean deleting the instrumentation once you are done debugging since really, you are never quite done, are you. Chances are that you might need the instrumentation for the future maintenance of the code. Therefore, I mean keeping the instrumentation but activating and deactivating it easily. A rather naive approach is to surround all snippets of code pertaining to the instrumentation with #ifdef SPY / #endif conditionals. Then you can activate all these parts of the code simultaneously just by defining the macro SPY or deactivate all of them by commenting out the macro definition. This would even work, but it is tedious because you must not forget to surround *every* call to the printf() instrumentation with the #ifdef / #endif pair. But a far better and much more maintainable design is to define the tracing instrumentation itself as preprocessor macros, and make the macro definitions conditional. Here is how it works: As before, you start with #ifdef SPY, but now you define the MY_PRINTF() macro to call printf(). A little tricky part here is that printf() takes a variable number of parameters or no parameters at all. Luckily, the C preprocessor supports the so-called variadic macros defined with the ellipsis, and then the special ## VA_ARGS construct is used to pass the variadic parameters to printf(). In case the tracing should be disabled, the macro MY_PRINTF() is defined as zero to emulate the return value from printf(). This typically generates no code. Similarly, MY_PRINTF_INIT() is defined as (void)0, which also generates no code. With these definitions, you can now replace all instances of printf() function call with the MY_PRINTF() macro. The code supporting the printf() directly needs to be also conditionally compiled. And finally, the UART initialization needs to be replaced with the MY_PRINTF_INIT() macro. The first compilation, after the changes, builds the code with the printf instrumentation disabled because the macro SPY is not defined. But when the macro SPY is defined, the code is built with the printf instrumentation enabled. Interestingly, now you can improve and generalize the code even further. For example, in case you wanted to add instrumentation in other files, such as your main.c with the TimeBomb state machine, you wouldn't want to repeat the same definitions of the instrumentation macros there. But you don't have to if you put the snippet of code into a separate header file. Before you do, however, let's refactor the function name uart0_init() into printf_init() because in other embedded boards, the initialization might involve some other peripheral than UART0. OK, so now you can put the printf-related stuff in the my_printf.h header file not forgetting the usual inclusion guards. With that done, you can include the new header file in all modules that you wish to instrument. But wait a minute! The SPY macro is now defined only in the bsp.c file, and is not defined in main.c, which is inconsistent. The remedy is to move the SPY macro definition to the compiler options, where it will be applied consistently to all compilation units. However, all this is still not the ultimate solution. The most professional way of incorporating software tracing into your project is through the concept of a Build Configuration. A Build Configuration is a collection of settings and source code packaged together and having a distinct name. For example, so far you've been building your code for ease of debugging, and so you've been using the dbg build configuration. But maybe the settings for debugging would not be optimal for the final release of the software, so you could define a separate release configuration. Most professional tools, including Keil uVision, support separate build configurations, so today I will quickly show you how to create a "spy" build configuration optimized for software tracing. You start from the top-menu Project | Manage | Project Items. Alternatively, you can click on the "Manage Project Items" button in the toolbar. In the Project Targets column, you click on the New item and type the name "spy" for the build configuration. Click "Set as Current Target" and close the dialog box. Now, you can adjust the settings for the "spy" configuration. In the Output tab, you still have the "dbg" folder inherted from the dbg configuration. You need to change it to the "spy" folder, which you create on disk. A similar situation occurs in the Listing tab, but this time you just use the already created "spy" folder. In the compiler settings, you obviously keep the SPY macro definition because it enables the tracing instrumentation in the code. Now you can finally build your spy configuration. You can also adjust the other debug build configuration where you don't want the tracing instrumentation. You disable the tracing by removing the SPY macro definition. Again, you can rebuild your debug configuration completely separately from the spy configuration. Which leads us to the last segment of this lesson, where I'd like to discuss the benefits and costs of printf-style debugging. The technique is immensely popular, and almost everybody uses it. For example, this is essentially the only way of debugging in the Arduino world. Printf-style tracing is convenient and requires relatively little code in the application. For example, to get printf working on a different board with a different communication mechanism, you only need to modify these two simple functions. It was even less code for the ITM debugging. But at the same time, printf is about the most expensive software tracing technique you can possibly come up with. Not many developers even realize how much codespace printf adds to the final binary image, so let's take a quick look. Here I compare the map file from the debug build configuration... against the map file from the spy build configuration. If you forgot what a map file is, I introduced it back in lesson #14, where you learned about the embedded build process. So, here in the map files, the most interesting is the "Image component sizes" section, where you can see that the library contribution increased from 124 bytes in the debug configuration to 676 bytes in the spy configuration. This means that printf adds some 552 bytes of code. And this is only for the simplest string and integer format specifications. If you used floating point format anywhere in the code, like here for example... the size of printf code would baloon to over 3,700 bytes, which means that printf would contribute over 3.5 KB to the total image. To put it in perspective, the whole QP/C framework contributes only about 2.5 KB of code and read-only data. If you remember from the previous lessons, the QP framework provides hierarchical state machines, active objects, time events, and an RTOS kernel. Still, all of this is dwarfed by the complexity of printf. But the big code footprint is not the only cost of printf. A much bigger problem is the overhead in the execution time. I mean, all software tracing techniques are intrusive to some degree, but printf is particularly bad. To see just how bad, I'll use the 5-dollar logic analyzer that I've introduced in lesson #43. To better observe the timing, I've added two more GPIO test pins to the code. The first GIPIO-D pin 0 is used in the SysTick ISR, where it is turned on upon the entry and turned off upon the exit from the ISR. The second GPIO-D pin 1 is used in a similar way to time the execution of the fputc() function. So now let's load this code to the board use everything you've got: the logic analyzer and the serial terminal for printf tracing. Set the analyzer to trigger when you press the SW1 button and start the trace. Reset the board and watch the first printout on the terminal. Now, press the SW1 button. You can see the printouts on the terminal and the several milliseconds of captured logic analyzer trace. Interestingly, notice the new Boom! trace added in the TimeBomb state machine. Regarding the logic analyzer, trace D4 shows the activity of the SysTick ISR, which fires periodically, but one execution time is much longer. This is when the printf trace is produced, and it takes 0.78 milliseconds. That needs to be compared to the normal case of only... 2 microseconds for the whole ISR. This means that printf extends the time by almost 400 times. Trace D5 shows the activity inside the fputc() function, which is even worse. This is because it shows all 3 printf messages produced. All of this takes some 3.2 milliseconds Actually, here you can count the individual bytes transmitted, whereas each byte takes 87 microseconds to send. Overall, this is pretty bad. The printf formatting to ASCII characters is expensive and produces a lot of bytes, which takes a lot of time to send out. But worse of all is that all that formatting and transmitting clogs the time-critical paths through the code. In the next lesson, you'll see a much smarter software tracing approach, far less intrusive, by orders of magnitude compared to the printf-debugging. Stay tuned! If you like this channel, please give this video a like and subscribe to stay tuned. You can also visit state-machine.com/video-course for the class notes and project file downloads. Finally, all the projects are also available on GitHub in the Quantum Leaps repository "modern embedded programming course." Thanks for watching!