Welcome to the Modern Embedded Systems Programming course. My name is Miro Samek and in this lesson I'll continue the subject of the startup code. Today, you will learn about the embedded software build process so that you can replace the generic vector table from the IAR library with your own. As usual, let's get started with making a copy of the previous "lesson13" project and renaming it to "lesson14". If you are just joining the course, you can download the previous projects from state-machine.com/quickstart. Get inside the new "lesson14" directory and double-click on the workspace file to open the IAR toolset. If you don't have the IAR toolset, go back to "lesson0". To quickly summarize what happened so far: at the end of the last lesson you encountered an important data structure at address 0 in ROM. This turned out to be the "vector table" of your ARM Cortex-M processor. The vector table you saw, however, turned out to be incomplete compared to the vector table described in the datasheet for your specific TM4C microcontroller. This was because you were using the default startup code from the IAR library. In this lesson, you will replace this generic vector table with your own. But to understand how the replacing would work, you need to step back and take a deeper look at the embedded software build process. So, here is a diagram that shows the main steps of building your embedded project. First, I'd like to make sure, however, that you understand that all these steps are performed by tools such as the IAR Workbench on a desktop computer, called the host machine, even though the produced program is for a completely different computer, such as your LaunchPad board, called the target machine. This aspect, called cross-development, is very characteristic for embedded systems. It just doesn't make sense to run the compiler and linker on the small embedded target machine, such as the LaunchPad board. This is also in stark contrast with software development for the desktop computers, where you typically both develop and run the software on the same machine (so the host is also the target). This type of software development is often called "native development". But going back to the embedded build process, the source files, such as main.c and delay.c are fed to the C-language compiler, which turns them into the so called object files main.o and delay.o. Next, all object files from the project, together with any standard and other libraries, as well as the linker script, are fed to the linker, which combines them into the final program. At this point, most textbooks would simply tell you that an object file contains "relocatable" machine code that is not directly executable, because it is not yet committed to any specific address in memory. It is the job of the linker to combine all the objects, resolve the cross-module references, and fix the addresses. I could also leave it at that, but this course is about looking under the hood, so I think it is worth while to see how object files are organized and what "relocatable code" really means. In your project, object files are located in the sub-directory Debug\Obj. Here, among others, you can find delay.o and main.o. If you open one of them in a text editor, you would see mostly garbage, because it is a binary file. However, even viewed as text, you should recognize that the file appears to contain distinct sections. Also, the first few ASCII characters of the file spell out 'E', 'L', 'F'. This is the indicator of the ELF file format, which stands for "Executable and Linkable Format, also known as "Extensible Linking Format". ELF is not the only format for object files, but it is one of the most popular formats used by modern development tools. So, I think it is a good idea for you to get somewhat familiar with ELF files and the tools that allow you to inspect them. For example, the IAR toolchain comes with the command-line tool called ielfdumparm.exe, which you can use to dump the contents of the binary ELF file in a human-readable text. Another popular tool for this is the objdump utility from the GNU compiler collection, which you can also use on the ELF files generated by IAR, because ELF is a standard format. So, here for example I invoke the IAR ielfdumparm utility in the Windows command prompt. In most of such programs, to see a quick help, just launch the program without any parameters. From the short help, you can see that the option --all allows you to dump all the sections of the ELF file. Because the listing is quite long, I capture the output into a text file main.txt. Let's open the file in the IAR IDE. As you can see, the ELF file indeed contains several sections, many of which you have encountered in the last lesson 13, such as .data, .bss. and .text, for initialized data, uninitialized data, and code, respectively. The remaining sections hold the symbolic information for the linker and a lot of sections contain debug information for the debugger. At this point, it should become clear that you should never use the size of the object file to assess the code size generated from a given .c source code, because the actual machine code is only a small part among many other parts in an object file. The only reliable source of information about the code size of various modules is the linker map file, that you encountered in the previous lesson 13. Now, let's take a look at the final image c.out produced by the linker. Again, if you quickly open it as text, you can see the 'ELF' signature at the top, so this is also an ELF file. This means that you can also dump the content of the c.out file using the IAR ielfdumparm utility, like so. So now let's open the human-readable dump of the final image and compare it with the dump of the main.o object file by putting them side by side. Let me scroll the final image to the .text16 section, which contains the code of the main function. By the way, the ELF dump utility is one of the quickest ways to see the disassembly of the generated code, without necessarily loading the program into the target and inspecting the disassembly view. You can use either the dump of a specific object file or the final image. When you compare the ELF dumps, you can see that most instructions are exactly the same in main.o as they are in the final image c.out. However, interestingly, some instructions have a different encoding. For example, the 32-bit BL instruction, by which main calls the delay function is encoded as 0xf7ff 0xfffe in the object file and 0xf000 0xf820 in the final image. What's going on here? Well, BL is a PC-relative instruction, meaning that in order to branch to the delay function the Program Counter (PC) will be incremented by the signed immediate offset encoded in the instruction itself. The problem is that the object file does not know where the delay function will end up in memory, so the BL opcode in the object file contains a generic offset 0x7fffffe. This offset is then fixed by the linker, after the linker decides where the delay function will be in memory with respect to main. But this does not end here. The linker needs to fix not just addresses of functions, but of variables as well. For example, the actual constant addresses of the variables p1, w, t, p2, and w2 are not known at compile time either. So, if you look in the constant pool in the section ??main_0 that follows immediately after the main function code, you can see that all these addresses are zero in the object file. So here again, when compared with the final ELF image, the linker had to fix this constant pool, after figuring out where to put the variables p1, w, t, and so on in the data section. From all these examples, I hope you start to appreciate the job of the linker and see for yourself what "relocatable code" really means. You can also understand now that the linker must be specific to the target processor, because it must "know" the instructions and how to fix them at the binary opcode level. This means, for example, that a linker designed for the x86 processor of your PC cannot be used to link programs for the ARM processor, even though all these tools might be using the ELF file format. You need both a compiler and a the linker for the same processor. OK, so now that you understand what kind of code is generated into the object files by the compiler, and perhaps more importantly, in which ways the object files are still incomplete, let's talk a bit about how the linker resolves the cross-module references to functions and global variables. First, you need to realize that every object provides symbols that it exports, meaning that they are defined in the object and can be used by other objects. For example, main.o exports the symbols p1, w, t, p2, w1, w2, and main. An object might also have imported symbols, that is, symbols that it needs but does not define. For example, main.o imports the function delay, because it calls it, yet does not define it. Now, resolving inter-dependencies means that the linker must match all imported references to the exported references. As the linker works on one object file at a time, it internally uses two lists: a list of exported symbols, and a list of undefined symbols in all the objects encountered so far. For example, at the very beginning of the linking process of your project, the exported list is empty and the undefined list contains only one symbol: __vector_table. This latter symbol is specific to the IAR toolset and might be different for other toolsets. The general rule is that all object files directly included in the project, such as main.o and delay.o, are always linked into the final image. Because of this, the order of linking does not matter for those files, but let's assume that the first will be main.o. As the linker processes this object file, it adds all the exported symbols to the exported list and also takes every symbol imported by the object file, such as delay, and tries to find it in the exported list. If the symbol is not found, as in this case, the linker adds it to the undefined list. So after processing the main.o object, the undefined list contains __vector_table and delay. Next is the delay.o object file. This file exports delay, which is added to the exported list and at the same time removed from the undefined list, because it is now known and resolved. The delay.o file does not import anything else, so the undefined list contains only the __vector_table symbol. At this point there are no more object files in your project, yet the undefined list is still not empty, so the linker proceeds to look through the standard libraries. Libraries are simply bundled collections of object files. However, the linking rules are different for libraries than for objects included directly. The critical difference is that objects from a library are added to the final image only if they contain symbols in the undefined list. Otherwise they are not added at all. It turns out that the __vector_table symbol is found to be exported by the object file vector_table_M.o in the IAR library rt7M_tl.a. How do I know this? Well, you can find it inside the linker map file, by searching for "__vector_table". You find that symbol in the Entry List section, where you can see that it comes from the vector_table_M.o object. Next, you search the map file for this object name, and you find it again in the Module Summary section, under the module rt7M_tl.a, which you recognize as a library by the extension .a (archive). However, it turns out that the object file vector_table_M.o has also its own imported symbols, such as __iar_program_start, BusFault_handler, and so on. So at this point, the linker applies another rule for linking libraries, which is to keep searching all object files in the current library for the undefined symbols. So again, as you can find out from the map file, __iar_program_start is found in the cstartup_M.o object in the same library, This object has some more imported symbols of its own, such as __iar_data_init3, __iar_zero_init3, __low_level_init, and some others. Interestingly, __iar_program_start imports also the main function, which is resolved immediately, because it is in the exported list and in fact, it is defined in your own object file main.o. The other undefiend symbols are resolved by the library linking rule, because they are exported by the other objects in the standard library. An interesting observation from linking libraries is that the object files in libraries typically contain only one function or one variable, as opposed to containing a whole bunch of them. This fine granularity of objects ensures that only stuff actually needed is taken from the library. If, on the other hand, the objects would contain multiple functions and variables, all this would be linked in, and so you would bloat the final image unnecessarily. So, if you ever develop your own libraries, remember to make objects small and nimble. Ideally, you should define only one function or one global variable per module. But going back to your project, the linker eventually resolves all the references from the standard libraries, the undefined list becomes empty and the linking process ends. Of course, another possible outcome is that the linker runs out of all objects and libraries, yet the undefined list still contains some symbols. In this case, you get the linker error and a dump of all the unresolved references still present in the undefined list. Typically to fix such errors you need to add an object or a library. However, sometimes you might need to change the order of libraries. Occasionally, you might even have to specify some libraries more than once in the linking order, to resolve circular inter-dependencies among them. I don't have the time here to present you an example of circular dependencies among libraries, but in the class notes to this video, I provide a web link to a good article about linking programs with libraries. Finally, in the last few minutes of this lesson let's try to apply the knowledge of the build process to the task at hand, which is to replace the generic vector table from the IAR library with your own. Well, how about you define the symbol __vector_table in your own object module linked directly into your project. This module then would be linked in BEFORE any standard IAR libraries, so the library-version of the vector table won't be used. This is exactly what you need. So, the plan is to add another C file to your project, which I will name startup_tm4c.c, because it will be specific to your TM4C microcontroller. But wait a minute, how a C module can accomplish anything at the startup time, where the machine is not ready yet for executing C. I mean, the stack pointer is not set up yet, the initialized data is not copied from ROM to RAM, and the .bss section is not cleared yet either. So, yes, you are absolutely right to raise such objections. And in fact, the startup code for most other processors can't be written in C and requires you to use assembly language. However, the ARM Cortex-M has been specifically designed to reduce the need for low-level assembly programming. But event for Cortex-M, the startup code will require some non-standard language extensions and of course you will have to be careful not to assume any initialization of the .data section or clearing the .bss section. With these caveats, let's start coding the startup_tm4c.c file. The main objective is to define the global array named __vector_table, with the layout you saw at the very beginning of this lesson in the disassembly view. So, the first attempt to provide your own vector table is to define it as an int array and initialize it explicitly at the point of definition, as you've learned in the previous lesson. For starters, let's skim over the initialization by just typing a couple of zeros to make the compiler happy, and instead let's focus first on the biggest challenge here, which is the proper placement of the vector table in memory. To see exactly where the linker has put the original vector table from the library, open the linker map file. As you can see, the default vector table was in the .intvec section at address zero in ROM. To understand how this section is defined, you need to open the linker script file project.icf. I've mentioned the linker script already in Lesson 10, where you saw how to adjust the size of the stack. Today, you also already saw the linker script in the diagram of the embedded build process. The purpose of the linker script is to tell the linker where to place the various merged program sections in the address space. So, let's open the linker script file project.icf. The part of the script delimited by the special comments is controlled by the Linker Configuration File editor. You can open this editor through the "project options" dialog box. The location of the .intvec section is controlled in the first tab of this editor. The second tab controls the start and end of ROM as well as the start and end of RAM. Finally, the third tab controls the size of stack and heap. Let me slightly change the size of the stack to show you how it gets updated in the linker script file. But going back to the .intvec section, it is defined in the "place" command at the bottom of the linker script. The section is read-only and is placed at the address specified by the variable ICFEDIT_intvec_start. All this has important implications for your startup file, because you need to make sure that your own vector table is placed in this special .intvec linker section. Unfortunately, there is no standard C syntax to place variables in specific sections. But IAR provides an extension, which looks as follows: You can specify the section by an ampersand followed by the section name in double quotes. Make sure that the map file is visible and press F7 to build the project. As you can see, the compiler happily accepted your new startup file, but the linker map file looks strange. The .intvec section is not at address zero as before, but rather has been pushed down to the RAM region. I wonder if you can tell why? Well, the problem is that your vector table is a variable that can change, so the compiler can't put into the read-only memory. To force the vector table array into the ROM, you need to define the vector table as a constant. For this, the C language provides the keyword "const". The use of the "const" keyword in C deserves a whole separate lesson, but syntactically it is very similar to the "volatile" keyword that I explained in lesson 5. Similar to "volatile", the keyword "const" can be placed either before or after the type. Again, just like with "volatile", I recommend placing "const" after the type. When you build the project and check the linker map file now, the vector table is indeed placed in ROM and at the correct address 0. Also, if you look at the object module that provides the vector table, it turns out to be your file startup_tm4c.o. So congratulations, you have achieved the objective of this lesson. Please note, however, that at this point the vector table is not initialized correctly, so the code won't work. In fact, I just found out that the code might hang up your debugger and prevent you from programming your board. Therefore, just in case, please initialize your vector table with the following safe values. This concludes the lesson about the embedded software build process and replacing the default vector table. In the next lesson, you will learn how to properly initialize the vector table with the correct stack pointer and all interrupts available in your microcontroller. If you like this channel, please subscribe to stay tuned. You can also visit state-machine.com/quickstart for the class notes and project file downloads. --- Links used in this lesson: Article "Library order in static linking" http://eli.thegreenplace.net/2013/07/09/library-order-in-static-linking Course web-page: http://www.state-machine.com/quickstart YouTube playlist of the course: http://www.youtube.com/playlist?list=PLPW8O6W-1chwyTzI3BHwBLbGQoPFxPAPM