Welcome to the Embedded Systems Programming course. My name is Miro
Samek and in this lesson I'll continue the subject of functions in C.
Today you'll learn how functions allow you to split your program into
separate files, you'll write your first recursive function, and you'll
learn about the ARM Procedure Call Standard (APCS).

As usual, let's start with making a copy of the previous "lesson8"
project and renaming it to "lesson9". If you are just joining the
course, you can download the previous projects from
state-machine.com/quickstart.

Get inside the new "lesson9" directory and double-click on the workspace
file to open the IAR toolset. If you don't have the IAR toolset, go back
to "lesson0".

Let me very quickly remind you what happened so far. In the last lesson,
you've created a function delay(), to busy-wait for the specified number
of loop iterations.

Here is the declaration of the delay() function, also known as the prototype

and here is the definition of the delay() function, which contains the
actual code.

The delay() function is then called in two places in your main program.

The fact that a single function can be called multiple times, instead of
repeating the same code verbatim, was--in fact--your main motivation for
using functions in the first place. But functions allow you to achieve
even a greater feat. Functions enable you to split the program into
multiple files rather than having to keep all the code in the main file,
which, for any non-trivial program would quickly become like a kitchen
sink.

So, the first thing I want you to do today is to move the delay function
to its own file.

First, create a new file by clicking the "New document" tool button.
Next, cut-and-paste the delay() function into the new file. And finally,
save the file as delay.c into your project directory.

At this point, the file exists on the disk, but it is not yet part of
the project.

You need to add the file to the project by right-clicking on the project
and choosing the Add and 'Add delay.c' popup menu.

When you compile your project at this stage by pressing F7, you get an
error that the delay() function has been defined without a prototype.
You can easily make this error go away by copying the prototype from
main.c to delay.c.

But this is a very lousy fix, because now the repeated prototypes can
very easily become different, if you change one and forget about the
other. This is just one more example of violating the DRY principle (Do
NOT Repeat Yourself), which I hope you remember from the last lesson.

The right solution is to place the prototype itself in a separate file,
which you could then include in all files that call the delay() function.

As before, create new document and cut-and-paste the delay() function
prototype into it. Save the file as delay.h header file.

Now you can include the delay.h header file in main.c and in delay.c
instead of repeating the prototype code.

The final touch you can add to the delay.h header file is the protection
against multiple inclusion. Such automatic protection is useful, because
header files can include other header files, which can easily lead to a
situation when a header file can be included more than once.

In fact, most header files provided in various libraries, such as the
lm4f.h header file contain this sort of protection against multiple
inclusion.

This is achieved by means of the C pre-processor as follows. At the top
of the file you have an #if-not-defined pre-processor directive followed
by the mangled name of the file. There are many conventions of mangling
the name, here for example you see two underscores pre-pended and
appended to the capitalized file name. The idea here is that this macro
is NOT defined initially, so the pre-processor WILL go beyond the
#ifndef directive the first time.

However, the next line defines this macro, so that it IS defined from
now on. Therefore, should the header file be included again, the
preprocessor will NOT go past the #if-not-defined directive and the body
of the file will be skipped up to the matching #endif directive.

So, now let's apply this technique to your delay.h header file. Copy and
paste the top two lines and change the mangled file name.

Don't forget to provide the matching #endif directive at the end.

In quick summary, you've just learned about one of the most powerful
features of the C programming language, which is the ability to build
programs from separately compiled source files. This ability, which is
enabled by the use of functions, is critically important, because having
a complete program in just one file is both impractical and
inconvenient. The way a program is organized into files can help you
understand the overall structure and enable the compiler to enforce that
structure. Such modularization is also beneficial for speeding the
compilation process, because only the changed files need to be re-compiled.

Now let's move on to explore some other properties of functions, such as
the ability to return a value. As an example, consider a function to
calculate the factorial of an integer argument n.

Let's design this function top-down, that is, starting with the
prototype and the use case. The function name is 'fact' and it takes one
unsigned argument 'n'.

The factorial function returns an unsigned value equal to 1*2*3 all the
way up to n.

With the prototype in place, you can write the code that uses the
factorial function, even before you actually define it.

An unsigned volatile variable 'x' will be used to store the values
returned by the factorial function. The variable is volatile, to prevent
the compiler from optimizing it away.

So, now here is how you call a function that returns a value. You can
simply assign the returned value to a variable. At this point, you
should also think about the allowed range of the function arguments.
Mathematically, the lowest value for which factorial is defined is zero.

Actually, strictly speaking the argument type should be an unsigned zero.

A function returning a value can also be used inside an expression
rather than being immediately assigned to a variable.

Finally, you can also just call the function  without doing anything
with the return value. This would make sense only if your function has
some meaningful side effects. However, to make clear that you don't care
about the return value, I recommend explicitly casting the return value
to void.

When you try to build your program at this stage, you get an error. But
note that this is _not_ a compiler error. This error is generated by the
linker, which is the next step of building your program, where all the
compilation units are "linked" together.

When you scroll up, you can see that the linking stage failed, because
the function "fact()" has not been found. Obviously, you have not
defined this function yet, so this reason is clear. But I wanted to
point out that this type of error is not detectable by the compiler,
because the compiler cannot know in which file the definition of this
function might reside.

So, let's define the fact() function. Copy the prototype and provide the
braces for the body of the function. Before writing the actual code, you
might want to brush up on your math by writing the mathematical
definition of factorial in a comment. In fact, there are two
definitions: iterative and recursive. For this exercise, you will use
the recursive definition, in which the factorial of zero is 1 and the
factorial of n is n times factorial of n-1 for all ns greater than zero.

Translating this into C you get: if n equals unsigned-zero, return the
value unsigned-1. Otherwise, return n times factorial of n-1. So, as you
can see, a function produces the result by means of the return
statements, which are followed by numerical expressions.

When you build now, both the compilation and linking succeed. So,
congratulations. You've just written your first recursive function that
calls itself.

All right, so now it's finally the time to see how it really works. I'll
run the code on the TI board, to show that it works on real hardware,
but you might just as well use the simulator.

Before stepping into the code, adjust the memory view to see the top of
the stack, which, as you recall from the last lesson, is stored in the
Stack Pointer (SP) register.

To help you keep track of the stack content in the memory view, I will
use an arrow pointing to the current top of the stack.

Stepping into the code, you can see that the call to your factorial
function consists of two instructions. First the argument value is moved
to R0 and then the function is called with the Branch-and-Link (BL)
instruction.

Inside the factorial function, the first thing that the code does is to
push the registers R4 and the LR (link-register) to the stack. By now
you should understand why the LR has to be saved, and this is because
factorial is not a leaf function, so it must preserve the LR clobbered
by the BL instruction when it calls itself.

The very next instruction should give you a clue why saving the R4
register is also necessary. As you perhaps remember from the last
lesson, the first argument to a function is passed in the R0 register,
but inside factorial R0 is re-used to return the value and also as the
argument to the nested factorial call, so the compiler moves R0 to R4.

Here you can see that the function moves the return value (1 in this
case) into the R0 register.

The last instruction of the function is very interesting, because it
kills two birds with one stone. As you know by now, any stack operation
executed in the beginning of a function code must be exactly reversed
before the function returns. So here the function pops the two
registers, which it pushed initially. But the content of the orignal LR
is popped directly into the Program Counter (PC), which causes the return.

Please notice that the value on the stack that was popped into PC is
actually an odd number 0x49, whereas the return address the PC has
become an even number 0x48. I explained why the least-significant bit of
the address is handled specially in the BX instruction, which was used
to return from a leaf function, such as delay(). Here you can see that
the pop to PC instruction is also a special case, and behaves like the
BX instruction.

Finally, here you see again that the function returns the value in R0,
which is then stored on the top of the stack, where the x variable lives.

In the next call to factorial(), you see that factorial calls itself.

The most important observation here is that the next call to factorial
happens before the previous call returns and pops the registers from the
stack, so it nests on top of the stack allocated by the first call.

Here you can see that after the recursive call returns, the value
returned in R0 is multiplied by the original value of the argument n
stored in R4. The result is again stored in R0 to be returned from the
function.

Back in the main function, you can see how the expression is evaluated,
whereas the factorial value is taken from R0.

The last call to factorial of 5 shows 6-levels of recursive calls. As I
quickly step through the code, please watch the stack building up with
the clearly distinguishable pattern of the decrementing arguments 'n'
and return addresses.

At this point the recursive calling stops and now all these nested calls
will return one at a time.

Again, as I step through this return sequence, please watch how this
causes the stack to unwind.

After the factorial function eventually returns all the way to main, you
can see that the result it produced in R0 is hex 78, which is 120
decimal, which is indeed the factorial of 5.

The last subject I'd like to touch upon in this lesson is the function
calling convention. From the discussion so far, I hope you have noticed
that there must be some agreement between the caller of a function and
the function being called. For example, both sides must have an
understanding that the return address will be provided to the function
in the LR register. Also, both parties must agree that the first
argument will be passed in R0 and that the return value will be returned
in R0 as well.

Of course there are many more such little agreements that all form a
whole formal contract called the ARM Appication Procedure Call Standard
(AAPCS). The whole formal document is quite complex and you can find it
online by searching for "AAPCS". Here, I would only like to mention how
the AAPCS assigns the responsibilities for the ARM registers, because
this will be very useful when you will learn about handling interrupts
on the ARM processor.

So, the registers R0-R3 and R12 are used for passing arguments and
returning values, and can be clobbered by a function. On the other hand,
the function must preserve the 8 registers R4-R11. This doesn't mean
that a function cannot use R4-R11, but if it does, the function code
must save them on the stack and restore before returning. For example,
please recall that your factorial function used the R4, but it saved on
the stack.

This convention allows the caller of the function to use the preserved
registers for values that must survive a function call. Again, recall
that the value of the argument 'n' was stored in R4 exactly to survive
the recursive call to factorial, because it was needed in the final
multiplication.

Speaking of the factorial computation. The recursive implementation that
I've used in this lesson was just for a convenient demonstration of a
deep call sequence so that you can watch the stack grow and shrink. But
in fact, any such deep call sequences should be avoided in embedded
programming, exactly because they use a lot precious RAM for the stack.
A much better implementation of factorial would be the iterative version
or, perhaps better yet, a lookup table.

This concludes this lesson about modules, functions that return values,
recursive functions, and the ARM Application Procedure Call Standard.

However, we are still not done with functions. In the next lesson, you
will learn more about function arguments, including pointer arguments,
and you'll learn about scope and local stack-based variables. Finally
you will see what can happen when you overflow the stack.

If you like this channel, please subscribe to stay tuned. You can also
visit state-machine.com/quickstart for the class notes and project file
downloads.

---
Course web-page:
http://www.state-machine.com/quickstart

YouTube playlist of the course:
http://www.youtube.com/playlist?list=PLPW8O6W-1chwyTzI3BHwBLbGQoPFxPAPM