Protothreads versus State Machines

share on: 
Table of Contents

For a number of years I’ve been getting questions regarding protothreads and comparisons to state machines. Here is what I think:

Protothreads in a Nutshell

Protothreads are an attempt to write event-driven code in a sequential way. To do so, protothreads introduce a concept of “blocking abstraction” to event-driven programming–something that event-driven programming is trying to get rid of in the first place.

Obviously, to be compatible with event-driven programming, protothreads cannot really block, at least not in the same sense as traditional threads of a conventional blocking RTOS can. Instead, a protothread is still called for every event and still returns to the caller without really blocking. However, when a given event does not match the “blocking abstraction”, the protothread returns without doing anything and without progressing. Only when the current event matches the “blocking abstraction” the protothread advances to the next “blocking abstraction” and also returns.

Please note that protothreads allow standard flow control statements, such as IF-THEN-ELSE and WHILE loops between any two “blocking abstractions”. Therefore the program feels more “natural” for designers used to the traditional sequential programming style. You simply see the expected sequence of events.

Prototheads as a Simplification

Protothreads are indeed a simplification, but only for sequential problems, in which only specific sequences of events are valid and all other sequences are invalid. Examples for using protothreads include the sequence of events for initializing a radio modem (the AT-commands and responses).

Prototheads as a Complication

However, protothreads lose their advantage entirely if there are many valid sequences of events. This is actually the most common situation in real-life systems. Typically, even if a problem initially seems sequential, inevitably during the development new valid event sequences are discovered.

State machines are capable to handle multiple sequences of events easily. In fact, state machines are getting simpler when the sequence of events matters less. In contrast, protothreads are getting more and more complex when they need to accept more sequences of events.

State Machine Implementations

So, in the end, protothreads are an attempt to replace state machines, which are considered complex and messy by the inventors of the protothread mechanism. But, the “messiness” of state machines is obviously a very subjective statement.

A good state machine implementation technique can remove many (most) accidental complexities of coding state machines. I mean, if a state machine as depicted in a state diagram is simple, and if the code does not reflect this simplicity, the problem is with the implementation technique, not with the inherent complexity of a state machine.

The Bottom Line

The bottom line: good state machine implementation techniques eliminate most reasons for using protothreads. State machines are far more generic and flexible, because they can easily handle multiple sequences of events. In comparison, protothreads are intentionally crippled state machines that transition implicitly from one “blocking abstraction” to another executing code in between as the “actions” on such implicit transitions.


21 Responses

  1. As the author of the googlecode version of prototheads (, I (not surprisingly) look at it a bit differently. I think there are two senses in which it’s not right to say the protothreads model (or threaded code in general) is good only when specific sequences of events are valid. First, there can be many independent threads running at the same time. So looking at timeline of a black-box view of processor events — initiating external actions and receiving various external stimuli — the sequence of these events can vary enormously, for the simple reason that the threads can interleave execution in innumerable ways. So, yes, a given thread follows a sequence of events, but the overall system does not.

    Second, even a given thread may not follow a single (“only valid”) sequence of events. Suppose a thread wants to read data from an external device such as a disk; it first checks to see if the data is in a memory cache, and if so it skips the sequence of events “send a read to the disk” and “receive a reply from the disk” but the rest of the sequence is the same.

    I guess where we would agree is that for threads to be useful, you have to be able to break down the system into units of “work in progress” or tasks, and this may not be the case of many embedded system which consist conceptually of a single task. But many state machines do fit this multi-task model; there is a per-task control structure that keeps track of the current state, and this structure is shepherded through a sequence of states — not necessarily an entirely fixed sequence but making progress toward some goal. Usually, the control structure includes is a “state” variable (often a pointer to a function to call when the next event in the sequence occurs, or, equivalently, it could be an integer or enum that controls a C switch statement) that records where the task is along its journey. These tasks are running independently and can all be in different phases of their overall execution.

    The advantage I see of threads is that this “state” variable is generalized. Instead of a simple function pointer, the state is a location /within/ a function (possibly within a bunch of IF or WHILE statements), so it provides much more context. And it’s even more general; the state can consist of a location within function A, which has called B (and B may be called from several places in A), the location within B, which has called C, and the location within C where we are waiting for an external event. That is a lot of context, compared to just a simple function pointer (or a state variable).

  2. I absolutely agree with all the properties of (proto)threads that you enumerate, however I see most of these properties as problematic, whereas you see them as beneficial.

    The biggest problem of threads is that they must *block* to wait in-line for events. The problem is that while a thread is blocked, the thread is not doing any other work and is *not responsive* to other events. Such a thread cannot be easily extended to handle new events, not just because the system is unresponsive, but mostly due to the fact the the whole structure of the intervening code past the blocking call is designed to handle only the event that it was explicitly waiting for.

    Sure, as you point out, one can use multiple threads to wait for multiple events in parallel. But this leads to unnecessary, explosive proliferation of threads. Designers, unable to add new events to unresponsive threads are forced to create new threads, regardless of coupling and cohesion. Often the new feature uses the same data and resources as an already existing feature (such features are called cohesive). But unresponsiveness forces you to add the new feature in a new thread, which requires tricky *synchronization* of the threads and
    caution with sharing the common data. So semaphores, mutexes and/or other such *blocking* mechanisms must be applied and the vicious cycle tightens. The designer ends up spending most of the time not on the feature at hand, but on managing subtle, hairy, unintended side-effects.

    Finally, regarding the generalization of “state variables” in (proto)threads, I think that this is actually a crippling *specialization* of a very powerful and simple concept of “state”.

    A thread preserves the context of computation in the call stack and the program counter. This is done to match the traditional imperative, procedural languages, like C or C++, so that designers used to the sequential way of thinking can continue to apply it to non-sequential event-driven problems. The stack and program counter context of a thread allow you only to progress from one “blocking abstraction” to another. Granted, you can use any number of IFs, ELSEs, WHILEs, and function calls between the “blocking calls”, but then you must use some conditions for your IFs and WHILEs, and these are problematic, as I will try to illustrate below.

    In contrast, the concept of “state” captures only the *relevant* aspects of the system’s history and can easily ignore all irrelevant event sequences. For example, a computer keyboard can be either in a “default” state or in a “shifted” state, depending whether the Shift key is depressed. The behavior of the keyboard does not depend on how many and which keys have been depressed in the past. The only thing that matters is depressing the Shift key, which causes a transition from “default” state to “shifted” state, and releasing Shift key, which causes the transition from “shifted” to “default”.

    Now, I challenge you to implement this simple behavior with a (proto)thread. I bet that most people will introduce a flag ‘Shifted’ that will be TRUE or FALSE depending on the state of the Shift key. Then their code will have an IF (Shifted) condition to distinguish the two cases. So, in spite of having all this rich thread context of the whole call stack and program counter, people will still use a flag! (Of course, to model a real keyboard, they will soon introduce other flags, such as CapsLocked, Alt, Ctrl, and others, all tested in convolued IFs and ELSEs so the code will quickly degenerate into “spaghetti”.)

    So, the bottom line is that the sequential (procedural) programming model of threads too often leads to violation of the Open Closed Principle (OCP) of a good design. When some aspect of a design follows the OCP, it can be extended by adding new code, rather than modifying existing code.

    Pure event-driven programming (without blocking) naturally partitions the code into small chunks that handle events. State machines partition the code even finer, because you have small chunks that are called only for a specific state-event combination. Such code naturally follows OCP, so you easily can add new functionality *without* changing the other already designed and *tested* code.

    In contrast, code based on threads requires dramatic changes whenever you add a new event. You have generally two options. Either you change the “blocking abstractions” to wait for the new event and you dramatically change the downstream code to handle the new event. Or you introduce a new thread, but then you still need to go to the original thread to provide some *synchronization* mechanism. Either way, you violate the OCP.

  3. Gosh, I am amazed at the degree to which we seem to be missing each other or talking past each other. It could be that we have in mind different kinds of functionality that we are trying to solve. I am the outsider, a guest, on your blog here, and you know the kinds of problems you want to discuss and write about here on your own blog, so I am probably off-base. In other words, for the kinds of problems you are interested in solving, your approach works best; and perhaps (although I probably haven’t convinced you of this yet) the kinds of problems I am interested in solving, threads (or protothreads) works best.

    But let me still try to get across more clearly what I am trying to say, and see if we can reach some agreement. You said when a thread is blocked, it is not doing any work and is not responsive. Well, just because this thread is not doing any work doesn’t mean the system is not doing any work (because of multiple threads). And I don’t *want* this thread to be responsive to anything but the event that constitutes the next step in its multi-step task. For example, a thread has issued a read request to a disk and is waiting for it to complete. I don’t *want* it to be responsive to any other event. This thread can’t make progress on this task until this particular read is complete, and only then it can go on to the next step, possibly involving blocking again, waiting for that particular event (such as a disk write) before going on to yet another step (or maybe now the overall task is done). What I think is nice about threads is that when you block for the next event, you don’t lose your context (the series of call frames); the code is structured as if this event happens instantaneously: you just called read_disk() and when it returns, you have your data; the fact that you had to yield the CPU to other threads during the read is hidden.

    But, again, maybe the kind of system we’re talking about is different. I am supposing there are two kinds of events (I’ll define an event as some kind of interrupt or message from outside the CPU) that can occur. The first kind are /initiation/ events, an event that initiates one of these tasks, which will sequence through a series of steps (whether implemented using a thread or a state machine). So imagine we’re building a server, such as a file server. There are many kinds of requests that we can receive: create a file, delete a file, read a file, write a file, look up a file in a directory, and so on. And we want the ability to process these requests in parallel, such that a file can be written to while another file is being deleted at the same time.

    So — and this is really the point you have emphasized — there has to be a piece of software that is *responsive* to all these many kinds of events, and does not block. Even when using threads, there has to be this kind of top-level *event-driven* message receive handler. That is, even a threaded design has (at least) this one key event-driven aspect to it. And notice that this receive handler can be extended to handle new kinds of events — it’s basically a big switch statement. (New threads may have to be written to handle these new event types.)

    Now, depending on the specific request message that we receive, the receive handler spawns a thread to handle it (unless it’s a simple request that can be handled without blocking). This new thread will be responsible for this one request, from start to finish, step by step, possibly blocking several times along the way, and finally replying to the original requester (on the other side of the network connection), and then the thread exits. (Note too that in more complex systems, a thread may reach a point where it has several sub-tasks that can be performed in parallel; in that case for better performance it can spawn a new thread (or threads) for one (or more) of the sub-tasks, and then synchronize with their completion, before going on to the next step.)

    The events that a thread waits for are the second kind of event — events that only a particular thread cares about, and it’s the *only* event that this thread cares about (right now), such as the completion of a disk read (continuing with the example of a file server).

    Depending on the details of the system, the top-level event-driven receive handler might run even in response to something like a disk read; in other words, it’s like a general-purpose interrupt handler. But in this case, it merely wakes up the thread that is waiting for that particular disk read (so, yes, it does have to be able to identify which thread to wake up, and my version of protothreads provides a very simple way to do that).

    So … are we any closer to being on the same page after that (too-lengthy) exposition?

    1. I am trying to understand which part of this discussion is the old “threads vs. events” debate, and which is specific to proto-threads.

      It seems to me that the idea of a “top-level *event-driven* message receive handler” is specific to proto-threads, because I don’t really see this being the norm in traditional RTOS-based projects. I would count it as a plus for proto-threads.

      What also seems specific to proto-threads is that they actually can share common data, as they are all called from a single “top-level event-driven message handler”, so they all cooperatively run-to-completion (to a “blocking abstraction”) in a single real thread.

      This makes it now possible to spawn different proto-threads to wait on different events in parallel, without worrying about corruption of the common set of data shared among all these proto-threads. This is all good, because this allows programmers to decompose the problem into sequences of events and applying the sequential programming paradigm that we all learned back in school and love.

      But, this paradigm also requires some synchronization among the proto-threads, which I’m not sure that I understand. Consider the following common problem. We want to send a message over a lossy connection and we want to proceed only when we hear a reply to our message. But the connection is unreliable, so we might never receive the expected reply, so we need to protect ourselves with a timeout. Now, we can use one proto-thread to wait for the reply and another to wait for the timeout. If the reply arrives, we need to terminate the proto-thread that blocks on the timeout. Conversely, when we receive the timeout, we assume that messages have been lost in transmission, so we need to terminate the proto-thread waiting for the original reply. How do we do this sort of synchronization? Isn’t it becoming actually more complicated than a single state machine that waits for both events?

  4. Thank you both for an interesting discussion. As someone who has programmed both ways and taught both ways, here’s my 2 cents worth.

    First, I don’t think the discussion is so much Protothreads vs. State Machines as it is Threads vs. State Machines. From where I sit (apologies to you, Larry, if I’ve missed something all these years), protothreads are just threads that don’t have a preemptive kernel underneath. Your clever protothread library is thus merely a light-weight alternative to a traditional RTOS. However, other than the elimination of some types of race conditions (the type for which a preemptive programmer would use a mutex), application-level code architecture is the same with threads (a.k.a., RTOS tasks) or protothreads.

    I think it’s a useful abstraction to think of all (proto)threads as having essentially the following simple structure:

        for (;;)
             // 1. Wait for an event of interest to occur (without burning CPU cycles).
             // 2. Respond appropriately.

    Where the wait step is either a semaphore pend or a message queue pend. (On a conceptual level, a semaphore is really just a special case of a message queue where the message contents are implicit in the act of sending.)

    I believe the above structure is useful when examining your differences of view. In fact, if multi-threaded code is architected so that all events of interest arrive on a single incoming message queue, cooperating (proto)threads have very much in common with cooperating state machines.

    I believe the key point in Miro’s original post is this one:

    [Threads] lose their advantage entirely if there are many valid sequences of events. This is actually the most common situation in event-driven systems. State machines are capable to handle multiple sequences of events [more] easily.

    Larry’s filesystem example is a problem of the more or less “single sequence of events” variety: a request to read a particular file arrives; a translation to drive coordinates is done; a low-level read request is issued; the thread waits; the low-level read completes and an interrupt fires; the ISR triggers the unblocking of the thread (e.g., by a post to a message queue); the thread awakes and sends the data to the requester. Here’s what this might look like in code under the further simplifications that files are always confined to a single disk read and there’s never more than one active request at a time:

        for (;;)
            // 1. Wait for a read request.
            // 2a. Translate file name to drive coordinates.
            // 2b. Call low-level read function.
            // 1. Wait for a signal from the driver ISR.
            // 2. Grab the data from the buffer and send to original requester.

    Note that we can always fold a transaction sequence like this into the simpler structure in my first code listing if we have both types of messages arrive on the same message queue and insert code to examine the messages according to implicit “sequence of event” expectations:

        for (;;)
            // 1. Wait for an event to arrive in the message queue.
            switch (p_event->type)
                case READ_REQUEST:
                    // 2a. Translate file name to drive coordinates.
                    // 2b. Call low-level read function.
              case XFER_COMPLETE:
                   // 2. Grab the data from the buffer and send to original requester.

    Miro’s technique for handling the same simple sequence is always also similar to my first code example, when viewed from the event dispatch loop:

        for (;;)
            // 1. Wait for an event to arrive in the message queue.
            // 2. Dispatch the event to the active state function, which will respond.

    When the state machines are simple, such as the filesystem example (as presently specified), the choice between threads and state machines is arbitrary. However, as the state machine grows complex–particularly if it becomes suited to the “behavioral inheritance” feature of hierarchical state machines, then Miro’s architecture is, IMHO, easier to write and maintain.

    One of the best things about Miro’s Quantum Platform state machine framework is the way that it scales naturally for use as either an alternative to threads or an additional layer of architectural structure on top of (proto)threads. But that’s not at all to suggest that protothreads aren’t also an option worth considering when preemption is not necessary.

    I hope this helps further your discussion and encourage others to jump in.

    1. I just wanted to point out that a thread routine structured around a single blocking point loses most of the benefits of sequential (procedural) programming, which perhaps is not obvious at first glance.

      The most important ability of an RTOS is to be able to block a thread at *any* point. For example, a thread loop can call function foo(), which has a complicated if-else logic inside, from which it calls function bar(), which has complicated while-loop inside, which blocks (e.g., by calling a semaphore). After the semaphore unblocks, the execution resumes *exactly* at the blocking call, so we have the whole *context* of function foo() and it’s complex if-else logic, and function bar() with its while loop, and all the stack-based variables. This is a lot of context, and that’s why RTOSes have to preserve the whole private *stack* for each thread.

      In contrast, when we limit the thread loop to a single, generic blocking call at the top of the loop, we are effectively saying that all code that follows *cannot* block and must quickly return to the main loop. But by returning we automatically lose the whole stack context.

      So, my point is that even though we can use a traditional RTOS for such programming style, we are obviously not using the most important capabilities of the RTOS (for which we are paying in RAM for the stacks and in CPU cycles for complex context switching.)

      1. I do not agree that the ability to block anywhere is “The most important ability of an RTOS”! On the contrary, the ability to do this, although occasionally useful, encourages some really bad programming practices. On this latter point, at least, I am sure we are agreed. The most important thing about an RTOS is that it provides concurrency by alowing an arbitrary set of tasks to execute, each within its own strictly bounded context. That’s regardless of whether we use these tasks to implement one or more state machines (with or without a formal framework) or not. Now that we are being forced into using multicore hardware for all sorts of reasons (most of them bad, in my view), the subject of concurrency becomes even more pertinent, but discussions about how to program concurrent systems are separate from and do not necessarily conflict with discussions about state machines.

        What I regret is that many (most?) RTOSes do not provide easy means to wait for several events at the same time, and to prioritise them. Mine (currently called SKC++) will address this if ever I get around to finishing it. See my blog at .

        I’m all for simplicity. In my world, all state information within a task (thread) is held at the top level (the task loop) and is passed to and returned from called functions as necessary. That requires that state information is held as data, as it should be, and not as the convoluted state of some code sequence, way down the tree, which suddenly decides to give way to another task! (Of course, the state information is still typically held on the task’s stack.) The top loop of the task is king and any called functions should, in most cases, do their work without blocking.

        See also my post of 14th June, lexically below this one.

        Are approaches are different, Miro, but I believe our thinking is similar.

        1. I could not agree more with both Michael Barr’s and Peter Bushell’s comments regarding intentional limiting the use of blocking in an RTOS based designs. In fact, in my earlier post to this blog “RTOS Without Blocking”, I described exactly how to use an RTOS in this way.

          However, my point in this post is different. I just observe that the bulk of the RTOS code, perhaps as much as 90% of it, is devoted to be able to block anywhere in the code. I mean here the need to maintain a separate private stack context for each task, complex context switching among those stacks, and all inter-task synchronization and communication mechanisms, which are based on *blocking*. So, my point is that if 90% of an RTOS is devoted to blocking anywhere in the code, this must be the most important ability of the RTOS.

          Now, the advice to drastically limit the blocking means effectively that 90% of the RTOS should be *avoided*! In other words, this is equivalent to saying that RTOS is *not* the right tool for the job.

          To draw an analogy, using an RTOS to build truly event-driven applications is like using a hammer to pound in the screws. Sure, it can be done, but a hammer is really not the right tool for the job and we are doing a lousy job with it. We need a screwdriver!

  5. This has also been discussed – at length – on LinkedIn. A point which I took from that discussion is that Miro’s Quantum Platform provides a formal, easy-to-use framework for programming an arbitrary hierarchy of state machines, whereas threads provide concurrency (if concurrency is needed), whether standalone or as a basis for such a framework. As you have pointed out, Michael, the two things are complementary rather than comparable. Programming state machines directly to the RTOS (without a framework) can lead to chaos, as Miro emphasises, but it does not necessarily do so. As always, the intelligence, experience and attitude of the programmer ultimately determines the quality of the code.

    I have not yet studied QP in depth, Miro, but I will!

    Meanwhile, I concur with Michael that each task (or thread) should wait at only one point in its loop, where it is responsive to all the events in which it is interested. Unfortunately, many RTOSes do not make it easy to wait for multiple events simultaneously.

    Finally, Michael, if you substituted a call through a table of function pointers for your switch statement in the example you gave, the code would resemble even more closely the code which is naturally generated (whether automatically or manually) from a classic state-transition matrix.

  6. It seems to me that embedded systems people should always be including performance and resources when discussing software architectures. If time and/or resources are not limited it is not truly an embedded system but just a custom OS with an “RT” thrown in front of it to impress people.

    The problem with threads is the in time and resources that they require. They usually require a stack so that they can store the rather significant stack frame necessary in order to restore the entire processor context when they next wake up. If you are implementing a truly high performing system, often this context store/restore process causes way too much latency to be practical. On the other hand, when chosen for the correct kind of task, threads can simplify debugging as it makes the code look more similar to the straight line code they teach people to write in college or applications programming. Of course, when the process implemented gets complicated this becomes just as complicated as a state machine to debug.

    The problem with state machines, particularly when one uses event callbacks, is that they can be difficult to debug. It doesn’t take long for the state machine to get so complex it is difficult for people to resolve bugs without walking through the problem with a source level debugger or some sort of event tracking mechanism. On the other hand state machines can be implemented to use verify minimal state tracking mechanisms. Also, they can be implemented such that things as pipeline stalls, cache line hits et al can be controlled for maximum performance. Many truly embedded systems today have limited memory bandwidth and also limited cache so control can become very important.

    I have had people argue with me that you can overcome the overhead of threading or even preemptive multitasking by processing a lot of events during each context switch. I would argue that you are basically just implementing the state machine inside of your RTOS at that point. Also, while this increases tasks per second it actually increases the latency of other threads.

    Ergo I would say that both systems have their place. If you have enough memory, then putting some medium complexity real time tasks that do not require incredibly high performance can save you debug (and maybe design) time as long as you have the stack memory available to handle it. If you need maximum performance then I believe state machines are you best bet.

    1. Even if one neglects the resources (RAM for the stacks and CPU for context switches) required by threads, the biggest disadvantage is *blocking*. A blocked task is unresponsive to any other events, so it is not extensible.

      I’m not sure to which state machine implementation techniques you have been using (I’m not familiar with the callback-based implementation), but I would say that with the good implementation techniques state machines are the easiest code to debug of all alternatives. The toughest code to debug in my experience is deeply nested IF-ELSE “spaghetti”, peppered with gazillions of flags and variables tested in the IFs. The problem is that you never know which convoluted path in the code led to a specific error.

      In contrast, state machine code has much simpler paths through the code. This is because correctly coded state machines work like “spaghetti reducers”. So, you always start with the current *state* (which you can conveniently see in the debugger) and you don’t use gazillions of flags and variables.

      So, to conclude, a good state machine implementation technique can eliminate all these accidental difficulties and makes state machine a clear winner compared to multitude of threads.

  7. From my point of view, (proto-)threads vs. state machines is also about the fit of the design & execution model. Miro emphasises the need for good design tool (e.g. state diagrams), while threads seems to be okay with the usual text editors/IDEs.

    The execution model is, as far as I can see, quite distinguishable. Withing a thread the execution model is subroutine based: A calls B which calls C which returns to B which returns to A. The call depth is bounded. State machine designs can execute as X transitions to Y transitions to Z, without any “returning” or bounds on stack size. The extra context of where to return is valueable to Larry for his tasks, but I would expect that state machines can also call subroutines when computing. The ability to transition without a stack is possible using “goto” or “trampolines” or “switch” statements, but this is manually tedious without a good design tool. Considering this execution model, functional (and likely non-embedded) languages with tail-recursion guarantees occupy a middle ground of expressiveness: they look like subroutines but can avoid growing the stack of pending returns.

    The ability to finely define the OCP tasks depends having an amenable problem to begin with. I would guess thatiIf the problem can be finely defined or needs the non-returning execution then state machines are a useful design abstraction. If the solution is clearer with threads & subroutines (dispatched by a master event loop) then the flexibility of state machines can be foregone. For some tasks the determining factor may be error and exception handling. It can be tiresome to consistantly have great error handling in normal threaded code, whereas state machines may allow simpler reuse of error reporting and recovery code and the flow of execution after an error may not want to deal with the typical stack.

    1. I think that it is exactly the other way around. State machines are inherently event-driven, and as such they naturally *return* to the caller after processing each event. The main benefit is that a state machine can easily pick up where it left off, because it represents the context as the current *state*.

  8. A few thoughts from a non-embedded systems perspective.

    C# 5.0 will have a functionality in which you can write proto-thread-style code, which the compiler will rewrite for you into a state machine.

    This has two interesting consequences:

    1. You can mix both approaches relatively easily, eg implement simple and mostly linear tasks using a proto-thread style, and others using a state machine implemented directly. Then you can compose these state machines into bigger state machines, so build your system in a modular fashion.

    2. You can change your mind about the implementation method (proto-threads vs direct) whenever it makes sense with only relatively local changes. This makes refactoring easier.

    Not that the C# version of the proto-thread-style coding is not limited to strictly linear flows. There’s an extensive library for doing things like waiting for (all/one of) n events to happen, so you can build quite complex state graphs with it, while retaining the linear look of the code. Not that this is necessarily better than writing the state machine yourself, but gives you an alternative that may or may not be better for your particular case.

  9. EmbeddedFirmwareEng, you are totally correct that the problem with threads is the time and resources they use. But the protothread model is *much* closer to the event-driven model in these aspects. I pointed out here that my own experiments showed that my version of protothreads is over 100 times as fast as real threads in context switch time (and over 600 times for thread create/destroy).

    Here, I’ll make one more attempt to “prove” why threads (including protothreads) are better than event-driven for some (not all) kinds of problems.

    I hope we can all agree that procedural programming (a procedure or function calls a lower-level function, waiting for it to return before proceeding, so there is a stack and a calling hierarchy), along with structured programming (IF and WHILE statements) are two of the all-time great breakthroughs in programming. If you have some kind of complex SORT algorithm to write, you are probably going to write it in that style, rather than as a FSM or using an asynchronous event-driven style. Most of the famous algorithms that have been published are written in a procedural style, and the languages we know and love (such as C) most directly support the procedural model.

    Many of these famous algorithms run completely within the CPU and memory, without interacting with the outside world (other than possibly reading input at the beginning and printing or writing output at the end). But now suppose the algorithm requires the use of outside entities at various points *during* its execution, such as a reading and writing a disk, or sending and receiving network messages (remote procedure calls), or think of an offload FFT processor. If you didn’t have to worry about multi-tasking (letting other tasks use the CPU) or if the external entities were infinitely fast, there would be no problem — we would just wait for the reply and continue on our way, even though we’re many layers deep in function calls and structured statements (IF and WHILE).

    But if we do want to give up the CPU (the real world), there are two alternatives. We can use threads (or for better performance, protothreads), or we can rewrite the algorithm in the event-driven model (asynchronous interfaces, callbacks when events occur). I contend that since (by hypothesis) the algorithm was originally written in the procedural model, it will be best to leave it in that model.

    1. Your post apparently contradicts established knowledge of the software community. Here are some excerpts from the book “Object-Oriented Modeling and Design” by James Rumbaugh (Prentice Hall, 1991):

      “Event-driven systems permit more flexible patterns of control than procedure-driven systems.”

      “Use an event-driven system for external control in preference to a procedure-driven system whenever possible because mapping from events to program constructs is much simpler and more powerful. Event-driven systems are also more modular and can handle error conditions better than procedure-driven systems.”

  10. Hi Larry,

    you are right, in the “real world” there are problems that are better (better in a pragmatical sense) solved with threads than event driven, and your example showes why.

    Don’t wait for Miro to admit that (he indirectly does, that’s why he’s got a preemptive kernel and classic RTOS support in his QP framework), he likes to keep discussions on a theoretical level, that’s why he is quoting Rumbaugh. I fully agree with the content of his quotes btw, it just doesn’t apply to an existing algorithm that is better be used in a threaded fashion rather than being rewritten to work event driven.

  11. Hi Miro,

    I want an automatic tool that could convert a nested code such kind of:

    (C / C++)

    if (q1 && q2 || ....)
    	if (q3 ...)
    		if (q4 ...)

    To the type of linear code:

    if (q1 && q2 || ....)
    if ((q1 && q2 || ....) && q3)
    if ((q1 && q2 || ....) && q3 && q4)
    if ((q1 && q2 || ....) && q3 && !q4)
    if ((q1 && q2 || ....) && !q3)

    Do you know of such a tool?

    Thank you


  12. Miro wrote: “so that designers used to the sequential way of thinking can continue to apply it to non-sequential event-driven problems.”

    Great article.
    I couldn’t agree more with Miro. I’ve seen the type of code he describes, with pseudo-sequential behavior and tons of flags around. I instead prefer the form:

      state 1{}
      state 2{}
      state 1{}
      state 2{}

    I find it much cleaner and flexible, without the need for flags or pseudo-formal ‘mutexes’ for concurrency coordination. This usually takes 1 byte for the state and 1 word to track the state time, per state-machine. I don’t know of systems nowadays that can’t afford this.

    Also, (and this boils down to semantics I guess) threads/states/events are not mutually exclusive. In my experience, a state machine can be considered a thread, and any state-transition is an “event”.

    However, I think some programmers mistakenly focus on the events first (flags) and a formal state second, if at all.

  13. I would like to comment on Adam Dunkels original “protothread”. “Under the hood” (on his website) explains this C macro trick, which turns PT_WAIT_UNTIL(pt, c) into a “switch” statement.

    That’s exactly like a state machine with “c” being the “state” to switch from.

    May I state that Adam Dunkels initial implementation of protothread is a syntactic sugar for a state machine. One can think of it as “switch” statement being hidden rather than hovering at the top.

    Since “c” can be any C expression, it won’t be “non-responsive”. Just put “state, events, …” into “c”, it will jump.

Leave a Reply