Suppose we have three functions A(), B() and C(), where A() calls B() to do something and then calls C() to do some other thing. This calling activity is shown in the following figure. More precisely, when A() calls B(), A() waits until B() returns. When B() is called, it always starts its execution with its first statement. The similar holds true for the call to C().
However, this type of call/return is not always the best way of using routines. Suppose a system runs three threads A, B and C with the round-robin scheduling policy. In other words, the threads run in the order of A, B, C, A, B, C, A, B, C and so on. The following figure shows the scheduling of these threads. The red arrows are thread executions, doted arrows indicate context switching, and different rounds use different colors.
Thus, A runs for a while, and the CPU is switched to B. Then, B runs for a while, and the CPU switched to C. When C finishes its time quantum, the control switches back to A; however, the restarting point is where A was suspended previously. When B and C restart, the restarting points are where B and C were suspended. If we consider A, B and C as three functions calling each other (simulated by switching CPU), the call brings the control to the statement where the caller was suspended. Routines organized in this way are called coroutines. Coroutines have been used to construct finite automata, parsers and, of course, multithreaded systems!
To show you a simple example, let us only consider two functions working as two coroutines. Click here to download a copy of this program.
#include#include int max_iteration, iter; jmp_buf Main, PointA, PointB; void Ping(void); void Pong(void); void main(int argc, char* argv[]) { max_iteration = abs(atoi(argv[1])); iter = 1; if (setjmp(Main) == 0) Ping(); if (setjmp(Main) == 0) Pong(); longjmp(PointA, 1); } void Ping(void) { if (setjmp(PointA) == 0) longjmp(Main, 1); while (1) { printf("%3d : Ping-", iter); if (setjmp(PointA) == 0) longjmp(PointB, 1); } } void Pong(void) { if (setjmp(PointB) == 0) longjmp(Main, 1); while (1) { printf("Pong\n"); iter++; if (iter > max_iteration) exit(0); if (setjmp(PointB) == 0) longjmp(PointA, 1); } }
The logic of this program requires some explanation.
Once the execution enters function Ping(), a jump buffer PointA is setup and then immediately returns the control to the main program. This is shown in the following figure:
Hmmm, things are getting more interesting now.
Function Pong() follows what we just mentioned. That is, the execution falls through and enters the while loop. In this loop, a new state is saved to jump buffer PointB and long jumps back to Ping(). These activities is shown in the following figure.
In a multithreaded system, we may have many functions running as threads, each of which has an environment in the runtime stack. The thread scheduler just uses a technique similar to the above program to schedule those threads that are in the ready queue.
#include#include int max_iteration; jmp_buf Main; jmp_buf PointPing; jmp_buf PointPong; void Ping(void); void Pong(void); void main(int argc, char* argv[]) { if (argc != 2) { printf("Use %s max-#-of-lines\n", argv[0]); exit(1); } max_iteration = abs(atoi(argv[1])); if (setjmp(Main) == 0) Ping(); if (setjmp(Main) == 0) Pong(); longjmp(PointPing, 1); } void Ping(void) { int i; if (setjmp(PointPing) == 0) { i = 1; longjmp(Main, 1); } while (1) { printf("Ping (%2d) - ", i); i++; if (setjmp(PointPing) == 0) longjmp(PointPong, 1); } } void Pong(void) { int j; if (setjmp(PointPong) == 0) { j = 1; longjmp(Main, 1); } while (1) { printf("Pong (%2d)\n", j); j++; if (j > max_iteration) exit(0); if (setjmp(PointPong) == 0) longjmp(PointPing, 1); } }
Local variables i and j are counters and are initialized when Ping() and Pong() are called. The output of this program with max_iteration set to 4 may look like the following:
Ping ( 1) - Pong ( 2) Ping ( 3) - Pong ( 4)
This output is definitively incorrect, because the while loop in function Pong() should iterate four times, printing out Pong ( 1), Pong ( 2), Pong ( 3), and Pong ( 4). But, we have have Pong ( 2) and Pong ( 4) in the above output. (WARNING: your system may generate a different output.) Moreover, the output from Ping() is also wrong. So, what is the problem?
The problem is in stack memory allocation as we have discussed a couple of times. When the main starts execution and before reaches the first setjmp(), the stack has only one stack frame as shown in Figure (b) below. Then, Ping() is called, and the stack has two frames as shown in Figure (a). Function Ping() uses a longjmp() to bring the control back to the main program, effectively restoring the stack back to the point of the first setjmp() being called (Figure (a)). When the main program calls setjmp() to mark a second return point and then calls Pong(), the stack frame for function Pong() will likely occupy the same location which was allocated for Ping()! Because both functions Ping() and Pong() have only one local variable, they will likely share the same location on the stack. More precisely, it is highly likely that i of Ping() and j of Pong() share the same address. In fact, you can print their addresses to verify this. Therefore, when Ping() prints and increases the value of i, Pong() loses its original value of j and prints the next one!
(a) | (b) | (c) |
To overcome this problem, we need to allocate separate stack space for each function that is running as a thread. Please refer to our mini-project to learn how this can be done easily. Unfortunately, this is usually system-dependent. Check our mini-project for a simple implementation of multiple stacks, on for each thread, on Sun Solaris.