Pthreads | thread | policy, priority | condition variables and mutexes |
TABLE 1.1
A system's scheduling facility may allow each thread to run until it voluntarily yields the processor to another thread ('run until block'). It may provide time-slicing, where each thread is forced to periodically yield so that other threads may run ('round-robin'). It may provide various scheduling policies that allow the application to control how each thread is scheduled according to that thread's function. It may provide a 'class scheduler' where dependencies between threads are described so that, for example, the scheduler can ensure that members of a tightly coupled parallel algorithm are scheduled at the same time.
Synchronization may be provided using a wide variety of mechanisms. Some of the most common forms are mutexes, condition variables, semaphores, and events. You may also use message passing mechanisms, such as UNIX pipes, sockets, POSIX message queues, or other protocols for communicating between asynchronous processes—on the same system or across a network. Any form of communication protocol contains some form of synchronization, because passing data around with no synchronization results in chaos, not
The terms
1.3 Asynchronous programming is intuitive ...
'In most gardens,' the Tiger-lily said,
'they make the beds too soft—so that the flowers are always asleep.' This sounded a very good reason, and Alice was quite
pleased to know it. 'I never thought of that before!' she said.
If you haven't been involved in traditional realtime programming, asynchronous programming may seem new and different. But you've probably been using
asynchronous programming techniques all along. You've probably used UNIX, for example, and, even as a user, the common UNIX shells from sh to ksh have been designed for asynchronous programming. You've also been using asynchronous 'programming' techniques in real life since you were born.
Most people understand asynchronous behavior much more thoroughly than they expect, once they get past the complications of formal and restricted definitions.
1.3.1 ... because UNIX is asynchronous
In any UNIX system, processes execute asynchronously with respect to each other, even when there is only a single processor. Yes, until recently it was difficult to write individual programs for UNIX that behaved asynchronously—but UNIX has always made it fairly easy for you to behave asynchronously. When you type a command to a shell, you are really starting an independent program—if you run the program in the background, it runs asynchronously with the shell. When you pipe the output of one command to another you are starting several independent programs, which synchronize between themselves using the pipe.
I Time is a synchronization mechanism.
In many cases you provide synchronization between a series of processes yourself, maybe without even thinking about it. For example, you run the compiler
I UNIX pipes and files can be synchronization mechanisms.
In other cases you may use more complicated software synchronization mechanisms. When you type 'ls|more' to a shell to pass the output of the ls command into the more command, you're describing synchronization by specifying a data dependency. The shell starts both commands right away, but the more command can't generate any output until it receives input from ls through the pipe. Both commands proceed concurrently (or even in parallel on a multiprocessor) with ls supplying data and more processing that data, independently of each other. If the pipe buffer is big enough, ls could complete before more ever started; but more can't ever get ahead of ls.
Some UNIX commands perform synchronization internally. For example, the command 'cc -o thread thread.c' might involve a number of separate processes. The cc command might be a 'front end' to the C language environment, which runs a filter to expand preprocessor commands (like #include and #if), a compiler to translate the program into an intermediate form, an optimizer to reorder the translation, an assembler to translate the intermediate form into object language, and a loader to translate that into an executable binary file. As with ls | more, all these programs may be running at the same time, with synchronization provided by pipes, or by access to temporary files.
UNIX processes can operate asynchronously because each process includes all the information needed to execute code. The operating system can save the state of one process and switch to another without affecting the operation of either. Any general-purpose asynchronous 'entity' needs enough state to enable the operating system to switch between them arbitrarily. But a UNIX process includes additional state that is not directly related to 'execution context,' such as an address space and file descriptors.
A thread is the part of a process that's necessary to execute code. On most computers that means each thread has a pointer to the thread's current instruction (often called a 'PC' or 'program counter'), a pointer to the top of the thread's stack (SP), general registers, and floating-point or address registers if they are kept separate. A thread may have other things, such as processor status and coprocessor control registers. A thread does not include most of the rest of the state associated with a process; for example, threads do not have their own file descriptors or address space. All threads within a process share all of the files and memory, including the program text and data segments.
I Threads are 'simpler' than processes.
You can think of a thread as a sort of 'stripped down' process, lean and mean and ready to go. The system can switch between two threads within a process much faster than it can switch between processes. A large part of this advantage comes from the fact that threads within a process share the address space—code, data, stack, everything.