If two threads write different 8-bit values within the same 32-bit memory unit, the result may be that the last thread to write the memory unit specifies the value of both bytes, overwriting the value supplied by the first writer. Figure 3.8 shows this effect.
FIGURE 3.8
If a variable crosses the boundary between memory units, which can happen if the machine supports unaligned memory access, the computer may have to send the data in two bus transactions. An unaligned 32-bit value, for example, may be sent by writing the two adjacent 32-bit memory units. If either memory unit involved in the transaction is simultaneously written from another processor, half of the value may be lost. This is called 'word tearing,' and is shown in Figure 3.9.
We have finally returned to the advice at the beginning of this section: If you want to write portable Pthreads code, you will always guarantee correct memory visibility by using the Pthreads memory visibility rules instead of relying on any assumptions regarding the hardware or compiler behavior. But now, at the bottom of the section, you have some understanding of why this is true. For a substantially more in-depth treatment of multiprocessor memory architecture, refer to
Figure 3.10 shows the same sequence as Figure 3.7, but it uses a mutex to ensure the desired read/write ordering. Figure3.10 does not show the cache flush steps that are shown in Figure 3.7, because those steps are no longer relevant. Memory visibility is guaranteed by passing mutex ownership in steps t+3 and t+4, through the associated memory barriers. That is, when thread 2 has
FIGURE 3.9
successfully locked the mutex previously unlocked by thread 1, thread 2 is guaranteed to see memory values 'at least as recent' as the values visible to thread 1 at the time it unlocked the mutex.
Time | Thread 1 | Thread 2 |
t | lock mutex (memory barrier) | |
t+1 | write '1' to address 1 (cache) | |
t+2 | write '2' to address 2 (cache) | |
t+3 | (memory barrier) unlock mutex | |
t+4 | lock mutex (memory barrier) | |
t+5 | read '1' from address 1 | |
t+6 | read '2' from address 2 | |
t+7 | (memory barrier) unlock mutex |
FIGURE 3.10
4 A few ways to use threads
'They were obliged to have him with them,' the Mock Turtle said.
'No wise fish would go anywhere without a porpoise.'
Wouldn't it, really?' said Alice, in a tone of great surprise.
'Of course not,' said the Mock Turtle. 'Why, if a fish came to me,
During the introduction to this book, I mentioned some of the ways you can structure a threaded solution to a problem. There are infinite variations, but the primary models of threaded programming are shown in Table 4.1.
Pipeline | Each thread repeatedly performs the same operation on a sequence of data sets, passing each result to another thread for the next step. This is also known as an 'assembly line.' |