Commit | Line | Data |
2605996a |
1 | =head1 NAME |
2 | |
2ad6cdcf |
3 | perlthrtut - Tutorial on threads in Perl |
2605996a |
4 | |
5 | =head1 DESCRIPTION |
6 | |
2ad6cdcf |
7 | This tutorial describes the use of Perl interpreter threads (sometimes |
8 | referred to as I<ithreads>) that was first introduced in Perl 5.6.0. In this |
9 | model, each thread runs in its own Perl interpreter, and any data sharing |
10 | between threads must be explicit. The user-level interface for I<ithreads> |
11 | uses the L<threads> class. |
9316ed2f |
12 | |
47f9f84c |
13 | B<NOTE>: There was another older Perl threading flavor called the 5.005 model |
14 | that used the L<Threads> class. This old model was known to have problems, is |
15 | deprecated, and was removed for release 5.10. You are |
2ad6cdcf |
16 | strongly encouraged to migrate any existing 5.005 threads code to the new |
17 | model as soon as possible. |
2a4bf773 |
18 | |
53d7eaa8 |
19 | You can see which (or neither) threading flavour you have by |
6eded8f3 |
20 | running C<perl -V> and looking at the C<Platform> section. |
53d7eaa8 |
21 | If you have C<useithreads=define> you have ithreads, if you |
22 | have C<use5005threads=define> you have 5.005 threads. |
23 | If you have neither, you don't have any thread support built in. |
24 | If you have both, you are in trouble. |
2605996a |
25 | |
2ad6cdcf |
26 | The L<threads> and L<threads::shared> modules are included in the core Perl |
27 | distribution. Additionally, they are maintained as a separate modules on |
28 | CPAN, so you can check there for any updates. |
2605996a |
29 | |
c975c451 |
30 | =head1 What Is A Thread Anyway? |
31 | |
32 | A thread is a flow of control through a program with a single |
33 | execution point. |
34 | |
35 | Sounds an awful lot like a process, doesn't it? Well, it should. |
36 | Threads are one of the pieces of a process. Every process has at least |
37 | one thread and, up until now, every process running Perl had only one |
38 | thread. With 5.8, though, you can create extra threads. We're going |
39 | to show you how, when, and why. |
40 | |
41 | =head1 Threaded Program Models |
42 | |
43 | There are three basic ways that you can structure a threaded |
44 | program. Which model you choose depends on what you need your program |
2ad6cdcf |
45 | to do. For many non-trivial threaded programs, you'll need to choose |
c975c451 |
46 | different models for different pieces of your program. |
47 | |
48 | =head2 Boss/Worker |
49 | |
2ad6cdcf |
50 | The boss/worker model usually has one I<boss> thread and one or more |
51 | I<worker> threads. The boss thread gathers or generates tasks that need |
c975c451 |
52 | to be done, then parcels those tasks out to the appropriate worker |
53 | thread. |
54 | |
55 | This model is common in GUI and server programs, where a main thread |
56 | waits for some event and then passes that event to the appropriate |
57 | worker threads for processing. Once the event has been passed on, the |
58 | boss thread goes back to waiting for another event. |
59 | |
60 | The boss thread does relatively little work. While tasks aren't |
61 | necessarily performed faster than with any other method, it tends to |
62 | have the best user-response times. |
63 | |
64 | =head2 Work Crew |
65 | |
66 | In the work crew model, several threads are created that do |
67 | essentially the same thing to different pieces of data. It closely |
68 | mirrors classical parallel processing and vector processors, where a |
69 | large array of processors do the exact same thing to many pieces of |
70 | data. |
71 | |
72 | This model is particularly useful if the system running the program |
73 | will distribute multiple threads across different processors. It can |
74 | also be useful in ray tracing or rendering engines, where the |
75 | individual threads can pass on interim results to give the user visual |
76 | feedback. |
77 | |
78 | =head2 Pipeline |
79 | |
80 | The pipeline model divides up a task into a series of steps, and |
81 | passes the results of one step on to the thread processing the |
82 | next. Each thread does one thing to each piece of data and passes the |
83 | results to the next thread in line. |
84 | |
85 | This model makes the most sense if you have multiple processors so two |
86 | or more threads will be executing in parallel, though it can often |
87 | make sense in other contexts as well. It tends to keep the individual |
88 | tasks small and simple, as well as allowing some parts of the pipeline |
89 | to block (on I/O or system calls, for example) while other parts keep |
90 | going. If you're running different parts of the pipeline on different |
91 | processors you may also take advantage of the caches on each |
92 | processor. |
93 | |
94 | This model is also handy for a form of recursive programming where, |
95 | rather than having a subroutine call itself, it instead creates |
96 | another thread. Prime and Fibonacci generators both map well to this |
97 | form of the pipeline model. (A version of a prime number generator is |
98 | presented later on.) |
99 | |
bfce6503 |
100 | =head1 What kind of threads are Perl threads? |
c975c451 |
101 | |
102 | If you have experience with other thread implementations, you might |
103 | find that things aren't quite what you expect. It's very important to |
2ad6cdcf |
104 | remember when dealing with Perl threads that I<Perl Threads Are Not X |
105 | Threads> for all values of X. They aren't POSIX threads, or |
c975c451 |
106 | DecThreads, or Java's Green threads, or Win32 threads. There are |
107 | similarities, and the broad concepts are the same, but if you start |
108 | looking for implementation details you're going to be either |
109 | disappointed or confused. Possibly both. |
110 | |
111 | This is not to say that Perl threads are completely different from |
2ad6cdcf |
112 | everything that's ever come before -- they're not. Perl's threading |
c975c451 |
113 | model owes a lot to other thread models, especially POSIX. Just as |
114 | Perl is not C, though, Perl threads are not POSIX threads. So if you |
115 | find yourself looking for mutexes, or thread priorities, it's time to |
116 | step back a bit and think about what you want to do and how Perl can |
117 | do it. |
118 | |
2ad6cdcf |
119 | However, it is important to remember that Perl threads cannot magically |
8efd9ba4 |
120 | do things unless your operating system's threads allow it. So if your |
2ad6cdcf |
121 | system blocks the entire process on C<sleep()>, Perl usually will, as well. |
c975c451 |
122 | |
2ad6cdcf |
123 | B<Perl Threads Are Different.> |
9316ed2f |
124 | |
cf5baa48 |
125 | =head1 Thread-Safe Modules |
c975c451 |
126 | |
cf5baa48 |
127 | The addition of threads has changed Perl's internals |
c975c451 |
128 | substantially. There are implications for people who write |
2ad6cdcf |
129 | modules with XS code or external libraries. However, since Perl data is |
cf5baa48 |
130 | not shared among threads by default, Perl modules stand a high chance of |
131 | being thread-safe or can be made thread-safe easily. Modules that are not |
132 | tagged as thread-safe should be tested or code reviewed before being used |
133 | in production code. |
c975c451 |
134 | |
135 | Not all modules that you might use are thread-safe, and you should |
136 | always assume a module is unsafe unless the documentation says |
137 | otherwise. This includes modules that are distributed as part of the |
2ad6cdcf |
138 | core. Threads are a relatively new feature, and even some of the standard |
bfce6503 |
139 | modules aren't thread-safe. |
c975c451 |
140 | |
cf5baa48 |
141 | Even if a module is thread-safe, it doesn't mean that the module is optimized |
6eded8f3 |
142 | to work well with threads. A module could possibly be rewritten to utilize |
143 | the new features in threaded Perl to increase performance in a threaded |
144 | environment. |
c975c451 |
145 | |
146 | If you're using a module that's not thread-safe for some reason, you |
cf5baa48 |
147 | can protect yourself by using it from one, and only one thread at all. |
148 | If you need multiple threads to access such a module, you can use semaphores and |
149 | lots of programming discipline to control access to it. Semaphores |
150 | are covered in L</"Basic semaphores">. |
9316ed2f |
151 | |
cf5baa48 |
152 | See also L</"Thread-Safety of System Libraries">. |
c975c451 |
153 | |
154 | =head1 Thread Basics |
155 | |
2ad6cdcf |
156 | The L<threads> module provides the basic functions you need to write |
157 | threaded programs. In the following sections, we'll cover the basics, |
c975c451 |
158 | showing you what you need to do to create a threaded program. After |
159 | that, we'll go over some of the features of the L<threads> module that |
160 | make threaded programming easier. |
161 | |
162 | =head2 Basic Thread Support |
163 | |
2ad6cdcf |
164 | Thread support is a Perl compile-time option -- it's something that's |
c975c451 |
165 | turned on or off when Perl is built at your site, rather than when |
166 | your programs are compiled. If your Perl wasn't compiled with thread |
167 | support enabled, then any attempt to use threads will fail. |
168 | |
c975c451 |
169 | Your programs can use the Config module to check whether threads are |
170 | enabled. If your program can't run without them, you can say something |
171 | like: |
172 | |
2ad6cdcf |
173 | use Config; |
174 | $Config{useithreads} or die('Recompile Perl with threads to run this program.'); |
c975c451 |
175 | |
176 | A possibly-threaded program using a possibly-threaded module might |
177 | have code like this: |
178 | |
cf5baa48 |
179 | use Config; |
180 | use MyMod; |
c975c451 |
181 | |
9316ed2f |
182 | BEGIN { |
cf5baa48 |
183 | if ($Config{useithreads}) { |
184 | # We have threads |
185 | require MyMod_threaded; |
2ad6cdcf |
186 | import MyMod_threaded; |
cf5baa48 |
187 | } else { |
2ad6cdcf |
188 | require MyMod_unthreaded; |
189 | import MyMod_unthreaded; |
9316ed2f |
190 | } |
cf5baa48 |
191 | } |
c975c451 |
192 | |
193 | Since code that runs both with and without threads is usually pretty |
194 | messy, it's best to isolate the thread-specific code in its own |
2ad6cdcf |
195 | module. In our example above, that's what C<MyMod_threaded> is, and it's |
c975c451 |
196 | only imported if we're running on a threaded Perl. |
197 | |
8f95bfb9 |
198 | =head2 A Note about the Examples |
199 | |
8f95bfb9 |
200 | In a real situation, care should be taken that all threads are finished |
201 | executing before the program exits. That care has B<not> been taken in these |
2ad6cdcf |
202 | examples in the interest of simplicity. Running these examples I<as is> will |
8f95bfb9 |
203 | produce error messages, usually caused by the fact that there are still |
204 | threads running when the program exits. You should not be alarmed by this. |
8f95bfb9 |
205 | |
c975c451 |
206 | =head2 Creating Threads |
207 | |
2ad6cdcf |
208 | The L<threads> module provides the tools you need to create new |
9e75ef81 |
209 | threads. Like any other module, you need to tell Perl that you want to use |
2ad6cdcf |
210 | it; C<use threads;> imports all the pieces you need to create basic |
c975c451 |
211 | threads. |
212 | |
2ad6cdcf |
213 | The simplest, most straightforward way to create a thread is with C<create()>: |
c975c451 |
214 | |
0b390a82 |
215 | use threads; |
c975c451 |
216 | |
2ad6cdcf |
217 | my $thr = threads->create(\&sub1); |
c975c451 |
218 | |
0b390a82 |
219 | sub sub1 { |
2ad6cdcf |
220 | print("In the thread\n"); |
c975c451 |
221 | } |
222 | |
2ad6cdcf |
223 | The C<create()> method takes a reference to a subroutine and creates a new |
224 | thread that starts executing in the referenced subroutine. Control |
c975c451 |
225 | then passes both to the subroutine and the caller. |
226 | |
227 | If you need to, your program can pass parameters to the subroutine as |
228 | part of the thread startup. Just include the list of parameters as |
2ad6cdcf |
229 | part of the C<threads-E<gt>create()> call, like this: |
c975c451 |
230 | |
0b390a82 |
231 | use threads; |
bfce6503 |
232 | |
2ad6cdcf |
233 | my $Param3 = 'foo'; |
234 | my $thr1 = threads->create(\&sub1, 'Param 1', 'Param 2', $Param3); |
235 | my @ParamList = (42, 'Hello', 3.14); |
236 | my $thr2 = threads->create(\&sub1, @ParamList); |
237 | my $thr3 = threads->create(\&sub1, qw(Param1 Param2 Param3)); |
c975c451 |
238 | |
0b390a82 |
239 | sub sub1 { |
240 | my @InboundParameters = @_; |
2ad6cdcf |
241 | print("In the thread\n"); |
242 | print('Got parameters >', join('<>', @InboundParameters), "<\n"); |
c975c451 |
243 | } |
244 | |
c975c451 |
245 | The last example illustrates another feature of threads. You can spawn |
246 | off several threads using the same subroutine. Each thread executes |
247 | the same subroutine, but in a separate thread with a separate |
248 | environment and potentially separate arguments. |
249 | |
2ad6cdcf |
250 | C<new()> is a synonym for C<create()>. |
bfce6503 |
251 | |
c975c451 |
252 | =head2 Waiting For A Thread To Exit |
253 | |
254 | Since threads are also subroutines, they can return values. To wait |
6eded8f3 |
255 | for a thread to exit and extract any values it might return, you can |
2ad6cdcf |
256 | use the C<join()> method: |
c975c451 |
257 | |
0b390a82 |
258 | use threads; |
bfce6503 |
259 | |
2ad6cdcf |
260 | my ($thr) = threads->create(\&sub1); |
c975c451 |
261 | |
2ad6cdcf |
262 | my @ReturnData = $thr->join(); |
263 | print('Thread returned ', join(', ', @ReturnData), "\n"); |
c975c451 |
264 | |
2ad6cdcf |
265 | sub sub1 { return ('Fifty-six', 'foo', 2); } |
c975c451 |
266 | |
2ad6cdcf |
267 | In the example above, the C<join()> method returns as soon as the thread |
c975c451 |
268 | ends. In addition to waiting for a thread to finish and gathering up |
2ad6cdcf |
269 | any values that the thread might have returned, C<join()> also performs |
c975c451 |
270 | any OS cleanup necessary for the thread. That cleanup might be |
271 | important, especially for long-running programs that spawn lots of |
272 | threads. If you don't want the return values and don't want to wait |
2ad6cdcf |
273 | for the thread to finish, you should call the C<detach()> method |
bfce6503 |
274 | instead, as described next. |
c975c451 |
275 | |
2ad6cdcf |
276 | NOTE: In the example above, the thread returns a list, thus necessitating |
277 | that the thread creation call be made in list context (i.e., C<my ($thr)>). |
278 | See L<threads/"$thr->join()"> and L<threads/"THREAD CONTEXT"> for more |
279 | details on thread context and return values. |
280 | |
c975c451 |
281 | =head2 Ignoring A Thread |
282 | |
2ad6cdcf |
283 | C<join()> does three things: it waits for a thread to exit, cleans up |
c975c451 |
284 | after it, and returns any data the thread may have produced. But what |
285 | if you're not interested in the thread's return values, and you don't |
286 | really care when the thread finishes? All you want is for the thread |
287 | to get cleaned up after when it's done. |
288 | |
2ad6cdcf |
289 | In this case, you use the C<detach()> method. Once a thread is detached, |
290 | it'll run until it's finished; then Perl will clean up after it |
c975c451 |
291 | automatically. |
292 | |
0b390a82 |
293 | use threads; |
bfce6503 |
294 | |
2ad6cdcf |
295 | my $thr = threads->create(\&sub1); # Spawn the thread |
296 | |
297 | $thr->detach(); # Now we officially don't care any more |
c975c451 |
298 | |
2ad6cdcf |
299 | sleep(15); # Let thread run for awhile |
c975c451 |
300 | |
cf5baa48 |
301 | sub sub1 { |
0b390a82 |
302 | $a = 0; |
303 | while (1) { |
304 | $a++; |
2ad6cdcf |
305 | print("\$a is $a\n"); |
306 | sleep(1); |
0b390a82 |
307 | } |
c975c451 |
308 | } |
309 | |
bfce6503 |
310 | Once a thread is detached, it may not be joined, and any return data |
311 | that it might have produced (if it was done and waiting for a join) is |
c975c451 |
312 | lost. |
313 | |
2ad6cdcf |
314 | C<detach()> can also be called as a class method to allow a thread to |
315 | detach itself: |
316 | |
317 | use threads; |
318 | |
319 | my $thr = threads->create(\&sub1); |
320 | |
321 | sub sub1 { |
322 | threads->detach(); |
323 | # Do more work |
324 | } |
325 | |
c975c451 |
326 | =head1 Threads And Data |
327 | |
328 | Now that we've covered the basics of threads, it's time for our next |
2ad6cdcf |
329 | topic: Data. Threading introduces a couple of complications to data |
c975c451 |
330 | access that non-threaded programs never need to worry about. |
331 | |
332 | =head2 Shared And Unshared Data |
333 | |
2ad6cdcf |
334 | The biggest difference between Perl I<ithreads> and the old 5.005 style |
bfce6503 |
335 | threading, or for that matter, to most other threading systems out there, |
2ad6cdcf |
336 | is that by default, no data is shared. When a new Perl thread is created, |
bfce6503 |
337 | all the data associated with the current thread is copied to the new |
338 | thread, and is subsequently private to that new thread! |
339 | This is similar in feel to what happens when a UNIX process forks, |
340 | except that in this case, the data is just copied to a different part of |
341 | memory within the same process rather than a real fork taking place. |
c975c451 |
342 | |
2ad6cdcf |
343 | To make use of threading, however, one usually wants the threads to share |
bfce6503 |
344 | at least some data between themselves. This is done with the |
2ad6cdcf |
345 | L<threads::shared> module and the C<:shared> attribute: |
bfce6503 |
346 | |
347 | use threads; |
348 | use threads::shared; |
349 | |
2ad6cdcf |
350 | my $foo :shared = 1; |
bfce6503 |
351 | my $bar = 1; |
2ad6cdcf |
352 | threads->create(sub { $foo++; $bar++; })->join(); |
818c4caa |
353 | |
2ad6cdcf |
354 | print("$foo\n"); # Prints 2 since $foo is shared |
355 | print("$bar\n"); # Prints 1 since $bar is not shared |
bfce6503 |
356 | |
357 | In the case of a shared array, all the array's elements are shared, and for |
358 | a shared hash, all the keys and values are shared. This places |
359 | restrictions on what may be assigned to shared array and hash elements: only |
360 | simple values or references to shared variables are allowed - this is |
f3278b06 |
361 | so that a private variable can't accidentally become shared. A bad |
bfce6503 |
362 | assignment will cause the thread to die. For example: |
363 | |
364 | use threads; |
365 | use threads::shared; |
366 | |
2ad6cdcf |
367 | my $var = 1; |
368 | my $svar :shared = 2; |
369 | my %hash :shared; |
bfce6503 |
370 | |
371 | ... create some threads ... |
372 | |
2ad6cdcf |
373 | $hash{a} = 1; # All threads see exists($hash{a}) and $hash{a} == 1 |
374 | $hash{a} = $var; # okay - copy-by-value: same effect as previous |
375 | $hash{a} = $svar; # okay - copy-by-value: same effect as previous |
376 | $hash{a} = \$svar; # okay - a reference to a shared variable |
377 | $hash{a} = \$var; # This will die |
378 | delete($hash{a}); # okay - all threads will see !exists($hash{a}) |
bfce6503 |
379 | |
380 | Note that a shared variable guarantees that if two or more threads try to |
381 | modify it at the same time, the internal state of the variable will not |
382 | become corrupted. However, there are no guarantees beyond this, as |
383 | explained in the next section. |
c975c451 |
384 | |
6eded8f3 |
385 | =head2 Thread Pitfalls: Races |
c975c451 |
386 | |
387 | While threads bring a new set of useful tools, they also bring a |
388 | number of pitfalls. One pitfall is the race condition: |
389 | |
0b390a82 |
390 | use threads; |
c975c451 |
391 | use threads::shared; |
bfce6503 |
392 | |
2ad6cdcf |
393 | my $a :shared = 1; |
394 | my $thr1 = threads->create(\&sub1); |
395 | my $thr2 = threads->create(\&sub2); |
c975c451 |
396 | |
397 | $thr1->join; |
398 | $thr2->join; |
2ad6cdcf |
399 | print("$a\n"); |
c975c451 |
400 | |
bfce6503 |
401 | sub sub1 { my $foo = $a; $a = $foo + 1; } |
402 | sub sub2 { my $bar = $a; $a = $bar + 1; } |
c975c451 |
403 | |
2ad6cdcf |
404 | What do you think C<$a> will be? The answer, unfortunately, is I<it |
405 | depends>. Both C<sub1()> and C<sub2()> access the global variable C<$a>, once |
c975c451 |
406 | to read and once to write. Depending on factors ranging from your |
407 | thread implementation's scheduling algorithm to the phase of the moon, |
2ad6cdcf |
408 | C<$a> can be 2 or 3. |
c975c451 |
409 | |
410 | Race conditions are caused by unsynchronized access to shared |
411 | data. Without explicit synchronization, there's no way to be sure that |
412 | nothing has happened to the shared data between the time you access it |
413 | and the time you update it. Even this simple code fragment has the |
414 | possibility of error: |
415 | |
0b390a82 |
416 | use threads; |
2ad6cdcf |
417 | my $a :shared = 2; |
418 | my $b :shared; |
419 | my $c :shared; |
0b390a82 |
420 | my $thr1 = threads->create(sub { $b = $a; $a = $b + 1; }); |
c975c451 |
421 | my $thr2 = threads->create(sub { $c = $a; $a = $c + 1; }); |
8f95bfb9 |
422 | $thr1->join; |
423 | $thr2->join; |
c975c451 |
424 | |
2ad6cdcf |
425 | Two threads both access C<$a>. Each thread can potentially be interrupted |
426 | at any point, or be executed in any order. At the end, C<$a> could be 3 |
427 | or 4, and both C<$b> and C<$c> could be 2 or 3. |
c975c451 |
428 | |
bfce6503 |
429 | Even C<$a += 5> or C<$a++> are not guaranteed to be atomic. |
430 | |
c975c451 |
431 | Whenever your program accesses data or resources that can be accessed |
432 | by other threads, you must take steps to coordinate access or risk |
bfce6503 |
433 | data inconsistency and race conditions. Note that Perl will protect its |
434 | internals from your race conditions, but it won't protect you from you. |
435 | |
f3278b06 |
436 | =head1 Synchronization and control |
bfce6503 |
437 | |
438 | Perl provides a number of mechanisms to coordinate the interactions |
439 | between themselves and their data, to avoid race conditions and the like. |
440 | Some of these are designed to resemble the common techniques used in thread |
441 | libraries such as C<pthreads>; others are Perl-specific. Often, the |
9e75ef81 |
442 | standard techniques are clumsy and difficult to get right (such as |
bfce6503 |
443 | condition waits). Where possible, it is usually easier to use Perlish |
444 | techniques such as queues, which remove some of the hard work involved. |
c975c451 |
445 | |
446 | =head2 Controlling access: lock() |
447 | |
2ad6cdcf |
448 | The C<lock()> function takes a shared variable and puts a lock on it. |
a6d05634 |
449 | No other thread may lock the variable until the variable is unlocked |
bfce6503 |
450 | by the thread holding the lock. Unlocking happens automatically |
0b390a82 |
451 | when the locking thread exits the block that contains the call to the |
2ad6cdcf |
452 | C<lock()> function. Using C<lock()> is straightforward: This example has |
f3278b06 |
453 | several threads doing some calculations in parallel, and occasionally |
bfce6503 |
454 | updating a running total: |
455 | |
456 | use threads; |
457 | use threads::shared; |
458 | |
2ad6cdcf |
459 | my $total :shared = 0; |
bfce6503 |
460 | |
461 | sub calc { |
2ad6cdcf |
462 | while (1) { |
463 | my $result; |
464 | # (... do some calculations and set $result ...) |
465 | { |
466 | lock($total); # Block until we obtain the lock |
467 | $total += $result; |
468 | } # Lock implicitly released at end of scope |
469 | last if $result == 0; |
470 | } |
bfce6503 |
471 | } |
472 | |
2ad6cdcf |
473 | my $thr1 = threads->create(\&calc); |
474 | my $thr2 = threads->create(\&calc); |
475 | my $thr3 = threads->create(\&calc); |
476 | $thr1->join(); |
477 | $thr2->join(); |
478 | $thr3->join(); |
479 | print("total=$total\n"); |
c975c451 |
480 | |
2ad6cdcf |
481 | C<lock()> blocks the thread until the variable being locked is |
482 | available. When C<lock()> returns, your thread can be sure that no other |
0b390a82 |
483 | thread can lock that variable until the block containing the |
c975c451 |
484 | lock exits. |
485 | |
486 | It's important to note that locks don't prevent access to the variable |
487 | in question, only lock attempts. This is in keeping with Perl's |
488 | longstanding tradition of courteous programming, and the advisory file |
2ad6cdcf |
489 | locking that C<flock()> gives you. |
c975c451 |
490 | |
491 | You may lock arrays and hashes as well as scalars. Locking an array, |
492 | though, will not block subsequent locks on array elements, just lock |
493 | attempts on the array itself. |
494 | |
bfce6503 |
495 | Locks are recursive, which means it's okay for a thread to |
c975c451 |
496 | lock a variable more than once. The lock will last until the outermost |
2ad6cdcf |
497 | C<lock()> on the variable goes out of scope. For example: |
bfce6503 |
498 | |
2ad6cdcf |
499 | my $x :shared; |
bfce6503 |
500 | doit(); |
501 | |
502 | sub doit { |
2ad6cdcf |
503 | { |
504 | { |
505 | lock($x); # Wait for lock |
506 | lock($x); # NOOP - we already have the lock |
507 | { |
508 | lock($x); # NOOP |
509 | { |
510 | lock($x); # NOOP |
511 | lockit_some_more(); |
512 | } |
513 | } |
514 | } # *** Implicit unlock here *** |
515 | } |
bfce6503 |
516 | } |
517 | |
518 | sub lockit_some_more { |
2ad6cdcf |
519 | lock($x); # NOOP |
520 | } # Nothing happens here |
bfce6503 |
521 | |
2ad6cdcf |
522 | Note that there is no C<unlock()> function - the only way to unlock a |
0b390a82 |
523 | variable is to allow it to go out of scope. |
bfce6503 |
524 | |
525 | A lock can either be used to guard the data contained within the variable |
526 | being locked, or it can be used to guard something else, like a section |
527 | of code. In this latter case, the variable in question does not hold any |
528 | useful data, and exists only for the purpose of being locked. In this |
529 | respect, the variable behaves like the mutexes and basic semaphores of |
530 | traditional thread libraries. |
c975c451 |
531 | |
bfce6503 |
532 | =head2 A Thread Pitfall: Deadlocks |
c975c451 |
533 | |
bfce6503 |
534 | Locks are a handy tool to synchronize access to data, and using them |
c975c451 |
535 | properly is the key to safe shared data. Unfortunately, locks aren't |
f3278b06 |
536 | without their dangers, especially when multiple locks are involved. |
bfce6503 |
537 | Consider the following code: |
c975c451 |
538 | |
0b390a82 |
539 | use threads; |
540 | |
2ad6cdcf |
541 | my $a :shared = 4; |
542 | my $b :shared = 'foo'; |
543 | my $thr1 = threads->create(sub { |
0b390a82 |
544 | lock($a); |
2ad6cdcf |
545 | sleep(20); |
0b390a82 |
546 | lock($b); |
547 | }); |
2ad6cdcf |
548 | my $thr2 = threads->create(sub { |
0b390a82 |
549 | lock($b); |
2ad6cdcf |
550 | sleep(20); |
0b390a82 |
551 | lock($a); |
c975c451 |
552 | }); |
553 | |
554 | This program will probably hang until you kill it. The only way it |
bfce6503 |
555 | won't hang is if one of the two threads acquires both locks |
c975c451 |
556 | first. A guaranteed-to-hang version is more complicated, but the |
557 | principle is the same. |
558 | |
2ad6cdcf |
559 | The first thread will grab a lock on C<$a>, then, after a pause during which |
bfce6503 |
560 | the second thread has probably had time to do some work, try to grab a |
2ad6cdcf |
561 | lock on C<$b>. Meanwhile, the second thread grabs a lock on C<$b>, then later |
562 | tries to grab a lock on C<$a>. The second lock attempt for both threads will |
bfce6503 |
563 | block, each waiting for the other to release its lock. |
c975c451 |
564 | |
565 | This condition is called a deadlock, and it occurs whenever two or |
566 | more threads are trying to get locks on resources that the others |
567 | own. Each thread will block, waiting for the other to release a lock |
568 | on a resource. That never happens, though, since the thread with the |
569 | resource is itself waiting for a lock to be released. |
570 | |
571 | There are a number of ways to handle this sort of problem. The best |
572 | way is to always have all threads acquire locks in the exact same |
2ad6cdcf |
573 | order. If, for example, you lock variables C<$a>, C<$b>, and C<$c>, always lock |
574 | C<$a> before C<$b>, and C<$b> before C<$c>. It's also best to hold on to locks for |
c975c451 |
575 | as short a period of time to minimize the risks of deadlock. |
576 | |
48b96218 |
577 | The other synchronization primitives described below can suffer from |
bfce6503 |
578 | similar problems. |
579 | |
c975c451 |
580 | =head2 Queues: Passing Data Around |
581 | |
582 | A queue is a special thread-safe object that lets you put data in one |
583 | end and take it out the other without having to worry about |
584 | synchronization issues. They're pretty straightforward, and look like |
585 | this: |
586 | |
0b390a82 |
587 | use threads; |
83272a45 |
588 | use Thread::Queue; |
c975c451 |
589 | |
2ad6cdcf |
590 | my $DataQueue = Thread::Queue->new(); |
591 | my $thr = threads->create(sub { |
592 | while (my $DataElement = $DataQueue->dequeue()) { |
593 | print("Popped $DataElement off the queue\n"); |
0b390a82 |
594 | } |
595 | }); |
c975c451 |
596 | |
0b390a82 |
597 | $DataQueue->enqueue(12); |
598 | $DataQueue->enqueue("A", "B", "C"); |
599 | $DataQueue->enqueue(\$thr); |
2ad6cdcf |
600 | sleep(10); |
c975c451 |
601 | $DataQueue->enqueue(undef); |
2ad6cdcf |
602 | $thr->join(); |
c975c451 |
603 | |
2ad6cdcf |
604 | You create the queue with C<Thread::Queue-E<gt>new()>. Then you can |
605 | add lists of scalars onto the end with C<enqueue()>, and pop scalars off |
606 | the front of it with C<dequeue()>. A queue has no fixed size, and can grow |
6eded8f3 |
607 | as needed to hold everything pushed on to it. |
c975c451 |
608 | |
2ad6cdcf |
609 | If a queue is empty, C<dequeue()> blocks until another thread enqueues |
c975c451 |
610 | something. This makes queues ideal for event loops and other |
611 | communications between threads. |
612 | |
c975c451 |
613 | =head2 Semaphores: Synchronizing Data Access |
614 | |
bfce6503 |
615 | Semaphores are a kind of generic locking mechanism. In their most basic |
fa11829f |
616 | form, they behave very much like lockable scalars, except that they |
bfce6503 |
617 | can't hold data, and that they must be explicitly unlocked. In their |
618 | advanced form, they act like a kind of counter, and can allow multiple |
2ad6cdcf |
619 | threads to have the I<lock> at any one time. |
2605996a |
620 | |
bfce6503 |
621 | =head2 Basic semaphores |
2605996a |
622 | |
2ad6cdcf |
623 | Semaphores have two methods, C<down()> and C<up()>: C<down()> decrements the resource |
8efd9ba4 |
624 | count, while C<up()> increments it. Calls to C<down()> will block if the |
c975c451 |
625 | semaphore's current count would decrement below zero. This program |
626 | gives a quick demonstration: |
627 | |
536bca94 |
628 | use threads; |
0b390a82 |
629 | use Thread::Semaphore; |
bfce6503 |
630 | |
2ad6cdcf |
631 | my $semaphore = Thread::Semaphore->new(); |
632 | my $GlobalVariable :shared = 0; |
2605996a |
633 | |
2ad6cdcf |
634 | $thr1 = threads->create(\&sample_sub, 1); |
635 | $thr2 = threads->create(\&sample_sub, 2); |
636 | $thr3 = threads->create(\&sample_sub, 3); |
2605996a |
637 | |
0b390a82 |
638 | sub sample_sub { |
2ad6cdcf |
639 | my $SubNumber = shift(@_); |
0b390a82 |
640 | my $TryCount = 10; |
641 | my $LocalCopy; |
2ad6cdcf |
642 | sleep(1); |
0b390a82 |
643 | while ($TryCount--) { |
2ad6cdcf |
644 | $semaphore->down(); |
0b390a82 |
645 | $LocalCopy = $GlobalVariable; |
2ad6cdcf |
646 | print("$TryCount tries left for sub $SubNumber (\$GlobalVariable is $GlobalVariable)\n"); |
647 | sleep(2); |
0b390a82 |
648 | $LocalCopy++; |
649 | $GlobalVariable = $LocalCopy; |
2ad6cdcf |
650 | $semaphore->up(); |
0b390a82 |
651 | } |
c975c451 |
652 | } |
6eded8f3 |
653 | |
2ad6cdcf |
654 | $thr1->join(); |
655 | $thr2->join(); |
656 | $thr3->join(); |
2605996a |
657 | |
c975c451 |
658 | The three invocations of the subroutine all operate in sync. The |
659 | semaphore, though, makes sure that only one thread is accessing the |
660 | global variable at once. |
2605996a |
661 | |
bfce6503 |
662 | =head2 Advanced Semaphores |
2605996a |
663 | |
c975c451 |
664 | By default, semaphores behave like locks, letting only one thread |
2ad6cdcf |
665 | C<down()> them at a time. However, there are other uses for semaphores. |
2605996a |
666 | |
6eded8f3 |
667 | Each semaphore has a counter attached to it. By default, semaphores are |
2ad6cdcf |
668 | created with the counter set to one, C<down()> decrements the counter by |
669 | one, and C<up()> increments by one. However, we can override any or all |
6eded8f3 |
670 | of these defaults simply by passing in different values: |
671 | |
672 | use threads; |
83272a45 |
673 | use Thread::Semaphore; |
2ad6cdcf |
674 | |
83272a45 |
675 | my $semaphore = Thread::Semaphore->new(5); |
6eded8f3 |
676 | # Creates a semaphore with the counter set to five |
677 | |
2ad6cdcf |
678 | my $thr1 = threads->create(\&sub1); |
679 | my $thr2 = threads->create(\&sub1); |
6eded8f3 |
680 | |
681 | sub sub1 { |
682 | $semaphore->down(5); # Decrements the counter by five |
683 | # Do stuff here |
684 | $semaphore->up(5); # Increment the counter by five |
685 | } |
686 | |
2ad6cdcf |
687 | $thr1->detach(); |
688 | $thr2->detach(); |
6eded8f3 |
689 | |
2ad6cdcf |
690 | If C<down()> attempts to decrement the counter below zero, it blocks until |
6eded8f3 |
691 | the counter is large enough. Note that while a semaphore can be created |
2ad6cdcf |
692 | with a starting count of zero, any C<up()> or C<down()> always changes the |
8efd9ba4 |
693 | counter by at least one, and so C<< $semaphore->down(0) >> is the same as |
694 | C<< $semaphore->down(1) >>. |
2605996a |
695 | |
c975c451 |
696 | The question, of course, is why would you do something like this? Why |
697 | create a semaphore with a starting count that's not one, or why |
c3e59998 |
698 | decrement or increment it by more than one? The answer is resource |
c975c451 |
699 | availability. Many resources that you want to manage access for can be |
700 | safely used by more than one thread at once. |
2605996a |
701 | |
c975c451 |
702 | For example, let's take a GUI driven program. It has a semaphore that |
703 | it uses to synchronize access to the display, so only one thread is |
704 | ever drawing at once. Handy, but of course you don't want any thread |
705 | to start drawing until things are properly set up. In this case, you |
706 | can create a semaphore with a counter set to zero, and up it when |
707 | things are ready for drawing. |
2605996a |
708 | |
c975c451 |
709 | Semaphores with counters greater than one are also useful for |
710 | establishing quotas. Say, for example, that you have a number of |
711 | threads that can do I/O at once. You don't want all the threads |
712 | reading or writing at once though, since that can potentially swamp |
713 | your I/O channels, or deplete your process' quota of filehandles. You |
714 | can use a semaphore initialized to the number of concurrent I/O |
715 | requests (or open files) that you want at any one time, and have your |
716 | threads quietly block and unblock themselves. |
2605996a |
717 | |
c975c451 |
718 | Larger increments or decrements are handy in those cases where a |
719 | thread needs to check out or return a number of resources at once. |
2605996a |
720 | |
8efd9ba4 |
721 | =head2 Waiting for a Condition |
bfce6503 |
722 | |
8efd9ba4 |
723 | The functions C<cond_wait()> and C<cond_signal()> |
724 | can be used in conjunction with locks to notify |
bfce6503 |
725 | co-operating threads that a resource has become available. They are |
726 | very similar in use to the functions found in C<pthreads>. However |
727 | for most purposes, queues are simpler to use and more intuitive. See |
728 | L<threads::shared> for more details. |
2605996a |
729 | |
536bca94 |
730 | =head2 Giving up control |
731 | |
732 | There are times when you may find it useful to have a thread |
733 | explicitly give up the CPU to another thread. You may be doing something |
734 | processor-intensive and want to make sure that the user-interface thread |
735 | gets called frequently. Regardless, there are times that you might want |
736 | a thread to give up the processor. |
737 | |
2ad6cdcf |
738 | Perl's threading package provides the C<yield()> function that does |
739 | this. C<yield()> is pretty straightforward, and works like this: |
536bca94 |
740 | |
0b390a82 |
741 | use threads; |
536bca94 |
742 | |
743 | sub loop { |
2ad6cdcf |
744 | my $thread = shift; |
745 | my $foo = 50; |
746 | while($foo--) { print("In thread $thread\n"); } |
747 | threads->yield(); |
748 | $foo = 50; |
749 | while($foo--) { print("In thread $thread\n"); } |
536bca94 |
750 | } |
751 | |
2ad6cdcf |
752 | my $thr1 = threads->create(\&loop, 'first'); |
753 | my $thr2 = threads->create(\&loop, 'second'); |
754 | my $thr3 = threads->create(\&loop, 'third'); |
536bca94 |
755 | |
2ad6cdcf |
756 | It is important to remember that C<yield()> is only a hint to give up the CPU, |
536bca94 |
757 | it depends on your hardware, OS and threading libraries what actually happens. |
758 | B<On many operating systems, yield() is a no-op.> Therefore it is important |
759 | to note that one should not build the scheduling of the threads around |
2ad6cdcf |
760 | C<yield()> calls. It might work on your platform but it won't work on another |
536bca94 |
761 | platform. |
762 | |
c975c451 |
763 | =head1 General Thread Utility Routines |
764 | |
765 | We've covered the workhorse parts of Perl's threading package, and |
766 | with these tools you should be well on your way to writing threaded |
767 | code and packages. There are a few useful little pieces that didn't |
768 | really fit in anyplace else. |
769 | |
770 | =head2 What Thread Am I In? |
771 | |
2ad6cdcf |
772 | The C<threads-E<gt>self()> class method provides your program with a way to |
bfce6503 |
773 | get an object representing the thread it's currently in. You can use this |
6eded8f3 |
774 | object in the same way as the ones returned from thread creation. |
c975c451 |
775 | |
776 | =head2 Thread IDs |
777 | |
2ad6cdcf |
778 | C<tid()> is a thread object method that returns the thread ID of the |
c975c451 |
779 | thread the object represents. Thread IDs are integers, with the main |
2ad6cdcf |
780 | thread in a program being 0. Currently Perl assigns a unique TID to |
c975c451 |
781 | every thread ever created in your program, assigning the first thread |
8efd9ba4 |
782 | to be created a TID of 1, and increasing the TID by 1 for each new |
2ad6cdcf |
783 | thread that's created. When used as a class method, C<threads-E<gt>tid()> |
784 | can be used by a thread to get its own TID. |
c975c451 |
785 | |
786 | =head2 Are These Threads The Same? |
787 | |
2ad6cdcf |
788 | The C<equal()> method takes two thread objects and returns true |
c975c451 |
789 | if the objects represent the same thread, and false if they don't. |
790 | |
2ad6cdcf |
791 | Thread objects also have an overloaded C<==> comparison so that you can do |
c975c451 |
792 | comparison on them as you would with normal objects. |
793 | |
794 | =head2 What Threads Are Running? |
795 | |
2ad6cdcf |
796 | C<threads-E<gt>list()> returns a list of thread objects, one for each thread |
c975c451 |
797 | that's currently running and not detached. Handy for a number of things, |
2ad6cdcf |
798 | including cleaning up at the end of your program (from the main Perl thread, |
799 | of course): |
c975c451 |
800 | |
0b390a82 |
801 | # Loop through all the threads |
2ad6cdcf |
802 | foreach my $thr (threads->list()) { |
803 | $thr->join(); |
c975c451 |
804 | } |
805 | |
bfce6503 |
806 | If some threads have not finished running when the main Perl thread |
807 | ends, Perl will warn you about it and die, since it is impossible for Perl |
2ad6cdcf |
808 | to clean up itself while other threads are running. |
809 | |
810 | NOTE: The main Perl thread (thread 0) is in a I<detached> state, and so |
811 | does not appear in the list returned by C<threads-E<gt>list()>. |
c975c451 |
812 | |
813 | =head1 A Complete Example |
814 | |
815 | Confused yet? It's time for an example program to show some of the |
816 | things we've covered. This program finds prime numbers using threads. |
817 | |
2ad6cdcf |
818 | 1 #!/usr/bin/perl |
819 | 2 # prime-pthread, courtesy of Tom Christiansen |
820 | 3 |
821 | 4 use strict; |
822 | 5 use warnings; |
823 | 6 |
824 | 7 use threads; |
825 | 8 use Thread::Queue; |
826 | 9 |
827 | 10 my $stream = Thread::Queue->new(); |
828 | 11 for my $i ( 3 .. 1000 ) { |
829 | 12 $stream->enqueue($i); |
830 | 13 } |
831 | 14 $stream->enqueue(undef); |
c975c451 |
832 | 15 |
2ad6cdcf |
833 | 16 threads->create(\&check_num, $stream, 2); |
834 | 17 $kid->join(); |
c975c451 |
835 | 18 |
836 | 19 sub check_num { |
837 | 20 my ($upstream, $cur_prime) = @_; |
838 | 21 my $kid; |
2ad6cdcf |
839 | 22 my $downstream = Thread::Queue->new(); |
840 | 23 while (my $num = $upstream->dequeue()) { |
841 | 24 next unless ($num % $cur_prime); |
c975c451 |
842 | 25 if ($kid) { |
2ad6cdcf |
843 | 26 $downstream->enqueue($num); |
844 | 27 } else { |
845 | 28 print("Found prime $num\n"); |
846 | 29 $kid = threads->create(\&check_num, $downstream, $num); |
c975c451 |
847 | 30 } |
0b390a82 |
848 | 31 } |
2ad6cdcf |
849 | 32 if ($kid) { |
850 | 33 $downstream->enqueue(undef); |
851 | 34 $kid->join(); |
852 | 35 } |
853 | 36 } |
c975c451 |
854 | |
855 | This program uses the pipeline model to generate prime numbers. Each |
856 | thread in the pipeline has an input queue that feeds numbers to be |
857 | checked, a prime number that it's responsible for, and an output queue |
9e75ef81 |
858 | into which it funnels numbers that have failed the check. If the thread |
c975c451 |
859 | has a number that's failed its check and there's no child thread, then |
860 | the thread must have found a new prime number. In that case, a new |
861 | child thread is created for that prime and stuck on the end of the |
862 | pipeline. |
863 | |
6eded8f3 |
864 | This probably sounds a bit more confusing than it really is, so let's |
c975c451 |
865 | go through this program piece by piece and see what it does. (For |
866 | those of you who might be trying to remember exactly what a prime |
2ad6cdcf |
867 | number is, it's a number that's only evenly divisible by itself and 1.) |
c975c451 |
868 | |
2ad6cdcf |
869 | The bulk of the work is done by the C<check_num()> subroutine, which |
c975c451 |
870 | takes a reference to its input queue and a prime number that it's |
871 | responsible for. After pulling in the input queue and the prime that |
c3e59998 |
872 | the subroutine is checking (line 20), we create a new queue (line 22) |
c975c451 |
873 | and reserve a scalar for the thread that we're likely to create later |
874 | (line 21). |
875 | |
876 | The while loop from lines 23 to line 31 grabs a scalar off the input |
877 | queue and checks against the prime this thread is responsible |
c3e59998 |
878 | for. Line 24 checks to see if there's a remainder when we divide the |
879 | number to be checked by our prime. If there is one, the number |
c975c451 |
880 | must not be evenly divisible by our prime, so we need to either pass |
881 | it on to the next thread if we've created one (line 26) or create a |
882 | new thread if we haven't. |
883 | |
884 | The new thread creation is line 29. We pass on to it a reference to |
885 | the queue we've created, and the prime number we've found. |
886 | |
2ad6cdcf |
887 | Finally, once the loop terminates (because we got a 0 or C<undef> in the |
888 | queue, which serves as a note to terminate), we pass on the notice to our |
6eded8f3 |
889 | child and wait for it to exit if we've created a child (lines 32 and |
2ad6cdcf |
890 | 35). |
c975c451 |
891 | |
2ad6cdcf |
892 | Meanwhile, back in the main thread, we first create a queue (line 10) and |
893 | queue up all the numbers from 3 to 1000 for checking (lines 11-13), |
894 | plus a termination notice (line 14). Then we create the initial child |
895 | threads (line 16), passing it the queue and the first prime: 2. Finally, |
896 | we wait for the first child thread to terminate (line 17). Because a |
897 | child won't terminate until its child has terminated, we know that we're |
898 | done once we return from the C<join()>. |
c975c451 |
899 | |
900 | That's how it works. It's pretty simple; as with many Perl programs, |
901 | the explanation is much longer than the program. |
902 | |
536bca94 |
903 | =head1 Different implementations of threads |
904 | |
905 | Some background on thread implementations from the operating system |
906 | viewpoint. There are three basic categories of threads: user-mode threads, |
907 | kernel threads, and multiprocessor kernel threads. |
908 | |
909 | User-mode threads are threads that live entirely within a program and |
910 | its libraries. In this model, the OS knows nothing about threads. As |
911 | far as it's concerned, your process is just a process. |
912 | |
913 | This is the easiest way to implement threads, and the way most OSes |
914 | start. The big disadvantage is that, since the OS knows nothing about |
915 | threads, if one thread blocks they all do. Typical blocking activities |
2ad6cdcf |
916 | include most system calls, most I/O, and things like C<sleep()>. |
536bca94 |
917 | |
918 | Kernel threads are the next step in thread evolution. The OS knows |
919 | about kernel threads, and makes allowances for them. The main |
920 | difference between a kernel thread and a user-mode thread is |
921 | blocking. With kernel threads, things that block a single thread don't |
922 | block other threads. This is not the case with user-mode threads, |
923 | where the kernel blocks at the process level and not the thread level. |
924 | |
925 | This is a big step forward, and can give a threaded program quite a |
926 | performance boost over non-threaded programs. Threads that block |
927 | performing I/O, for example, won't block threads that are doing other |
928 | things. Each process still has only one thread running at once, |
929 | though, regardless of how many CPUs a system might have. |
930 | |
931 | Since kernel threading can interrupt a thread at any time, they will |
932 | uncover some of the implicit locking assumptions you may make in your |
933 | program. For example, something as simple as C<$a = $a + 2> can behave |
2ad6cdcf |
934 | unpredictably with kernel threads if C<$a> is visible to other |
935 | threads, as another thread may have changed C<$a> between the time it |
536bca94 |
936 | was fetched on the right hand side and the time the new value is |
937 | stored. |
938 | |
939 | Multiprocessor kernel threads are the final step in thread |
940 | support. With multiprocessor kernel threads on a machine with multiple |
941 | CPUs, the OS may schedule two or more threads to run simultaneously on |
942 | different CPUs. |
943 | |
944 | This can give a serious performance boost to your threaded program, |
945 | since more than one thread will be executing at the same time. As a |
946 | tradeoff, though, any of those nagging synchronization issues that |
947 | might not have shown with basic kernel threads will appear with a |
948 | vengeance. |
949 | |
950 | In addition to the different levels of OS involvement in threads, |
951 | different OSes (and different thread implementations for a particular |
952 | OS) allocate CPU cycles to threads in different ways. |
953 | |
954 | Cooperative multitasking systems have running threads give up control |
955 | if one of two things happen. If a thread calls a yield function, it |
956 | gives up control. It also gives up control if the thread does |
957 | something that would cause it to block, such as perform I/O. In a |
958 | cooperative multitasking implementation, one thread can starve all the |
959 | others for CPU time if it so chooses. |
960 | |
961 | Preemptive multitasking systems interrupt threads at regular intervals |
962 | while the system decides which thread should run next. In a preemptive |
963 | multitasking system, one thread usually won't monopolize the CPU. |
964 | |
965 | On some systems, there can be cooperative and preemptive threads |
966 | running simultaneously. (Threads running with realtime priorities |
967 | often behave cooperatively, for example, while threads running at |
968 | normal priorities behave preemptively.) |
969 | |
970 | Most modern operating systems support preemptive multitasking nowadays. |
971 | |
bfce6503 |
972 | =head1 Performance considerations |
973 | |
2ad6cdcf |
974 | The main thing to bear in mind when comparing Perl's I<ithreads> to other threading |
bfce6503 |
975 | models is the fact that for each new thread created, a complete copy of |
2ad6cdcf |
976 | all the variables and data of the parent thread has to be taken. Thus, |
bfce6503 |
977 | thread creation can be quite expensive, both in terms of memory usage and |
978 | time spent in creation. The ideal way to reduce these costs is to have a |
979 | relatively short number of long-lived threads, all created fairly early |
2ad6cdcf |
980 | on -- before the base thread has accumulated too much data. Of course, this |
bfce6503 |
981 | may not always be possible, so compromises have to be made. However, after |
982 | a thread has been created, its performance and extra memory usage should |
983 | be little different than ordinary code. |
984 | |
985 | Also note that under the current implementation, shared variables |
986 | use a little more memory and are a little slower than ordinary variables. |
987 | |
cf5baa48 |
988 | =head1 Process-scope Changes |
989 | |
990 | Note that while threads themselves are separate execution threads and |
991 | Perl data is thread-private unless explicitly shared, the threads can |
992 | affect process-scope state, affecting all the threads. |
993 | |
994 | The most common example of this is changing the current working |
2ad6cdcf |
995 | directory using C<chdir()>. One thread calls C<chdir()>, and the working |
cf5baa48 |
996 | directory of all the threads changes. |
bdcfa4c7 |
997 | |
2ad6cdcf |
998 | Even more drastic example of a process-scope change is C<chroot()>: |
cf5baa48 |
999 | the root directory of all the threads changes, and no thread can |
2ad6cdcf |
1000 | undo it (as opposed to C<chdir()>). |
cf5baa48 |
1001 | |
2ad6cdcf |
1002 | Further examples of process-scope changes include C<umask()> and |
c3e59998 |
1003 | changing uids and gids. |
cf5baa48 |
1004 | |
2ad6cdcf |
1005 | Thinking of mixing C<fork()> and threads? Please lie down and wait |
1006 | until the feeling passes. Be aware that the semantics of C<fork()> vary |
a95a5f75 |
1007 | between platforms. For example, some UNIX systems copy all the current |
1008 | threads into the child process, while others only copy the thread that |
2ad6cdcf |
1009 | called C<fork()>. You have been warned! |
cf5baa48 |
1010 | |
2ad6cdcf |
1011 | Similarly, mixing signals and threads may be problematic. |
b03ad8f6 |
1012 | Implementations are platform-dependent, and even the POSIX |
1013 | semantics may not be what you expect (and Perl doesn't even |
2ad6cdcf |
1014 | give you the full POSIX API). For example, there is no way to |
1015 | guarantee that a signal sent to a multi-threaded Perl application |
1016 | will get intercepted by any particular thread. (However, a recently |
1017 | added feature does provide the capability to send signals between |
1018 | threads. See L<threads/"THREAD SIGNALLING> for more details.) |
b03ad8f6 |
1019 | |
cf5baa48 |
1020 | =head1 Thread-Safety of System Libraries |
1021 | |
1022 | Whether various library calls are thread-safe is outside the control |
1023 | of Perl. Calls often suffering from not being thread-safe include: |
8efd9ba4 |
1024 | C<localtime()>, C<gmtime()>, functions fetching user, group and |
1025 | network information (such as C<getgrent()>, C<gethostent()>, |
1026 | C<getnetent()> and so on), C<readdir()>, |
2ad6cdcf |
1027 | C<rand()>, and C<srand()> -- in general, calls that depend on some global |
cf5baa48 |
1028 | external state. |
80bbcbc4 |
1029 | |
cf5baa48 |
1030 | If the system Perl is compiled in has thread-safe variants of such |
80bbcbc4 |
1031 | calls, they will be used. Beyond that, Perl is at the mercy of |
cf5baa48 |
1032 | the thread-safety or -unsafety of the calls. Please consult your |
80bbcbc4 |
1033 | C library call documentation. |
1034 | |
af685957 |
1035 | On some platforms the thread-safe library interfaces may fail if the |
1036 | result buffer is too small (for example the user group databases may |
1037 | be rather large, and the reentrant interfaces may have to carry around |
1038 | a full snapshot of those databases). Perl will start with a small |
1039 | buffer, but keep retrying and growing the result buffer |
1040 | until the result fits. If this limitless growing sounds bad for |
1041 | security or memory consumption reasons you can recompile Perl with |
2ad6cdcf |
1042 | C<PERL_REENTRANT_MAXSIZE> defined to the maximum number of bytes you will |
af685957 |
1043 | allow. |
bdcfa4c7 |
1044 | |
c975c451 |
1045 | =head1 Conclusion |
1046 | |
1047 | A complete thread tutorial could fill a book (and has, many times), |
6eded8f3 |
1048 | but with what we've covered in this introduction, you should be well |
1049 | on your way to becoming a threaded Perl expert. |
c975c451 |
1050 | |
2ad6cdcf |
1051 | =head1 SEE ALSO |
1052 | |
1053 | Annotated POD for L<threads>: |
1054 | L<http://annocpan.org/?mode=search&field=Module&name=threads> |
1055 | |
1056 | Lastest version of L<threads> on CPAN: |
1057 | L<http://search.cpan.org/search?module=threads> |
1058 | |
1059 | Annotated POD for L<threads::shared>: |
1060 | L<http://annocpan.org/?mode=search&field=Module&name=threads%3A%3Ashared> |
1061 | |
1062 | Lastest version of L<threads::shared> on CPAN: |
1063 | L<http://search.cpan.org/search?module=threads%3A%3Ashared> |
1064 | |
1065 | Perl threads mailing list: |
1066 | L<http://lists.cpan.org/showlist.cgi?name=iThreads> |
1067 | |
c975c451 |
1068 | =head1 Bibliography |
1069 | |
1070 | Here's a short bibliography courtesy of Jürgen Christoffel: |
1071 | |
1072 | =head2 Introductory Texts |
1073 | |
1074 | Birrell, Andrew D. An Introduction to Programming with |
1075 | Threads. Digital Equipment Corporation, 1989, DEC-SRC Research Report |
1076 | #35 online as |
6eded8f3 |
1077 | http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html |
1078 | (highly recommended) |
c975c451 |
1079 | |
1080 | Robbins, Kay. A., and Steven Robbins. Practical Unix Programming: A |
1081 | Guide to Concurrency, Communication, and |
1082 | Multithreading. Prentice-Hall, 1996. |
1083 | |
1084 | Lewis, Bill, and Daniel J. Berg. Multithreaded Programming with |
1085 | Pthreads. Prentice Hall, 1997, ISBN 0-13-443698-9 (a well-written |
1086 | introduction to threads). |
1087 | |
1088 | Nelson, Greg (editor). Systems Programming with Modula-3. Prentice |
1089 | Hall, 1991, ISBN 0-13-590464-1. |
1090 | |
1091 | Nichols, Bradford, Dick Buttlar, and Jacqueline Proulx Farrell. |
1092 | Pthreads Programming. O'Reilly & Associates, 1996, ISBN 156592-115-1 |
1093 | (covers POSIX threads). |
1094 | |
1095 | =head2 OS-Related References |
1096 | |
1097 | Boykin, Joseph, David Kirschen, Alan Langerman, and Susan |
1098 | LoVerso. Programming under Mach. Addison-Wesley, 1994, ISBN |
1099 | 0-201-52739-1. |
1100 | |
1101 | Tanenbaum, Andrew S. Distributed Operating Systems. Prentice Hall, |
1102 | 1995, ISBN 0-13-219908-4 (great textbook). |
1103 | |
1104 | Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts, |
1105 | 4th ed. Addison-Wesley, 1995, ISBN 0-201-59292-4 |
1106 | |
1107 | =head2 Other References |
1108 | |
1109 | Arnold, Ken and James Gosling. The Java Programming Language, 2nd |
1110 | ed. Addison-Wesley, 1998, ISBN 0-201-31006-6. |
1111 | |
b03ad8f6 |
1112 | comp.programming.threads FAQ, |
1113 | L<http://www.serpentine.com/~bos/threads-faq/> |
1114 | |
c975c451 |
1115 | Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage |
1116 | Collection on Virtually Shared Memory Architectures" in Memory |
1117 | Management: Proc. of the International Workshop IWMM 92, St. Malo, |
1118 | France, September 1992, Yves Bekkers and Jacques Cohen, eds. Springer, |
1119 | 1992, ISBN 3540-55940-X (real-life thread applications). |
1120 | |
5e549d84 |
1121 | Artur Bergman, "Where Wizards Fear To Tread", June 11, 2002, |
1122 | L<http://www.perl.com/pub/a/2002/06/11/threads.html> |
1123 | |
c975c451 |
1124 | =head1 Acknowledgements |
1125 | |
1126 | Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy |
1127 | Sarathy, Ilya Zakharevich, Benjamin Sugars, Jürgen Christoffel, Joshua |
1128 | Pritikin, and Alan Burlison, for their help in reality-checking and |
1129 | polishing this article. Big thanks to Tom Christiansen for his rewrite |
1130 | of the prime number generator. |
1131 | |
1132 | =head1 AUTHOR |
1133 | |
9316ed2f |
1134 | Dan Sugalski E<lt>dan@sidhe.org<gt> |
c975c451 |
1135 | |
1136 | Slightly modified by Arthur Bergman to fit the new thread model/module. |
1137 | |
cf5baa48 |
1138 | Reworked slightly by Jörg Walter E<lt>jwalt@cpan.org<gt> to be more concise |
2ad6cdcf |
1139 | about thread-safety of Perl code. |
cf5baa48 |
1140 | |
536bca94 |
1141 | Rearranged slightly by Elizabeth Mattijsen E<lt>liz@dijkmat.nl<gt> to put |
1142 | less emphasis on yield(). |
1143 | |
c975c451 |
1144 | =head1 Copyrights |
1145 | |
bfce6503 |
1146 | The original version of this article originally appeared in The Perl |
1147 | Journal #10, and is copyright 1998 The Perl Journal. It appears courtesy |
1148 | of Jon Orwant and The Perl Journal. This document may be distributed |
1149 | under the same terms as Perl itself. |
2605996a |
1150 | |
2ad6cdcf |
1151 | =cut |