X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlthrtut.pod;h=8e4e4f6063bd80d173e51a0a74b82ef4b9bc9cda;hb=28b41a8090d259cff9b1dd87c0c53b3c4a31e822;hp=dbc792d6605649d5450099bfa7394648234c51a6;hpb=83272a45226e83bd136d713158e9b44ace2dbc8d;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlthrtut.pod b/pod/perlthrtut.pod index dbc792d..8e4e4f6 100644 --- a/pod/perlthrtut.pod +++ b/pod/perlthrtut.pod @@ -102,81 +102,6 @@ another thread. Prime and Fibonacci generators both map well to this form of the pipeline model. (A version of a prime number generator is presented later on.) -=head1 Native threads - -There are several different ways to implement threads on a system. How -threads are implemented depends both on the vendor and, in some cases, -the version of the operating system. Often the first implementation -will be relatively simple, but later versions of the OS will be more -sophisticated. - -While the information in this section is useful, it's not necessary, -so you can skip it if you don't feel up to it. - -There are three basic categories of threads: user-mode threads, kernel -threads, and multiprocessor kernel threads. - -User-mode threads are threads that live entirely within a program and -its libraries. In this model, the OS knows nothing about threads. As -far as it's concerned, your process is just a process. - -This is the easiest way to implement threads, and the way most OSes -start. The big disadvantage is that, since the OS knows nothing about -threads, if one thread blocks they all do. Typical blocking activities -include most system calls, most I/O, and things like sleep(). - -Kernel threads are the next step in thread evolution. The OS knows -about kernel threads, and makes allowances for them. The main -difference between a kernel thread and a user-mode thread is -blocking. With kernel threads, things that block a single thread don't -block other threads. This is not the case with user-mode threads, -where the kernel blocks at the process level and not the thread level. - -This is a big step forward, and can give a threaded program quite a -performance boost over non-threaded programs. Threads that block -performing I/O, for example, won't block threads that are doing other -things. Each process still has only one thread running at once, -though, regardless of how many CPUs a system might have. - -Since kernel threading can interrupt a thread at any time, they will -uncover some of the implicit locking assumptions you may make in your -program. For example, something as simple as C<$a = $a + 2> can behave -unpredictably with kernel threads if $a is visible to other -threads, as another thread may have changed $a between the time it -was fetched on the right hand side and the time the new value is -stored. - -Multiprocessor kernel threads are the final step in thread -support. With multiprocessor kernel threads on a machine with multiple -CPUs, the OS may schedule two or more threads to run simultaneously on -different CPUs. - -This can give a serious performance boost to your threaded program, -since more than one thread will be executing at the same time. As a -tradeoff, though, any of those nagging synchronization issues that -might not have shown with basic kernel threads will appear with a -vengeance. - -In addition to the different levels of OS involvement in threads, -different OSes (and different thread implementations for a particular -OS) allocate CPU cycles to threads in different ways. - -Cooperative multitasking systems have running threads give up control -if one of two things happen. If a thread calls a yield function, it -gives up control. It also gives up control if the thread does -something that would cause it to block, such as perform I/O. In a -cooperative multitasking implementation, one thread can starve all the -others for CPU time if it so chooses. - -Preemptive multitasking systems interrupt threads at regular intervals -while the system decides which thread should run next. In a preemptive -multitasking system, one thread usually won't monopolize the CPU. - -On some systems, there can be cooperative and preemptive threads -running simultaneously. (Threads running with realtime priorities -often behave cooperatively, for example, while threads running at -normal priorities behave preemptively.) - =head1 What kind of threads are Perl threads? If you have experience with other thread implementations, you might @@ -202,14 +127,15 @@ system blocks the entire process on sleep(), Perl usually will as well. Perl Threads Are Different. -=head1 Threadsafe Modules +=head1 Thread-Safe Modules -The addition of threads has changed Perl's internals +The addition of threads has changed Perl's internals substantially. There are implications for people who write -modules with XS code or external libraries. However, since the threads -do not share data, pure Perl modules that don't interact with external -systems should be safe. Modules that are not tagged as thread-safe should -be tested or code reviewed before being used in production code. +modules with XS code or external libraries. However, since perl data is +not shared among threads by default, Perl modules stand a high chance of +being thread-safe or can be made thread-safe easily. Modules that are not +tagged as thread-safe should be tested or code reviewed before being used +in production code. Not all modules that you might use are thread-safe, and you should always assume a module is unsafe unless the documentation says @@ -217,17 +143,18 @@ otherwise. This includes modules that are distributed as part of the core. Threads are a new feature, and even some of the standard modules aren't thread-safe. -Even if a module is threadsafe, it doesn't mean that the module is optimized +Even if a module is thread-safe, it doesn't mean that the module is optimized to work well with threads. A module could possibly be rewritten to utilize the new features in threaded Perl to increase performance in a threaded environment. If you're using a module that's not thread-safe for some reason, you -can protect yourself by using semaphores and lots of programming -discipline to control access to the module. Semaphores are covered -later in the article. +can protect yourself by using it from one, and only one thread at all. +If you need multiple threads to access such a module, you can use semaphores and +lots of programming discipline to control access to it. Semaphores +are covered in L. -See also L. +See also L. =head1 Thread Basics @@ -253,19 +180,19 @@ like: A possibly-threaded program using a possibly-threaded module might have code like this: - use Config; - use MyMod; + use Config; + use MyMod; BEGIN { - if ($Config{useithreads}) { - # We have threads - require MyMod_threaded; - import MyMod_threaded; - } else { - require MyMod_unthreaded; - import MyMod_unthreaded; + if ($Config{useithreads}) { + # We have threads + require MyMod_threaded; + import MyMod_threaded; + } else { + require MyMod_unthreaded; + import MyMod_unthreaded; } - } + } Since code that runs both with and without threads is usually pretty messy, it's best to isolate the thread-specific code in its own @@ -329,39 +256,6 @@ environment and potentially separate arguments. C is a synonym for C. -=head2 Giving up control - -There are times when you may find it useful to have a thread -explicitly give up the CPU to another thread. Your threading package -might not support preemptive multitasking for threads, for example, or -you may be doing something processor-intensive and want to make sure -that the user-interface thread gets called frequently. Regardless, -there are times that you might want a thread to give up the processor. - -Perl's threading package provides the yield() function that does -this. yield() is pretty straightforward, and works like this: - - use threads; - - sub loop { - my $thread = shift; - my $foo = 50; - while($foo--) { print "in thread $thread\n" } - threads->yield; - $foo = 50; - while($foo--) { print "in thread $thread\n" } - } - - my $thread1 = threads->new(\&loop, 'first'); - my $thread2 = threads->new(\&loop, 'second'); - my $thread3 = threads->new(\&loop, 'third'); - -It is important to remember that yield() is only a hint to give up the CPU, -it depends on your hardware, OS and threading libraries what actually happens. -Therefore it is important to note that one should not build the scheduling of -the threads around yield() calls. It might work on your platform but it won't -work on another platform. - =head2 Waiting For A Thread To Exit Since threads are also subroutines, they can return values. To wait @@ -404,7 +298,7 @@ automatically. $thr->detach; # Now we officially don't care any more - sub sub1 { + sub sub1 { $a = 0; while (1) { $a++; @@ -540,7 +434,7 @@ techniques such as queues, which remove some of the hard work involved. =head2 Controlling access: lock() The lock() function takes a shared variable and puts a lock on it. -No other thread may lock the variable until the the variable is unlocked +No other thread may lock the variable until the variable is unlocked by the thread holding the lock. Unlocking happens automatically when the locking thread exits the outermost block that contains C function. Using lock() is straightforward: this example has @@ -637,13 +531,11 @@ Consider the following code: my $b : shared = "foo"; my $thr1 = threads->new(sub { lock($a); - threads->yield; sleep 20; lock($b); }); my $thr2 = threads->new(sub { lock($b); - threads->yield; sleep 20; lock($a); }); @@ -710,7 +602,7 @@ communications between threads. =head2 Semaphores: Synchronizing Data Access Semaphores are a kind of generic locking mechanism. In their most basic -form, they behave very much like lockable scalars, except that thay +form, they behave very much like lockable scalars, except that they can't hold data, and that they must be explicitly unlocked. In their advanced form, they act like a kind of counter, and can allow multiple threads to have the 'lock' at any one time. @@ -722,7 +614,7 @@ count, while up increments it. Calls to down() will block if the semaphore's current count would decrement below zero. This program gives a quick demonstration: - use threads qw(yield); + use threads; use Thread::Semaphore; my $semaphore = new Thread::Semaphore; @@ -741,7 +633,6 @@ gives a quick demonstration: $semaphore->down; $LocalCopy = $GlobalVariable; print "$TryCount tries left for sub $SubNumber (\$GlobalVariable is $GlobalVariable)\n"; - yield; sleep 2; $LocalCopy++; $GlobalVariable = $LocalCopy; @@ -823,6 +714,39 @@ very similar in use to the functions found in C. However for most purposes, queues are simpler to use and more intuitive. See L for more details. +=head2 Giving up control + +There are times when you may find it useful to have a thread +explicitly give up the CPU to another thread. You may be doing something +processor-intensive and want to make sure that the user-interface thread +gets called frequently. Regardless, there are times that you might want +a thread to give up the processor. + +Perl's threading package provides the yield() function that does +this. yield() is pretty straightforward, and works like this: + + use threads; + + sub loop { + my $thread = shift; + my $foo = 50; + while($foo--) { print "in thread $thread\n" } + threads->yield; + $foo = 50; + while($foo--) { print "in thread $thread\n" } + } + + my $thread1 = threads->new(\&loop, 'first'); + my $thread2 = threads->new(\&loop, 'second'); + my $thread3 = threads->new(\&loop, 'third'); + +It is important to remember that yield() is only a hint to give up the CPU, +it depends on your hardware, OS and threading libraries what actually happens. +B Therefore it is important +to note that one should not build the scheduling of the threads around +yield() calls. It might work on your platform but it won't work on another +platform. + =head1 General Thread Utility Routines We've covered the workhorse parts of Perl's threading package, and @@ -958,6 +882,75 @@ child has died, we know that we're done once we return from the join. That's how it works. It's pretty simple; as with many Perl programs, the explanation is much longer than the program. +=head1 Different implementations of threads + +Some background on thread implementations from the operating system +viewpoint. There are three basic categories of threads: user-mode threads, +kernel threads, and multiprocessor kernel threads. + +User-mode threads are threads that live entirely within a program and +its libraries. In this model, the OS knows nothing about threads. As +far as it's concerned, your process is just a process. + +This is the easiest way to implement threads, and the way most OSes +start. The big disadvantage is that, since the OS knows nothing about +threads, if one thread blocks they all do. Typical blocking activities +include most system calls, most I/O, and things like sleep(). + +Kernel threads are the next step in thread evolution. The OS knows +about kernel threads, and makes allowances for them. The main +difference between a kernel thread and a user-mode thread is +blocking. With kernel threads, things that block a single thread don't +block other threads. This is not the case with user-mode threads, +where the kernel blocks at the process level and not the thread level. + +This is a big step forward, and can give a threaded program quite a +performance boost over non-threaded programs. Threads that block +performing I/O, for example, won't block threads that are doing other +things. Each process still has only one thread running at once, +though, regardless of how many CPUs a system might have. + +Since kernel threading can interrupt a thread at any time, they will +uncover some of the implicit locking assumptions you may make in your +program. For example, something as simple as C<$a = $a + 2> can behave +unpredictably with kernel threads if $a is visible to other +threads, as another thread may have changed $a between the time it +was fetched on the right hand side and the time the new value is +stored. + +Multiprocessor kernel threads are the final step in thread +support. With multiprocessor kernel threads on a machine with multiple +CPUs, the OS may schedule two or more threads to run simultaneously on +different CPUs. + +This can give a serious performance boost to your threaded program, +since more than one thread will be executing at the same time. As a +tradeoff, though, any of those nagging synchronization issues that +might not have shown with basic kernel threads will appear with a +vengeance. + +In addition to the different levels of OS involvement in threads, +different OSes (and different thread implementations for a particular +OS) allocate CPU cycles to threads in different ways. + +Cooperative multitasking systems have running threads give up control +if one of two things happen. If a thread calls a yield function, it +gives up control. It also gives up control if the thread does +something that would cause it to block, such as perform I/O. In a +cooperative multitasking implementation, one thread can starve all the +others for CPU time if it so chooses. + +Preemptive multitasking systems interrupt threads at regular intervals +while the system decides which thread should run next. In a preemptive +multitasking system, one thread usually won't monopolize the CPU. + +On some systems, there can be cooperative and preemptive threads +running simultaneously. (Threads running with realtime priorities +often behave cooperatively, for example, while threads running at +normal priorities behave preemptively.) + +Most modern operating systems support preemptive multitasking nowadays. + =head1 Performance considerations The main thing to bear in mind when comparing ithreads to other threading @@ -974,23 +967,56 @@ be little different than ordinary code. Also note that under the current implementation, shared variables use a little more memory and are a little slower than ordinary variables. -=head1 Threadsafety of System Libraries +=head1 Process-scope Changes + +Note that while threads themselves are separate execution threads and +Perl data is thread-private unless explicitly shared, the threads can +affect process-scope state, affecting all the threads. + +The most common example of this is changing the current working +directory using chdir(). One thread calls chdir(), and the working +directory of all the threads changes. + +Even more drastic example of a process-scope change is chroot(): +the root directory of all the threads changes, and no thread can +undo it (as opposed to chdir()). + +Further examples of process-scope changes include umask() and +changing uids/gids. + +Thinking of mixing fork() and threads? Please lie down and wait +until the feeling passes. Be aware that the semantics of fork() vary +between platforms. For example, some UNIX systems copy all the current +threads into the child process, while others only copy the thread that +called fork(). You have been warned! -Whether various library calls are threadsafe is outside the control -of Perl. Calls often suffering from not being threadsafe include: +Similarly, mixing signals and threads should not be attempted. +Implementations are platform-dependent, and even the POSIX +semantics may not be what you expect (and Perl doesn't even +give you the full POSIX API). + +=head1 Thread-Safety of System Libraries + +Whether various library calls are thread-safe is outside the control +of Perl. Calls often suffering from not being thread-safe include: localtime(), gmtime(), get{gr,host,net,proto,serv,pw}*(), readdir(), -rand(), and srand() -- in general, calls that depend on some external -state. +rand(), and srand() -- in general, calls that depend on some global +external state. -If the system Perl is compiled in has threadsafe variants of such +If the system Perl is compiled in has thread-safe variants of such calls, they will be used. Beyond that, Perl is at the mercy of -the threadsafety or unsafety of the calls. Please consult your +the thread-safety or -unsafety of the calls. Please consult your C library call documentation. -In some platforms the threadsafe interfaces may fail if the result -buffer is too small (for example getgrent() may return quite large -group member lists). Perl will retry growing the result buffer -a few times, but only up to 64k (for safety reasons). +On some platforms the thread-safe library interfaces may fail if the +result buffer is too small (for example the user group databases may +be rather large, and the reentrant interfaces may have to carry around +a full snapshot of those databases). Perl will start with a small +buffer, but keep retrying and growing the result buffer +until the result fits. If this limitless growing sounds bad for +security or memory consumption reasons you can recompile Perl with +PERL_REENTRANT_MAXSIZE defined to the maximum number of bytes you will +allow. =head1 Conclusion @@ -1042,6 +1068,9 @@ Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts, Arnold, Ken and James Gosling. The Java Programming Language, 2nd ed. Addison-Wesley, 1998, ISBN 0-201-31006-6. +comp.programming.threads FAQ, +L + Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage Collection on Virtually Shared Memory Architectures" in Memory Management: Proc. of the International Workshop IWMM 92, St. Malo, @@ -1065,6 +1094,12 @@ Dan Sugalski Edan@sidhe.org Slightly modified by Arthur Bergman to fit the new thread model/module. +Reworked slightly by Jörg Walter Ejwalt@cpan.org to be more concise +about thread-safety of perl code. + +Rearranged slightly by Elizabeth Mattijsen Eliz@dijkmat.nl to put +less emphasis on yield(). + =head1 Copyrights The original version of this article originally appeared in The Perl @@ -1073,4 +1108,3 @@ of Jon Orwant and The Perl Journal. This document may be distributed under the same terms as Perl itself. For more information please see L and L. -