If you've played around much with threaded Perl, you've probably already discovered that many of the modules available, including some that ship with the base Perl distribution, aren't thread safe. And if you've written any modules that have been released to CPAN, you've probably already gotten mail from someone asking "Is your module thread safe?" This, of course, begs the question "How do I make my module thread safe?" a question that can seem pretty overwhelming, especially if you've got no experience with threads.
What we're going to do in this article is show you what you need to do to make your Perl module thread safe. If you correctly implement everything we cover here, your module should be fine. Do be aware that we're talking strictly about making a module thread safe with a minimum of fuss. This is just the first, and easiest, step in taking full advantage of threads.
If your module will be used with Perl 5.005 or higher, internal locking is simple. The lock() function locks a variable if your Perl was built with threads, and is a no-op otherwise. That's the easy part.
If your module is going to run on versions of Perl 5.004 and below, things get a bit trickier. The easiest thing to do is put this piece of code at the beginning of the module:
BEGIN { sub fakelock {}; if ($] < 5.005) {# $] holds the version number of your Perl *lock = \&fakelock; } }
This code will create a lock() subroutine that does nothing if you're running on a version of Perl below 5.005.
Once you're set with a lock() subroutine, just scatter them throughout your code wherever you need to lock things. The standard locking rules apply, of course, so if you're going to have several blocks that lock multiple variables, you'll want to make sure you lock them in the same order in each block.
Luckily, Perl provides you with a way to fix this. The answer is to tie your globals. This slows down your code, but as you're probably not accessing the globals that much the safety tradeoff is worth it. The following code chunk demonstrates one way to do this, tying the two variables $DEBUG and $BEHAVIOR.
# This should be your real package name package MyPackage; use Config; # Predeclare the variables you want to protect use vars qw($DEBUG $BEHAVIOR); BEGIN { # This only needs to be different from your main package name # if your main package can be tied to things. package MyPackage::ThrSafe; sub TIESCALAR { my $var; my $class = shift; return bless \$var, $class; } sub FETCH { my $var = shift; lock $var; # Lock goes up one level of reference return $$var; } sub STORE { my ($var, $val) = @_; lock $var; $$var = $val; return $val; } # Tie the global variables to our threadsafing package if ($Config{usethreads}) { tie $DEBUG, 'MyPackage::ThrSafe'; tie $BEHAVIOR, 'MyPackage::ThrSafe'; } }
As you can see from the example, the tie code is very simple; just enough to wrap a lock around the variable access. We also only tie $DEBUG and $BEHAVIOR if we're actually running on a threaded Perl (that is, if $Config{usethreads} is true). And, since the locks are only held for the duration of the subroutines, we don't even need any DESTROY code to clean things up.
You may be tempted to do the tying only if the Thread module has acually been used. That's not a safe thing to do, though - our module might have been used before the Thread module, or the Thread module might get loaded in at runtime via do, require, or eval. Locking Your Code
Sometimes it's more appropriate to lock code rather than data. You might, for example, have a subroutine that updates a configuration file, and the last thing that you want is to have multiple threads running at once. And it's often much simpler to lock a single subroutine rather than lock dozens of variables.
Locking a subroutine is simple. If you're running with Perl 5.005 or higher, make the first line of your subroutine
use attrs qw(locked);
and Perl will ensure that only one thread is in the subroutine at any one time. If your code might run on older versions of Perl, though, you don't want to do that. Instead, make the first line of the subroutine
lock(\&subname);
where subname is the name of the subroutine being locked. The use attrs method is slightly faster, but the speed difference isn't that noticeable unless you're doing a lot of subroutine locking.
Once the subroutine is locked, you can be sure that no other thread can enter it until the lock is released. Subroutine locks, by the way, are the only mandatory locks in Perl - when a thread locks a subroutine, Perl enforces that lock and will not let any other thread into that subroutine until the lock is released. While this isn't that big a deal if the subroutine lock is inside the subroutine (like we're talking about here), it can be an issue if you lock the subroutine someplace else.
Once again, Perl 5.005 and higher provide this functionality. All you need to do to get Perl to use method locking rather than subroutine locking is to make this the first line of your subroutine:
use attrs qw(locked method);
and Perl will automatically use method locking instead of subroutine locking. If this subroutine is called as a method on an object, Perl will lock the object. If called as a static method, Perl locks the whole stash. (The stash, for those not familiar with Perl's guts, is a hash that holds a package's global variables and subroutines.)
This makes duplicating the method locking behavior a bit trickier. The code to do so looks like this:
package MyPackage::SubPackage; sub locked_method { my $obj = shift; # Lock the object if we got one lock $obj if ref($obj); # Lock the stash if we didn't lock $::{'MyPackage::'}{'SubPackage::'} unless ref($obj); # Do your stuff here while the locks are still in scope }
You'll need to update the stash lock line depending on what package the subroutine is actually in. While it's possible to determine this at runtime, it's pretty expensive, and Perl's method calls hurt enough as it is.
One thing you'll notice here is that we're getting a lock just on the object or stash. Nothing special is done to match up the subroutine and object, or subroutine and stash. This is consistent with Perl's behavior - entering a locked method for an object or package prevents any other thread from entering a locked method for that object or package.
package MyPackage; my $package_lock; sub foo { lock $package_lock; # Do stuff } sub bar { lock $package_lock; # More stuff } sub baz { # Just do stuff without locks }
While these shortcomings may be fixed in future releases of Perl (we are, to some extent, limited to the facilities provided by different platform's threading libraries), right now the only defense against deadlock is careful programming. To avoid them, follow these rules;
Actually acquiring a lock isn't really a performance killer. What can bite you is when a thread blocks trying to acquire a lock. Blocking and later waking up cost a little bit of time, but more importantly it creates a bottleneck - what folks doing threads call a critical path.
Critical paths are usually bad, especially on multiprocessor machines, since they reduce the level of concurrency in your program. The more threads stuck trying to get into a critical path, the lower the level of concurrency. The lower the level of concurrency, the less well your CPU resources are used. In particularly bad cases, adding an extra CPU can actually decrease performance. Concurrency is your friend!
Don't think that critical paths are only an issue on multiprocessor machines, though. You can cause yourself similar problems on a uniprocessor machine by holding a lock across a blocking system call, so just don't do that.
Hopefully, that proper locking is both good and reasonably simple.
__END__