PREVIOUS  TABLE OF CONTENTS  NEXT 

Seven Useful Uses of local

Mark-Jason Dominus

In my article Coping With Scoping I offered the advice "Always use my; never use local." The most common use for both is to provide your subroutines with private variables, and for this application you should always use my, and never local. But many readers (and the tech editors) noted that local isn't entirely useless; there are cases in which my doesn't work, or doesn't do what you want. So I promised a followup article on useful uses for local. Here they are.

1. Special Variables

my makes most uses of local obsolete. So it's not surprising that the most common useful uses of local arise because of peculiar cases where my happens to be illegal. The most important examples are the punctuation variables such as $", $/, $^W, and $_. Long ago Larry Wall decided that it would be too confusing if you could my them; they're exempt from the normal package scheme for the same reason. So if you want to change them, but have the change apply to only part of the program, you'll have to use local. As an example of where this might be useful, let's consider a function whose job is to read in an entire file and return its contents as a single string:

sub getfile {
   my $filename = shift;
   open F, "< $filename" or die 
              "Couldn't open '$filename': $!";
   my $contents = '';
   while (<F>) {
      $contents .= $_;
   }
   close F;
   return $contents;
}

This is inefficient, because the <F> operator makes Perl go to all the trouble of breaking the file into lines and returning them one at a time, and then all we do is put them back together again. It's cheaper to read the file all at once, without all the splitting and reassembling. (Some people call this slurping the file.) Perl has a special feature to support this: If the $/ variable is undefined, the <> operator will read the entire file all at once:

sub getfile {
   my $filename = shift;
   open F, "< $filename" or die 
              "Couldn't open '$filename': $!";
   $/ = undef;      # Read entire file at once 
   $contents = <F>; # Return file as one single 'line' 
   close F; 
   return $contents; 
}

There's a terrible problem here, which is that $/ is a global variable that affects the semantics of every <> in the entire program. If getfile() doesn't put it back the way it was, some other part of the program is probably going to fail disastrously when it tries to read a line of input and gets the whole rest of the file instead. Normally we'd like to use my, to make the change local to the functions. But we can't here, because my doesn't work on punctuation variables; we would get this error if we tried.


Can't use global $/ in "my" ...

Also, more to the point, Perl itself knows that it should look in the global variable $/ to find the input record separator; even if we could create a new private variable with the same name, Perl wouldn't know to look there. So instead, we need to set a temporary value for the global variable $/, and that is exactly what local does:

sub getfile {
   my $filename = shift;
   open F, "< $filename" or die 
              "Couldn't open '$filename': $!";
   local $/ = undef;   # Read entire file at once
   $contents = <F>;    # Return file as one single 'line' 
   close F; 
   return $contents; 
} 

The old value of $/ is restored when the function returns. In this example, that's enough for safety. In a more complicated function that might call some other functions in a library somewhere, we'd still have to worry that we might be sabotaging the library with our strange $/. It's probably best to confine our changes to the smallest possible part of the program:

sub getfile {
   my $filename = shift; 
   open F, "< $filename" or die 
              "Couldn't open '$filename': $!"; 
   my $contents; 
   { local $/ = undef;   # Read entire file at once 
      $contents = <F>;   # Return file as one single 'line' 
   }                     # $/ regains its old value
   close F; 
   return $contents; 
}

This is a good practice, even for simple functions like this that don't call any other subroutines. By confining the changes to $/ to just the one line we want to affect, we've prevented the possibility that someone in the future will insert some calls to other functions that will break because of the change. This is called defensive programming.

Although you may not think about it much, localizing $_ this way can be very important. Here's a slightly different version of getfile(), one which throws away comments and blank lines from the file that it gets:

sub getfile {
   my $filename = shift; 
   local *F; 
   open F, "< $filename" or die 
              "Couldn't open '$filename': $!";
   my $contents;
   while (<F>) { 
      s/#.*//;            # Remove comments 
	  next unless /\S/;   # Skip blank lines
      $contents .= $_;    # Save current (nonblank) line
   } 
   return $contents; 
}

This function has a terrible problem. Here's the terrible problem: If you call it like this, it clobbers the elements of @array.

foreach (@array) {
   ...  
   $f = getfile($filename); 
   ... 
}

Why? Because inside a foreach loop, $_ is aliased to the elements of the array; if you change $_, it changes the array. And getfile() does change $_. To prevent itself from sabotaging the $_ of anyone who calls it, getfile() should have local $_ at the top.

Other special variables present similar problems. For example, it's sometimes convenient to change $", $,, or $\ to alter the way print works, but if you don't arrange to put them back the way they were before you call any other functions, you might get a big disaster:

# Good style:
{ local $" = ')(';
   print "Array a: (@a)\n";
}
# Program continues safely...

Another common situation in which you localize a special variable is when you want to temporarily suppress warning messages. Warnings are enabled by the -w command-line option, which in turn sets the variable $^W to a true value. If you reset $^W to a false value, that turns the warnings off. Here's an example: My Memoize module creates a front-end to the user's function and then installs it into the symbol table, replacing the original function. That's what it's for, and it would be awfully annoying to the user to get the warning every time they tried to use my module.

Subroutine factorial redefined at Memoize.pm line 113

So I have this, which turns off the warning for just the one line.

  {
     local $^W = 0;                    # Shut UP!
     *{$name} = $tabent->{UNMEMOIZED};  
  }

The old value of $^W is automatically restored after the chance of triggering the warning is over.

2. Localized Filehandles

Let's look back at that getfile() function. To read the file, it opened the filehandle F. That's fine, unless some other part of the program happened to have already opened a filehandle named F. In that case, the old file is closed, and when control returns from the function, that other part of the program is going to become very confused and upset. This is the 'filehandle clobbering problem'.

This is exactly the sort of problem that local variables were supposed to solve. Unfortunately, there's no way to localize a filehandle directly in Perl.

Well, that's actually a fib. There are three ways to do it:

  1. You can cast a magic spell in which you create an anonymous glob, extract the filehandle from it, and discard the rest of the glob.
  2. You can use the Filehandle or IO::Handle modules, which cast the spell I just described, and present you with the results, so that you don't have to perform any sorcery yourself.
  3. See below. The simplest and cheapest way to solve the filehandle clobbering problem is a little bit obscure. You can't localize the filehandle itself, but you can localize the entry in Perl's symbol table that associates the filehandle's name with the filehandle. This entry is called a 'glob'. In Perl, variables don't have names directly; instead the glob has a name, and the glob gathers together the scalar, array, hash, subroutine, and filehandle with that name. In Perl, the glob named F is denoted with *F.

To localize the filehandle, we actually localize the entire glob, which is a little hamfisted:

sub getfile {
   my $filename = shift; 
   local *F; 
   open F, "< $filename" or die 
              "Couldn't open '$filename': $!";
   local $/ = undef;   # Read entire file at once 
   $contents = <F>;    # Return file as one 'line' 
   close F; 
   return $contents; 
}

local on a glob does the same as any other local: It saves the current value somewhere, creates a new value, and arranges that the old value will be restored at the end of the current block. In this case, that means that any filehandle that was formerly attached to the old *F glob is saved, and the open will apply to the filehandle in the new, local glob. At the end of the block, filehandle F will regain its old meaning again.

This works pretty well most of the time, except that you still have the usual local worries about called subroutines changing the localized values on you. You can't use my here because globs are all about the Perl symbol table; the lexical variable mechanism is totally different, and there is no such thing as a lexical glob.

With this technique, you have the new problem that getfile() can't get at $F, @F, or %F either, because you localized them all, along with the filehandle. But you probably weren't using any global variables anyway. Were you? And getfile() won't be able to call &F, for the same reason. There are a few ways around this, but the easiest one is that if getfile() needs to call &F, it should name the local filehandle something other than F.

use FileHandle does have fewer strange problems. Unfortunately, it also sucks a few thousand lines of code into your program. Now someone will probably write in to complain that I'm exaggerating, because it isn't really 3,000 lines, some of those are whitespace, blah blah blah. Okay, let's say it's only 300 lines to use FileHandle, probably a gross underestimate. It's still only one line to localize the glob. For many programs, localizing the glob is a good, cheap, simple way to solve the problem.

Localized Filehandles, II

When a localized glob goes out of scope, its open filehandle is automatically closed. So the close F in getfile() is unnecessary:

sub getfile {
   my $filename = shift;
   local *F;
   open F, "< $filename" or die 
              "Couldn't open '$filename': $!";
   local $/ = undef;    # Read entire file at once
   return <F>;          # Return file as one single 'line'
}  # F is automatically closed here

That's such a convenient feature that it's worth using even when you're not worried that you might be clobbering someone else's filehandle.

The filehandles that you get from FileHandle and IO::Handle do this also.

Marginal Uses of Localized Filehandles

As I was researching this article, I kept finding common uses for local that turned out not to be useful, because there were simpler and more straightforward ways to do the same thing without using local.

Here is one that you see far too often. People sometimes want to pass a filehandle to a subroutine, and they know that you can pass a filehandle by passing the entire glob, like this:

$rec = read_record(*INPUT_FILE);

sub read_record {
    local *FH = shift;
    my $record;
    read FH, $record, 1024;
    return $record;
}

Here we pass in the entire glob INPUT_FILE, which includes the filehandle of that name. Inside read_record, we temporarily alias FH to INPUT_FILE, so that the filehandle FH inside the function is the same as whatever filehandle was passed in from outside. Then when we read from FH, we're actually reading from the filehandle that the caller wanted. But actually there's a more straightforward way to do the same thing:

$rec = read_record(*INPUT_FILE);
    sub read_record {
    my $fh = shift;
    my $record;
    read $fh, $record, 1024;
    return $record;
}

You can store a glob into a scalar variable, and you can use such a variable in any of Perl's I/O functions wherever you might have used a filehandle name. So the local here was unnecessary.

Dirhandles

Filehandles and dirhandles are stored in the same place in Perl, so everything this article says about filehandles applies to dirhandles in the same way.

3. The First-Class Filehandle Trick

Often you want to put filehandles into an array, or treat them like regular scalars, or pass them to a function, and you can't, because filehandles aren't really first-class objects in Perl. As noted above, you can use the FileHandle or IO::Handle packages to construct a scalar that acts something like a filehandle, but there are some definite disadvantages to that approach.

Another approach is to use a glob as a filehandle; it turns out that a glob will fit into a scalar variable, so you can put it into an array or pass it to a function. The only problem with globs is that they are apt to have strange and magical effects on the Perl symbol table. What you really want is a glob that has been disconnected from the symbol table, so that you can just use it like a filehandle and forget that it might once have had an effect on the symbol table. It turns out that there is a simple way to do that:

my $filehandle = do { local *FH };

do just introduces a block which will be evaluated, and will return the value of the last expression that it contains, which in this case is local *FH. The value of local *FH is a glob. But what glob? local takes the existing FH glob and temporarily replaces it with a new glob. But then it immediately goes out of scope and puts the old glob back, leaving the new glob without a name. But then it returns the new, nameless glob, which is then stored into $filehandle. This is just what we wanted: A glob that has been disconnected from the symbol table. You can make a whole bunch of these, if you want:

for $i (0 .. 99) {
   $fharray[$i] = do { local *FH };
}

You can pass them to subroutines, return them from subroutines, put them in data structures, and give them to Perl's I/O functions like open, close, read, print, and <> and they'll work just fine.

4. Aliases

Globs turn out to be very useful. You can assign an entire glob, as we saw above, and alias an entire symbol in the symbol table. But you don't have to do it all at once.
If you say

*GLOB = $reference;

then Perl only changes the meaning of part of the glob. If the reference is a scalar reference, it changes the meaning of $GLOB, which now means the same as whatever scalar the reference referred to; @GLOB, %GLOB and the other parts don't change at all. If the reference is a hash reference, Perl makes %GLOB mean the same as whatever hash the reference referred to, but the other parts stay the same. Similarly for other kinds of references.

You can use this for all sorts of wonderful tricks. For example, suppose you have a function that is going to do a lot of operations on $_[0]{Time}[2] for some reason. You can say

*arg = \$_[0]{Time}[2];

and from then on, $arg is synonymous with $_[0]{Time}[2], which might make your code simpler, and probably more efficient, because Perl won't have to go digging through three levels of indirection every time. But you'd better use local, or else you'll permanently clobber any $arg variable that already exists. (Gurusamy Sarathy's Alias module does this, but without the local.)

You can create locally-scoped subroutines that are invisible outside a block by saying

*mysub = sub { ... } ;

and then call them with mysub(...). But you must use local, or else you'll permanently clobber any mysub subroutine that already exists.

5. Dynamic Scope

local introduces what is called dynamic scope, which means that the 'local' variable that it declares is inherited by other functions called from the one with the declaration. Usually this isn't what you want, and it's rather a strange feature, unavailable in many programming languages. To see the difference, consider this example:

first();

sub first {
	local $x = 1;
	my    $y = 1;
	second();
}

sub second {
    print "x=", $x, "\n";
    print "y=", $y, "\n";
}

The variable $y is a true local variable. It's available only from its declaration through the end of the enclosing block. In particular, it's unavailable inside of second(), which prints "y=", not "y=1". This is is called lexical scope. local, in contrast, does not actually make a local variable. It creates a new 'local' value for a global variable, which persists until the end of the enclosing block. When control exits the block, the old value is restored. But the variable, and its new 'local' value, are still global, and hence accessible to other subroutines that are called before the old value is restored. second() above prints "x=1", because $x is a global variable that temporarily happens to have the value 1. Once first() returns, the old value will be restored. This is called dynamic scope, which is a misnomer, because it's not really scope at all.

For 'local' variables, you almost always want lexical scope, because it ensures that variables that you declare in one subroutine can't be tampered with by other subroutines. But every once in a strange while, you actually do want dynamic scope, and that's the time to get local out of your bag of tricks.

Here's the most useful example I could find, and one that really does bear careful study. We'll make our own iteration syntax, in the same family as Perl's grep and map. Let's call it listjoin; it'll combine two lists into one:

@list1 = (1,2,3,4,5);
@list2 = (2,3,5,7,11);
@result = listjoin { $a + $b } @list1, @list2;

Now the @result is (3,5,8,11,16). Each element of the result is the sum of the corresponding terms from @list1 and @list2. If we wanted differences instead of sums, we could have put { $a - $b }. In general, we can supply any code fragment that does something with $a and $b, and listjoin will use our code fragment to construct the elements in the result list.

Here's a first cut at listjoin:

sub listjoin (&\@\@) {

Oops! The first line already has a lot of magic. Let's stop here and sightsee a while before we go on. The (&\@\@) is a prototype. In Perl, a prototype changes the way the function is parsed and the way its arguments are passed. In (&\@\@), the & warns the Perl compiler to expect to see a brace-delimited block of code as the first argument to this function, and tells Perl that it should pass listjoin a reference to that block. The block behaves just like an anonymous function. The \@\@ says that listjoin should get two other arguments, which must be arrays; Perl will pass listjoin references to these two arrays. If any of the arguments are missing, or have the wrong type (a hash instead of an array, for example) Perl will signal a compile-time error.

The result of this little wad of punctuation is that we will be able to write

listjoin { $a + $b } @list1, @list2;

and Perl will behave as if we had written

listjoin(sub { $a + $b }, \@list1, \@list2);

With the prototype, Perl knows enough to let us leave out the parentheses, the sub, the first comma, and the slashes. Perl has too much punctuation already, so we should take advantage of every opportunity to use less.

Now that that's out of the way, the rest of listjoin is straightforward:

sub listjoin (&\@\@) {
    my $code = shift;           # Get the code block
    my $arr1 = shift;           # Get reference to first array
    my $arr2 = shift;           # Get reference to second array
    my @result;
    while (@$arr1 && @$arr2) {
        my $a = shift @$arr1;   # Element from array 1 into $a
        my $b = shift @$arr2;   # Element from array 2 into $b
        push @result, &$code(); # Execute code and get result
    }
    return @result;
}

listjoin simply runs a loop over the elements in the two arrays, putting elements from each into $a and $b, respectively, and then executing the code and pushing the result into @result. All very simple and nice, except that it doesn't work: By declaring $a and $b with my, we've made them lexical, and they're unavailable to the $code. Removing the my's from $a and $b makes it work:

$a = shift @$arr1;
$b = shift @$arr2;

But this solution is boobytrapped. Without the my declaration, $a and $b are global variables, and whatever values they had before we ran listjoin are lost now.

The correct solution is to use local. This preserves the old values of the $a and $b variables, if there were any, and restores them when listjoin() is finished. But because of dynamic scoping, the values set by listjoin() are inherited by the code fragment. Here's the correct solution:

sub listjoin (&\@\@) {
    my $code = shift;
    my $arr1 = shift;
    my $arr2 = shift;
    my @result;
    while (@$arr1 && @$arr2) {
        local $a = shift @$arr1;
        local $b = shift @$arr2;
        push @result, &$code();
    }
    return @result;
}

You might worry about another problem: Suppose you had strict 'vars' in force. Shouldn't listjoin { $a + $b } be illegal? It should be, because $a and $b are global variables, and the purpose of strict 'vars' is to forbid the use of unqualified global variables.

But actually, there's no problem here, because strict 'vars' makes a special exception for $a and $b. These two names, and no others, are exempt from strict 'vars', because if they weren't, sort wouldn't work either, for exactly the same reason. We're taking advantage of that here by giving listjoin the same kind of syntax. It's a peculiar and arbitrary exception, but one that we're happy to take advantage of. Here's another example in the same vein:

sub printhash (&\%) {
    my $code = shift;
    my $hash = shift;
    local ($k, $v);
    while (($k, $v) = each %$hash) {
        print &$code();
    }
}

Now you can say

printhash { "$k => $v\n" } %capitals;

and you'll get something like

Athens => Greece
Moscow => Russia
Helsinki => Finland

or you can say

printhash { "$k," } %capitals;

and you'll get

Athens,Moscow,Helsinki,

Note that because I used $k and $v here, you might get into trouble with strict 'vars'. You'll either have to change the definition of printhash to use $a and $b instead, or you'll have to use vars qw($k $v).

6. Dynamic Scope Revisited

Here's another possible use for dynamic scope: You have some subroutine whose behavior depends on the setting of a global variable. This is usually a result of bad design, and should be avoided unless the variable is large and widely used. We'll suppose that this is the case, and that the variable is called %CONFIG. You want to call the subroutine, but you want to change its behavior. Perhaps you want to trick it about what the configuration really is, or perhaps you want to see what it would do if the configuration were different, or you want to try out a fake configuration to see if it works. But you don't want to change the real global configuration, because you don't know what bizarre effects that will have on the rest of the program. So you do this:

local %CONFIG = (new configuration here);
the_subroutine();

The changed %CONFIG is inherited by the subroutine, and the original configuration is restored automatically when the declaration goes out of scope.

Actually in this kind of circumstance you can sometimes do better. Here's how: Suppose that the %CONFIG hash has lots and lots of members, but we only want to change $CONFIG{VERBOSITY}. The obvious thing to do is something like this:

my %new_config = %CONFIG;       # Copy configuration
$new_config{VERBOSITY} = 1000;  # Change one member
local %CONFIG = %new_config;    # Copy changed back, temporarily
the_subroutine();               # Subroutine inherits change

But there's a better way:

local $CONFIG{VERBOSITY} = 1000; # Temporary change!
the_subroutine();

You can actually localize a single element of an array or a hash. It works just like localizing any other scalar: The old value is saved, and restored at the end of the enclosing scope.

Marginal Uses of Dynamic Scoping

Like local filehandles, I kept finding examples of dynamic scoping that seemed to require local, but on further reflection didn't. Lest you be tempted to make one of these mistakes, here they are. One application people sometimes have for dynamic scoping is like this: Suppose you have a complicated subroutine that does a search of some sort, returning bunch of items. If the search function is complicated enough, you might like to have it simply deposit each item into a global array variable when it's found, rather than returning the complete list from the subroutine, especially if the search subroutine is recursive in a complicated way:

sub search {
    # do something very complicated here
    if ($found) {
        push @solutions, $solution;
    }
    # do more complicated things
}

This is dangerous, because @solutions is a global variable, and you don't know who else might be using it.

In some languages, the best answer is to add a frontend to search that localizes the global @solutions variable:

sub search {
    local @solutions;
    realsearch(@_);
    return @solutions;
}

sub realsearch {
    # ... as before ...
}

Now the real work is done in realsearch(), which still gets to store its solutions into the global variable. But since the user of realsearch() is calling the front-end search() function, any old value that @solutions might have had is saved beforehand and restored again afterwards.

There are two other ways to accomplish the same thing, and both of them are better than this way. Here's one:

{ my @solutions;  # Private, but available to both functions
   sub search {
      realsearch(@_);
      return @solutions;
   }
   sub realsearch {
      # ... just as before ...
      # but now it modifies a private variable not a global.
   }
}

Here's the other:

sub search {
    my @solutions;
    realsearch(\@solutions, @_);
    return @solutions;
}

sub realsearch {
	my $solutions_ref = shift;
	# do something very complicated here
	if ($found) {
	    push @$solutions_ref, $solution;
	}
	# do more complicated things
}

One or the other of these strategies will solve most problems where you might think you would want to use a dynamic variable. They're both safer than the solution with local because you don't have to worry that the global variable will 'leak' out into the subroutines called by realsearch().

One final example of a marginal use of local: I can imagine an error-handling routine that examines the value of some global error message variable such as $! or $DBI::errstr to decide what to do. If this routine seems to have a more general utility, you might want to call it even when there wasn't an error, because you want to invoke its cleanup behavior, or you like the way it issues the error message, or whatever. It should accept the message as an argument instead of examining some fixed global variable, but it was badly designed and now you can't change it. If you're in this kind of situation, the best solution might turn out to be something like this:

local $DBI::errstr = "Your shoelace is untied!";
handle_error();

Probably a better solution is to find the person responsible for the routine and to sternly remind them that functions are more flexible and easier to reuse if they don't depend on hardwired global variables. But sometimes time is short and you have to do what you can.

7. Perl 4 and Other Relics

A lot of the useful uses for local became obsolete with Perl 5; local was much more useful in Perl 4. The most important of these was that my wasn't available, so you needed local for private variables.

If you find yourself programming in Perl 4, expect to use a lot of local. my hadn't been invented yet, so we had to do the best we could with what we had.

Summary

Useful uses for local fall into two classes: First, places where you would like to use my, but you can't because of some restriction, and second, rare, peculiar, or contrived situations.

For the vast majority of cases, you should use my, and avoid local whenever possible. In particular, when you want private variables, use my, because local variables aren't private.

Even the useful uses for local are mostly not very useful.

Revised rule of when to use my and when to use local:

Always use my; never use local unless you get an error when you try to use my.
Experts don't need me to tell them what the real rules are.
__END__

Mark-Jason Dominus lives in Philadelphia and works as a programming and systems administration consultant. This August, he will teach three classes at the O'Reilly Perl Conference (on Regular Expressions, Web Security, and Tricks of the Wizards) and has also been invited to give a sequel to last year's popular talk on "The Perl Hardware Store". He likes to get mail, so send him some at mjd-tpj@plover.com.


PREVIOUS  TABLE OF CONTENTS  NEXT