The more you know about Perl, though, and the more of its features you use, the more you must understand how they work. When you put these features together, they can often interact in surprising ways. This article explores the interaction between context, prototyping, and subroutine calls.
Tom Christiansen has long been in favor of reducing these interactions by dropping context. In fact, he goes so far as to call it "cryptocontext," and he came up with the challenge which prompted this article.
Given these:
sub f($$); @a = (5,9);
Explain what each of these does, and why:
&f; &f(); f(); f; f(@a); f(@a[0,1]); f(@a, @a); &f(@a); &f(@a, @a); f('ls /bin', 'ls /tmp'); &f('ls /bin', 'ls /tmp');
Frightened that these may not do the obvious? Read on, and I'll explain what happens.
Context is one of the gnarlier pieces of Perl. While Tom is anti-context, Larry Wall is a proponent. This is as you'd expect - he incorporated it into the language. Larry is a linguist, and he tried to adapt some of the features of successful natural languages in the hopes of making Perl a successful unnatural language. Context was one of those features.
When compiling and running your program, the Perl interpreter associates a context with every expression. There are three types of context: void, scalar, and list. Void and scalar are almost identical, and we'll be looking at the difference between them and list context.
You may have encountered the importance of context in situations like these:
@list_context = 'ls'; $scalar_context = 'ls';
@list_context now holds the output of ls, one line per list element. $scalar_context has the output of ls as one big string with embedded newlines. The backticks know what to return (one big string of many lines, or many one-line strings) because of the context they were called in.
You also see context in other situations:
if (@array) { # the array is not empty }
The if statement needs to test the truth of @array, and so evaluates it in a scalar context. It just so happens that an array in scalar context evaluates to the number of elements in the array. Thus:
@array = ('a', 'b', 'c'); $count = @array + 1;
$count is now set to four. This is where the distinction between arrays and lists becomes important. Lists don't behave the same:
$count = (5, 7, 9) + 1;
This sets $count to ten. A list in scalar context evaluates to the value of the last element, in this case nine.
You can see why context is confusing. Let's stop talking context and consider prototypes.
Version 5.003 of Perl added support for prototypes. They let you tell Perl what type of arguments your subroutines expect; Perl can then do some elementary type-checking and optimizations with that information.
A prototyped subroutine looks like this:
sub add_two ($$) { return $_[0] + $_[1]; }The ($$) is the prototype. It's a shorthand to identify the types of the arguments that will be passed to your subroutine. In this case, Perl is told to expect two scalars. You can prototype your subroutines without defining them:
sub add_two ($$); # add_two will be defined later
If you then try to call your subroutine without two arguments that can be evaluated as scalars, Perl will complain. (However, almost everything can be coerced to a scalar. Scalar prototypes accept @arrays and %hashes without warning!) All prototype checking is performed during compilation - before your program is run. This means that the prototype-checking stage of the compiler is necessarily limited in what it can deduce. In particular, you can't build up an array of arguments and pass the array to a subroutine prototyped to take many scalars:
sub complain_about ($$$$); @args = (0, 1, 2, 3); complain_about(@args); # WRONG
The compiler can't know beforehand how big @args will be when complain_about() is called; that's a runtime thing. Sure, @args was just assigned to, but the compiler only built the internal instructions to make that assignment - it hasn't actually created the array in memory. That won't happen until runtime. So the compiler sees an array where it was expecting four scalars, and complains.
This behavior can be annoying sometimes, so Perl provides a way to bypass prototype checking.
You can turn off the prototype checking for a particular call by using the & notation:
&complain_about(@args); # ALLOWED
& is also used for another shortcut. If you call a subroutine with & and give it no argument list, Perl will use the current @_ as the subroutine's argument list. That is:
@_ = ( 4, 6, 8 ); sub count_args { return scalar(@_); } $two = count_args(3,5); $two = &count_args(3,5); $zero = &count_args(); $three = &count_args;
The first two calls to count_args() are identical. If we had prototyped count_args() they might be different, but since we didn't, the & is redundant. The third example also bypasses the nonexistent prototype and calls count_args() with no arguments. The fourth example shows how the @_ of the caller - that is, (4, 6, 8) - becomes the @_ (the argument list) of the subroutine. It's the same @_, not a copy!
Here's the crux of the interaction: when a subroutine is prototyped (and isn't bypassed with &), the prototype specifies the context the arguments will be evaluated in. Let's revisit the example at the beginning of the article:
sub f($$); @a = (5, 9);
Now let's consider Tom's subroutine calls one by one and see what happens to each of them.
&f;
The & bypasses subroutine prototypes, so the compiler won't complain about the subroutine call not matching its prototype. The call also has no argument list, so f is called with its caller's @_.
&f();
f is called with an empty argument list. The compiler won't complain because, again, the & causes the subroutine prototypes to be bypassed.
f();
No & means that the compiler will check the subroutine call against its prototype, and when it sees that f is being called with an empty argument list but was prototyped to take two scalars, it will complain. In short, it won't compile.
f;
This is the same situation as above. The only difference between f() and f is that use strict prevents f from compiling.
f(@a);
No & means that subroutine prototypes are checked. The single array argument doesn't match the two scalar arguments in the prototype, so the compiler will complain and stop the program.
f(@a[0,1]);
This also won't compile. Although we're taking a two-element slice from the array, the compiler sees the array slice and not two scalars. If this seems a little odd, consider that we could have had @a[$b..$c], the size of which can't be known until runtime.
f(@a, @a);
Alarmingly, this does compile. The compiler knows, from the prototype, that f takes two scalars. An array can be evaluated in scalar context, so this call is equivalent to f(2, 2) because @a in scalar context evaluates to 2. Surprised?
&f(@a);
This also compiles. The & disables prototype checking, so @a becomes the @_ of the subroutine. This is equivalent to f(5,9).
&f(@a, @a);
This is equivalent to f(5,9,5,9). The & turns off prototype checking, and so the list (@a, @a) is flattened to (5,9,5,9).
f('ls /bin', 'ls /tmp');
This compiles, and calls f with two long strings. Each string is the complete output of an ls command, embedded newlines and all. The prototype tells the compiler to evaluate the arguments in scalar context, so the backtick output results in long strings.
&f('ls /bin', 'ls /tmp');
By turning off prototype checking with &, we prevent f's arguments being evaluated in scalar context. The function call provides list context for the arguments, which means that the backticks yield many strings. On my system, this is equivalent to calling f with about seventy arguments.
We have two lessons to learn from this (besides "anything can be made hard if you think about it long enough").
Context is subtle. If you're having trouble with subroutine calls, and you're using prototypes, look at how the compiler is behaving as a result of the prototypes. Perhaps context is causing your problems.
Prototypes are a mixed blessing. In some relatively straightforward situations, prototypes are a useful way to catch incorrect subroutine calls. In other situations, though, they may be useless. You should probably only use them to mimic the behavior of built-in functions like push() and splice().
_ _END_ _