1=head1 NAME 2 3perlmod - Perl modules (packages and symbol tables) 4 5=head1 DESCRIPTION 6 7=head2 Packages 8X<package> X<namespace> X<variable, global> X<global variable> X<global> 9 10Perl provides a mechanism for alternative namespaces to protect 11packages from stomping on each other's variables. In fact, there's 12really no such thing as a global variable in Perl. The package 13statement declares the compilation unit as being in the given 14namespace. The scope of the package declaration is from the 15declaration itself through the end of the enclosing block, C<eval>, 16or file, whichever comes first (the same scope as the my() and 17local() operators). Unqualified dynamic identifiers will be in 18this namespace, except for those few identifiers that if unqualified, 19default to the main package instead of the current one as described 20below. A package statement affects only dynamic variables--including 21those you've used local() on--but I<not> lexical variables created 22with my(). Typically it would be the first declaration in a file 23included by the C<do>, C<require>, or C<use> operators. You can 24switch into a package in more than one place; it merely influences 25which symbol table is used by the compiler for the rest of that 26block. You can refer to variables and filehandles in other packages 27by prefixing the identifier with the package name and a double 28colon: C<$Package::Variable>. If the package name is null, the 29C<main> package is assumed. That is, C<$::sail> is equivalent to 30C<$main::sail>. 31 32The old package delimiter was a single quote, but double colon is now the 33preferred delimiter, in part because it's more readable to humans, and 34in part because it's more readable to B<emacs> macros. It also makes C++ 35programmers feel like they know what's going on--as opposed to using the 36single quote as separator, which was there to make Ada programmers feel 37like they knew what was going on. Because the old-fashioned syntax is still 38supported for backwards compatibility, if you try to use a string like 39C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is, 40the $s variable in package C<owner>, which is probably not what you meant. 41Use braces to disambiguate, as in C<"This is ${owner}'s house">. 42X<::> X<'> 43 44Packages may themselves contain package separators, as in 45C<$OUTER::INNER::var>. This implies nothing about the order of 46name lookups, however. There are no relative packages: all symbols 47are either local to the current package, or must be fully qualified 48from the outer package name down. For instance, there is nowhere 49within package C<OUTER> that C<$INNER::var> refers to 50C<$OUTER::INNER::var>. C<INNER> refers to a totally 51separate global package. 52 53Only identifiers starting with letters (or underscore) are stored 54in a package's symbol table. All other symbols are kept in package 55C<main>, including all punctuation variables, like $_. In addition, 56when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, 57ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>, 58even when used for other purposes than their built-in ones. If you 59have a package called C<m>, C<s>, or C<y>, then you can't use the 60qualified form of an identifier because it would be instead interpreted 61as a pattern match, a substitution, or a transliteration. 62X<variable, punctuation> 63 64Variables beginning with underscore used to be forced into package 65main, but we decided it was more useful for package writers to be able 66to use leading underscore to indicate private variables and method names. 67However, variables and functions named with a single C<_>, such as 68$_ and C<sub _>, are still forced into the package C<main>. See also 69L<perlvar/"Technical Note on the Syntax of Variable Names">. 70 71C<eval>ed strings are compiled in the package in which the eval() was 72compiled. (Assignments to C<$SIG{}>, however, assume the signal 73handler specified is in the C<main> package. Qualify the signal handler 74name if you wish to have a signal handler in a package.) For an 75example, examine F<perldb.pl> in the Perl library. It initially switches 76to the C<DB> package so that the debugger doesn't interfere with variables 77in the program you are trying to debug. At various points, however, it 78temporarily switches back to the C<main> package to evaluate various 79expressions in the context of the C<main> package (or wherever you came 80from). See L<perldebug>. 81 82The special symbol C<__PACKAGE__> contains the current package, but cannot 83(easily) be used to construct variable names. 84 85See L<perlsub> for other scoping issues related to my() and local(), 86and L<perlref> regarding closures. 87 88=head2 Symbol Tables 89X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias> 90 91The symbol table for a package happens to be stored in the hash of that 92name with two colons appended. The main symbol table's name is thus 93C<%main::>, or C<%::> for short. Likewise the symbol table for the nested 94package mentioned earlier is named C<%OUTER::INNER::>. 95 96The value in each entry of the hash is what you are referring to when you 97use the C<*name> typeglob notation. In fact, the following have the same 98effect, though the first is more efficient because it does the symbol 99table lookups at compile time: 100 101 local *main::foo = *main::bar; 102 local $main::{foo} = $main::{bar}; 103 104(Be sure to note the B<vast> difference between the second line above 105and C<local $main::foo = $main::bar>. The former is accessing the hash 106C<%main::>, which is the symbol table of package C<main>. The latter is 107simply assigning scalar C<$bar> in package C<main> to scalar C<$foo> of 108the same package.) 109 110You can use this to print out all the variables in a package, for 111instance. The standard but antiquated F<dumpvar.pl> library and 112the CPAN module Devel::Symdump make use of this. 113 114Assignment to a typeglob performs an aliasing operation, i.e., 115 116 *dick = *richard; 117 118causes variables, subroutines, formats, and file and directory handles 119accessible via the identifier C<richard> also to be accessible via the 120identifier C<dick>. If you want to alias only a particular variable or 121subroutine, assign a reference instead: 122 123 *dick = \$richard; 124 125Which makes $richard and $dick the same variable, but leaves 126@richard and @dick as separate arrays. Tricky, eh? 127 128There is one subtle difference between the following statements: 129 130 *foo = *bar; 131 *foo = \$bar; 132 133C<*foo = *bar> makes the typeglobs themselves synonymous while 134C<*foo = \$bar> makes the SCALAR portions of two distinct typeglobs 135refer to the same scalar value. This means that the following code: 136 137 $bar = 1; 138 *foo = \$bar; # Make $foo an alias for $bar 139 140 { 141 local $bar = 2; # Restrict changes to block 142 print $foo; # Prints '1'! 143 } 144 145Would print '1', because C<$foo> holds a reference to the I<original> 146C<$bar> -- the one that was stuffed away by C<local()> and which will be 147restored when the block ends. Because variables are accessed through the 148typeglob, you can use C<*foo = *bar> to create an alias which can be 149localized. (But be aware that this means you can't have a separate 150C<@foo> and C<@bar>, etc.) 151 152What makes all of this important is that the Exporter module uses glob 153aliasing as the import/export mechanism. Whether or not you can properly 154localize a variable that has been exported from a module depends on how 155it was exported: 156 157 @EXPORT = qw($FOO); # Usual form, can't be localized 158 @EXPORT = qw(*FOO); # Can be localized 159 160You can work around the first case by using the fully qualified name 161(C<$Package::FOO>) where you need a local value, or by overriding it 162by saying C<*FOO = *Package::FOO> in your script. 163 164The C<*x = \$y> mechanism may be used to pass and return cheap references 165into or from subroutines if you don't want to copy the whole 166thing. It only works when assigning to dynamic variables, not 167lexicals. 168 169 %some_hash = (); # can't be my() 170 *some_hash = fn( \%another_hash ); 171 sub fn { 172 local *hashsym = shift; 173 # now use %hashsym normally, and you 174 # will affect the caller's %another_hash 175 my %nhash = (); # do what you want 176 return \%nhash; 177 } 178 179On return, the reference will overwrite the hash slot in the 180symbol table specified by the *some_hash typeglob. This 181is a somewhat tricky way of passing around references cheaply 182when you don't want to have to remember to dereference variables 183explicitly. 184 185Another use of symbol tables is for making "constant" scalars. 186X<constant> X<scalar, constant> 187 188 *PI = \3.14159265358979; 189 190Now you cannot alter C<$PI>, which is probably a good thing all in all. 191This isn't the same as a constant subroutine, which is subject to 192optimization at compile-time. A constant subroutine is one prototyped 193to take no arguments and to return a constant expression. See 194L<perlsub> for details on these. The C<use constant> pragma is a 195convenient shorthand for these. 196 197You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and 198package the *foo symbol table entry comes from. This may be useful 199in a subroutine that gets passed typeglobs as arguments: 200 201 sub identify_typeglob { 202 my $glob = shift; 203 print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n"; 204 } 205 identify_typeglob *foo; 206 identify_typeglob *bar::baz; 207 208This prints 209 210 You gave me main::foo 211 You gave me bar::baz 212 213The C<*foo{THING}> notation can also be used to obtain references to the 214individual elements of *foo. See L<perlref>. 215 216Subroutine definitions (and declarations, for that matter) need 217not necessarily be situated in the package whose symbol table they 218occupy. You can define a subroutine outside its package by 219explicitly qualifying the name of the subroutine: 220 221 package main; 222 sub Some_package::foo { ... } # &foo defined in Some_package 223 224This is just a shorthand for a typeglob assignment at compile time: 225 226 BEGIN { *Some_package::foo = sub { ... } } 227 228and is I<not> the same as writing: 229 230 { 231 package Some_package; 232 sub foo { ... } 233 } 234 235In the first two versions, the body of the subroutine is 236lexically in the main package, I<not> in Some_package. So 237something like this: 238 239 package main; 240 241 $Some_package::name = "fred"; 242 $main::name = "barney"; 243 244 sub Some_package::foo { 245 print "in ", __PACKAGE__, ": \$name is '$name'\n"; 246 } 247 248 Some_package::foo(); 249 250prints: 251 252 in main: $name is 'barney' 253 254rather than: 255 256 in Some_package: $name is 'fred' 257 258This also has implications for the use of the SUPER:: qualifier 259(see L<perlobj>). 260 261=head2 BEGIN, CHECK, INIT and END 262X<BEGIN> X<CHECK> X<INIT> X<END> 263 264Four specially named code blocks are executed at the beginning and at the end 265of a running Perl program. These are the C<BEGIN>, C<CHECK>, C<INIT>, and 266C<END> blocks. 267 268These code blocks can be prefixed with C<sub> to give the appearance of a 269subroutine (although this is not considered good style). One should note 270that these code blocks don't really exist as named subroutines (despite 271their appearance). The thing that gives this away is the fact that you can 272have B<more than one> of these code blocks in a program, and they will get 273B<all> executed at the appropriate moment. So you can't execute any of 274these code blocks by name. 275 276A C<BEGIN> code block is executed as soon as possible, that is, the moment 277it is completely defined, even before the rest of the containing file (or 278string) is parsed. You may have multiple C<BEGIN> blocks within a file (or 279eval'ed string) -- they will execute in order of definition. Because a C<BEGIN> 280code block executes immediately, it can pull in definitions of subroutines 281and such from other files in time to be visible to the rest of the compile 282and run time. Once a C<BEGIN> has run, it is immediately undefined and any 283code it used is returned to Perl's memory pool. 284 285It should be noted that C<BEGIN> code blocks B<are> executed inside string 286C<eval()>'s. The C<CHECK> and C<INIT> code blocks are B<not> executed inside 287a string eval, which e.g. can be a problem in a mod_perl environment. 288 289An C<END> code block is executed as late as possible, that is, after 290perl has finished running the program and just before the interpreter 291is being exited, even if it is exiting as a result of a die() function. 292(But not if it's morphing into another program via C<exec>, or 293being blown out of the water by a signal--you have to trap that yourself 294(if you can).) You may have multiple C<END> blocks within a file--they 295will execute in reverse order of definition; that is: last in, first 296out (LIFO). C<END> blocks are not executed when you run perl with the 297C<-c> switch, or if compilation fails. 298 299Note that C<END> code blocks are B<not> executed at the end of a string 300C<eval()>: if any C<END> code blocks are created in a string C<eval()>, 301they will be executed just as any other C<END> code block of that package 302in LIFO order just before the interpreter is being exited. 303 304Inside an C<END> code block, C<$?> contains the value that the program is 305going to pass to C<exit()>. You can modify C<$?> to change the exit 306value of the program. Beware of changing C<$?> by accident (e.g. by 307running something via C<system>). 308X<$?> 309 310C<CHECK> and C<INIT> code blocks are useful to catch the transition between 311the compilation phase and the execution phase of the main program. 312 313C<CHECK> code blocks are run just after the B<initial> Perl compile phase ends 314and before the run time begins, in LIFO order. C<CHECK> code blocks are used 315in the Perl compiler suite to save the compiled state of the program. 316 317C<INIT> blocks are run just before the Perl runtime begins execution, in 318"first in, first out" (FIFO) order. For example, the code generators 319documented in L<perlcc> make use of C<INIT> blocks to initialize and 320resolve pointers to XSUBs. 321 322When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and 323C<END> work just as they do in B<awk>, as a degenerate case. 324Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c> 325switch for a compile-only syntax check, although your main code 326is not. 327 328The B<begincheck> program makes it all clear, eventually: 329 330 #!/usr/bin/perl 331 332 # begincheck 333 334 print " 8. Ordinary code runs at runtime.\n"; 335 336 END { print "14. So this is the end of the tale.\n" } 337 INIT { print " 5. INIT blocks run FIFO just before runtime.\n" } 338 CHECK { print " 4. So this is the fourth line.\n" } 339 340 print " 9. It runs in order, of course.\n"; 341 342 BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" } 343 END { print "13. Read perlmod for the rest of the story.\n" } 344 CHECK { print " 3. CHECK blocks run LIFO at compilation's end.\n" } 345 INIT { print " 6. Run this again, using Perl's -c switch.\n" } 346 347 print "10. This is anti-obfuscated code.\n"; 348 349 END { print "12. END blocks run LIFO at quitting time.\n" } 350 BEGIN { print " 2. So this line comes out second.\n" } 351 INIT { print " 7. You'll see the difference right away.\n" } 352 353 print "11. It merely _looks_ like it should be confusing.\n"; 354 355 __END__ 356 357=head2 Perl Classes 358X<class> X<@ISA> 359 360There is no special class syntax in Perl, but a package may act 361as a class if it provides subroutines to act as methods. Such a 362package may also derive some of its methods from another class (package) 363by listing the other package name(s) in its global @ISA array (which 364must be a package global, not a lexical). 365 366For more on this, see L<perltoot> and L<perlobj>. 367 368=head2 Perl Modules 369X<module> 370 371A module is just a set of related functions in a library file, i.e., 372a Perl package with the same name as the file. It is specifically 373designed to be reusable by other modules or programs. It may do this 374by providing a mechanism for exporting some of its symbols into the 375symbol table of any package using it, or it may function as a class 376definition and make its semantics available implicitly through 377method calls on the class and its objects, without explicitly 378exporting anything. Or it can do a little of both. 379 380For example, to start a traditional, non-OO module called Some::Module, 381create a file called F<Some/Module.pm> and start with this template: 382 383 package Some::Module; # assumes Some/Module.pm 384 385 use strict; 386 use warnings; 387 388 BEGIN { 389 use Exporter (); 390 our ($VERSION, @ISA, @EXPORT, @EXPORT_OK, %EXPORT_TAGS); 391 392 # set the version for version checking 393 $VERSION = 1.00; 394 # if using RCS/CVS, this may be preferred 395 $VERSION = sprintf "%d.%03d", q$Revision: 1.10 $ =~ /(\d+)/g; 396 397 @ISA = qw(Exporter); 398 @EXPORT = qw(&func1 &func2 &func4); 399 %EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ], 400 401 # your exported package globals go here, 402 # as well as any optionally exported functions 403 @EXPORT_OK = qw($Var1 %Hashit &func3); 404 } 405 our @EXPORT_OK; 406 407 # exported package globals go here 408 our $Var1; 409 our %Hashit; 410 411 # non-exported package globals go here 412 our @more; 413 our $stuff; 414 415 # initialize package globals, first exported ones 416 $Var1 = ''; 417 %Hashit = (); 418 419 # then the others (which are still accessible as $Some::Module::stuff) 420 $stuff = ''; 421 @more = (); 422 423 # all file-scoped lexicals must be created before 424 # the functions below that use them. 425 426 # file-private lexicals go here 427 my $priv_var = ''; 428 my %secret_hash = (); 429 430 # here's a file-private function as a closure, 431 # callable as &$priv_func; it cannot be prototyped. 432 my $priv_func = sub { 433 # stuff goes here. 434 }; 435 436 # make all your functions, whether exported or not; 437 # remember to put something interesting in the {} stubs 438 sub func1 {} # no prototype 439 sub func2() {} # proto'd void 440 sub func3($$) {} # proto'd to 2 scalars 441 442 # this one isn't exported, but could be called! 443 sub func4(\%) {} # proto'd to 1 hash ref 444 445 END { } # module clean-up code here (global destructor) 446 447 ## YOUR CODE GOES HERE 448 449 1; # don't forget to return a true value from the file 450 451Then go on to declare and use your variables in functions without 452any qualifications. See L<Exporter> and the L<perlmodlib> for 453details on mechanics and style issues in module creation. 454 455Perl modules are included into your program by saying 456 457 use Module; 458 459or 460 461 use Module LIST; 462 463This is exactly equivalent to 464 465 BEGIN { require Module; import Module; } 466 467or 468 469 BEGIN { require Module; import Module LIST; } 470 471As a special case 472 473 use Module (); 474 475is exactly equivalent to 476 477 BEGIN { require Module; } 478 479All Perl module files have the extension F<.pm>. The C<use> operator 480assumes this so you don't have to spell out "F<Module.pm>" in quotes. 481This also helps to differentiate new modules from old F<.pl> and 482F<.ph> files. Module names are also capitalized unless they're 483functioning as pragmas; pragmas are in effect compiler directives, 484and are sometimes called "pragmatic modules" (or even "pragmata" 485if you're a classicist). 486 487The two statements: 488 489 require SomeModule; 490 require "SomeModule.pm"; 491 492differ from each other in two ways. In the first case, any double 493colons in the module name, such as C<Some::Module>, are translated 494into your system's directory separator, usually "/". The second 495case does not, and would have to be specified literally. The other 496difference is that seeing the first C<require> clues in the compiler 497that uses of indirect object notation involving "SomeModule", as 498in C<$ob = purge SomeModule>, are method calls, not function calls. 499(Yes, this really can make a difference.) 500 501Because the C<use> statement implies a C<BEGIN> block, the importing 502of semantics happens as soon as the C<use> statement is compiled, 503before the rest of the file is compiled. This is how it is able 504to function as a pragma mechanism, and also how modules are able to 505declare subroutines that are then visible as list or unary operators for 506the rest of the current file. This will not work if you use C<require> 507instead of C<use>. With C<require> you can get into this problem: 508 509 require Cwd; # make Cwd:: accessible 510 $here = Cwd::getcwd(); 511 512 use Cwd; # import names from Cwd:: 513 $here = getcwd(); 514 515 require Cwd; # make Cwd:: accessible 516 $here = getcwd(); # oops! no main::getcwd() 517 518In general, C<use Module ()> is recommended over C<require Module>, 519because it determines module availability at compile time, not in the 520middle of your program's execution. An exception would be if two modules 521each tried to C<use> each other, and each also called a function from 522that other module. In that case, it's easy to use C<require> instead. 523 524Perl packages may be nested inside other package names, so we can have 525package names containing C<::>. But if we used that package name 526directly as a filename it would make for unwieldy or impossible 527filenames on some systems. Therefore, if a module's name is, say, 528C<Text::Soundex>, then its definition is actually found in the library 529file F<Text/Soundex.pm>. 530 531Perl modules always have a F<.pm> file, but there may also be 532dynamically linked executables (often ending in F<.so>) or autoloaded 533subroutine definitions (often ending in F<.al>) associated with the 534module. If so, these will be entirely transparent to the user of 535the module. It is the responsibility of the F<.pm> file to load 536(or arrange to autoload) any additional functionality. For example, 537although the POSIX module happens to do both dynamic loading and 538autoloading, the user can say just C<use POSIX> to get it all. 539 540=head2 Making your module threadsafe 541X<threadsafe> X<thread safe> 542X<module, threadsafe> X<module, thread safe> 543X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread> 544 545Since 5.6.0, Perl has had support for a new type of threads called 546interpreter threads (ithreads). These threads can be used explicitly 547and implicitly. 548 549Ithreads work by cloning the data tree so that no data is shared 550between different threads. These threads can be used by using the C<threads> 551module or by doing fork() on win32 (fake fork() support). When a 552thread is cloned all Perl data is cloned, however non-Perl data cannot 553be cloned automatically. Perl after 5.7.2 has support for the C<CLONE> 554special subroutine. In C<CLONE> you can do whatever 555you need to do, 556like for example handle the cloning of non-Perl data, if necessary. 557C<CLONE> will be called once as a class method for every package that has it 558defined (or inherits it). It will be called in the context of the new thread, 559so all modifications are made in the new area. Currently CLONE is called with 560no parameters other than the invocant package name, but code should not assume 561that this will remain unchanged, as it is likely that in future extra parameters 562will be passed in to give more information about the state of cloning. 563 564If you want to CLONE all objects you will need to keep track of them per 565package. This is simply done using a hash and Scalar::Util::weaken(). 566 567Perl after 5.8.7 has support for the C<CLONE_SKIP> special subroutine. 568Like C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is 569called just before cloning starts, and in the context of the parent 570thread. If it returns a true value, then no objects of that class will 571be cloned; or rather, they will be copied as unblessed, undef values. 572This provides a simple mechanism for making a module threadsafe; just add 573C<sub CLONE_SKIP { 1 }> at the top of the class, and C<DESTROY()> will be 574now only be called once per object. Of course, if the child thread needs 575to make use of the objects, then a more sophisticated approach is 576needed. 577 578Like C<CLONE>, C<CLONE_SKIP> is currently called with no parameters other 579than the invocant package name, although that may change. Similarly, to 580allow for future expansion, the return value should be a single C<0> or 581C<1> value. 582 583=head1 SEE ALSO 584 585See L<perlmodlib> for general style issues related to building Perl 586modules and classes, as well as descriptions of the standard library 587and CPAN, L<Exporter> for how Perl's standard import/export mechanism 588works, L<perltoot> and L<perltooc> for an in-depth tutorial on 589creating classes, L<perlobj> for a hard-core reference document on 590objects, L<perlsub> for an explanation of functions and scoping, 591and L<perlxstut> and L<perlguts> for more information on writing 592extension modules. 593