1############################################################################# 2# Pod/Parser.pm -- package which defines a base class for parsing POD docs. 3# 4# Copyright (C) 1996-2000 by Bradford Appleton. All rights reserved. 5# This file is part of "PodParser". PodParser is free software; 6# you can redistribute it and/or modify it under the same terms 7# as Perl itself. 8############################################################################# 9 10package Pod::Parser; 11 12use vars qw($VERSION); 13$VERSION = 1.32; ## Current version of this package 14require 5.005; ## requires this Perl version or later 15 16############################################################################# 17 18=head1 NAME 19 20Pod::Parser - base class for creating POD filters and translators 21 22=head1 SYNOPSIS 23 24 use Pod::Parser; 25 26 package MyParser; 27 @ISA = qw(Pod::Parser); 28 29 sub command { 30 my ($parser, $command, $paragraph, $line_num) = @_; 31 ## Interpret the command and its text; sample actions might be: 32 if ($command eq 'head1') { ... } 33 elsif ($command eq 'head2') { ... } 34 ## ... other commands and their actions 35 my $out_fh = $parser->output_handle(); 36 my $expansion = $parser->interpolate($paragraph, $line_num); 37 print $out_fh $expansion; 38 } 39 40 sub verbatim { 41 my ($parser, $paragraph, $line_num) = @_; 42 ## Format verbatim paragraph; sample actions might be: 43 my $out_fh = $parser->output_handle(); 44 print $out_fh $paragraph; 45 } 46 47 sub textblock { 48 my ($parser, $paragraph, $line_num) = @_; 49 ## Translate/Format this block of text; sample actions might be: 50 my $out_fh = $parser->output_handle(); 51 my $expansion = $parser->interpolate($paragraph, $line_num); 52 print $out_fh $expansion; 53 } 54 55 sub interior_sequence { 56 my ($parser, $seq_command, $seq_argument) = @_; 57 ## Expand an interior sequence; sample actions might be: 58 return "*$seq_argument*" if ($seq_command eq 'B'); 59 return "`$seq_argument'" if ($seq_command eq 'C'); 60 return "_${seq_argument}_'" if ($seq_command eq 'I'); 61 ## ... other sequence commands and their resulting text 62 } 63 64 package main; 65 66 ## Create a parser object and have it parse file whose name was 67 ## given on the command-line (use STDIN if no files were given). 68 $parser = new MyParser(); 69 $parser->parse_from_filehandle(\*STDIN) if (@ARGV == 0); 70 for (@ARGV) { $parser->parse_from_file($_); } 71 72=head1 REQUIRES 73 74perl5.005, Pod::InputObjects, Exporter, Symbol, Carp 75 76=head1 EXPORTS 77 78Nothing. 79 80=head1 DESCRIPTION 81 82B<Pod::Parser> is a base class for creating POD filters and translators. 83It handles most of the effort involved with parsing the POD sections 84from an input stream, leaving subclasses free to be concerned only with 85performing the actual translation of text. 86 87B<Pod::Parser> parses PODs, and makes method calls to handle the various 88components of the POD. Subclasses of B<Pod::Parser> override these methods 89to translate the POD into whatever output format they desire. 90 91=head1 QUICK OVERVIEW 92 93To create a POD filter for translating POD documentation into some other 94format, you create a subclass of B<Pod::Parser> which typically overrides 95just the base class implementation for the following methods: 96 97=over 2 98 99=item * 100 101B<command()> 102 103=item * 104 105B<verbatim()> 106 107=item * 108 109B<textblock()> 110 111=item * 112 113B<interior_sequence()> 114 115=back 116 117You may also want to override the B<begin_input()> and B<end_input()> 118methods for your subclass (to perform any needed per-file and/or 119per-document initialization or cleanup). 120 121If you need to perform any preprocesssing of input before it is parsed 122you may want to override one or more of B<preprocess_line()> and/or 123B<preprocess_paragraph()>. 124 125Sometimes it may be necessary to make more than one pass over the input 126files. If this is the case you have several options. You can make the 127first pass using B<Pod::Parser> and override your methods to store the 128intermediate results in memory somewhere for the B<end_pod()> method to 129process. You could use B<Pod::Parser> for several passes with an 130appropriate state variable to control the operation for each pass. If 131your input source can't be reset to start at the beginning, you can 132store it in some other structure as a string or an array and have that 133structure implement a B<getline()> method (which is all that 134B<parse_from_filehandle()> uses to read input). 135 136Feel free to add any member data fields you need to keep track of things 137like current font, indentation, horizontal or vertical position, or 138whatever else you like. Be sure to read L<"PRIVATE METHODS AND DATA"> 139to avoid name collisions. 140 141For the most part, the B<Pod::Parser> base class should be able to 142do most of the input parsing for you and leave you free to worry about 143how to intepret the commands and translate the result. 144 145Note that all we have described here in this quick overview is the 146simplest most straightforward use of B<Pod::Parser> to do stream-based 147parsing. It is also possible to use the B<Pod::Parser::parse_text> function 148to do more sophisticated tree-based parsing. See L<"TREE-BASED PARSING">. 149 150=head1 PARSING OPTIONS 151 152A I<parse-option> is simply a named option of B<Pod::Parser> with a 153value that corresponds to a certain specified behavior. These various 154behaviors of B<Pod::Parser> may be enabled/disabled by setting 155or unsetting one or more I<parse-options> using the B<parseopts()> method. 156The set of currently accepted parse-options is as follows: 157 158=over 3 159 160=item B<-want_nonPODs> (default: unset) 161 162Normally (by default) B<Pod::Parser> will only provide access to 163the POD sections of the input. Input paragraphs that are not part 164of the POD-format documentation are not made available to the caller 165(not even using B<preprocess_paragraph()>). Setting this option to a 166non-empty, non-zero value will allow B<preprocess_paragraph()> to see 167non-POD sections of the input as well as POD sections. The B<cutting()> 168method can be used to determine if the corresponding paragraph is a POD 169paragraph, or some other input paragraph. 170 171=item B<-process_cut_cmd> (default: unset) 172 173Normally (by default) B<Pod::Parser> handles the C<=cut> POD directive 174by itself and does not pass it on to the caller for processing. Setting 175this option to a non-empty, non-zero value will cause B<Pod::Parser> to 176pass the C<=cut> directive to the caller just like any other POD command 177(and hence it may be processed by the B<command()> method). 178 179B<Pod::Parser> will still interpret the C<=cut> directive to mean that 180"cutting mode" has been (re)entered, but the caller will get a chance 181to capture the actual C<=cut> paragraph itself for whatever purpose 182it desires. 183 184=item B<-warnings> (default: unset) 185 186Normally (by default) B<Pod::Parser> recognizes a bare minimum of 187pod syntax errors and warnings and issues diagnostic messages 188for errors, but not for warnings. (Use B<Pod::Checker> to do more 189thorough checking of POD syntax.) Setting this option to a non-empty, 190non-zero value will cause B<Pod::Parser> to issue diagnostics for 191the few warnings it recognizes as well as the errors. 192 193=back 194 195Please see L<"parseopts()"> for a complete description of the interface 196for the setting and unsetting of parse-options. 197 198=cut 199 200############################################################################# 201 202use vars qw(@ISA); 203use strict; 204#use diagnostics; 205use Pod::InputObjects; 206use Carp; 207use Exporter; 208BEGIN { 209 if ($] < 5.6) { 210 require Symbol; 211 import Symbol; 212 } 213} 214@ISA = qw(Exporter); 215 216## These "variables" are used as local "glob aliases" for performance 217use vars qw(%myData %myOpts @input_stack); 218 219############################################################################# 220 221=head1 RECOMMENDED SUBROUTINE/METHOD OVERRIDES 222 223B<Pod::Parser> provides several methods which most subclasses will probably 224want to override. These methods are as follows: 225 226=cut 227 228##--------------------------------------------------------------------------- 229 230=head1 B<command()> 231 232 $parser->command($cmd,$text,$line_num,$pod_para); 233 234This method should be overridden by subclasses to take the appropriate 235action when a POD command paragraph (denoted by a line beginning with 236"=") is encountered. When such a POD directive is seen in the input, 237this method is called and is passed: 238 239=over 3 240 241=item C<$cmd> 242 243the name of the command for this POD paragraph 244 245=item C<$text> 246 247the paragraph text for the given POD paragraph command. 248 249=item C<$line_num> 250 251the line-number of the beginning of the paragraph 252 253=item C<$pod_para> 254 255a reference to a C<Pod::Paragraph> object which contains further 256information about the paragraph command (see L<Pod::InputObjects> 257for details). 258 259=back 260 261B<Note> that this method I<is> called for C<=pod> paragraphs. 262 263The base class implementation of this method simply treats the raw POD 264command as normal block of paragraph text (invoking the B<textblock()> 265method with the command paragraph). 266 267=cut 268 269sub command { 270 my ($self, $cmd, $text, $line_num, $pod_para) = @_; 271 ## Just treat this like a textblock 272 $self->textblock($pod_para->raw_text(), $line_num, $pod_para); 273} 274 275##--------------------------------------------------------------------------- 276 277=head1 B<verbatim()> 278 279 $parser->verbatim($text,$line_num,$pod_para); 280 281This method may be overridden by subclasses to take the appropriate 282action when a block of verbatim text is encountered. It is passed the 283following parameters: 284 285=over 3 286 287=item C<$text> 288 289the block of text for the verbatim paragraph 290 291=item C<$line_num> 292 293the line-number of the beginning of the paragraph 294 295=item C<$pod_para> 296 297a reference to a C<Pod::Paragraph> object which contains further 298information about the paragraph (see L<Pod::InputObjects> 299for details). 300 301=back 302 303The base class implementation of this method simply prints the textblock 304(unmodified) to the output filehandle. 305 306=cut 307 308sub verbatim { 309 my ($self, $text, $line_num, $pod_para) = @_; 310 my $out_fh = $self->{_OUTPUT}; 311 print $out_fh $text; 312} 313 314##--------------------------------------------------------------------------- 315 316=head1 B<textblock()> 317 318 $parser->textblock($text,$line_num,$pod_para); 319 320This method may be overridden by subclasses to take the appropriate 321action when a normal block of POD text is encountered (although the base 322class method will usually do what you want). It is passed the following 323parameters: 324 325=over 3 326 327=item C<$text> 328 329the block of text for the a POD paragraph 330 331=item C<$line_num> 332 333the line-number of the beginning of the paragraph 334 335=item C<$pod_para> 336 337a reference to a C<Pod::Paragraph> object which contains further 338information about the paragraph (see L<Pod::InputObjects> 339for details). 340 341=back 342 343In order to process interior sequences, subclasses implementations of 344this method will probably want to invoke either B<interpolate()> or 345B<parse_text()>, passing it the text block C<$text>, and the corresponding 346line number in C<$line_num>, and then perform any desired processing upon 347the returned result. 348 349The base class implementation of this method simply prints the text block 350as it occurred in the input stream). 351 352=cut 353 354sub textblock { 355 my ($self, $text, $line_num, $pod_para) = @_; 356 my $out_fh = $self->{_OUTPUT}; 357 print $out_fh $self->interpolate($text, $line_num); 358} 359 360##--------------------------------------------------------------------------- 361 362=head1 B<interior_sequence()> 363 364 $parser->interior_sequence($seq_cmd,$seq_arg,$pod_seq); 365 366This method should be overridden by subclasses to take the appropriate 367action when an interior sequence is encountered. An interior sequence is 368an embedded command within a block of text which appears as a command 369name (usually a single uppercase character) followed immediately by a 370string of text which is enclosed in angle brackets. This method is 371passed the sequence command C<$seq_cmd> and the corresponding text 372C<$seq_arg>. It is invoked by the B<interpolate()> method for each interior 373sequence that occurs in the string that it is passed. It should return 374the desired text string to be used in place of the interior sequence. 375The C<$pod_seq> argument is a reference to a C<Pod::InteriorSequence> 376object which contains further information about the interior sequence. 377Please see L<Pod::InputObjects> for details if you need to access this 378additional information. 379 380Subclass implementations of this method may wish to invoke the 381B<nested()> method of C<$pod_seq> to see if it is nested inside 382some other interior-sequence (and if so, which kind). 383 384The base class implementation of the B<interior_sequence()> method 385simply returns the raw text of the interior sequence (as it occurred 386in the input) to the caller. 387 388=cut 389 390sub interior_sequence { 391 my ($self, $seq_cmd, $seq_arg, $pod_seq) = @_; 392 ## Just return the raw text of the interior sequence 393 return $pod_seq->raw_text(); 394} 395 396############################################################################# 397 398=head1 OPTIONAL SUBROUTINE/METHOD OVERRIDES 399 400B<Pod::Parser> provides several methods which subclasses may want to override 401to perform any special pre/post-processing. These methods do I<not> have to 402be overridden, but it may be useful for subclasses to take advantage of them. 403 404=cut 405 406##--------------------------------------------------------------------------- 407 408=head1 B<new()> 409 410 my $parser = Pod::Parser->new(); 411 412This is the constructor for B<Pod::Parser> and its subclasses. You 413I<do not> need to override this method! It is capable of constructing 414subclass objects as well as base class objects, provided you use 415any of the following constructor invocation styles: 416 417 my $parser1 = MyParser->new(); 418 my $parser2 = new MyParser(); 419 my $parser3 = $parser2->new(); 420 421where C<MyParser> is some subclass of B<Pod::Parser>. 422 423Using the syntax C<MyParser::new()> to invoke the constructor is I<not> 424recommended, but if you insist on being able to do this, then the 425subclass I<will> need to override the B<new()> constructor method. If 426you do override the constructor, you I<must> be sure to invoke the 427B<initialize()> method of the newly blessed object. 428 429Using any of the above invocations, the first argument to the 430constructor is always the corresponding package name (or object 431reference). No other arguments are required, but if desired, an 432associative array (or hash-table) my be passed to the B<new()> 433constructor, as in: 434 435 my $parser1 = MyParser->new( MYDATA => $value1, MOREDATA => $value2 ); 436 my $parser2 = new MyParser( -myflag => 1 ); 437 438All arguments passed to the B<new()> constructor will be treated as 439key/value pairs in a hash-table. The newly constructed object will be 440initialized by copying the contents of the given hash-table (which may 441have been empty). The B<new()> constructor for this class and all of its 442subclasses returns a blessed reference to the initialized object (hash-table). 443 444=cut 445 446sub new { 447 ## Determine if we were called via an object-ref or a classname 448 my $this = shift; 449 my $class = ref($this) || $this; 450 ## Any remaining arguments are treated as initial values for the 451 ## hash that is used to represent this object. 452 my %params = @_; 453 my $self = { %params }; 454 ## Bless ourselves into the desired class and perform any initialization 455 bless $self, $class; 456 $self->initialize(); 457 return $self; 458} 459 460##--------------------------------------------------------------------------- 461 462=head1 B<initialize()> 463 464 $parser->initialize(); 465 466This method performs any necessary object initialization. It takes no 467arguments (other than the object instance of course, which is typically 468copied to a local variable named C<$self>). If subclasses override this 469method then they I<must> be sure to invoke C<$self-E<gt>SUPER::initialize()>. 470 471=cut 472 473sub initialize { 474 #my $self = shift; 475 #return; 476} 477 478##--------------------------------------------------------------------------- 479 480=head1 B<begin_pod()> 481 482 $parser->begin_pod(); 483 484This method is invoked at the beginning of processing for each POD 485document that is encountered in the input. Subclasses should override 486this method to perform any per-document initialization. 487 488=cut 489 490sub begin_pod { 491 #my $self = shift; 492 #return; 493} 494 495##--------------------------------------------------------------------------- 496 497=head1 B<begin_input()> 498 499 $parser->begin_input(); 500 501This method is invoked by B<parse_from_filehandle()> immediately I<before> 502processing input from a filehandle. The base class implementation does 503nothing, however, subclasses may override it to perform any per-file 504initializations. 505 506Note that if multiple files are parsed for a single POD document 507(perhaps the result of some future C<=include> directive) this method 508is invoked for every file that is parsed. If you wish to perform certain 509initializations once per document, then you should use B<begin_pod()>. 510 511=cut 512 513sub begin_input { 514 #my $self = shift; 515 #return; 516} 517 518##--------------------------------------------------------------------------- 519 520=head1 B<end_input()> 521 522 $parser->end_input(); 523 524This method is invoked by B<parse_from_filehandle()> immediately I<after> 525processing input from a filehandle. The base class implementation does 526nothing, however, subclasses may override it to perform any per-file 527cleanup actions. 528 529Please note that if multiple files are parsed for a single POD document 530(perhaps the result of some kind of C<=include> directive) this method 531is invoked for every file that is parsed. If you wish to perform certain 532cleanup actions once per document, then you should use B<end_pod()>. 533 534=cut 535 536sub end_input { 537 #my $self = shift; 538 #return; 539} 540 541##--------------------------------------------------------------------------- 542 543=head1 B<end_pod()> 544 545 $parser->end_pod(); 546 547This method is invoked at the end of processing for each POD document 548that is encountered in the input. Subclasses should override this method 549to perform any per-document finalization. 550 551=cut 552 553sub end_pod { 554 #my $self = shift; 555 #return; 556} 557 558##--------------------------------------------------------------------------- 559 560=head1 B<preprocess_line()> 561 562 $textline = $parser->preprocess_line($text, $line_num); 563 564This method should be overridden by subclasses that wish to perform 565any kind of preprocessing for each I<line> of input (I<before> it has 566been determined whether or not it is part of a POD paragraph). The 567parameter C<$text> is the input line; and the parameter C<$line_num> is 568the line number of the corresponding text line. 569 570The value returned should correspond to the new text to use in its 571place. If the empty string or an undefined value is returned then no 572further processing will be performed for this line. 573 574Please note that the B<preprocess_line()> method is invoked I<before> 575the B<preprocess_paragraph()> method. After all (possibly preprocessed) 576lines in a paragraph have been assembled together and it has been 577determined that the paragraph is part of the POD documentation from one 578of the selected sections, then B<preprocess_paragraph()> is invoked. 579 580The base class implementation of this method returns the given text. 581 582=cut 583 584sub preprocess_line { 585 my ($self, $text, $line_num) = @_; 586 return $text; 587} 588 589##--------------------------------------------------------------------------- 590 591=head1 B<preprocess_paragraph()> 592 593 $textblock = $parser->preprocess_paragraph($text, $line_num); 594 595This method should be overridden by subclasses that wish to perform any 596kind of preprocessing for each block (paragraph) of POD documentation 597that appears in the input stream. The parameter C<$text> is the POD 598paragraph from the input file; and the parameter C<$line_num> is the 599line number for the beginning of the corresponding paragraph. 600 601The value returned should correspond to the new text to use in its 602place If the empty string is returned or an undefined value is 603returned, then the given C<$text> is ignored (not processed). 604 605This method is invoked after gathering up all the lines in a paragraph 606and after determining the cutting state of the paragraph, 607but before trying to further parse or interpret them. After 608B<preprocess_paragraph()> returns, the current cutting state (which 609is returned by C<$self-E<gt>cutting()>) is examined. If it evaluates 610to true then input text (including the given C<$text>) is cut (not 611processed) until the next POD directive is encountered. 612 613Please note that the B<preprocess_line()> method is invoked I<before> 614the B<preprocess_paragraph()> method. After all (possibly preprocessed) 615lines in a paragraph have been assembled together and either it has been 616determined that the paragraph is part of the POD documentation from one 617of the selected sections or the C<-want_nonPODs> option is true, 618then B<preprocess_paragraph()> is invoked. 619 620The base class implementation of this method returns the given text. 621 622=cut 623 624sub preprocess_paragraph { 625 my ($self, $text, $line_num) = @_; 626 return $text; 627} 628 629############################################################################# 630 631=head1 METHODS FOR PARSING AND PROCESSING 632 633B<Pod::Parser> provides several methods to process input text. These 634methods typically won't need to be overridden (and in some cases they 635can't be overridden), but subclasses may want to invoke them to exploit 636their functionality. 637 638=cut 639 640##--------------------------------------------------------------------------- 641 642=head1 B<parse_text()> 643 644 $ptree1 = $parser->parse_text($text, $line_num); 645 $ptree2 = $parser->parse_text({%opts}, $text, $line_num); 646 $ptree3 = $parser->parse_text(\%opts, $text, $line_num); 647 648This method is useful if you need to perform your own interpolation 649of interior sequences and can't rely upon B<interpolate> to expand 650them in simple bottom-up order. 651 652The parameter C<$text> is a string or block of text to be parsed 653for interior sequences; and the parameter C<$line_num> is the 654line number curresponding to the beginning of C<$text>. 655 656B<parse_text()> will parse the given text into a parse-tree of "nodes." 657and interior-sequences. Each "node" in the parse tree is either a 658text-string, or a B<Pod::InteriorSequence>. The result returned is a 659parse-tree of type B<Pod::ParseTree>. Please see L<Pod::InputObjects> 660for more information about B<Pod::InteriorSequence> and B<Pod::ParseTree>. 661 662If desired, an optional hash-ref may be specified as the first argument 663to customize certain aspects of the parse-tree that is created and 664returned. The set of recognized option keywords are: 665 666=over 3 667 668=item B<-expand_seq> =E<gt> I<code-ref>|I<method-name> 669 670Normally, the parse-tree returned by B<parse_text()> will contain an 671unexpanded C<Pod::InteriorSequence> object for each interior-sequence 672encountered. Specifying B<-expand_seq> tells B<parse_text()> to "expand" 673every interior-sequence it sees by invoking the referenced function 674(or named method of the parser object) and using the return value as the 675expanded result. 676 677If a subroutine reference was given, it is invoked as: 678 679 &$code_ref( $parser, $sequence ) 680 681and if a method-name was given, it is invoked as: 682 683 $parser->method_name( $sequence ) 684 685where C<$parser> is a reference to the parser object, and C<$sequence> 686is a reference to the interior-sequence object. 687[I<NOTE>: If the B<interior_sequence()> method is specified, then it is 688invoked according to the interface specified in L<"interior_sequence()">]. 689 690=item B<-expand_text> =E<gt> I<code-ref>|I<method-name> 691 692Normally, the parse-tree returned by B<parse_text()> will contain a 693text-string for each contiguous sequence of characters outside of an 694interior-sequence. Specifying B<-expand_text> tells B<parse_text()> to 695"preprocess" every such text-string it sees by invoking the referenced 696function (or named method of the parser object) and using the return value 697as the preprocessed (or "expanded") result. [Note that if the result is 698an interior-sequence, then it will I<not> be expanded as specified by the 699B<-expand_seq> option; Any such recursive expansion needs to be handled by 700the specified callback routine.] 701 702If a subroutine reference was given, it is invoked as: 703 704 &$code_ref( $parser, $text, $ptree_node ) 705 706and if a method-name was given, it is invoked as: 707 708 $parser->method_name( $text, $ptree_node ) 709 710where C<$parser> is a reference to the parser object, C<$text> is the 711text-string encountered, and C<$ptree_node> is a reference to the current 712node in the parse-tree (usually an interior-sequence object or else the 713top-level node of the parse-tree). 714 715=item B<-expand_ptree> =E<gt> I<code-ref>|I<method-name> 716 717Rather than returning a C<Pod::ParseTree>, pass the parse-tree as an 718argument to the referenced subroutine (or named method of the parser 719object) and return the result instead of the parse-tree object. 720 721If a subroutine reference was given, it is invoked as: 722 723 &$code_ref( $parser, $ptree ) 724 725and if a method-name was given, it is invoked as: 726 727 $parser->method_name( $ptree ) 728 729where C<$parser> is a reference to the parser object, and C<$ptree> 730is a reference to the parse-tree object. 731 732=back 733 734=cut 735 736sub parse_text { 737 my $self = shift; 738 local $_ = ''; 739 740 ## Get options and set any defaults 741 my %opts = (ref $_[0]) ? %{ shift() } : (); 742 my $expand_seq = $opts{'-expand_seq'} || undef; 743 my $expand_text = $opts{'-expand_text'} || undef; 744 my $expand_ptree = $opts{'-expand_ptree'} || undef; 745 746 my $text = shift; 747 my $line = shift; 748 my $file = $self->input_file(); 749 my $cmd = ""; 750 751 ## Convert method calls into closures, for our convenience 752 my $xseq_sub = $expand_seq; 753 my $xtext_sub = $expand_text; 754 my $xptree_sub = $expand_ptree; 755 if (defined $expand_seq and $expand_seq eq 'interior_sequence') { 756 ## If 'interior_sequence' is the method to use, we have to pass 757 ## more than just the sequence object, we also need to pass the 758 ## sequence name and text. 759 $xseq_sub = sub { 760 my ($self, $iseq) = @_; 761 my $args = join("", $iseq->parse_tree->children); 762 return $self->interior_sequence($iseq->name, $args, $iseq); 763 }; 764 } 765 ref $xseq_sub or $xseq_sub = sub { shift()->$expand_seq(@_) }; 766 ref $xtext_sub or $xtext_sub = sub { shift()->$expand_text(@_) }; 767 ref $xptree_sub or $xptree_sub = sub { shift()->$expand_ptree(@_) }; 768 769 ## Keep track of the "current" interior sequence, and maintain a stack 770 ## of "in progress" sequences. 771 ## 772 ## NOTE that we push our own "accumulator" at the very beginning of the 773 ## stack. It's really a parse-tree, not a sequence; but it implements 774 ## the methods we need so we can use it to gather-up all the sequences 775 ## and strings we parse. Thus, by the end of our parsing, it should be 776 ## the only thing left on our stack and all we have to do is return it! 777 ## 778 my $seq = Pod::ParseTree->new(); 779 my @seq_stack = ($seq); 780 my ($ldelim, $rdelim) = ('', ''); 781 782 ## Iterate over all sequence starts text (NOTE: split with 783 ## capturing parens keeps the delimiters) 784 $_ = $text; 785 my @tokens = split /([A-Z]<(?:<+\s)?)/; 786 while ( @tokens ) { 787 $_ = shift @tokens; 788 ## Look for the beginning of a sequence 789 if ( /^([A-Z])(<(?:<+\s)?)$/ ) { 790 ## Push a new sequence onto the stack of those "in-progress" 791 my $ldelim_orig; 792 ($cmd, $ldelim_orig) = ($1, $2); 793 ($ldelim = $ldelim_orig) =~ s/\s+$//; 794 ($rdelim = $ldelim) =~ tr/</>/; 795 $seq = Pod::InteriorSequence->new( 796 -name => $cmd, 797 -ldelim => $ldelim_orig, -rdelim => $rdelim, 798 -file => $file, -line => $line 799 ); 800 (@seq_stack > 1) and $seq->nested($seq_stack[-1]); 801 push @seq_stack, $seq; 802 } 803 ## Look for sequence ending 804 elsif ( @seq_stack > 1 ) { 805 ## Make sure we match the right kind of closing delimiter 806 my ($seq_end, $post_seq) = ("", ""); 807 if ( ($ldelim eq '<' and /\A(.*?)(>)/s) 808 or /\A(.*?)(\s+$rdelim)/s ) 809 { 810 ## Found end-of-sequence, capture the interior and the 811 ## closing the delimiter, and put the rest back on the 812 ## token-list 813 $post_seq = substr($_, length($1) + length($2)); 814 ($_, $seq_end) = ($1, $2); 815 (length $post_seq) and unshift @tokens, $post_seq; 816 } 817 if (length) { 818 ## In the middle of a sequence, append this text to it, and 819 ## dont forget to "expand" it if that's what the caller wanted 820 $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_); 821 $_ .= $seq_end; 822 } 823 if (length $seq_end) { 824 ## End of current sequence, record terminating delimiter 825 $seq->rdelim($seq_end); 826 ## Pop it off the stack of "in progress" sequences 827 pop @seq_stack; 828 ## Append result to its parent in current parse tree 829 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) 830 : $seq); 831 ## Remember the current cmd-name and left-delimiter 832 if(@seq_stack > 1) { 833 $cmd = $seq_stack[-1]->name; 834 $ldelim = $seq_stack[-1]->ldelim; 835 $rdelim = $seq_stack[-1]->rdelim; 836 } else { 837 $cmd = $ldelim = $rdelim = ''; 838 } 839 } 840 } 841 elsif (length) { 842 ## In the middle of a sequence, append this text to it, and 843 ## dont forget to "expand" it if that's what the caller wanted 844 $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_); 845 } 846 ## Keep track of line count 847 $line += tr/\n//; 848 ## Remember the "current" sequence 849 $seq = $seq_stack[-1]; 850 } 851 852 ## Handle unterminated sequences 853 my $errorsub = (@seq_stack > 1) ? $self->errorsub() : undef; 854 while (@seq_stack > 1) { 855 ($cmd, $file, $line) = ($seq->name, $seq->file_line); 856 $ldelim = $seq->ldelim; 857 ($rdelim = $ldelim) =~ tr/</>/; 858 $rdelim =~ s/^(\S+)(\s*)$/$2$1/; 859 pop @seq_stack; 860 my $errmsg = "*** ERROR: unterminated ${cmd}${ldelim}...${rdelim}". 861 " at line $line in file $file\n"; 862 (ref $errorsub) and &{$errorsub}($errmsg) 863 or (defined $errorsub) and $self->$errorsub($errmsg) 864 or warn($errmsg); 865 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) : $seq); 866 $seq = $seq_stack[-1]; 867 } 868 869 ## Return the resulting parse-tree 870 my $ptree = (pop @seq_stack)->parse_tree; 871 return $expand_ptree ? &$xptree_sub($self, $ptree) : $ptree; 872} 873 874##--------------------------------------------------------------------------- 875 876=head1 B<interpolate()> 877 878 $textblock = $parser->interpolate($text, $line_num); 879 880This method translates all text (including any embedded interior sequences) 881in the given text string C<$text> and returns the interpolated result. The 882parameter C<$line_num> is the line number corresponding to the beginning 883of C<$text>. 884 885B<interpolate()> merely invokes a private method to recursively expand 886nested interior sequences in bottom-up order (innermost sequences are 887expanded first). If there is a need to expand nested sequences in 888some alternate order, use B<parse_text> instead. 889 890=cut 891 892sub interpolate { 893 my($self, $text, $line_num) = @_; 894 my %parse_opts = ( -expand_seq => 'interior_sequence' ); 895 my $ptree = $self->parse_text( \%parse_opts, $text, $line_num ); 896 return join "", $ptree->children(); 897} 898 899##--------------------------------------------------------------------------- 900 901=begin __PRIVATE__ 902 903=head1 B<parse_paragraph()> 904 905 $parser->parse_paragraph($text, $line_num); 906 907This method takes the text of a POD paragraph to be processed, along 908with its corresponding line number, and invokes the appropriate method 909(one of B<command()>, B<verbatim()>, or B<textblock()>). 910 911For performance reasons, this method is invoked directly without any 912dynamic lookup; Hence subclasses may I<not> override it! 913 914=end __PRIVATE__ 915 916=cut 917 918sub parse_paragraph { 919 my ($self, $text, $line_num) = @_; 920 local *myData = $self; ## alias to avoid deref-ing overhead 921 local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options 922 local $_; 923 924 ## See if we want to preprocess nonPOD paragraphs as well as POD ones. 925 my $wantNonPods = $myOpts{'-want_nonPODs'}; 926 927 ## Update cutting status 928 $myData{_CUTTING} = 0 if $text =~ /^={1,2}\S/; 929 930 ## Perform any desired preprocessing if we wanted it this early 931 $wantNonPods and $text = $self->preprocess_paragraph($text, $line_num); 932 933 ## Ignore up until next POD directive if we are cutting 934 return if $myData{_CUTTING}; 935 936 ## Now we know this is block of text in a POD section! 937 938 ##----------------------------------------------------------------- 939 ## This is a hook (hack ;-) for Pod::Select to do its thing without 940 ## having to override methods, but also without Pod::Parser assuming 941 ## $self is an instance of Pod::Select (if the _SELECTED_SECTIONS 942 ## field exists then we assume there is an is_selected() method for 943 ## us to invoke (calling $self->can('is_selected') could verify this 944 ## but that is more overhead than I want to incur) 945 ##----------------------------------------------------------------- 946 947 ## Ignore this block if it isnt in one of the selected sections 948 if (exists $myData{_SELECTED_SECTIONS}) { 949 $self->is_selected($text) or return ($myData{_CUTTING} = 1); 950 } 951 952 ## If we havent already, perform any desired preprocessing and 953 ## then re-check the "cutting" state 954 unless ($wantNonPods) { 955 $text = $self->preprocess_paragraph($text, $line_num); 956 return 1 unless ((defined $text) and (length $text)); 957 return 1 if ($myData{_CUTTING}); 958 } 959 960 ## Look for one of the three types of paragraphs 961 my ($pfx, $cmd, $arg, $sep) = ('', '', '', ''); 962 my $pod_para = undef; 963 if ($text =~ /^(={1,2})(?=\S)/) { 964 ## Looks like a command paragraph. Capture the command prefix used 965 ## ("=" or "=="), as well as the command-name, its paragraph text, 966 ## and whatever sequence of characters was used to separate them 967 $pfx = $1; 968 $_ = substr($text, length $pfx); 969 ($cmd, $sep, $text) = split /(\s+)/, $_, 2; 970 ## If this is a "cut" directive then we dont need to do anything 971 ## except return to "cutting" mode. 972 if ($cmd eq 'cut') { 973 $myData{_CUTTING} = 1; 974 return unless $myOpts{'-process_cut_cmd'}; 975 } 976 } 977 ## Save the attributes indicating how the command was specified. 978 $pod_para = new Pod::Paragraph( 979 -name => $cmd, 980 -text => $text, 981 -prefix => $pfx, 982 -separator => $sep, 983 -file => $myData{_INFILE}, 984 -line => $line_num 985 ); 986 # ## Invoke appropriate callbacks 987 # if (exists $myData{_CALLBACKS}) { 988 # ## Look through the callback list, invoke callbacks, 989 # ## then see if we need to do the default actions 990 # ## (invoke_callbacks will return true if we do). 991 # return 1 unless $self->invoke_callbacks($cmd, $text, $line_num, $pod_para); 992 # } 993 if (length $cmd) { 994 ## A command paragraph 995 $self->command($cmd, $text, $line_num, $pod_para); 996 } 997 elsif ($text =~ /^\s+/) { 998 ## Indented text - must be a verbatim paragraph 999 $self->verbatim($text, $line_num, $pod_para); 1000 } 1001 else { 1002 ## Looks like an ordinary block of text 1003 $self->textblock($text, $line_num, $pod_para); 1004 } 1005 return 1; 1006} 1007 1008##--------------------------------------------------------------------------- 1009 1010=head1 B<parse_from_filehandle()> 1011 1012 $parser->parse_from_filehandle($in_fh,$out_fh); 1013 1014This method takes an input filehandle (which is assumed to already be 1015opened for reading) and reads the entire input stream looking for blocks 1016(paragraphs) of POD documentation to be processed. If no first argument 1017is given the default input filehandle C<STDIN> is used. 1018 1019The C<$in_fh> parameter may be any object that provides a B<getline()> 1020method to retrieve a single line of input text (hence, an appropriate 1021wrapper object could be used to parse PODs from a single string or an 1022array of strings). 1023 1024Using C<$in_fh-E<gt>getline()>, input is read line-by-line and assembled 1025into paragraphs or "blocks" (which are separated by lines containing 1026nothing but whitespace). For each block of POD documentation 1027encountered it will invoke a method to parse the given paragraph. 1028 1029If a second argument is given then it should correspond to a filehandle where 1030output should be sent (otherwise the default output filehandle is 1031C<STDOUT> if no output filehandle is currently in use). 1032 1033B<NOTE:> For performance reasons, this method caches the input stream at 1034the top of the stack in a local variable. Any attempts by clients to 1035change the stack contents during processing when in the midst executing 1036of this method I<will not affect> the input stream used by the current 1037invocation of this method. 1038 1039This method does I<not> usually need to be overridden by subclasses. 1040 1041=cut 1042 1043sub parse_from_filehandle { 1044 my $self = shift; 1045 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : (); 1046 my ($in_fh, $out_fh) = @_; 1047 $in_fh = \*STDIN unless ($in_fh); 1048 local *myData = $self; ## alias to avoid deref-ing overhead 1049 local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options 1050 local $_; 1051 1052 ## Put this stream at the top of the stack and do beginning-of-input 1053 ## processing. NOTE that $in_fh might be reset during this process. 1054 my $topstream = $self->_push_input_stream($in_fh, $out_fh); 1055 (exists $opts{-cutting}) and $self->cutting( $opts{-cutting} ); 1056 1057 ## Initialize line/paragraph 1058 my ($textline, $paragraph) = ('', ''); 1059 my ($nlines, $plines) = (0, 0); 1060 1061 ## Use <$fh> instead of $fh->getline where possible (for speed) 1062 $_ = ref $in_fh; 1063 my $tied_fh = (/^(?:GLOB|FileHandle|IO::\w+)$/ or tied $in_fh); 1064 1065 ## Read paragraphs line-by-line 1066 while (defined ($textline = $tied_fh ? <$in_fh> : $in_fh->getline)) { 1067 $textline = $self->preprocess_line($textline, ++$nlines); 1068 next unless ((defined $textline) && (length $textline)); 1069 1070 if ((! length $paragraph) && ($textline =~ /^==/)) { 1071 ## '==' denotes a one-line command paragraph 1072 $paragraph = $textline; 1073 $plines = 1; 1074 $textline = ''; 1075 } else { 1076 ## Append this line to the current paragraph 1077 $paragraph .= $textline; 1078 ++$plines; 1079 } 1080 1081 ## See if this line is blank and ends the current paragraph. 1082 ## If it isnt, then keep iterating until it is. 1083 next unless (($textline =~ /^([^\S\r\n]*)[\r\n]*$/) 1084 && (length $paragraph)); 1085 1086 ## Issue a warning about any non-empty blank lines 1087 if (length($1) > 0 and $myOpts{'-warnings'} and ! $myData{_CUTTING}) { 1088 my $errorsub = $self->errorsub(); 1089 my $file = $self->input_file(); 1090 my $errmsg = "*** WARNING: line containing nothing but whitespace". 1091 " in paragraph at line $nlines in file $file\n"; 1092 (ref $errorsub) and &{$errorsub}($errmsg) 1093 or (defined $errorsub) and $self->$errorsub($errmsg) 1094 or warn($errmsg); 1095 } 1096 1097 ## Now process the paragraph 1098 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1); 1099 $paragraph = ''; 1100 $plines = 0; 1101 } 1102 ## Dont forget about the last paragraph in the file 1103 if (length $paragraph) { 1104 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1) 1105 } 1106 1107 ## Now pop the input stream off the top of the input stack. 1108 $self->_pop_input_stream(); 1109} 1110 1111##--------------------------------------------------------------------------- 1112 1113=head1 B<parse_from_file()> 1114 1115 $parser->parse_from_file($filename,$outfile); 1116 1117This method takes a filename and does the following: 1118 1119=over 2 1120 1121=item * 1122 1123opens the input and output files for reading 1124(creating the appropriate filehandles) 1125 1126=item * 1127 1128invokes the B<parse_from_filehandle()> method passing it the 1129corresponding input and output filehandles. 1130 1131=item * 1132 1133closes the input and output files. 1134 1135=back 1136 1137If the special input filename "-" or "<&STDIN" is given then the STDIN 1138filehandle is used for input (and no open or close is performed). If no 1139input filename is specified then "-" is implied. 1140 1141If a second argument is given then it should be the name of the desired 1142output file. If the special output filename "-" or ">&STDOUT" is given 1143then the STDOUT filehandle is used for output (and no open or close is 1144performed). If the special output filename ">&STDERR" is given then the 1145STDERR filehandle is used for output (and no open or close is 1146performed). If no output filehandle is currently in use and no output 1147filename is specified, then "-" is implied. 1148Alternatively, an L<IO::String> object is also accepted as an output 1149file handle. 1150 1151This method does I<not> usually need to be overridden by subclasses. 1152 1153=cut 1154 1155sub parse_from_file { 1156 my $self = shift; 1157 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : (); 1158 my ($infile, $outfile) = @_; 1159 my ($in_fh, $out_fh) = (gensym(), gensym()) if ($] < 5.006); 1160 my ($close_input, $close_output) = (0, 0); 1161 local *myData = $self; 1162 local *_; 1163 1164 ## Is $infile a filename or a (possibly implied) filehandle 1165 if (defined $infile && ref $infile) { 1166 if (ref($infile) =~ /^(SCALAR|ARRAY|HASH|CODE|REF)$/) { 1167 croak "Input from $1 reference not supported!\n"; 1168 } 1169 ## Must be a filehandle-ref (or else assume its a ref to an object 1170 ## that supports the common IO read operations). 1171 $myData{_INFILE} = ${$infile}; 1172 $in_fh = $infile; 1173 } 1174 elsif (!defined($infile) || !length($infile) || ($infile eq '-') 1175 || ($infile =~ /^<&(?:STDIN|0)$/i)) 1176 { 1177 ## Not a filename, just a string implying STDIN 1178 $infile ||= '-'; 1179 $myData{_INFILE} = "<standard input>"; 1180 $in_fh = \*STDIN; 1181 } 1182 else { 1183 ## We have a filename, open it for reading 1184 $myData{_INFILE} = $infile; 1185 open($in_fh, "< $infile") or 1186 croak "Can't open $infile for reading: $!\n"; 1187 $close_input = 1; 1188 } 1189 1190 ## NOTE: we need to be *very* careful when "defaulting" the output 1191 ## file. We only want to use a default if this is the beginning of 1192 ## the entire document (but *not* if this is an included file). We 1193 ## determine this by seeing if the input stream stack has been set-up 1194 ## already 1195 1196 ## Is $outfile a filename, a (possibly implied) filehandle, maybe a ref? 1197 if (ref $outfile) { 1198 ## we need to check for ref() first, as other checks involve reading 1199 if (ref($outfile) =~ /^(ARRAY|HASH|CODE)$/) { 1200 croak "Output to $1 reference not supported!\n"; 1201 } 1202 elsif (ref($outfile) eq 'SCALAR') { 1203# # NOTE: IO::String isn't a part of the perl distribution, 1204# # so probably we shouldn't support this case... 1205# require IO::String; 1206# $myData{_OUTFILE} = "$outfile"; 1207# $out_fh = IO::String->new($outfile); 1208 croak "Output to SCALAR reference not supported!\n"; 1209 } 1210 else { 1211 ## Must be a filehandle-ref (or else assume its a ref to an 1212 ## object that supports the common IO write operations). 1213 $myData{_OUTFILE} = ${$outfile}; 1214 $out_fh = $outfile; 1215 } 1216 } 1217 elsif (!defined($outfile) || !length($outfile) || ($outfile eq '-') 1218 || ($outfile =~ /^>&?(?:STDOUT|1)$/i)) 1219 { 1220 if (defined $myData{_TOP_STREAM}) { 1221 $out_fh = $myData{_OUTPUT}; 1222 } 1223 else { 1224 ## Not a filename, just a string implying STDOUT 1225 $outfile ||= '-'; 1226 $myData{_OUTFILE} = "<standard output>"; 1227 $out_fh = \*STDOUT; 1228 } 1229 } 1230 elsif ($outfile =~ /^>&(STDERR|2)$/i) { 1231 ## Not a filename, just a string implying STDERR 1232 $myData{_OUTFILE} = "<standard error>"; 1233 $out_fh = \*STDERR; 1234 } 1235 else { 1236 ## We have a filename, open it for writing 1237 $myData{_OUTFILE} = $outfile; 1238 (-d $outfile) and croak "$outfile is a directory, not POD input!\n"; 1239 open($out_fh, "> $outfile") or 1240 croak "Can't open $outfile for writing: $!\n"; 1241 $close_output = 1; 1242 } 1243 1244 ## Whew! That was a lot of work to set up reasonably/robust behavior 1245 ## in the case of a non-filename for reading and writing. Now we just 1246 ## have to parse the input and close the handles when we're finished. 1247 $self->parse_from_filehandle(\%opts, $in_fh, $out_fh); 1248 1249 $close_input and 1250 close($in_fh) || croak "Can't close $infile after reading: $!\n"; 1251 $close_output and 1252 close($out_fh) || croak "Can't close $outfile after writing: $!\n"; 1253} 1254 1255############################################################################# 1256 1257=head1 ACCESSOR METHODS 1258 1259Clients of B<Pod::Parser> should use the following methods to access 1260instance data fields: 1261 1262=cut 1263 1264##--------------------------------------------------------------------------- 1265 1266=head1 B<errorsub()> 1267 1268 $parser->errorsub("method_name"); 1269 $parser->errorsub(\&warn_user); 1270 $parser->errorsub(sub { print STDERR, @_ }); 1271 1272Specifies the method or subroutine to use when printing error messages 1273about POD syntax. The supplied method/subroutine I<must> return TRUE upon 1274successful printing of the message. If C<undef> is given, then the B<warn> 1275builtin is used to issue error messages (this is the default behavior). 1276 1277 my $errorsub = $parser->errorsub() 1278 my $errmsg = "This is an error message!\n" 1279 (ref $errorsub) and &{$errorsub}($errmsg) 1280 or (defined $errorsub) and $parser->$errorsub($errmsg) 1281 or warn($errmsg); 1282 1283Returns a method name, or else a reference to the user-supplied subroutine 1284used to print error messages. Returns C<undef> if the B<warn> builtin 1285is used to issue error messages (this is the default behavior). 1286 1287=cut 1288 1289sub errorsub { 1290 return (@_ > 1) ? ($_[0]->{_ERRORSUB} = $_[1]) : $_[0]->{_ERRORSUB}; 1291} 1292 1293##--------------------------------------------------------------------------- 1294 1295=head1 B<cutting()> 1296 1297 $boolean = $parser->cutting(); 1298 1299Returns the current C<cutting> state: a boolean-valued scalar which 1300evaluates to true if text from the input file is currently being "cut" 1301(meaning it is I<not> considered part of the POD document). 1302 1303 $parser->cutting($boolean); 1304 1305Sets the current C<cutting> state to the given value and returns the 1306result. 1307 1308=cut 1309 1310sub cutting { 1311 return (@_ > 1) ? ($_[0]->{_CUTTING} = $_[1]) : $_[0]->{_CUTTING}; 1312} 1313 1314##--------------------------------------------------------------------------- 1315 1316##--------------------------------------------------------------------------- 1317 1318=head1 B<parseopts()> 1319 1320When invoked with no additional arguments, B<parseopts> returns a hashtable 1321of all the current parsing options. 1322 1323 ## See if we are parsing non-POD sections as well as POD ones 1324 my %opts = $parser->parseopts(); 1325 $opts{'-want_nonPODs}' and print "-want_nonPODs\n"; 1326 1327When invoked using a single string, B<parseopts> treats the string as the 1328name of a parse-option and returns its corresponding value if it exists 1329(returns C<undef> if it doesn't). 1330 1331 ## Did we ask to see '=cut' paragraphs? 1332 my $want_cut = $parser->parseopts('-process_cut_cmd'); 1333 $want_cut and print "-process_cut_cmd\n"; 1334 1335When invoked with multiple arguments, B<parseopts> treats them as 1336key/value pairs and the specified parse-option names are set to the 1337given values. Any unspecified parse-options are unaffected. 1338 1339 ## Set them back to the default 1340 $parser->parseopts(-warnings => 0); 1341 1342When passed a single hash-ref, B<parseopts> uses that hash to completely 1343reset the existing parse-options, all previous parse-option values 1344are lost. 1345 1346 ## Reset all options to default 1347 $parser->parseopts( { } ); 1348 1349See L<"PARSING OPTIONS"> for more information on the name and meaning of each 1350parse-option currently recognized. 1351 1352=cut 1353 1354sub parseopts { 1355 local *myData = shift; 1356 local *myOpts = ($myData{_PARSEOPTS} ||= {}); 1357 return %myOpts if (@_ == 0); 1358 if (@_ == 1) { 1359 local $_ = shift; 1360 return ref($_) ? $myData{_PARSEOPTS} = $_ : $myOpts{$_}; 1361 } 1362 my @newOpts = (%myOpts, @_); 1363 $myData{_PARSEOPTS} = { @newOpts }; 1364} 1365 1366##--------------------------------------------------------------------------- 1367 1368=head1 B<output_file()> 1369 1370 $fname = $parser->output_file(); 1371 1372Returns the name of the output file being written. 1373 1374=cut 1375 1376sub output_file { 1377 return $_[0]->{_OUTFILE}; 1378} 1379 1380##--------------------------------------------------------------------------- 1381 1382=head1 B<output_handle()> 1383 1384 $fhandle = $parser->output_handle(); 1385 1386Returns the output filehandle object. 1387 1388=cut 1389 1390sub output_handle { 1391 return $_[0]->{_OUTPUT}; 1392} 1393 1394##--------------------------------------------------------------------------- 1395 1396=head1 B<input_file()> 1397 1398 $fname = $parser->input_file(); 1399 1400Returns the name of the input file being read. 1401 1402=cut 1403 1404sub input_file { 1405 return $_[0]->{_INFILE}; 1406} 1407 1408##--------------------------------------------------------------------------- 1409 1410=head1 B<input_handle()> 1411 1412 $fhandle = $parser->input_handle(); 1413 1414Returns the current input filehandle object. 1415 1416=cut 1417 1418sub input_handle { 1419 return $_[0]->{_INPUT}; 1420} 1421 1422##--------------------------------------------------------------------------- 1423 1424=begin __PRIVATE__ 1425 1426=head1 B<input_streams()> 1427 1428 $listref = $parser->input_streams(); 1429 1430Returns a reference to an array which corresponds to the stack of all 1431the input streams that are currently in the middle of being parsed. 1432 1433While parsing an input stream, it is possible to invoke 1434B<parse_from_file()> or B<parse_from_filehandle()> to parse a new input 1435stream and then return to parsing the previous input stream. Each input 1436stream to be parsed is pushed onto the end of this input stack 1437before any of its input is read. The input stream that is currently 1438being parsed is always at the end (or top) of the input stack. When an 1439input stream has been exhausted, it is popped off the end of the 1440input stack. 1441 1442Each element on this input stack is a reference to C<Pod::InputSource> 1443object. Please see L<Pod::InputObjects> for more details. 1444 1445This method might be invoked when printing diagnostic messages, for example, 1446to obtain the name and line number of the all input files that are currently 1447being processed. 1448 1449=end __PRIVATE__ 1450 1451=cut 1452 1453sub input_streams { 1454 return $_[0]->{_INPUT_STREAMS}; 1455} 1456 1457##--------------------------------------------------------------------------- 1458 1459=begin __PRIVATE__ 1460 1461=head1 B<top_stream()> 1462 1463 $hashref = $parser->top_stream(); 1464 1465Returns a reference to the hash-table that represents the element 1466that is currently at the top (end) of the input stream stack 1467(see L<"input_streams()">). The return value will be the C<undef> 1468if the input stack is empty. 1469 1470This method might be used when printing diagnostic messages, for example, 1471to obtain the name and line number of the current input file. 1472 1473=end __PRIVATE__ 1474 1475=cut 1476 1477sub top_stream { 1478 return $_[0]->{_TOP_STREAM} || undef; 1479} 1480 1481############################################################################# 1482 1483=head1 PRIVATE METHODS AND DATA 1484 1485B<Pod::Parser> makes use of several internal methods and data fields 1486which clients should not need to see or use. For the sake of avoiding 1487name collisions for client data and methods, these methods and fields 1488are briefly discussed here. Determined hackers may obtain further 1489information about them by reading the B<Pod::Parser> source code. 1490 1491Private data fields are stored in the hash-object whose reference is 1492returned by the B<new()> constructor for this class. The names of all 1493private methods and data-fields used by B<Pod::Parser> begin with a 1494prefix of "_" and match the regular expression C</^_\w+$/>. 1495 1496=cut 1497 1498##--------------------------------------------------------------------------- 1499 1500=begin _PRIVATE_ 1501 1502=head1 B<_push_input_stream()> 1503 1504 $hashref = $parser->_push_input_stream($in_fh,$out_fh); 1505 1506This method will push the given input stream on the input stack and 1507perform any necessary beginning-of-document or beginning-of-file 1508processing. The argument C<$in_fh> is the input stream filehandle to 1509push, and C<$out_fh> is the corresponding output filehandle to use (if 1510it is not given or is undefined, then the current output stream is used, 1511which defaults to standard output if it doesnt exist yet). 1512 1513The value returned will be reference to the hash-table that represents 1514the new top of the input stream stack. I<Please Note> that it is 1515possible for this method to use default values for the input and output 1516file handles. If this happens, you will need to look at the C<INPUT> 1517and C<OUTPUT> instance data members to determine their new values. 1518 1519=end _PRIVATE_ 1520 1521=cut 1522 1523sub _push_input_stream { 1524 my ($self, $in_fh, $out_fh) = @_; 1525 local *myData = $self; 1526 1527 ## Initialize stuff for the entire document if this is *not* 1528 ## an included file. 1529 ## 1530 ## NOTE: we need to be *very* careful when "defaulting" the output 1531 ## filehandle. We only want to use a default value if this is the 1532 ## beginning of the entire document (but *not* if this is an included 1533 ## file). 1534 unless (defined $myData{_TOP_STREAM}) { 1535 $out_fh = \*STDOUT unless (defined $out_fh); 1536 $myData{_CUTTING} = 1; ## current "cutting" state 1537 $myData{_INPUT_STREAMS} = []; ## stack of all input streams 1538 } 1539 1540 ## Initialize input indicators 1541 $myData{_OUTFILE} = '(unknown)' unless (defined $myData{_OUTFILE}); 1542 $myData{_OUTPUT} = $out_fh if (defined $out_fh); 1543 $in_fh = \*STDIN unless (defined $in_fh); 1544 $myData{_INFILE} = '(unknown)' unless (defined $myData{_INFILE}); 1545 $myData{_INPUT} = $in_fh; 1546 my $input_top = $myData{_TOP_STREAM} 1547 = new Pod::InputSource( 1548 -name => $myData{_INFILE}, 1549 -handle => $in_fh, 1550 -was_cutting => $myData{_CUTTING} 1551 ); 1552 local *input_stack = $myData{_INPUT_STREAMS}; 1553 push(@input_stack, $input_top); 1554 1555 ## Perform beginning-of-document and/or beginning-of-input processing 1556 $self->begin_pod() if (@input_stack == 1); 1557 $self->begin_input(); 1558 1559 return $input_top; 1560} 1561 1562##--------------------------------------------------------------------------- 1563 1564=begin _PRIVATE_ 1565 1566=head1 B<_pop_input_stream()> 1567 1568 $hashref = $parser->_pop_input_stream(); 1569 1570This takes no arguments. It will perform any necessary end-of-file or 1571end-of-document processing and then pop the current input stream from 1572the top of the input stack. 1573 1574The value returned will be reference to the hash-table that represents 1575the new top of the input stream stack. 1576 1577=end _PRIVATE_ 1578 1579=cut 1580 1581sub _pop_input_stream { 1582 my ($self) = @_; 1583 local *myData = $self; 1584 local *input_stack = $myData{_INPUT_STREAMS}; 1585 1586 ## Perform end-of-input and/or end-of-document processing 1587 $self->end_input() if (@input_stack > 0); 1588 $self->end_pod() if (@input_stack == 1); 1589 1590 ## Restore cutting state to whatever it was before we started 1591 ## parsing this file. 1592 my $old_top = pop(@input_stack); 1593 $myData{_CUTTING} = $old_top->was_cutting(); 1594 1595 ## Dont forget to reset the input indicators 1596 my $input_top = undef; 1597 if (@input_stack > 0) { 1598 $input_top = $myData{_TOP_STREAM} = $input_stack[-1]; 1599 $myData{_INFILE} = $input_top->name(); 1600 $myData{_INPUT} = $input_top->handle(); 1601 } else { 1602 delete $myData{_TOP_STREAM}; 1603 delete $myData{_INPUT_STREAMS}; 1604 } 1605 1606 return $input_top; 1607} 1608 1609############################################################################# 1610 1611=head1 TREE-BASED PARSING 1612 1613If straightforward stream-based parsing wont meet your needs (as is 1614likely the case for tasks such as translating PODs into structured 1615markup languages like HTML and XML) then you may need to take the 1616tree-based approach. Rather than doing everything in one pass and 1617calling the B<interpolate()> method to expand sequences into text, it 1618may be desirable to instead create a parse-tree using the B<parse_text()> 1619method to return a tree-like structure which may contain an ordered 1620list of children (each of which may be a text-string, or a similar 1621tree-like structure). 1622 1623Pay special attention to L<"METHODS FOR PARSING AND PROCESSING"> and 1624to the objects described in L<Pod::InputObjects>. The former describes 1625the gory details and parameters for how to customize and extend the 1626parsing behavior of B<Pod::Parser>. B<Pod::InputObjects> provides 1627several objects that may all be used interchangeably as parse-trees. The 1628most obvious one is the B<Pod::ParseTree> object. It defines the basic 1629interface and functionality that all things trying to be a POD parse-tree 1630should do. A B<Pod::ParseTree> is defined such that each "node" may be a 1631text-string, or a reference to another parse-tree. Each B<Pod::Paragraph> 1632object and each B<Pod::InteriorSequence> object also supports the basic 1633parse-tree interface. 1634 1635The B<parse_text()> method takes a given paragraph of text, and 1636returns a parse-tree that contains one or more children, each of which 1637may be a text-string, or an InteriorSequence object. There are also 1638callback-options that may be passed to B<parse_text()> to customize 1639the way it expands or transforms interior-sequences, as well as the 1640returned result. These callbacks can be used to create a parse-tree 1641with custom-made objects (which may or may not support the parse-tree 1642interface, depending on how you choose to do it). 1643 1644If you wish to turn an entire POD document into a parse-tree, that process 1645is fairly straightforward. The B<parse_text()> method is the key to doing 1646this successfully. Every paragraph-callback (i.e. the polymorphic methods 1647for B<command()>, B<verbatim()>, and B<textblock()> paragraphs) takes 1648a B<Pod::Paragraph> object as an argument. Each paragraph object has a 1649B<parse_tree()> method that can be used to get or set a corresponding 1650parse-tree. So for each of those paragraph-callback methods, simply call 1651B<parse_text()> with the options you desire, and then use the returned 1652parse-tree to assign to the given paragraph object. 1653 1654That gives you a parse-tree for each paragraph - so now all you need is 1655an ordered list of paragraphs. You can maintain that yourself as a data 1656element in the object/hash. The most straightforward way would be simply 1657to use an array-ref, with the desired set of custom "options" for each 1658invocation of B<parse_text>. Let's assume the desired option-set is 1659given by the hash C<%options>. Then we might do something like the 1660following: 1661 1662 package MyPodParserTree; 1663 1664 @ISA = qw( Pod::Parser ); 1665 1666 ... 1667 1668 sub begin_pod { 1669 my $self = shift; 1670 $self->{'-paragraphs'} = []; ## initialize paragraph list 1671 } 1672 1673 sub command { 1674 my ($parser, $command, $paragraph, $line_num, $pod_para) = @_; 1675 my $ptree = $parser->parse_text({%options}, $paragraph, ...); 1676 $pod_para->parse_tree( $ptree ); 1677 push @{ $self->{'-paragraphs'} }, $pod_para; 1678 } 1679 1680 sub verbatim { 1681 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1682 push @{ $self->{'-paragraphs'} }, $pod_para; 1683 } 1684 1685 sub textblock { 1686 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1687 my $ptree = $parser->parse_text({%options}, $paragraph, ...); 1688 $pod_para->parse_tree( $ptree ); 1689 push @{ $self->{'-paragraphs'} }, $pod_para; 1690 } 1691 1692 ... 1693 1694 package main; 1695 ... 1696 my $parser = new MyPodParserTree(...); 1697 $parser->parse_from_file(...); 1698 my $paragraphs_ref = $parser->{'-paragraphs'}; 1699 1700Of course, in this module-author's humble opinion, I'd be more inclined to 1701use the existing B<Pod::ParseTree> object than a simple array. That way 1702everything in it, paragraphs and sequences, all respond to the same core 1703interface for all parse-tree nodes. The result would look something like: 1704 1705 package MyPodParserTree2; 1706 1707 ... 1708 1709 sub begin_pod { 1710 my $self = shift; 1711 $self->{'-ptree'} = new Pod::ParseTree; ## initialize parse-tree 1712 } 1713 1714 sub parse_tree { 1715 ## convenience method to get/set the parse-tree for the entire POD 1716 (@_ > 1) and $_[0]->{'-ptree'} = $_[1]; 1717 return $_[0]->{'-ptree'}; 1718 } 1719 1720 sub command { 1721 my ($parser, $command, $paragraph, $line_num, $pod_para) = @_; 1722 my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...); 1723 $pod_para->parse_tree( $ptree ); 1724 $parser->parse_tree()->append( $pod_para ); 1725 } 1726 1727 sub verbatim { 1728 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1729 $parser->parse_tree()->append( $pod_para ); 1730 } 1731 1732 sub textblock { 1733 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1734 my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...); 1735 $pod_para->parse_tree( $ptree ); 1736 $parser->parse_tree()->append( $pod_para ); 1737 } 1738 1739 ... 1740 1741 package main; 1742 ... 1743 my $parser = new MyPodParserTree2(...); 1744 $parser->parse_from_file(...); 1745 my $ptree = $parser->parse_tree; 1746 ... 1747 1748Now you have the entire POD document as one great big parse-tree. You 1749can even use the B<-expand_seq> option to B<parse_text> to insert 1750whole different kinds of objects. Just don't expect B<Pod::Parser> 1751to know what to do with them after that. That will need to be in your 1752code. Or, alternatively, you can insert any object you like so long as 1753it conforms to the B<Pod::ParseTree> interface. 1754 1755One could use this to create subclasses of B<Pod::Paragraphs> and 1756B<Pod::InteriorSequences> for specific commands (or to create your own 1757custom node-types in the parse-tree) and add some kind of B<emit()> 1758method to each custom node/subclass object in the tree. Then all you'd 1759need to do is recursively walk the tree in the desired order, processing 1760the children (most likely from left to right) by formatting them if 1761they are text-strings, or by calling their B<emit()> method if they 1762are objects/references. 1763 1764=head1 SEE ALSO 1765 1766L<Pod::InputObjects>, L<Pod::Select> 1767 1768B<Pod::InputObjects> defines POD input objects corresponding to 1769command paragraphs, parse-trees, and interior-sequences. 1770 1771B<Pod::Select> is a subclass of B<Pod::Parser> which provides the ability 1772to selectively include and/or exclude sections of a POD document from being 1773translated based upon the current heading, subheading, subsubheading, etc. 1774 1775=for __PRIVATE__ 1776B<Pod::Callbacks> is a subclass of B<Pod::Parser> which gives its users 1777the ability the employ I<callback functions> instead of, or in addition 1778to, overriding methods of the base class. 1779 1780=for __PRIVATE__ 1781B<Pod::Select> and B<Pod::Callbacks> do not override any 1782methods nor do they define any new methods with the same name. Because 1783of this, they may I<both> be used (in combination) as a base class of 1784the same subclass in order to combine their functionality without 1785causing any namespace clashes due to multiple inheritance. 1786 1787=head1 AUTHOR 1788 1789Please report bugs using L<http://rt.cpan.org>. 1790 1791Brad Appleton E<lt>bradapp@enteract.comE<gt> 1792 1793Based on code for B<Pod::Text> written by 1794Tom Christiansen E<lt>tchrist@mox.perl.comE<gt> 1795 1796=cut 1797 17981; 1799# vim: ts=4 sw=4 et 1800