[14544] | 1 | =head1 NAME |
---|
| 2 | |
---|
| 3 | perlfilter - Source Filters |
---|
| 4 | |
---|
| 5 | =head1 DESCRIPTION |
---|
| 6 | |
---|
| 7 | This article is about a little-known feature of Perl called |
---|
| 8 | I<source filters>. Source filters alter the program text of a module |
---|
| 9 | before Perl sees it, much as a C preprocessor alters the source text of |
---|
| 10 | a C program before the compiler sees it. This article tells you more |
---|
| 11 | about what source filters are, how they work, and how to write your |
---|
| 12 | own. |
---|
| 13 | |
---|
| 14 | The original purpose of source filters was to let you encrypt your |
---|
| 15 | program source to prevent casual piracy. This isn't all they can do, as |
---|
| 16 | you'll soon learn. But first, the basics. |
---|
| 17 | |
---|
| 18 | =head1 CONCEPTS |
---|
| 19 | |
---|
| 20 | Before the Perl interpreter can execute a Perl script, it must first |
---|
| 21 | read it from a file into memory for parsing and compilation. If that |
---|
| 22 | script itself includes other scripts with a C<use> or C<require> |
---|
| 23 | statement, then each of those scripts will have to be read from their |
---|
| 24 | respective files as well. |
---|
| 25 | |
---|
| 26 | Now think of each logical connection between the Perl parser and an |
---|
| 27 | individual file as a I<source stream>. A source stream is created when |
---|
| 28 | the Perl parser opens a file, it continues to exist as the source code |
---|
| 29 | is read into memory, and it is destroyed when Perl is finished parsing |
---|
| 30 | the file. If the parser encounters a C<require> or C<use> statement in |
---|
| 31 | a source stream, a new and distinct stream is created just for that |
---|
| 32 | file. |
---|
| 33 | |
---|
| 34 | The diagram below represents a single source stream, with the flow of |
---|
| 35 | source from a Perl script file on the left into the Perl parser on the |
---|
| 36 | right. This is how Perl normally operates. |
---|
| 37 | |
---|
| 38 | file -------> parser |
---|
| 39 | |
---|
| 40 | There are two important points to remember: |
---|
| 41 | |
---|
| 42 | =over 5 |
---|
| 43 | |
---|
| 44 | =item 1. |
---|
| 45 | |
---|
| 46 | Although there can be any number of source streams in existence at any |
---|
| 47 | given time, only one will be active. |
---|
| 48 | |
---|
| 49 | =item 2. |
---|
| 50 | |
---|
| 51 | Every source stream is associated with only one file. |
---|
| 52 | |
---|
| 53 | =back |
---|
| 54 | |
---|
| 55 | A source filter is a special kind of Perl module that intercepts and |
---|
| 56 | modifies a source stream before it reaches the parser. A source filter |
---|
| 57 | changes our diagram like this: |
---|
| 58 | |
---|
| 59 | file ----> filter ----> parser |
---|
| 60 | |
---|
| 61 | If that doesn't make much sense, consider the analogy of a command |
---|
| 62 | pipeline. Say you have a shell script stored in the compressed file |
---|
| 63 | I<trial.gz>. The simple pipeline command below runs the script without |
---|
| 64 | needing to create a temporary file to hold the uncompressed file. |
---|
| 65 | |
---|
| 66 | gunzip -c trial.gz | sh |
---|
| 67 | |
---|
| 68 | In this case, the data flow from the pipeline can be represented as follows: |
---|
| 69 | |
---|
| 70 | trial.gz ----> gunzip ----> sh |
---|
| 71 | |
---|
| 72 | With source filters, you can store the text of your script compressed and use a source filter to uncompress it for Perl's parser: |
---|
| 73 | |
---|
| 74 | compressed gunzip |
---|
| 75 | Perl program ---> source filter ---> parser |
---|
| 76 | |
---|
| 77 | =head1 USING FILTERS |
---|
| 78 | |
---|
| 79 | So how do you use a source filter in a Perl script? Above, I said that |
---|
| 80 | a source filter is just a special kind of module. Like all Perl |
---|
| 81 | modules, a source filter is invoked with a use statement. |
---|
| 82 | |
---|
| 83 | Say you want to pass your Perl source through the C preprocessor before |
---|
| 84 | execution. You could use the existing C<-P> command line option to do |
---|
| 85 | this, but as it happens, the source filters distribution comes with a C |
---|
| 86 | preprocessor filter module called Filter::cpp. Let's use that instead. |
---|
| 87 | |
---|
| 88 | Below is an example program, C<cpp_test>, which makes use of this filter. |
---|
| 89 | Line numbers have been added to allow specific lines to be referenced |
---|
| 90 | easily. |
---|
| 91 | |
---|
| 92 | 1: use Filter::cpp ; |
---|
| 93 | 2: #define TRUE 1 |
---|
| 94 | 3: $a = TRUE ; |
---|
| 95 | 4: print "a = $a\n" ; |
---|
| 96 | |
---|
| 97 | When you execute this script, Perl creates a source stream for the |
---|
| 98 | file. Before the parser processes any of the lines from the file, the |
---|
| 99 | source stream looks like this: |
---|
| 100 | |
---|
| 101 | cpp_test ---------> parser |
---|
| 102 | |
---|
| 103 | Line 1, C<use Filter::cpp>, includes and installs the C<cpp> filter |
---|
| 104 | module. All source filters work this way. The use statement is compiled |
---|
| 105 | and executed at compile time, before any more of the file is read, and |
---|
| 106 | it attaches the cpp filter to the source stream behind the scenes. Now |
---|
| 107 | the data flow looks like this: |
---|
| 108 | |
---|
| 109 | cpp_test ----> cpp filter ----> parser |
---|
| 110 | |
---|
| 111 | As the parser reads the second and subsequent lines from the source |
---|
| 112 | stream, it feeds those lines through the C<cpp> source filter before |
---|
| 113 | processing them. The C<cpp> filter simply passes each line through the |
---|
| 114 | real C preprocessor. The output from the C preprocessor is then |
---|
| 115 | inserted back into the source stream by the filter. |
---|
| 116 | |
---|
| 117 | .-> cpp --. |
---|
| 118 | | | |
---|
| 119 | | | |
---|
| 120 | | <-' |
---|
| 121 | cpp_test ----> cpp filter ----> parser |
---|
| 122 | |
---|
| 123 | The parser then sees the following code: |
---|
| 124 | |
---|
| 125 | use Filter::cpp ; |
---|
| 126 | $a = 1 ; |
---|
| 127 | print "a = $a\n" ; |
---|
| 128 | |
---|
| 129 | Let's consider what happens when the filtered code includes another |
---|
| 130 | module with use: |
---|
| 131 | |
---|
| 132 | 1: use Filter::cpp ; |
---|
| 133 | 2: #define TRUE 1 |
---|
| 134 | 3: use Fred ; |
---|
| 135 | 4: $a = TRUE ; |
---|
| 136 | 5: print "a = $a\n" ; |
---|
| 137 | |
---|
| 138 | The C<cpp> filter does not apply to the text of the Fred module, only |
---|
| 139 | to the text of the file that used it (C<cpp_test>). Although the use |
---|
| 140 | statement on line 3 will pass through the cpp filter, the module that |
---|
| 141 | gets included (C<Fred>) will not. The source streams look like this |
---|
| 142 | after line 3 has been parsed and before line 4 is parsed: |
---|
| 143 | |
---|
| 144 | cpp_test ---> cpp filter ---> parser (INACTIVE) |
---|
| 145 | |
---|
| 146 | Fred.pm ----> parser |
---|
| 147 | |
---|
| 148 | As you can see, a new stream has been created for reading the source |
---|
| 149 | from C<Fred.pm>. This stream will remain active until all of C<Fred.pm> |
---|
| 150 | has been parsed. The source stream for C<cpp_test> will still exist, |
---|
| 151 | but is inactive. Once the parser has finished reading Fred.pm, the |
---|
| 152 | source stream associated with it will be destroyed. The source stream |
---|
| 153 | for C<cpp_test> then becomes active again and the parser reads line 4 |
---|
| 154 | and subsequent lines from C<cpp_test>. |
---|
| 155 | |
---|
| 156 | You can use more than one source filter on a single file. Similarly, |
---|
| 157 | you can reuse the same filter in as many files as you like. |
---|
| 158 | |
---|
| 159 | For example, if you have a uuencoded and compressed source file, it is |
---|
| 160 | possible to stack a uudecode filter and an uncompression filter like |
---|
| 161 | this: |
---|
| 162 | |
---|
| 163 | use Filter::uudecode ; use Filter::uncompress ; |
---|
| 164 | M'XL(".H<US4''V9I;F%L')Q;>7/;1I;_>_I3=&E=%:F*I"T?22Q/ |
---|
| 165 | M6]9*<IQCO*XFT"0[PL%%'Y+IG?WN^ZYN-$'J.[.JE$,20/?K=_[> |
---|
| 166 | ... |
---|
| 167 | |
---|
| 168 | Once the first line has been processed, the flow will look like this: |
---|
| 169 | |
---|
| 170 | file ---> uudecode ---> uncompress ---> parser |
---|
| 171 | filter filter |
---|
| 172 | |
---|
| 173 | Data flows through filters in the same order they appear in the source |
---|
| 174 | file. The uudecode filter appeared before the uncompress filter, so the |
---|
| 175 | source file will be uudecoded before it's uncompressed. |
---|
| 176 | |
---|
| 177 | =head1 WRITING A SOURCE FILTER |
---|
| 178 | |
---|
| 179 | There are three ways to write your own source filter. You can write it |
---|
| 180 | in C, use an external program as a filter, or write the filter in Perl. |
---|
| 181 | I won't cover the first two in any great detail, so I'll get them out |
---|
| 182 | of the way first. Writing the filter in Perl is most convenient, so |
---|
| 183 | I'll devote the most space to it. |
---|
| 184 | |
---|
| 185 | =head1 WRITING A SOURCE FILTER IN C |
---|
| 186 | |
---|
| 187 | The first of the three available techniques is to write the filter |
---|
| 188 | completely in C. The external module you create interfaces directly |
---|
| 189 | with the source filter hooks provided by Perl. |
---|
| 190 | |
---|
| 191 | The advantage of this technique is that you have complete control over |
---|
| 192 | the implementation of your filter. The big disadvantage is the |
---|
| 193 | increased complexity required to write the filter - not only do you |
---|
| 194 | need to understand the source filter hooks, but you also need a |
---|
| 195 | reasonable knowledge of Perl guts. One of the few times it is worth |
---|
| 196 | going to this trouble is when writing a source scrambler. The |
---|
| 197 | C<decrypt> filter (which unscrambles the source before Perl parses it) |
---|
| 198 | included with the source filter distribution is an example of a C |
---|
| 199 | source filter (see Decryption Filters, below). |
---|
| 200 | |
---|
| 201 | |
---|
| 202 | =over 5 |
---|
| 203 | |
---|
| 204 | =item B<Decryption Filters> |
---|
| 205 | |
---|
| 206 | All decryption filters work on the principle of "security through |
---|
| 207 | obscurity." Regardless of how well you write a decryption filter and |
---|
| 208 | how strong your encryption algorithm, anyone determined enough can |
---|
| 209 | retrieve the original source code. The reason is quite simple - once |
---|
| 210 | the decryption filter has decrypted the source back to its original |
---|
| 211 | form, fragments of it will be stored in the computer's memory as Perl |
---|
| 212 | parses it. The source might only be in memory for a short period of |
---|
| 213 | time, but anyone possessing a debugger, skill, and lots of patience can |
---|
| 214 | eventually reconstruct your program. |
---|
| 215 | |
---|
| 216 | That said, there are a number of steps that can be taken to make life |
---|
| 217 | difficult for the potential cracker. The most important: Write your |
---|
| 218 | decryption filter in C and statically link the decryption module into |
---|
| 219 | the Perl binary. For further tips to make life difficult for the |
---|
| 220 | potential cracker, see the file I<decrypt.pm> in the source filters |
---|
| 221 | module. |
---|
| 222 | |
---|
| 223 | =back |
---|
| 224 | |
---|
| 225 | =head1 CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE |
---|
| 226 | |
---|
| 227 | An alternative to writing the filter in C is to create a separate |
---|
| 228 | executable in the language of your choice. The separate executable |
---|
| 229 | reads from standard input, does whatever processing is necessary, and |
---|
| 230 | writes the filtered data to standard output. C<Filter:cpp> is an |
---|
| 231 | example of a source filter implemented as a separate executable - the |
---|
| 232 | executable is the C preprocessor bundled with your C compiler. |
---|
| 233 | |
---|
| 234 | The source filter distribution includes two modules that simplify this |
---|
| 235 | task: C<Filter::exec> and C<Filter::sh>. Both allow you to run any |
---|
| 236 | external executable. Both use a coprocess to control the flow of data |
---|
| 237 | into and out of the external executable. (For details on coprocesses, |
---|
| 238 | see Stephens, W.R. "Advanced Programming in the UNIX Environment." |
---|
| 239 | Addison-Wesley, ISBN 0-210-56317-7, pages 441-445.) The difference |
---|
| 240 | between them is that C<Filter::exec> spawns the external command |
---|
| 241 | directly, while C<Filter::sh> spawns a shell to execute the external |
---|
| 242 | command. (Unix uses the Bourne shell; NT uses the cmd shell.) Spawning |
---|
| 243 | a shell allows you to make use of the shell metacharacters and |
---|
| 244 | redirection facilities. |
---|
| 245 | |
---|
| 246 | Here is an example script that uses C<Filter::sh>: |
---|
| 247 | |
---|
| 248 | use Filter::sh 'tr XYZ PQR' ; |
---|
| 249 | $a = 1 ; |
---|
| 250 | print "XYZ a = $a\n" ; |
---|
| 251 | |
---|
| 252 | The output you'll get when the script is executed: |
---|
| 253 | |
---|
| 254 | PQR a = 1 |
---|
| 255 | |
---|
| 256 | Writing a source filter as a separate executable works fine, but a |
---|
| 257 | small performance penalty is incurred. For example, if you execute the |
---|
| 258 | small example above, a separate subprocess will be created to run the |
---|
| 259 | Unix C<tr> command. Each use of the filter requires its own subprocess. |
---|
| 260 | If creating subprocesses is expensive on your system, you might want to |
---|
| 261 | consider one of the other options for creating source filters. |
---|
| 262 | |
---|
| 263 | =head1 WRITING A SOURCE FILTER IN PERL |
---|
| 264 | |
---|
| 265 | The easiest and most portable option available for creating your own |
---|
| 266 | source filter is to write it completely in Perl. To distinguish this |
---|
| 267 | from the previous two techniques, I'll call it a Perl source filter. |
---|
| 268 | |
---|
| 269 | To help understand how to write a Perl source filter we need an example |
---|
| 270 | to study. Here is a complete source filter that performs rot13 |
---|
| 271 | decoding. (Rot13 is a very simple encryption scheme used in Usenet |
---|
| 272 | postings to hide the contents of offensive posts. It moves every letter |
---|
| 273 | forward thirteen places, so that A becomes N, B becomes O, and Z |
---|
| 274 | becomes M.) |
---|
| 275 | |
---|
| 276 | |
---|
| 277 | package Rot13 ; |
---|
| 278 | |
---|
| 279 | use Filter::Util::Call ; |
---|
| 280 | |
---|
| 281 | sub import { |
---|
| 282 | my ($type) = @_ ; |
---|
| 283 | my ($ref) = [] ; |
---|
| 284 | filter_add(bless $ref) ; |
---|
| 285 | } |
---|
| 286 | |
---|
| 287 | sub filter { |
---|
| 288 | my ($self) = @_ ; |
---|
| 289 | my ($status) ; |
---|
| 290 | |
---|
| 291 | tr/n-za-mN-ZA-M/a-zA-Z/ |
---|
| 292 | if ($status = filter_read()) > 0 ; |
---|
| 293 | $status ; |
---|
| 294 | } |
---|
| 295 | |
---|
| 296 | 1; |
---|
| 297 | |
---|
| 298 | All Perl source filters are implemented as Perl classes and have the |
---|
| 299 | same basic structure as the example above. |
---|
| 300 | |
---|
| 301 | First, we include the C<Filter::Util::Call> module, which exports a |
---|
| 302 | number of functions into your filter's namespace. The filter shown |
---|
| 303 | above uses two of these functions, C<filter_add()> and |
---|
| 304 | C<filter_read()>. |
---|
| 305 | |
---|
| 306 | Next, we create the filter object and associate it with the source |
---|
| 307 | stream by defining the C<import> function. If you know Perl well |
---|
| 308 | enough, you know that C<import> is called automatically every time a |
---|
| 309 | module is included with a use statement. This makes C<import> the ideal |
---|
| 310 | place to both create and install a filter object. |
---|
| 311 | |
---|
| 312 | In the example filter, the object (C<$ref>) is blessed just like any |
---|
| 313 | other Perl object. Our example uses an anonymous array, but this isn't |
---|
| 314 | a requirement. Because this example doesn't need to store any context |
---|
| 315 | information, we could have used a scalar or hash reference just as |
---|
| 316 | well. The next section demonstrates context data. |
---|
| 317 | |
---|
| 318 | The association between the filter object and the source stream is made |
---|
| 319 | with the C<filter_add()> function. This takes a filter object as a |
---|
| 320 | parameter (C<$ref> in this case) and installs it in the source stream. |
---|
| 321 | |
---|
| 322 | Finally, there is the code that actually does the filtering. For this |
---|
| 323 | type of Perl source filter, all the filtering is done in a method |
---|
| 324 | called C<filter()>. (It is also possible to write a Perl source filter |
---|
| 325 | using a closure. See the C<Filter::Util::Call> manual page for more |
---|
| 326 | details.) It's called every time the Perl parser needs another line of |
---|
| 327 | source to process. The C<filter()> method, in turn, reads lines from |
---|
| 328 | the source stream using the C<filter_read()> function. |
---|
| 329 | |
---|
| 330 | If a line was available from the source stream, C<filter_read()> |
---|
| 331 | returns a status value greater than zero and appends the line to C<$_>. |
---|
| 332 | A status value of zero indicates end-of-file, less than zero means an |
---|
| 333 | error. The filter function itself is expected to return its status in |
---|
| 334 | the same way, and put the filtered line it wants written to the source |
---|
| 335 | stream in C<$_>. The use of C<$_> accounts for the brevity of most Perl |
---|
| 336 | source filters. |
---|
| 337 | |
---|
| 338 | In order to make use of the rot13 filter we need some way of encoding |
---|
| 339 | the source file in rot13 format. The script below, C<mkrot13>, does |
---|
| 340 | just that. |
---|
| 341 | |
---|
| 342 | die "usage mkrot13 filename\n" unless @ARGV ; |
---|
| 343 | my $in = $ARGV[0] ; |
---|
| 344 | my $out = "$in.tmp" ; |
---|
| 345 | open(IN, "<$in") or die "Cannot open file $in: $!\n"; |
---|
| 346 | open(OUT, ">$out") or die "Cannot open file $out: $!\n"; |
---|
| 347 | |
---|
| 348 | print OUT "use Rot13;\n" ; |
---|
| 349 | while (<IN>) { |
---|
| 350 | tr/a-zA-Z/n-za-mN-ZA-M/ ; |
---|
| 351 | print OUT ; |
---|
| 352 | } |
---|
| 353 | |
---|
| 354 | close IN; |
---|
| 355 | close OUT; |
---|
| 356 | unlink $in; |
---|
| 357 | rename $out, $in; |
---|
| 358 | |
---|
| 359 | If we encrypt this with C<mkrot13>: |
---|
| 360 | |
---|
| 361 | print " hello fred \n" ; |
---|
| 362 | |
---|
| 363 | the result will be this: |
---|
| 364 | |
---|
| 365 | use Rot13; |
---|
| 366 | cevag "uryyb serq\a" ; |
---|
| 367 | |
---|
| 368 | Running it produces this output: |
---|
| 369 | |
---|
| 370 | hello fred |
---|
| 371 | |
---|
| 372 | =head1 USING CONTEXT: THE DEBUG FILTER |
---|
| 373 | |
---|
| 374 | The rot13 example was a trivial example. Here's another demonstration |
---|
| 375 | that shows off a few more features. |
---|
| 376 | |
---|
| 377 | Say you wanted to include a lot of debugging code in your Perl script |
---|
| 378 | during development, but you didn't want it available in the released |
---|
| 379 | product. Source filters offer a solution. In order to keep the example |
---|
| 380 | simple, let's say you wanted the debugging output to be controlled by |
---|
| 381 | an environment variable, C<DEBUG>. Debugging code is enabled if the |
---|
| 382 | variable exists, otherwise it is disabled. |
---|
| 383 | |
---|
| 384 | Two special marker lines will bracket debugging code, like this: |
---|
| 385 | |
---|
| 386 | ## DEBUG_BEGIN |
---|
| 387 | if ($year > 1999) { |
---|
| 388 | warn "Debug: millennium bug in year $year\n" ; |
---|
| 389 | } |
---|
| 390 | ## DEBUG_END |
---|
| 391 | |
---|
| 392 | When the C<DEBUG> environment variable exists, the filter ensures that |
---|
| 393 | Perl parses only the code between the C<DEBUG_BEGIN> and C<DEBUG_END> |
---|
| 394 | markers. That means that when C<DEBUG> does exist, the code above |
---|
| 395 | should be passed through the filter unchanged. The marker lines can |
---|
| 396 | also be passed through as-is, because the Perl parser will see them as |
---|
| 397 | comment lines. When C<DEBUG> isn't set, we need a way to disable the |
---|
| 398 | debug code. A simple way to achieve that is to convert the lines |
---|
| 399 | between the two markers into comments: |
---|
| 400 | |
---|
| 401 | ## DEBUG_BEGIN |
---|
| 402 | #if ($year > 1999) { |
---|
| 403 | # warn "Debug: millennium bug in year $year\n" ; |
---|
| 404 | #} |
---|
| 405 | ## DEBUG_END |
---|
| 406 | |
---|
| 407 | Here is the complete Debug filter: |
---|
| 408 | |
---|
| 409 | package Debug; |
---|
| 410 | |
---|
| 411 | use strict; |
---|
| 412 | use warnings; |
---|
| 413 | use Filter::Util::Call ; |
---|
| 414 | |
---|
| 415 | use constant TRUE => 1 ; |
---|
| 416 | use constant FALSE => 0 ; |
---|
| 417 | |
---|
| 418 | sub import { |
---|
| 419 | my ($type) = @_ ; |
---|
| 420 | my (%context) = ( |
---|
| 421 | Enabled => defined $ENV{DEBUG}, |
---|
| 422 | InTraceBlock => FALSE, |
---|
| 423 | Filename => (caller)[1], |
---|
| 424 | LineNo => 0, |
---|
| 425 | LastBegin => 0, |
---|
| 426 | ) ; |
---|
| 427 | filter_add(bless \%context) ; |
---|
| 428 | } |
---|
| 429 | |
---|
| 430 | sub Die { |
---|
| 431 | my ($self) = shift ; |
---|
| 432 | my ($message) = shift ; |
---|
| 433 | my ($line_no) = shift || $self->{LastBegin} ; |
---|
| 434 | die "$message at $self->{Filename} line $line_no.\n" |
---|
| 435 | } |
---|
| 436 | |
---|
| 437 | sub filter { |
---|
| 438 | my ($self) = @_ ; |
---|
| 439 | my ($status) ; |
---|
| 440 | $status = filter_read() ; |
---|
| 441 | ++ $self->{LineNo} ; |
---|
| 442 | |
---|
| 443 | # deal with EOF/error first |
---|
| 444 | if ($status <= 0) { |
---|
| 445 | $self->Die("DEBUG_BEGIN has no DEBUG_END") |
---|
| 446 | if $self->{InTraceBlock} ; |
---|
| 447 | return $status ; |
---|
| 448 | } |
---|
| 449 | |
---|
| 450 | if ($self->{InTraceBlock}) { |
---|
| 451 | if (/^\s*##\s*DEBUG_BEGIN/ ) { |
---|
| 452 | $self->Die("Nested DEBUG_BEGIN", $self->{LineNo}) |
---|
| 453 | } elsif (/^\s*##\s*DEBUG_END/) { |
---|
| 454 | $self->{InTraceBlock} = FALSE ; |
---|
| 455 | } |
---|
| 456 | |
---|
| 457 | # comment out the debug lines when the filter is disabled |
---|
| 458 | s/^/#/ if ! $self->{Enabled} ; |
---|
| 459 | } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) { |
---|
| 460 | $self->{InTraceBlock} = TRUE ; |
---|
| 461 | $self->{LastBegin} = $self->{LineNo} ; |
---|
| 462 | } elsif ( /^\s*##\s*DEBUG_END/ ) { |
---|
| 463 | $self->Die("DEBUG_END has no DEBUG_BEGIN", $self->{LineNo}); |
---|
| 464 | } |
---|
| 465 | return $status ; |
---|
| 466 | } |
---|
| 467 | |
---|
| 468 | 1 ; |
---|
| 469 | |
---|
| 470 | The big difference between this filter and the previous example is the |
---|
| 471 | use of context data in the filter object. The filter object is based on |
---|
| 472 | a hash reference, and is used to keep various pieces of context |
---|
| 473 | information between calls to the filter function. All but two of the |
---|
| 474 | hash fields are used for error reporting. The first of those two, |
---|
| 475 | Enabled, is used by the filter to determine whether the debugging code |
---|
| 476 | should be given to the Perl parser. The second, InTraceBlock, is true |
---|
| 477 | when the filter has encountered a C<DEBUG_BEGIN> line, but has not yet |
---|
| 478 | encountered the following C<DEBUG_END> line. |
---|
| 479 | |
---|
| 480 | If you ignore all the error checking that most of the code does, the |
---|
| 481 | essence of the filter is as follows: |
---|
| 482 | |
---|
| 483 | sub filter { |
---|
| 484 | my ($self) = @_ ; |
---|
| 485 | my ($status) ; |
---|
| 486 | $status = filter_read() ; |
---|
| 487 | |
---|
| 488 | # deal with EOF/error first |
---|
| 489 | return $status if $status <= 0 ; |
---|
| 490 | if ($self->{InTraceBlock}) { |
---|
| 491 | if (/^\s*##\s*DEBUG_END/) { |
---|
| 492 | $self->{InTraceBlock} = FALSE |
---|
| 493 | } |
---|
| 494 | |
---|
| 495 | # comment out debug lines when the filter is disabled |
---|
| 496 | s/^/#/ if ! $self->{Enabled} ; |
---|
| 497 | } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) { |
---|
| 498 | $self->{InTraceBlock} = TRUE ; |
---|
| 499 | } |
---|
| 500 | return $status ; |
---|
| 501 | } |
---|
| 502 | |
---|
| 503 | Be warned: just as the C-preprocessor doesn't know C, the Debug filter |
---|
| 504 | doesn't know Perl. It can be fooled quite easily: |
---|
| 505 | |
---|
| 506 | print <<EOM; |
---|
| 507 | ##DEBUG_BEGIN |
---|
| 508 | EOM |
---|
| 509 | |
---|
| 510 | Such things aside, you can see that a lot can be achieved with a modest |
---|
| 511 | amount of code. |
---|
| 512 | |
---|
| 513 | =head1 CONCLUSION |
---|
| 514 | |
---|
| 515 | You now have better understanding of what a source filter is, and you |
---|
| 516 | might even have a possible use for them. If you feel like playing with |
---|
| 517 | source filters but need a bit of inspiration, here are some extra |
---|
| 518 | features you could add to the Debug filter. |
---|
| 519 | |
---|
| 520 | First, an easy one. Rather than having debugging code that is |
---|
| 521 | all-or-nothing, it would be much more useful to be able to control |
---|
| 522 | which specific blocks of debugging code get included. Try extending the |
---|
| 523 | syntax for debug blocks to allow each to be identified. The contents of |
---|
| 524 | the C<DEBUG> environment variable can then be used to control which |
---|
| 525 | blocks get included. |
---|
| 526 | |
---|
| 527 | Once you can identify individual blocks, try allowing them to be |
---|
| 528 | nested. That isn't difficult either. |
---|
| 529 | |
---|
| 530 | Here is a interesting idea that doesn't involve the Debug filter. |
---|
| 531 | Currently Perl subroutines have fairly limited support for formal |
---|
| 532 | parameter lists. You can specify the number of parameters and their |
---|
| 533 | type, but you still have to manually take them out of the C<@_> array |
---|
| 534 | yourself. Write a source filter that allows you to have a named |
---|
| 535 | parameter list. Such a filter would turn this: |
---|
| 536 | |
---|
| 537 | sub MySub ($first, $second, @rest) { ... } |
---|
| 538 | |
---|
| 539 | into this: |
---|
| 540 | |
---|
| 541 | sub MySub($$@) { |
---|
| 542 | my ($first) = shift ; |
---|
| 543 | my ($second) = shift ; |
---|
| 544 | my (@rest) = @_ ; |
---|
| 545 | ... |
---|
| 546 | } |
---|
| 547 | |
---|
| 548 | Finally, if you feel like a real challenge, have a go at writing a |
---|
| 549 | full-blown Perl macro preprocessor as a source filter. Borrow the |
---|
| 550 | useful features from the C preprocessor and any other macro processors |
---|
| 551 | you know. The tricky bit will be choosing how much knowledge of Perl's |
---|
| 552 | syntax you want your filter to have. |
---|
| 553 | |
---|
| 554 | =head1 REQUIREMENTS |
---|
| 555 | |
---|
| 556 | The Source Filters distribution is available on CPAN, in |
---|
| 557 | |
---|
| 558 | CPAN/modules/by-module/Filter |
---|
| 559 | |
---|
| 560 | =head1 AUTHOR |
---|
| 561 | |
---|
| 562 | Paul Marquess E<lt>Paul.Marquess@btinternet.comE<gt> |
---|
| 563 | |
---|
| 564 | =head1 Copyrights |
---|
| 565 | |
---|
| 566 | This article originally appeared in The Perl Journal #11, and is |
---|
| 567 | copyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and |
---|
| 568 | The Perl Journal. This document may be distributed under the same terms |
---|
| 569 | as Perl itself. |
---|