Mod Perl Icon Mod Perl Icon Code Snippets


[ Prev | Main Page | Next ]

Table of Contents:


The Writing Apache Modules with Perl and C book can be purchased online from O'Reilly and Amazon.com.
Your corrections of either technical or grammatical errors are very welcome. You are encouraged to help me to improve this guide. If you have something to contribute please send it directly to me.

[TOC]


Redirecting Errors to the Client instead of error_log

To trap (almost) all Perl run-time errors and send the output to the client instead of to Apache's error_log add this line to your script:

  use CGI::Carp qw(fatalsToBrowser);

Refer to the CGI::Carp man page for more detailed information.

You can also write your own custom __DIE__ and __WARN__ signal handlers. Suppose that I don't want users to see an error message, but I want it to be emailed to me if it's severe enough. The handler is to trap various errors and perform according to some defined logic.

I wrote this handler for the modperl environment, but it works correctly when called from the shell. A stripped-down version of the code is shown here:

  # assign the DIE sighandler to call mydie(error_message) whenever a
  # die() sub is being called. Can be added anywhere in the code.
  local $SIG{'__DIE__'} = \&mydie;
  
  # Do not forget the C<local()>, unless you want this signal handler to
  # be invoked every time any scripts dies (including events where this
  # treatment may be undesirable).
   
  # and the handler itself
  sub mydie{
    my $why = shift;
  
    my $UNDER_MOD_PERL = ( (exists $ENV{'GATEWAY_INTERFACE'} 
                           and $ENV{'GATEWAY_INTERFACE'} =~ /CGI-Perl/)
                         or exists $ENV{'MOD_PERL'} ) ? 1 : 0;
  
    chomp $why;
    my $orig_why = $why;                # an ASCII copy for email report
  
    # handle the shell execution case (so we will not get all the HTML)
    print("Error: $why\n"), exit unless $UNDER_MOD_PERL;
  
    my $should_email = 0;
    my $message = '';
  
    $why =~ s/[<&>]/"&#".ord($&).";"/ge;    # entity escape
  
    # Now we need to trap various kinds of errors, that come from CGI.pm
    # And we don't want these errors to be emailed to us, since
    # these aren't programmatical errors
    if ($orig_why =~ /Client attempted to POST (\d+) bytes/o) {
  
      $message = qq{
                  You cannot POST messages bigger than 
                  @{[1024*$c{max_image_size}]} bytes.<BR>
                  You have tried to post $1 bytes<BR>
                  If you are trying to upload an image, make sure its size is not 
                  bigger than @{[1024*$c{max_image_size}]} bytes.<P>
                  Thank you!
                 };
  
    } elsif ($orig_why =~ /Malformed multipart POST/o) {
  
      $message = qq{
                  Have you tried to upload an image in the wrong way?<P>
                  To sucessfully upload an image you must use a browser that supports
                  image upload and use the 'Browse' button to select that image.
                  DO NOT type the path to the image into the upload field.<P>
                  Thank you!
                 };
  
    } elsif ($orig_why =~ /closed socket during multipart read/o) {
  
      $message = qq{
                  Have you pressed a 'STOP' button?<BR>
                  Please try again!<P>
                  Thank you!
                 };
  
    } else {
  
      $message = qq{
                    <B>There is no action to be performed on your side, since
                  the error report has been already sent to webmaster. <BR><P>
                  <B>Thank you for your patience!</B>
                 };
  
      $should_email = 1;
    }
  
  
    print qq{Content-type: text/html
  
  <HTML><BODY BGCOLOR="white">
  <B>Oops, Something went wrong.</B><P>
  $message
  </BODY></HTML>};      
  
      # send email report if appropriate
    if ($should_email){
  
        # import sendmail subs
      use Mail ();
        # prepare the email error report:
      my $subject ="Error Report";
      my $body = qq|
    An error has happened:
  
    $orig_why
  
      |;
  
        # send error reports to admin and author
      send_mail($c{email}{'admin'},$c{email}{'admin'},$subject,$body);
      send_mail($c{email}{'admin'},$c{email}{'author'},$subject,$body);
      print STDERR "[".scalar localtime()."] [SIGDIE] Sending Error Email\n";
    }
  
       # print to error_log so we will know we've sent
    print STDERR "[".scalar localtime()."] [SIGDIE] $orig_why \n";
  
    exit 1;
  }                             # end of sub mydie
  

You may have noticed that I trap the CGI.pm's die() calls here, I don't see any reason why my users should see ugly error messages, but that's the way CGI.pm written. The workaround is to trap them yourself.

Please note that as of version 2.49, CGI.pm provides a cgi_error() method to print the errors and won't die() unless you want it to.

[TOC]


Caching the POSTed Data

What happens if you need to access the POSTed data more than once? May be if you want to reuse it on subsequent requests. At the low-level data can only be read from a socket once. So you have to store it once and make it available for reuse. There is an experimental option for Makefile.PL called PERL_STASH_POST_DATA. If you turn it on, you can get at it again with $r->subprocess_env("POST_DATA"). This is not on by default because of the overhead it adds. And, because not all POST data is read in one clump, what do we do with large multipart file uploads? It's not a problem that's easy to solve in a general way. You might try the following approach:

  <Limit POST>
     PerlFixupHandler    My::fixup_handler
  </Limit>

  use Apache::Constants;
  sub My::fixup_handler {
    my $r = shift;
    return DECLINED unless $r->method eq "POST";
    $r->args(scalar $r->content);
    $r->method("GET");
    $r->method_number(M_GET);
    $r->headers_in->unset('Content-length');
    OK;
  }

Now when CGI.pm, Apache::Request or whoever parses the client data, it can do so more than once since $r->args doesn't go away (unless you make it go away).

[TOC]


Cache control for regular and error modes

To disable caching you should use the headers:

  Pragma: no-cache
  Cache-control: no-cache

For normally generated responds use:

  $r->header_out("Pragma","no-cache");
  $r->header_out("Cache-control","no-cache");
  $r->no_cache(1);

If for some reason you need to use them in Error control code use:

  $r->err_header_out("Pragma","no-cache");
  $r->err_header_out("Cache-control","no-cache");

[TOC]


Redirect a POST request, forwarding the content

With mod_perl you can easily redirect a POST request to some other location. All it takes is reading in the contents, setting the method to be of a GET type and args with the content to be forwarded and finally doing the redirect:

  my $r = shift;
  my $content = $r->content;
  $r->method("GET");
  $r->method_number(M_GET);
  $r->headers_in->unset("Content-length");
  $r->args($content);
  $r->internal_redirect_handler("/new/url");

Of course that last line can be any kind of redirect, not necessarily an internal redirect.

[TOC]


Reading POST Data, then Redirecting or doing something else

If you read POST data, then redirect, you need to do this before the redirect or apache will hang:

  $r->method_number(M_GET);
  $r->method('GET');
  $r->headers_in->unset('Content-length');
  $r->header_out('Location' => $ENV{SCRIPT_NAME});
  $r->status(REDIRECT);
  $r->send_http_header;

After the first time you read POST data, you need the code above to prevent somebody else from trying to read post data that's already been read.

[TOC]


Redirecting While Maintaining Environment Variables

Let's say you have a module that sets some environment variables.

If you redirect, that's most likely telling the web browser to fetch the new page. This makes it a totally new request and none of environment variables stays preserved.

However, if you're using internal_redirect(), then subprocess_env() should do the trick, but the %ENV keys will be prefixed with REDIRECT_.

[TOC]


Terminating a child process on Request Completion

If you want to terminate the child process serving the current request, upon completion of processing, call anywhere in the code:

  $r->child_terminate;

Apache won't actually terminate the child until everything is done and the connection is closed.

[TOC]


More on relative paths

Many people use relative paths for require, use, etc., or open files in the current directory or relative to the current directory. But this will fail if you don't chdir() into the correct directory first (e.g when you call the script by its full path). This code would work:

  /home/httpd/perl/test.pl:
  -------------------------
  #!/usr/bin/perl
  open IN, "./foo.txt";
  -------------------------

if we call the script by:

  % chdir /home/httpd/perl
  % ./test.pl

since foo.txt is located at the same directory the script is being called from. if we call the script by:

  % /home/httpd/perl/test.pl

when we aren't chdir to the /home/httpd/perl, the script will fail to find foo.txt. If you don't want to use hardcoded directories in your scripts, FindBin.pm package will come to rescue.

  use FindBin qw($Bin);
  use lib $Bin;
  open IN, "./foo.txt";

or

  use FindBin qw($Bin);
  open IN, "$Bin/foo.txt";

Now $Bin includes the path of the directory the script resides in, so you can move the script from one directory to the other and call it from anywhere else. The paths will be always correct.

It's different from using "./foo", for you first have to chdir to the directory in which the script is located. (Think about crontabs!!!)

Important: FindBin will not work in mod_perl environment as it's loaded and executed only for the first script executed inside the process, all the other will use the cached value, which would be probably incorrect.

[TOC]


Watching the error_log file without telneting to the server

I wrote this script a long time ago, when I had to debug my CGI scripts but didn't have the access to the error_log file. I asked the admin to install this script and have used it happily since then.

If your scripts are running on these 'Get-free-site' servers, and you cannot debug your script because you can't telnet to the server or can't see the error_log, you can ask your sysadmin to install this script.

Note, that it was written for a plain Apache, and isn't prepared to handle complex multiline error and warning messages generated by mod_perl. It also uses a system() call to do the main work with tail() utility, probably a more efficient perl implementation is due (take a look at File::Tail module). You are welcome to fix it and contribute it back to mod_perl community. Thank you!

Ok, here is the code:

  # !/usr/bin/perl -Tw
  
  use strict;
  
  my $default   = 10;
  my $error_log = "/usr/local/apache/logs/error_log";
  use CGI;
  
  # untaint $ENV{PATH}
  $ENV{'PATH'} = '/bin:/usr/bin';
  delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
    
  my $q = new CGI;
  
  my $counts = (defined $q->param('count') and $q->param('count'))
    ? $q->param('count') : $default;
  
  print $q->header,
        $q->start_html(-bgcolor => "white",
                     -title   => "Error logs"),
        $q->start_form,
        $q->center(
                 $q->b('How many lines to fetch? '),
                 $q->textfield('count',10,3,3),
                 $q->submit('', 'Fetch'),
                 $q->reset,
                ),
        $q->end_form,
        $q->hr;
  
  # untaint $counts
  $counts = ($counts =~ /(\d+)/) ? $1 : 0;
  
  print($q->b("$error_log doesn't exist!!!")),exit unless -e $error_log;
  
  open LOG, "tail -$counts $error_log|" or die "Can't open tail on $error_log :$!\n";
  my @logs = <LOG>;
  close LOG;
    # format and colorize each line nicely
  foreach (@logs) {
      s{
       \[(.*?)\]\s* # date
       \[(.*?)\]\s* # type of error 
       \[(.*?)\]\s* # client part
       (.*)         # the message
      }
      {
        "[$1] <BR> [".
        colorize($2,$2).
        "] <BR> [$3] <PRE>".
        colorize($2,$4).
        "</PRE>"
      }ex;
    print "<BR>$_<BR>"; 
  }
  
  
  
  #############
  sub colorize{
    my ($type,$context) = @_;
  
    my %colors = 
      (
       error  => 'red',
       crit   => 'black',
       notice => 'green',
       warn   => 'brown',
      );
  
    return exists $colors{$type}
        ? qq{<B><FONT COLOR="$colors{$type}">$context</FONT></B>}
        : $context;
  }

[TOC]


Accessing variables from the caller's package

Sometimes you want to access variables from the caller's package. One way is to do:

  my $caller = caller;
  print qq[$caller --- ${"${caller}::var"}];

[TOC]


Handling cookies

Unless you use some well known module like CGI.pm you can handle the cookies yourself.

Cookies come in the $ENV{HTTP_COOKIE} variable. You can print the raw cookie string as $ENV{HTTP_COOKIE}.

Here is a fairly well-known bit of code to take cookie values and put them into a hash:

  sub getCookies {
      # cookies are seperated by a semicolon and a space, this will
      # split them and return a hash of cookies
    local(@rawCookies) = split (/; /,$ENV{'HTTP_COOKIE'});
    local(%cookies);
  
    foreach(@rawCookies){
      ($key, $val) = split (/=/,$_);
      $cookies{$key} = $val;
    }
  
    return %cookies;
  }

[TOC]


Sending multiple cookies with Perl API

Taken that you have prepared your cookies in @cookies, the following would do:

  for(@cookies){
   $r->headers_out->add( 'Set-Cookie' => $_ );
 }

[TOC]


Passing and preserving custom data structures between handlers

Let's say that you wrote a few handlers to process a request, and they all need to share some custom Perl data structure. The pnotes() method comes to your rescue. Given that one of the handlers stored some data in a hash %my_data, before it finishes its activity:

   # First handler:
   my %my_data = qw(foo => 1, bar => 2);
   $r->pnotes('my_data' => \%my_data);

All the subsequent handlers will be able to retrieve the stored data with:

   # Later handler:
   my $info = $r->pnotes('my_data');
   print $info->{foo};

The stored information will be destroyed at the end of the request.

[TOC]


Passing environment variables between handlers

A simple example of passing environment variables between handlers:

Having a configuration:

  PerlAccessHandler My::Access
  PerlLogHandler My::Log

and startup.pl:

  sub My::Access::handler {
    my $r = shift;
    $r->subprocess_env(TICKET => $$);
    $r->notes(TICKET => $$);
  }
  
  sub My::Log::handler {
    my $r = shift;
    my $env = $r->subprocess_env('TICKET');
    my $note = $r->notes('TICKET');
    warn "env=$env, note=$note\n";
  }

Adding %{TICKET}e and %{TICKET}n to the LogFormat for access_log works fine too.

[TOC]


CGI::params in the mod_perl-ish way

Extracting request params in the mod_perl-ish way:

  my $r = shift;  # or $r = Apache->request
  my %params = $r->method eq 'POST' ? $r->content : $r->args;

Also take a look at Apache::Request which has the same parameters extraction and setting API.

[TOC]


Subclassing Apache::Request example

  package My::TestAPR;
    
  use strict;
  use vars qw/@ISA/;
  @ISA = qw/Apache::Request/;
  
  sub new {
        my ($proto, $apr) = @_;
        my $class = ref($proto) || $proto;
        bless { _r => $apr }, $class;
  }
  
  sub param {
        my ($self, $key) = @_;
        my $apr = $self->{_r};
        $apr->param($key) . '42';
  }
  
  sub sum {
        my ($self, $key) = @_;
        my $apr = $self->{_r};
        my @values = $apr->param($key);
        my $sum = 0;
        for (@values) {
                $sum += $_;
        }
        $sum;
  }
  1;
  __END__

[TOC]


Sending email from mod_perl

Well, there is nothing special about sending email from mod_perl, it's just that we do that a lot. And there are a few important issues about it. The most widely used approach is firing a sendmail process and piping the headers and the body to it. The problem is that sendmail is a very heavy process and it makes mod_perl processes less efficient.

One of the improvements is to say to sendmail not to deliver the email at the ``real time'' but to do that in the background or just queue the job until the next queue run, if you don't want your process to wait until delivery is complete, which sometimes significantly diminishes the delay for mod_perl process waiting for the sendmail proces to complete. This can be specified for all deliveries in sendmail.cf or on each invocation on the sendmail command line: -odb (background) -odq (queue-only) or -odd (queue and also defer the DNS/NIS lookups).

Some people prefer using a lighter mail delivery programs like qmail.

The most efficient approach is to talk directly to the SMTP server. Luckily Net::SMTP modules makes this task a very easy one. The only problem is when <Net::SMTP> fails to deliver the mail, because the destination peer server is temporarely down. But from the other side Net::SMTP allows you to send email much much faster, since you don't have to invoke a dedicated process for that. Here is an example of the subroutine that sends email.

  use Net::SMTP ();
  use Carp qw(carp verbose);
  
  #
  # Sends email by using the SMTP Server
  #
  # The SMTP server as defined in Net::Config 
  # or you can hardcode it here, look for $smtp_server below 
  #
  sub send_mail{
    my ($from, $to, $subject, $body) = @_;
  
    my $mail_message = <<__END_OF_MAIL__;
  To: $to
  From: $from
  Subject: $subject
  
  $body
  
  __END_OF_MAIL__
  
      # Set this parameter if you don't have a valid Net/Config.pm
      # entry for SMTP host and uncomment it in the Net::SMTP->new
      # call
    # my $smtp_server = 'localhost';
  
      # init the server
    my $smtp = Net::SMTP->new(
                            # $smtp_server,
                            Timeout => 60, 
                            Debug   => 0,
                           );
  
    $smtp->mail($from) or carp ("Failed to specify a sender [$from]\n");
    $smtp->to($to) or carp ("Failed to specify a recipient [$to]\n");
    $smtp->data([$mail_message]) or carp ("Failed to send a message\n");
  
    $smtp->quit or carp ("Failed to quit\n");
  
  } #  end of sub send_mail

[TOC]


Code Unloading

We urge to preload as much code as possible all the time as it reduces the memory footprint. But sometimes we want to unload the code that was loaded before. For example, you could load many modules to do some configuration or initialization work at the server startup, but none of the children will need these modules later. You can unload the code.

For example if you use XML::Parser in a <Perl section only, you could remove it with:

  delete $INC{'XML/Parser.pm'};
  Apache::PerlRun->flush_namespace('XML::Parser');

[TOC]


A Simple Handler To Print The Environment Variables

The code:

  package MyEnv;
  use Apache;
  use Apache::Constants;
  sub handler{ 
    my $r = shift; 
    print $r->send_http_header("text/plain"); 
    print map {"$_ => $ENV{$_}\n"} keys %ENV;
    return OK;
  }
  1;

The configuration:

  PerlModule MyEnv
  <Location /env>
    SetHandler perl-script
    PerlHandler MyEnv
  </Location>

The invocation:

  http://localhost/env

[TOC]


mysql backup and restore scripts

Well, this is something off-topic but since many of us use mysql or other RDBMS in their work with mod_perl driven sites, it's good to know how to backup and restore the databases in case of database corruption.

First we should tell the mysql to log all the clauses that modify the databases (we don't care about SELECT queries for database backups). Modify the safe_mysql script by adding the --log-update options to the mysql server starting parameters and restart the server. From now on all the non-select queries will be logged into /var/lib/mysql/www.bar.com file. Your hostname will show up instead of www.bar.com.

Now create a dump directory under /var/lib/mysql/. That's where the backups will be stored (you can name the directory as you wish of course).

Prepare the backup script and store it in file, e.g: /usr/local/sbin/mysql/mysql.backup.pl

  #!/usr/bin/perl -w
  
  # this script should be run from the crontab every night or in shorter
  # intervals. This scripts does a few things.
  # 1. dump all the tables into a separate dump files (these dump files 
  # are ready for DB restore)
  # 2. backups the last update log file and create a new log file
  
  
  use strict;
  my $data_dir = "/var/lib/mysql";
  my $update_log = "$data_dir/www.bar.com";
  my $dump_dir  = "$data_dir/dump";
  my $gzip_exec = "/bin/gzip";
  my @db_names = qw(bugs mysql bonsai);
  my $mysql_admin_exec = "/usr/bin/mysqladmin ";
  
      # convert unix time to date + time
  my ($sec,$min,$hour,$mday,$mon,$year) = localtime(time);
  my $time  = sprintf("%0.2d:%0.2d:%0.2d",$hour,$min,$sec);
  my $date  = sprintf("%0.2d.%0.2d.%0.4d",++$mon,$mday,$year+1900);
  my $timestamp = "$date.$time";
  
  # dump all the DBs we want to backup
  foreach my $db_name (@db_names) {
    my $dump_file = "$dump_dir/$timestamp.$db_name.dump";
    my $dump_command = "/usr/bin/mysqldump -c -e -l -q --flush-logs $db_name > $dump_file";
    system $dump_command;
  }
  
  # move update log to backup for later restore if needed
  rename $update_log, "$dump_dir/$timestamp.log" if -e $update_log;
  
  # restart the update log to log to a new file!
  `/usr/bin/mysqladmin refresh`;
  
  # compress all the created files
  system "$gzip_exec $dump_dir/$timestamp.*";

You might need to change the executable paths according to your system. And list the names of the databases you want to backup, using the db_names array.

Now make the script executable and arrange the crontab entry to run the backup script nightly. Notice that in time there backups will use lots of disk space and you should remove the old ones. A sample crontab entry, to run the script at 4am every day:

  0 4 * * * /usr/local/sbin/mysql/mysql.backup.pl > /dev/null 2>&1

So what we have achieved is this. At any moment we have the dump of the databases from the last execution of the backup script and the log file of all the clauses that has updated the databases since then. So if the database gets corrupted we have all the information to restore it, without loosing a single bit of information. We restore it with the following script, which I put in: /usr/local/sbin/mysql/mysql.restore.pl

  #!/usr/bin/perl -w
  
  # this scripts restores the DBs
  
  # Usage: mysql.restore.pl update.log.gz dump.db1.gz [... dump.dbn.gz]
  # all files dump* are compressed as we expect them to be created by 
  # mysql.backup utility
  
  # example: 
  # % mysql.restore.pl myhostname.log.gz 12.10.1998.16:37:12.*.dump.gz
  
  # .dump.gz extension.
  
  use strict;
  
  use FindBin qw($Bin);
  
  my $data_dir   = "/var/lib/mysql";
  my $dump_dir   = "$data_dir/dump";
  my $gzip_exec  = "/bin/gzip";
  my $mysql_exec = "/usr/bin/mysql -f ";
  my $mysql_backup_exec = "$Bin/mysql.backup.pl";
  my $mysql_admin_exec  = "/usr/bin/mysqladmin ";
  
  my $update_log_file = '';
  my @dump_files = ();
  
  # split input files into an update log and the dump files
  foreach (@ARGV) {
    push(@dump_files, $_),next unless /\.log\.gz/;
    $update_log_file = $_;
  }
  
  die "Usage: mysql.restore.pl update.log.gz dump.db1.gz [... dump.dbn.gz]\n" 
    unless defined @dump_files and @dump_files > 0;
  
  # load the dump files
  foreach (@dump_files) {
  
      # check the file exists
    warn("Can't locate $_"),next unless -e $_;
  
      # extract the db name from the dump file
    my $db_name = $1 if /\d\d\.\d\d.\d\d.\d\d:\d\d:\d\d\.(\w+)\.dump\.gz/;
  
    warn("Can't extract DB name from the file name,
          probably an error in the file format"),
            next unless defined $db_name and $db_name;
  
      # we want to drop the table since restore will rebuild it!
      # force to drop the db without confirmation
    my $drop_command = "$mysql_admin_exec -f drop $db_name";
    system $drop_command;
  
    $drop_command = "$mysql_admin_exec create $db_name";
    system $drop_command;
  
      # build the command and execute it
    my $restore_command = "$gzip_exec -cd $_ | $mysql_exec $db_name";
    system $restore_command;
  }
  
  # now load the update_log file (update the db with the changes since
  # the last dump
  warn("Can't locate $update_log_file"),next unless  -e $update_log_file;
  
  my $restore_command = 
    "$gzip_exec -cd $update_log_file |$mysql_exec";
  system $restore_command;
  
  # rerun the mysql.backup.pl since we have reloaded the dump files
  # and update log , and we must rebuild backups!
  system $mysql_backup_exec;

These are kinda dirty scripts, but they work... if you come up with a more clean scripts, please contribute... thanks

[TOC]


The Writing Apache Modules with Perl and C book can be purchased online from O'Reilly and Amazon.com.
Your corrections of either technical or grammatical errors are very welcome. You are encouraged to help me to improve this guide. If you have something to contribute please send it directly to me.
[ Prev | Main Page | Next ]

Written by Stas Bekman.
Last Modified at 12/18/1999
Mod Perl Icon Use of the Camel for Perl is
a trademark of O'Reilly & Associates,
and is used by permission.