REVISION 2

[ FIXME -- I'll be the first person to admit I'm not a great technical
  writer. Documentation is one thing, but tutorial exposition isn't
  something I consider myself great at. In particular, if you think I get
  too technical/too handwavey in parts, please point it out to me so I can
  correct it and see what sorts of things I do wrong. Feel free to edit
  everything I've written here to death. ]

=pod

=head1 A Beginner's Introduction to POE

[ FIXME -- Should I add Jeff's name here too? ]

by Dennis Taylor
January 13th, 2001

=head1 What Is POE, And Why Should I Use It?

Most of the programs we write every day have the same basic blueprint:
they start up, they perform a series of actions, and then they exit.
This works fine for programs that don't need much interaction with their
users or their data, but for more complicated tasks, you need a more
expressive program structure.

That's where POE (Perl Object Environment) comes in. POE is a framework
for building Perl programs that lends itself naturally to tasks which
involve reacting to external data, such as network communications or
user interfaces. Programs written in POE are completely non-linear; you
set up a bunch of small subroutines and define how they all call each
other, and POE will automatically switch between them while it's
handling your program's input and output. It can be confusing at first,
if you're used to procedural programming, but with a little practice it
becomes second nature.

=head1 POE Design

It's not much of an exaggeration to say that POE is a small operating
system written in Perl, with its own kernel, processes, interprocess
communication (IPC), drivers, and so on. In practice, however, it just
boils down to a simple system for assembling state machines. Here's a
brief description of each of the pieces that make up the POE
environment:

=over

=item States

The basic building block of the POE program is the B<state>, which is a
piece of code that gets executed when some event occurs -- when incoming
data arrives, for instance, or when a session runs out of things to do,
or when one session sends a message to another. Everything in POE is
based around receiving and handling these events.

=item The Kernel

POE's B<kernel> is much like an operating system's kernel: it keeps
track of all your processes and data behind the scenes, and schedules
when each piece of your code gets to run. You can use the kernel to set
alarms for your POE processes, queue up states that you want to run, and
perform various other low-level services, but most of the time you don't
interact with it directly.

=item Sessions

B<Sessions> are the POE equivalent to processes in a real operating
system. A session is just a POE program which switches from state to
state as it runs. It can create "child" sessions, send POE events to
other sessions, and so on. Each session can store session-specific data
in a hash called the B<heap>, which is accessible from every state in
that session.

POE has a very simple cooperative multitasking model; every
session executes in the same OS process without threads or forking. For
this reason, you should beware of using blocking system calls in POE
programs.

=back

Those are the basic pieces of the Perl Object Environment, although
there are a few slightly more advanced parts that we ought to explain
before we go on to the actual code:

=over

=item Drivers

B<Drivers> are the lowest level of POE's I/O layer. Currently, there's only
one driver included with the POE distribution -- C<POE::Driver::SysRW>,
which reads and writes data from a filehandle -- so there's not much to
say about them. You'll never actually use a driver directly, anyhow.

=item Filters

B<Filters>, on the other hand, are inordinately useful. A filter is a
simple interface for converting chunks of formatted data into another
format. For example, C<POE::Filter::HTTPD> converts HTTP 1.0 requests
into C<HTTP::Request> objects and back, and C<POE::Filter::Line>
converts a raw stream of data into a series of lines (much like Perl's
E<lt>E<gt> operator).

=item Wheels

B<Wheels> contain reusable pieces of high-level logic for accomplishing
everyday tasks. They're the POE way to encapsulate useful code. Common
things you'll do with wheels in POE include handling event-driven input
and output and easily creating network connections. Wheels often use
Filters and Drivers to massage and send off data. I know this is a vague
description, but the code below will provide some concrete examples.

=item Components

A B<Component> is a session that's designed to be controlled by other
sessions. Your sessions can issue commands to and receive events from
them, much like processes communicating via IPC in a real operating
system. Some examples of Components include C<POE::Component::IRC>, an
interface for creating POE-based IRC clients, or
C<POE::Component::Client::HTTP>, an event-driven HTTP user agent in
Perl. We won't be using any Components in this article, but they're a
very useful part of POE nevertheless.

=back

=head1 A Simple Example

For this simple example, we're going to make a server daemon which
accepts TCP connections and prints the answers to simple arithmetic
problems posed by its clients. When someone connects to it on port
31008, it will print "Hello, client!". The client can then send it an
arithmetic expression, terminated by a newline (such as "C<6 + 3\n>" or
"C<50 / (7 - 2)\n>", and the server will send back the answer. Easy
enough, right?

Writing such a program in POE isn't terribly different from the
traditional method of writing daemons in Unix. We'll have a server
session which listens for incoming TCP connections on port 31008. Each
time a connection arrives, it'll create a new child session to handle
the connection. Each child session will interact with the user, and then
quietly die when the connection is closed. And best of all, it'll only
take 74 lines of modular, simple Perl.

The program begins innocently enough:

   1  #!/usr/bin/perl -w
   2  use strict;
   3  use Socket qw(inet_ntoa);
   4  use POE qw( Wheel::SocketFactory  Wheel::ReadWrite
   5              Filter::Line          Driver::SysRW );
   6  use constant PORT => 31008;

Here, we import the modules and functions which the script will use, and
define a constant value for the listening port. The odd-looking C<qw()>
statement after the "C<use POE>" is just POE's shorthand way for pulling
in a lot of C<POE::> modules at once. It's equivalent to the more
verbose:

        use POE;
        use POE::Wheel::SocketFactory;
        use POE::Wheel:ReadWrite;
        use POE::Filter::Line;
        use POE::Driver::SysRW;

Now for a truly cool part:

   7  new POE::Session (
   8      _start => \&server_start,
   9      _stop  => \&server_stop,
  10  );
  11  $poe_kernel->run();
  12  exit;

That's the entire program! We set up the main server session, tell the
POE kernel to start processing events, and then exit when it's done.
(The kernel is considered "done" when it has no more sessions left to
manage, but since we're going to put the server session in an infinite
loop, it'll never actually exit that way in this script.) POE 
automatically exports the $poe_kernel variable into your namespace when
you write "C<use POE;>".

The C<new POE::Session> call needs a word of explanation. When you
create a session, you give the kernel a list of the events it will
accept. In the code above, we're saying that the new session will handle
the C<_start> and C<_stop> events by calling the C<&server_start> and
C<&server_stop> functions. Any other events which this session receives
will be ignored. C<_start> and C<_stop> are special events to a POE
session: the C<_start> state is the first thing the session executes
when it's created, and the session is put into the C<_stop> state by the
kernel when it's about to be destroyed. Basically, they're a constructor
and a destructor.

Now that we've written the entire program, we have to write the code for
the states which our sessions will execute while it runs. Let's start
with (appropriately enough) C<&server_start>, which is called when the
main server session is created at the beginning of the program:

  13  sub server_start {
  14      $_[HEAP]->{listener} = new POE::Wheel::SocketFactory
  15        ( BindPort     => PORT,
  16          Reuse        => 'yes',
  17          SuccessState => \&accept_new_client,
  18          FailureState => \&accept_failed
  19        );
  20      print "SERVER: Started listening on port ", PORT, ".\n";
  21  }

This is a good example of a POE state. First things first: Note the
variable called C<$_[HEAP]>? POE has a special way of passing arguments
around. The C<@_> array is packed with lots of extra arguments -- a
reference to the current kernel and session, the state name, a reference
to the heap, and other goodies. To access them, you index the C<@_>
array with various special constants which POE exports, such as HEAP,
SESSION, KERNEL, STATE, and ARG0 through ARG9 to access the state's
user-supplied arguments. Like most design decisions in POE, the point of
this scheme is to maximize backwards compatibility without sacrificing
speed. The example above is storing a SocketFactory wheel in the heap
under the key 'C<listener>'.

The C<POE::Wheel::SocketFactory> wheel is one of the coolest things
about POE. You can use it to create any sort of stream socket (sorry, no
UDP sockets yet) without worrying about the details. The statement above
will create a SocketFactory that listens on the specified TCP port (with
the SO_REUSE option set) for new connections. When a connection is
established, it will call the C<&accept_new_client> state to pass on the
new client socket; if something goes wrong, it'll call the
C<&accept_failed> state instead to let us handle the error. That's all
there is to networking in POE!

We store the wheel in the heap to keep Perl from accidentally
garbage-collecting it at the end of the state -- this way, it's
persistent across all states in this session. Now, onto the
C<&server_stop> state:

  22  sub server_stop {
  23      print "SERVER: Stopped.\n";
  24  }

Not much to it. I just put this state here to illustrate the flow of the
program when you run it. We could just as easily have had no C<_stop>
state for the session at all, but it's more instructive (and easier to
debug) this way.

Here's where we create new sessions to handle each incoming connection:

  25  sub accept_new_client {
  26      my ($socket, $peeraddr, $peerport) = @_[ARG0 .. ARG2];
  27      $peeraddr = inet_ntoa($peeraddr);
  28      new POE::Session (
  29          _start => \&child_start,
  30          _stop  => \&child_stop,
  31          main   => [ 'child_input', 'child_done', 'child_error' ],
  32          [ $socket, $peeraddr, $peerport ]
  33      );
  34      print "SERVER: Got connection from $peeraddr:$peerport.\n";
  35  }

Our C<POE::Wheel::SocketFactory> will call this subroutine whenever it
successfully establishes a connection to a client. We convert the
socket's address into a human-readable IP address (line 27) and then set
up a new session which will talk to the client. It's somewhat similar to
the previous C<POE::Session> constructor we've seen, but a couple things
bear explaining:

C<@_[ARG0 .. ARG2]> is shorthand for C<($_[ARG0], $_[ARG1], $_[ARG2])>.
You'll see array slices used like this a lot in POE programs.

What does line 31 mean? It's not like any other C<event_name => state>
pair that we've seen yet. Actually, it's another clever abbreviation. If
we were to write it out the long way, it would be:

  new POE::Session (
      ...
      child_input => &main::child_input,
      child_done  => &main::child_done,
      child_error => &main::child_error,
      ...
  );

It's a handy way to write out a lot of state names when the state name
is the same as the event name -- you just pass a package name or object
as the key, and an array reference full of subroutine or method names,
and POE will just do the right thing. See the C<POE::Session> docs for
more useful tricks like that.

Finally, the array reference at the end of the C<POE::Session>
constructor's argument list (on line 32) is the list of arguments which
we're going to manually supply to the session's C<_start> state.

If the C<POE::Wheel::SocketFactory> had problems creating the listening
socket or accepting a connection, this happens:

  36  sub accept_failed {
  37      my ($function, $error) = @_[ARG0, ARG2];
  38      delete $_[HEAP]->{listener};
  39      print "SERVER: call to $function() failed: $error.\n";
  40  }

Printing the error message is normal enough, but why do we delete the
C<SocketFactory> wheel from the heap? The answer lies in the way POE
manages session resources. Each session is considered "alive" so long as
it has some way of generating or receiving events. If it has no wheels
and no aliases (a nifty POE feature which we won't cover in this
article), the POE kernel realizes that the session is dead and
garbage-collects it.  The only way the server
session can get events is from its C<SocketFactory> wheel -- if that's
destroyed, the POE kernel will wait until all its child sessions have
finished, and then garbage-collect the session. At this point, since
there are no remaining sessions to execute, the POE kernel will run out
of things to do and exit.

So, basically, this is just the normal way of getting rid of unwanted
POE sessions: dispose of all the session's resources and let the kernel
clean up. Now, onto the details of the child sessions:

  41  sub child_start {
  42      my ($heap, $socket) = @_[HEAP, ARG0];
  43      $heap->{readwrite} = new POE::Wheel::ReadWrite
  44        ( Handle => $socket,
  45          Driver => new POE::Driver::SysRW (),
  46          Filter => new POE::Filter::Line (),
  47          InputState => 'child_input',
  48          ErrorState => 'child_error',
  49        );
  50      $heap->{readwrite}->put( "Hello, client!" );
  51      $heap->{peername} = join ':', @_[ARG1, ARG2];
  52      print "CHILD: Connected to $heap->{peername}.\n";
  53  }

This gets called every time a new child session is created to handle a
newly connected client. We'll introduce a new sort of POE wheel here:
the C<ReadWrite> wheel, which is an event-driven way to handle I/O
tasks. We pass it a filehandle, a driver which it'll use for I/O calls,
and a filter that it'll munge incoming and outgoing data with (in this
case, turning a raw stream of socket data into separate lines and vice
versa). In return, the wheel will send this session a C<child_input> event
whenever new data arrives on the filehandle, and a C<child_error> event
if any errors occur.

We immediately use the new wheel to output the string "Hello, client!"
to the socket. (When you try out the code, note that the
C<POE::Filter::Line> filter takes care of adding a line terminator to
the string for us.) Finally, we store the address and port of the
client in the heap, and print a success message.

We will omit discussion of the C<child_stop> state, since it's only one
line long. Now for the real meat of the program: the C<child_input>
state!

  57  sub child_input {
  58      my $data = $_[ARG0];
  59      $data =~ tr{0-9+*/()-}{}cd;
  60      return unless length $data;
  61      my $result = eval $data;
  62      chomp $@;
  63      $_[HEAP]->{readwrite}->put( $@ || $result );
  64      print "CHILD: Got input from peer: \"$data\" = $result.\n";
  65  }

When the client sends us a line of data, we strip it down to a simple
arithmetic expression and eval it, sending either the result or an error
message back to the client. Normally, passing untrusted user data
straight to C<eval()> is a horribly dangerous thing to do, so we have to
make sure we remove every non-arithmetic character from the string
before it's evaled (line 59). The child session will happily keep
accepting new data until the client closes the connection. Run the code
yourself and give it a try!

The C<child_done> and C<child_error> states should be fairly
self-explanatory by now -- they each delete the child session's
C<ReadWrite> wheel, thus causing the session to be garbage-collected,
and print an expository message explaining what happened. Easy enough.

=head1 That's All For Today

And that's all there is to it! The longest subroutine in the entire
program is only 12 lines, and all the complicated parts of the
server-witing process have been offloaded to POE. Now, you could make
the argument that it could be done more easily as a procedural-style
program, like the examples in C<man perlipc>. For a simple example
program like this, that would probably be true. But the beauty of POE is
that, as your program scales, it stays easy to modify. It's easier to
organize your program into discrete elements, and POE will provide all
the features you would otherwise have had to hackishly reinvent yourself
when the need arose.

So give POE a try on your next project. Anything that would ordinarily
use an event loop would be a good place to start using POE. Have fun!

L<Source Listing....>

=head1 Related Links

=over

=item http://poe.perl.org/

The POE home page. All good things stem from here.

=back

=cut