pwebstats creates a collection of html files and images in a group of directories under the output directory specified below. At present, pwebstats produces output statistics over daily, weekly or monthly periods. Note: the input logfile(s) must be split into separate daily, weekly or monthly files. A utility ( has been included to assist you in doing so.

  1. Edit the configuration file (./conf/pwebstats.conf) to reflect the location of the pwebstats distribution directory on your system, the location of the output directory, and your site-specific details.
  2. Some config details can also be given on the command line. Type ./pwebstats for a full list of options.
  3. Type the following command in the distribution directory to run pwebstats:
    ./pwebstats -c conf/pwebstats.conf
    The output will go in the directory specified in the config file or on the command line.
  4. If you want page-specific stats generated, have a look at the file ./conf/pwebstats.pages which is the configuration file for the page-based part of pwebstats. The format is a series of colon-separated directives in the format:
    PERL pattern for html collection (or just path to html file):
    Description of file or collection:Level of Indentation (for subsections):
    URL to the page itself (to create an active link)

    Some examples are available.

    I’ve made copy of our local page-config file and a copy of the pwebstats output for that page, available so you can get an idea of what can be done with page-based stats.

    See for details on perl regular expressions.

The Configuration File

The configuration file (./conf/pwebstats.conf) controls the setup information needed for pwebstats to run, and the user-settable limits and variables.

Lines starting with a # are comments and are ignored, as are blank lines.

All other lines are of the form variable:setting (the colon is necessary).

Use full pathnames where pathnames are to be specified (no trailing ‘/’).

Config Variables

Unique nickname for server – use only a-z, A-Z and ‘_’.
Header for index page.
Location of log file (full pathname).
Type of logfile.
Acceptable values are: common (Common Log Format), squid, squid-emulated, ncsa-extended, and netscape-proxy. Defaults to common.
Directory location for the output of pwebstats (full pathname).
directory containing GIF templates (full pathname).
Stats collection interval – can be daily, weekly, monthly, quarterly.
Verbose output – progress bar and other details when pwebstats is running (any value = on).
Location of ‘fly’ program (full pathname).
Location of page-based stats config file (full pathname).
Threshold for inclusion in all hosts list (default = 25).
Threshold for inclusion in all requests list (default = 25).
Threshold for inclusion in all domains list (default = 5).
Threshold for inclusion in all protocols list (default = 25).
Regular expression for local domain
e.g.: local_patt:\.unimelb\.edu\.au$|^128\.250|\.mu\.oz\.au$
regexp of items to exclude from display in request stats (but are still counted in totals)
completely ignore access from this set of hostnames ( | is the delimeter)
completely ignore access to this pattern of URLs
e.g. complete_exclude_url_patt:^/foo/bar/*$|^/robots.txt$
completely ignore access from this set of users ( | is the delimeter)
e.g. complete_exclude_user:tom|dick|harry
Convert IP numbers in the hostname field to fully-qualified domain names (any value = on).

An example config file.

Additionally, in a configuration file for a proxy server, the following directives are applicable:

Threshold for inclusion in all remote hosts list (default = 25)
Exclude requests/accesses array – saves time and a lot of memory! (any value = on)

Auxiliary programs

The following extra programs and scripts are included in the pwebstats distribution, in the utilities directory.
This will split an existing log file into weekly or monthly files for input to pwebstats. Type ./ for usage information.
Handy utility for rolling over logfiles, restarting the server and general cleaning-up.
This will split a Netscape Proxy extended log file into CERN-style proxy and cache logs.
Simple shell script to feed all your old weekly/monthly logs into pwebstats. If you just have one big logfile, run it through first.

Leave a Reply

You must be logged in to post a comment.