Operation

How it works

Essentially, there are two steps in the process:

  1. the specified log files are parsed and the parsed output is added to a cache;
  2. the cache is processed to generate a report.

The second step is invoked by the first, unless --noreport is specified. The second step can be invoked on its own by simply not specifying any log files.

Typically, tool use might be run every time you rotate your log files with the cache being changed at the beginning (and maybe end) of every semester. The report might be written to the myFiles area of a course that is accessible to the appropriate people.

Command syntax

toouse.pl
  [--version] [--help] [--quiet] [--verbose]
  [--from date] [--to date]
  [--include regexp] [--exclude regexp]
  [--includecourse regexp] [--excludecourse regexp]
  [--includecategory regexp] [--excludecategory regexp]
  [--includeterm regexp] [--excludeterm regexp]
  [--categoryfile file] [--termfile file]
  [--reportdir] [--reportcourse] [--noreport] [--zip [path to zip executable]]
  [--cache file] [--removecache] 
  [--merge] [--logversion version] [--timetolerance seconds]
  [--wimba] [--horizonlive] [--plugins]
  [--userpopmax number]
  logfile1 logfile2 ...

  --version         prints version information
  --help            what you are reading now
  --quiet           run quietly
  --verbose         run verbosely
  --from            start date (yyyy-mm-dd)
  --to              end date (yyyy-mm-dd)
  --include         only include users whose WebCT IDs match regexp
  --exclude         exclude users whose WebCT IDs match regexp
  --includecourse   only include courses whose IDs match regexp
  --excludecourse   exclude courses whose IDs match regexp
  --includecategory only include courses whose categories match regexp
  --excludecategory exclude courses whose categories match regexp
  --includeterm     only include courses whose terms match regexp
  --excludeterm     exclude courses whose terms match regexp
  --categoryfile    file of extra course category classifications
  --termfile        file of extra course term classifications
  --reportdir       directory for report
  --reportcourse    course for report, must also specify --reportdir
  --noreport        do not create a report
  --zip             create a zip archive containing the report
  --cache           load/save from/to cache file
  --removecache     remove cache
  --merge           merge log files
  --logversion      process a version 3 log on a version 4 server
  --timetolerance   exit if log time goes backwards by more than seconds
  --wimba           report on Wimba use
  --horizonlive     report on HorizonLive use
  --plugins         report on both Wimba and Horizonlive
  --userpopupmax    threshold for user pop-ups on users by category-term page
  logfile1 ...      list of log files to parse, "active" for current log, "-" for STDIN

Notes:

Period analysed
The report will cover the period of the cache unless --from and/or --to are specified. The period analysed is inclusive of any specified start and end dates.
Include/Exclude
Only specify one of these (if --include is present, --exlude is ignored). The regular expression is bound by ^ and $ and interpreted by Perl. Examples:
regexpMatches WebCT IDs
\d+that are all digits
s\d{7}that start with "s" and are followed by seven digits
[a-z]+ that only contain lowercase letters
(\d+|[a-z]+) that are all digits or only contain lowercase letters

Include/Exclude course/category/term work similarly.

Categories and Terms
A course's category and term are obtained by reading from WebCT's category and term files. If the course is not present in these files, its category/term is set to Main/Default. Thus courses that are no longer on the server will appear in Main/Default. This can be altered by specifying additional category and term files with --categoryfile and --termfile. The category file should contain lines of the form xxx:::category:::courseid1,courseid2, ..., courseidN where xxx is any number of alphanumeric characters. Any number of course IDs can be specified and the last one can be followed by ::: and then anything else. The term file has the same format, but with term instead of category. Categories, terms and courses can be repeated in a file; the last classification of a course will be the one used. WebCT's category and term files are processed last and so any courses in those files will appear in their current category/term.
Report location
By default, the report is written to directory <NKID>/tmp/toouse. Instead of this, any directory may be specified with --reportdir. If --reportcourse is also specified, the directory specified with --reportdir is interpreted as relative to the specified WebCT courses's myFiles area. If no --reportdir is specified, the default directory is emptied. Specified report directories are not emptied.
zip
Creates a zip archive containing the report. The archive is called toouse.zip and is placed in the report directory. By default, the archive is created using the zip utility that comes with WebCT. Alternatively, you can specify the path to a zip utility.
Cache
By default, the cache file is <NKID>/var/toouse_cache.txt. An alternative file can be specified with --cache. This allows the cache to exist where their might be more disk space or so it can be omitted from backups etc. This file will probably be of the order of 5-10%? of the size of the log files that have been processed to make up the cache. If --cache is specified, a temporary file that is normally written to <NKID>/tmp will be written in the same directory as the cache file. The cache can be removed using --removecache. This command is interactive; you will be asked to confirm the removal of the cache.
Time tolerance
To help prevent accidental re-processing of a processed log, Tool Use exits if log time goes backwards by more than 2.5 hours. This tolerance can be adjusted with --timetolerance. A tolerance is necessary because Apache writes request completion time to its logs; thus the times in a log file are not always chronological. Specifying a time tolerance of 0 will disable the chronological checking.
Log files
These must be uncompressed. They should also be specified in chronological order. Specifying "active" will process the current log. If "-" is specified the log will be read from STDIN (so you can, for example, process compressed logs with something like gunzip -c access_log.27.gz | ./toouse.pl -). If the log files are omitted, tool use will simply process the current cache. Thus, for example, you can generate reports for different time periods and categories from the one cache by altering --from, --to, --category and --reportdir.
Overlapping logs
Normally, a specified log file will be newer than the ones already cached. To prevent accidentally processing an already processed log, tool use will stop if this is not the case, unless the difference is less than --timetolerance. However, if you have omitted an earlier log file you may want to merge it into the current cache. In this case, specifying --merge will merge the processed logs into the current cache. In this case, only one log can be processed at a time. This operation is slow.

Copyright (c) 2003-2005 NetKnowledgy. All rights reserved.