The Wumpus Information Retrieval System – File system search
Author: Stefan Buettcher (stefan@buettcher.org)
Last change: 2005-05-14
Wumpus can be used as a file system search engine for Linux.
Before you can use Wumpus as a file system search engine, you first need to do
two things:
- Build a Linux kernel with file system change notification enabled. At the moment,
only fschange is supported
by Wumpus. Support for
inotify
is under development.
- Start a web server (e.g. Apache)
that is configured to support PHP.
After you have installed the new Linux kernel with file change notification support
and restarted your system, make sure that the notificaton service is running. For
fschange, you can do this by executing "cat /proc/fschange" as root. For inotify,
use the inotify-utils package provided by the inotify developer.
Before you can actually use the system, you need to do two more things:
- Edit the wumpus.cfg configuration file in the Wumpus main directory. Change
the TCP_PORT configuration variable to the port that you want Wumpus' TCP server
to listen on. Change the INDEXABLE_FILESYSTEMS variable so that it reflects your
local file system. Wumpus will only index files that are below one of the mount
points given here.
- Edit the config.php file in wumpus/php so that it is consistent with the
TCP port specified in wumpus.cfg (this means changing the value of "$port"). Then copy all
files found in wumpus/php into a directory that can be accessed through the web server,
e.g. /var/www/html/wumpus or ~/public_html/wumpus).
After you have started Wumpus in file system search mode by executing
bin/fssearch, you should be able to to access the index through the PHP scripts.
Wumpus will automatically start an exhaustive file system scan. The time between two
such scans can be specified using the TIME_BETWEEN_FS_SCANS configuration variable.
Wumpus automatically reacts to file system mounts and umounts by creating or
releasing indices under the respective mount points.
Per-file-system indices will be created under each mount point defined in the
configuration file. The index for the "/" file system, for example, will be found
in "/.indexdir/".
Please note that you might have to run umount twice in order to unmount
a given file system. This is because when you run it first, Linux cannot unmount
the file system, since Wumpus has still open files. However, it notices that an
unmount was requested. Thus, when you run umount the second time, it should
be successful.