http://www.spinellis.gr/pubs/jrnl/2002-JCS-Panoptis/html/paper.html
This is an HTML rendering of a working paper draft that
led to a publication.
The publication should always be cited in preference to this
draft using the following reference:
- Diomidis Spinellis and
Dimitris Gritzalis.
Panoptis:
Intrusion detection using a domain-specific language.
Journal of Computer Security, 10:159–176, 2002.
Citation(s): 3 (selected).This document is also available in
PDF format.The document's metadata is available in BibTeX format.
Find
the publication on Google Scholar
This material is presented to ensure timely dissemination of
scholarly and technical work. Copyright and all rights therein are
retained by authors or by other copyright holders. All persons
copying this information are expected to adhere to the terms and
constraints invoked by each author's copyright. In most cases, these
works may not be reposted without the explicit permission of the
copyright holder.
Diomidis Spinellis Publications
|
Panoptis: Intrusion Detection using a Domain-specific
Language
Diomidis Spinellis
Athens University of Economics and Business
Department of Management Science and Technology
Athens, Greece
dds@aueb.gr
Dimitris Gritzalis
Athens University of Economics and Business,
Department of Informatics
Athens, Greece
dgrit@aueb.gr
Abstract
We describe the use of a domain-specific language (DSL)
for expressing critical
design values and constraints in an intrusion detection application.
Through the use of this specialised language, information
that is critical to the correct operation of the software can be
expressed in a form that can be easily drafted, verified, and
maintained by domain experts (security officers), thus minimising the
risk inherent from the mediation of software engineers.
Our application,
Panoptis, is a DSL-based low-cost, easy-to-use intrusion detection system
using the process accounting records kept by most Unix systems.
A set of database tables contain resource usage profiles for processes,
terminals, users, and time intervals.
Panoptis monitors new process data against the recorded profiles and
reports on entities diverging from the established resource usage envelopes
implying possible data security threats.
We demonstrate the operation of Panoptis by a case study of
a real attack and subsequent system compromise that occured on a system
under our administrative control.
Keywords
Domain-specific languages,
security monitoring, intrusion detection, Unix process accounting.
1 Introduction
Panoptis1
is an anomaly detection system based on the process-accounting records
produced by all widely-used versions of Unix.
These records, originally intended for producing billing information,
can be used to detect anomalous situations and alert the security
administrators.
The voluminous nature of the process accounting records prohibits manual
inspection; Panoptis keeps detailed database tables keyed by users,
terminals, processes, and time intervals containing typical usage profiles.
A novel aspect of Panoptis is the use of a domain-specific language (DSL)
for the specification of the items that will be checked.
Many intrusion detection systems rely on a specification language
for the detection, corelation, or reporting of incidents
(see section 7).
The unique aspect of Panoptis is the narrow domain it covers;
in the intrusion detection paradigm that we advocate, a number of
small focused systems like Panoptis,
each with its own domain-specific language,
are combined to form an intrusion detection confederation.
Panoptis detects and reports all entities that execute outside the
defined profile envelopes and automatically updates the database tables
to reduce the administrative burden and reporting volume.
On a system that has an established pattern of use entities outside the
normal usage, envelopes are
likely to be associated with information security breaches.
Data threats that can be detected in this way include wiretapping, browsing,
leakage, tampering, and masquerading [7].
An example of Panoptis's output can be seen in
Figure 5.
The heuristic and quantitative nature of our approach extends the
range of data security threats that can be detected beyond the closed
computer system environment into the organisational environment that hosts
Panoptis.
As an example, Panoptis could detect an employee transferring inordinately
large amounts of data to a computer outside the organisation even if
that employee had proper system authorisations to perform such
transfers.
Although Panoptis was implemented under the Unix operating system, the
approach and techniques we used are applicable to other operating
systems keeping process accounting records.
As an example, the Windows NT audit event log can be used in
a similar way.
1.1 Domain-specific Languages
A domain-specific language [25]
is a programming language
tailored specifically for an application domain: rather than
being general purpose, it captures precisely the domain's semantics.
Examples of DSLs include
lex and yacc [13],
used for program lexical analysis and parsing,
HTML [6], used for document mark-up, and
VHDL, used for electronic hardware descriptions.
Domain-specific languages allow the concise description of an application's
logic reducing the semantic distance between the problem and the program
[5,29].
DSLs are, by definition, special purpose languages.
Any system architecture encompassing one or more DSLs is
typically structured as a confederation of modules; some are implemented
in one of the DSLs and the rest are implemented using a general purpose
programming language.
As a design choice for implementing security software,
DSLs present two distinct advantages over
a ``hard-coded'' program logic.
Concrete Expression of Security Policies
Security policies are not coded into the system or stored in
an arcane file format; they are captured in a concrete human-readable form.
Policies expressed in the DSL can be
scrutinised, split, combined, shared, published,
put under release control, printed, commented,
and even automatically generated by other applications.
Direct Involvement of the Security Officer
The DSL expression style
can often be designed so as to match the format typically used by
the security officer.
This results in keeping the experts in a very tight software lifecycle
loop where they can directly specify, implement, verify, and validate,
without the need of coding intermediaries.
Even if the DSL is not high-level enough to be used as a specification
language by the security officer, it may still be possible to involve the
security officer in code walkthroughts far more productive than those involving
code expressed in a general purpose language.
1.2 Unix Process Accounting Records
Most modern versions of Unix provide the capability of process
accounting [18].
The operating system kernel creates a file containing an accounting
record for every process that terminates.
Each record contains for a given process the following vector:
- Ui, Gi: its user and group identification,
- CT: its controlling terminal,
- Tb: the time the process began,
- Tr, Ts, Tu: the real, system and user times used by the process,
- Mt: its total memory usage,
- Ct, Dt: its total character and disk input/output,
- N: the name of the command that started the process, and
- E, F: its exit status and associated flags.
Based on the above data the following quantities can be derived for every
terminated process:
- Tl: the local time of the day the process started found by converting the
time the process began to local time,
- Tt: the total CPU time consumed by the process as the sum of the
system and user times (Ts + Tu),
- Ma: average memory usage as the memory accounted divided by
the CPU time (Mt/Tt),
- Ca, Da: average character
and disk input/output as the respective quantity
divided by the CPU time (Ct/Tt, Dt/Tt),
- H: CPU ``hog'' factor as the process's CPU time
divided by the actual time it executed (Tt/Tr), and
- the number of times the process ran in a specific time interval.
A number of programs are typically provided for processing the
accounting records, but these are geared towards providing billing
and system performance tuning information.
In the following sections, we will describe how a domain-specific
language can be used to specify the way parts of the process-accounting
data space can be grouped and checked for intrusion detection
purposes.
2 Anomaly Detection Data Space
Panoptis monitors the system processes in three independent dimensions:
- The accounting data
This data corresponds to a specific process, terminal, and user
and consists of the values described in the previous section.
It can be monitored for being above or below specific limits which are based
on the system's historical data collected by Panoptis.
- The monitored entity
A monitored entity can be one of the following:
- U: a user,
- T: a terminal,
- P: a process,
- (U, P): a process executed by a specific user, and
- (U, T): a user working on a specific terminal.
An abnormal behaviour which could signify a security breach can be
associated with any of the above entities.
For example,
- a user may run programs at an unusual time (K0 £ Tl(U) £ K1),
- a process may consume an inordinate amount of CPU time (Tt(P) ³ K),
- a terminal may be exhibit abnormal input/output behaviour
(e.g. Ct(T) ³ K),
- a user may execute an uncommon command, or
- a user may work from an unusual terminal.
- The monitoring time interval
Time intervals are defined by the system administrator.
Typical intervals that provide useful data are:
- A fixed period
-
We found (see Section 6 for details)
that storing data for twenty minute intervals, a day, and a week
captures enough information about the system behaviour to cover a
large number of possible security breach attempts.
The twenty minute interval is useful for quickly detecting
a large number of invocations of an important program such as the password
changing command, while the day and week database tables can be run with a
larger set of checks to detect finer changes in the system's behaviour
indicating attempted security breaches.
- A specific period
-
Panoptis can store separate data for every day and hour (e.g. Mon, Tue, ... and
1200h, 1300h, ...) to capture behaviour that is occurring in non-standard
days or times.
An example of a security breach that can be detected using this method is
the execution of an application used by personnel working nine to five,
late at night, or over the weekend.
We found it more convenient to group the specific period
time interval tables into groups of larger granularity such as
workdays/weekend.
- Continuous monitoring
-
Finally, Panoptis can be run in a mode whereby the accounting log is continuously
monitored and all records that are appended to it are checked against the
specified tables.
This execution mode provides immediate notification of possible security
problems.
A system administrator can run Panoptis in this mode with its output
redirected to a hardcopy terminal to create a log that can not be
erased even when the security of the system is compromised.
The three dimensions described above can be tailored via a configuration
file to a setup that is suitable for the system being monitored.
In addition, terminal and user names can be grouped in logical
sets to avoid the generation of redundant messages.
As an example, all users of the same application or toolset can be
defined as one group, because we expect them to have similar usage profiles.
One profile will be defined and used for all of them, but any leap
outside the profile will be directly attributable to a specific user.
Similarly, a pool of terminals that are interchangeably used in a room
should be grouped together, because they too will have statistically
similar usage profiles.
3 The Panoptis Domain-specific Language
Panoptis consists of a single program that reads accounting records
and updates profile tables, optionally reporting cases that fall outside
the existing profiles.
Its arguments are a DSL-based configuration file that directs the program
operation, the database to update, the interval to operate upon,
and an optional list of process accounting files (the system accounting file
/var/adm/pacct is the default record source).
#
# Configuration file for host pooh
#
# $ Id: poo.dsl 1.6 2000/05/30 12:26:58 dds Exp $
#
HZ = 100 # "Floating point" value divisor
bigend = FALSE # Set to TRUE for big endian (e.g. Sun), FALSE for
# little endian (e.g. VAX, Intel x86)
map = TRUE # Set to TRUE to map uid/tty numbers to names
EPSILON = 150 # New maxima difference threshold (%)
report = TRUE # Set to TRUE to report new/updated entries
unlink = FALSE # Set to TRUE to start fresh
# Reporting procedure
output = '| /usr/bin/tee /dev/console | /bin/mail root'
# Databases and parameters to check
dbcheck(tty, minbmin, maxbmin, maxio, maxcount) # Terminals
dbcheck(comm, ALL) # Commands
dbcheck(uid, ALL) # Users
dbcheck(uidtty, maxcount) # Users on a terminal
dbcheck(uidcomm, minbmin, maxbmin, maxutime, # Users of a command
maxstime, maxmem, maxrw, maxcount, maxasu)
# Map users and terminals into groups
usermap(caduser, john, marry, jill)
usermap(admin, root, bin, uucp, mail, news)
termmap(room202, tty31, tty32, tty33, tty34, tty35)
termmap(ptys, ttyp01, ttyp02, ttyp03, ttyp04, ttyp05, ttyp06)
Figure 1:
Sample configuration file.
Panoptis is configured by a domain-specific language.
The language supports bindings over the following distinct database tables:
- tty
-
Terminals.
- uid
-
Users.
- uidtty
-
Users logged in on a specific terminal.
- comm
-
Commands.
- uidcomm
-
Users executing a specific command.
The basename used for storing each one of the above tables is specified
as a parameter in the Panoptis invocation.
As a result, different tables can be used to store process accounting
history for different hosts, time intervals, or monitoring configurations.
For every process accounting record the following attributes can
be checked:
- maxaxsig
-
Signal exit status.
- maxhog
-
Maximum CPU hog factor (CPU time over elapsed time).
- maxmem
-
Maximum memory usage.
- maxavrw
-
Maximum average disk block input/output.
- maxstime
-
Maximum system time.
- minbmin
-
Minimum daily start time (start time whithin the 24 hour interval).
- maxutime
-
Maximum user time.
- maxbmin
-
Maximum daily start time.
- maxasu
-
Superuser status.
- maxcount
-
Maximum number of times a given record has appeared in the database.
- maxrw
-
Maximum disk block input/output.
- maxacore
-
Core dump flag.
- maxavio
-
Maximum average character input/output.
- maxafork
-
Fork status.
- maxetime
-
Maximum clock time.
- maxavmem
-
Maximum average memory usage.
- maxio
-
Maximum character input/output.
Panoptis will report process accounting records whose
attributes fall above (or below) the values already recorded in a given
database.
The Panoptis monitoring options are also set in the DSL configuration file.
The file contains the following elements:
- Assignments
-
Specific variables can be assigned values to control the
Panoptis behaviour.
- Monitoring specifications
-
These are given using the relation
dbcheck(database, attribute ...)
and specify that the given attributes should be monitored in
a given database.
The special attribute ALL can be used to specify that
all attributes shall be monitored.
- User maps
-
These are given using the relation
usermap(abstract user, username ...)
and specify that all concrete users specified will be mapped to
the given abstract user.
This relation can be used to group users into specific monitoring
groups (e.g. power users, administrators, typists).
- Terminal maps
-
These are given using the relation
termmap(abstract terminal, terminal name ...)
and specify that all concrete terminals specified will be mapped to
the given abstract terminal.
This relation can be used to group terminals into specific monitoring
groups (e.g. network terminals, printers, data entry, etc.).
In addition,
the following variables can be specified in a configuration file:
- report
-
Boolean variable.
Set to TRUE to report new/updated entries.
- countreport
-
Boolean variable.
Set to TRUE to report time the command was started.
- unlink
-
Boolean variable.
Set to TRUE to clear existing database entries.
- map
-
Boolean variable.
Set to TRUE to map uid/tty numbers to names based on the mapping
of the system where Panoptis is run.
- HZ
-
Numeric variable.
The divisor used by the system to store ``floating point'' values.
- EPSILON
-
Numeric variable.
Maximum difference threshold expressed as a percentage difference
of a new value against the previous one.
When this threshold is exceeded, Panoptis will report the specific
command.
- acct
-
String variable.
Set to specify the system source of the accounting records.
The following values are currently supported:
- ['SVR3'] SunOS 4.X and XENIX,
- ['Linux'] e.g. Linux 2.2,
- ['SVR4'] POSIX, XOPEN, e.g. SunOS 5.6,
- ['fBSD'] Free BSD e.g. Free BSD 3.4.
- bigend
-
Boolean variable.
Set to TRUE for big endian (e.g. Sun), FALSE for little endian (e.g. VAX, x86)
accounting records.
- output
-
String variable.
Set to specify how Panoptis results will be output.
The Perl syntax used for opening files can be used.
A sample configuration file is reproduced in Figure 1.
Two variables (HZ and bigend) define the machine's hardware
characteristics.
These - in conjunction with the option map which specifies
whether the local system user and terminal names should be used for reporting
- made it possible for us to run Panoptis on our system, cross-checking the accounting
files of other systems.
A possible setup based on this capability could be a centralised security
server monitoring a large number of remote systems.
The report and unlink settings are used for creating initial
profiles.
Setting unlink will create a fresh set of profile data.
In that case report could be disabled while historical data
is collected and stored in the database.
The output parameter specifies the filename or process to receive
Panoptis's output.
In this example, all reports are printed on the system console and a
copy is mailed to the system administrator account.
The next section of the configuration file specifies for each
of the tables outlined in section 2 the parameters
- as described in section 1.2 - to be checked.
These specifications are used to customise the profile tables for
storing only relevant profile data.
In the example we provide we monitor terminals (tty) used
outside the normal hours to detect physical or network security breaches,
and the number of characters transfered to detect attempts to
transfer data outside the system.
Commands (comm) and users (uid) have all their parameters
monitored as these should
quickly settle to an established pattern minimising false alarms.
A subsequent divergence of any of the parameters is likely to be interesting.
The database containing the users of a specific terminal is only
monitored for the number of commands run from that terminal in order
to catch intruders.
Finally, the database containing data for every command a user
executes (uidcomm) is monitored for the time that process is run, its
use of CPU time, memory, and disk I/O, the number
of times it was executed, and whether it was executed with
superuser privileges.
Divergence of these parameters can pinpoint
Trojan horses, viruses, encryption crackers,
operation of distributed denial of service attack tools,
and data browsers [8,9].
The last section of the configuration file contains the
grouping tuples used to specify logical sets of terminals and users.
In our example, the users of the CAD application form one group
(caduser) and the administrative accounts form another (admin).
All other system users are stored and checked as individuals.
After a process accounting entry is decoded, terminal and user names
that belong to a given group are replaced by the name of that group.
As a result, records in tables that have a user name as their key
(uid, uidtty, uidcomm) will reflect the behaviour of
the whole group instead of a specific user.
Similarly, this method allows terminals that are shared in one room to
be checked as a single group.
Pseudo-terminals (ptys) - often
used for network connections - are also grouped together as they are
assigned to incoming connections in a random way.
#
# Panoptis crontab file for host pooh
#
# The format of this file is:
# Hour Minute Day-of-month Month Day-of-week Command
* 5,25,45 * * * panoptis pooh-quick.cfg pooh.20min 20m
8-18 05 * * * panoptis pooh-hour.cfg pooh.workhour 1h
19-7 05 * * * panoptis pooh-hour.cfg pooh.late 1h
4 50 * * 1-5 panoptis pooh-day.cfg pooh.workday 24h
4 50 * * 6,0 panoptis pooh-day.cfg pooh.weekend 24h
2 20 * * 0 panoptis pooh-full.cfg pooh.weekly 7d \
/usr/adm/pacct? /usr/adm/pacct
Figure 2:
Sample scheduling file.
4 Operation
Panoptis is typically installed as a program to be executed by the system's
command scheduler crontab.
Additionally, Panoptis can be run at system startup as a background task to
continuously monitor the accounting files.
A sample scheduling file for Panoptis that we used on our system is
reproduced in Figure 2.
In this example, a few quick checks are run every twenty minutes (on the
fifth, 25th, and 45th minute of the hour) against
the profiles stored in the pooh.20min database.
Every hour a more complete check is run.
Its profiles are split into two tables; one stores the working
hour (8am to 6pm) profiles (pooh.workhour) and one the
night-hour (7pm to 7am) profiles (pooh.late).
Daily checks are run every night at 4:50 a.m.
Again, the profile tables are split between workdays and weekends.
Finally, the complete set of accounting files is checked using a full
configuration every Sunday at 2:20 a.m.
Every time Panoptis is run, the sequence outlined in Figure 3
is executed.
As one can see, Panoptis gradully ``learns''
the profiles of various commands, users, and terminals and can therefore
spot irregularities that may indicate an intrusion.
read and parse the the domain-specific language configuration
while there are records in the specified accounting file
read and decode an accounting record
synthesise the derived quantities
substitute the name of entities belonging to a group with the group name
for every database table specified
look for an database entry matching the key of the record retrieved
if a matching entry is found
compare it with the entry read
if the accounting record value exceeds the amount stored in the database
produce a new maximum value alert and update the database
else (if no matching entry is found)
produce a new value alert and update the database
Figure 3:
Panoptis operation
An important aspect of the configuration file concerns the parameters
that are set to be checked, and the corresponding mappings.
Both specifications are heuristic in nature; a set of right parameters
will rapidly identify irregular patterns signifying an intrusion
without letting legitimate commands mask pottential security breaches.
User mappings can group together users with similar behaviour or tasks.
Examples of potential user groups include ``sales,'' ``developers,''
``administrators,'' and ``system programmers.''
Similar groups can also be defined for terminals, based on the premise
that different physical locations are used for different purposes:
the factory floor terminals run a different set of programs than those
in the floor occupied by administrative personnel.
The parameters checked for different entities are
also important.
In the following paragraphs we outline some pertinent factors for choosing
the parameters to specify on a given monitored entity.
- maxhog, maxmem, maxavrw, maxstime, maxutime
-
The maximum CPU hog factor,
the maximum memory usage,
the maximum average disk block input/output, and the
maximum user and system times are relevant to all monitored entities.
Since, however, the same command can be used differently by different
users, it is more appropriate to monitor these factors on a user by command
(uidcomm) basis.
- maxaxsig
-
The signal exit status should be monitored for all commands.
Front-end commands (eg client-server applications) will not typically respond
to signals.
Signal termination of such commands may indicate exploitations of race
conditions or buffer overflows.
- minbmin,maxbmin
-
The minimum and maxinum daily start times
are mostly important for users and terminals which typically have
quite distinct patterns of use within the day.
Some commands are also scheduled to run at specific times; monitoring
these factors against commands can catch abnormal uses.
- maxasu
-
Given the security model of Unix systems,
the superuser status of a command should be closely monitored for all
entity combinations.
Most cases where a new command, user, or terminal acquire superuser status
should be carefuly investigated.
- maxcount
-
The maximum number of times accounting records appear for a given entity
can pinpoint some denial of service-type attacks,
typically without generating extraneous noise.
It should therefore be monitored for all entities.
- maxacore
-
The core dump flag should be monitored for all commands and terminals.
Failed attempts to exploit a buffer overflow often result in core dumps.
- maxavio, maxio
-
The maximum (average) character input/output should be monitored for terminals
since it can be used to detect attempts to transfer data in or out of the
system.
- maxafork
-
Monitoring the fork status on commands can detect trojan horses since they
often behave differently in this aspect than the original command.
- maxetime
-
Finally, the maximum clock time should be monitored for commands executed by
specific users.
These tend to have distinct profiles of usage; variations should trigger an
alarm.
Figure 4:
Number of anomalies over the weekly period leading
to the intrusion.
5 Incident Detection
On Saturday, January 13th, 2001, a system under our administrative
control was compromised.
The system was directly connected to the Internet and running
the FreeBSD 3.0-RELEASE version of Unix.
It is used as an Internet gateway providing email, web, FTP, and DNS
services.
It hosts only three user accounts (mostly used for system administration),
and is accessed using ssh.
One month before the incident, we performed a port scan on it using the
Nessus tool.
The single vulnerability that Nessus detected
(a domain-name service daemon vulnerability) was fixed by installing an
updated version of named.
Figure 4 depicts the number of anomalies detected by
Panoptis over the weekly period leading to the intrusion.
In order to demonstrate how a Panoptis system is trained, we deleted
all table contents prior to January 9th.
Each new Panoptis invocation starts at 02:00 in the morning after
the daily cron jobs are run.
The number of anomalies shown for the 9th mostly
represent the working-hour workload of the host.
The rise on the 10th represents the additional maintenance jobs executed
overnight by cron
(web and email accounting, updates, search engine indexing,
statistics, backup, security checks).
After the database tables have been primed the number of detected anomalies
fall to 27 (on the 11th) and 18 (on the 12th).
In normal operation, the number of anomalies detected by Panoptis
fall further under stable system operation (as can be seen on January 14th).
In our case, on January 13th a cracker attacked our system from an ISP
dial-in connection.
The rise of anomalies detected in a 24 hour period
(358 against of the previous ``training'' maximum of 169) should alert any
competent system administrator.
The 349 anomalies found by Panoptis on the 15th represent
system administration actions performed to investigate the cracker's
activity and mitigate the compromise.
Figure 5
contains some representative entries slightly edited
to remove extraneous information and substitute meaningful names
for numeric identifiers (Panoptis can perform this substitution
automatically, but only when running directly on the target system;
for techinical reasons this post-mortem analysis could not be performed
on the compromised system.)
The commands are typical of those run by an intruder.
The intruder(s) probably used a buffer overflow in the mail system POP
server to gain system access.
After gaining access, they checked for other users on the system (w),
the files in various directories (ls) and
the binary contents on some files (objdump),
and proceeded to compile and install a modified version of a setuid
program (gcc, chown, chmod).
Database Users*Commands, key [root/popper]: New entry.
Command: popper Terminal: ttyp1 User: root
Executed from: 2001-01-13 01:27:53 to: 2001-01-13 01:28:09 (16.2 seconds)
spending 0.12 seconds in kernel space and 0.04 seconds in user space
(0.16 total) and using the CPU for 1% of the time.
Character I/O: 4 characters (average I/O: 24.00 characters / CPU second)
Disk I/O: 0 K (average I/O: 0.00 K / CPU second)
Memory accounted: 2.58 K (average size: 8.06 K)
Database Users*Commands, key [dds/popper]: New entry.
Command: popper Terminal: ttyp1 User: dds
Executed from: 2001-01-13 01:28:20 to: 2001-01-13 01:28:34 (14.47 seconds)
Database Commands, key [w]: New entry.
Command: w Terminal: ttyp0 User: dds
Executed from: 2001-01-13 01:29:29 to: 2001-01-13 01:29:29 (0.32 seconds)
Database Users*Commands, key [dds/w]: New entry.
Command: w Terminal: ttyp0 User: dds
Executed from: 2001-01-13 01:29:29 to: 2001-01-13 01:29:29 (0.32 seconds)
Database Users*Commands, key [dds/ls]: New entry.
Command: ls Terminal: ttyp0 User: dds
Executed from: 2001-01-13 01:29:32 to: 2001-01-13 01:29:32 (0.36 seconds)
Database Users*Commands, key [root/awk]:
New maximum average character input/output (204.8 / 64; 220%)
Command: awk Terminal: ttyp1 User: root
Executed from: 2001-01-13 01:32:13 to: 2001-01-13 01:32:13 (0.41 seconds)
Database Users*Commands, key [dds/uname]: New entry.
Command: uname Terminal: ttyp0 User: dds
Executed from: 2001-01-13 01:33:14 to: 2001-01-13 01:33:14 (0.3 seconds)
Database Commands, key [objdump]: New entry.
Command: objdump Terminal: ttyp0 User: dds
Executed from: 2001-01-13 01:39:27 to: 2001-01-13 01:39:28 (1.26 seconds)
spending 0.09 seconds in kernel space and 0.02 seconds in user space
Database Commands, key [fitso]: New entry.
Command: fitso Terminal: ttyp0 User: dds
Executed from: 2001-01-13 01:46:14 to: 2001-01-13 01:46:14 (0.06 seconds)
Database Users*Commands, key [root/chown]: New entry.
Command: chown Terminal: ttyp1 User: root
Executed from: 2001-01-13 02:00:04 to: 2001-01-13 02:00:04 (0.06 seconds)
Database Commands, key [gcc]: New maximum daily start time (419 / 145; 189%)
Command: gcc Terminal: ttyp1 User: root
Figure 5:
The Panoptis report from the compromised system.
6 Evaluation
A monitoring system can fail in two different ways:
- Type I error
- Failing to report an important event (false
negative, silence).
- Type II error
- Reporting a large number of unimportant events letting
important ones passing unnoticed (false positive, noise).
In addition, a security monitoring system can fail either because
an intruder uses an attack mode not anticipated or covered by its
design (a system limitation),
or because the intruder intentionally tries to get around it
(a system weakness).
Panoptis's heuristic nature will result in both silence and noise.
Noise is gradually eliminated as more and more cases are added to
the profile data.
Silence can result either from security breaches that are outside
the system's domain, or from an intruder's deliberate exploitation
of the system's weaknesses.
As the system is based on process accounting records, a number
of other important information that could lead to the detection
of security problems is not examined.
Examples of other entities that could be monitored and included in
the profile data include system calls made by a process, network
connections, and patterns of file access.
Monitoring these entities would require operating system kernel
modifications [4];
we decided against them in order to keep the system portable and
easy to install.
An intruder knowing Panoptis's architecture and configuration
could also foil the system by:
- generating legitimate ``noise'' in order to hide a culprit process,
- an attack based on a non-terminating process (such as system daemons)
which are not normally logged,
- using an interpreter such as Perl [31] to access system
resources without invoking external processes,
- changing the name of the offending command to a benign name,
- gradually and legitimately changing the usage profile of an entity
avoiding the suspicion caused by a sudden change,
- filling the disk where administrative data is kept in order to
disable process accounting, or
- exploiting Panoptis's relatively large time window between the
occurrence of a suspicious event and its detection.
On the plus side, Panoptis's open ended nature can result in the detection
of security problems unanticipated during its design and deployment.
Some of the attacks described can be defended by careful installation
and configuration.
Countermeasures include keeping the accounting records in a filesystem
that has no publically writable directories (by default process accounting
records and the temporary file directory residing on the same filesystem), and
the protection of the configuration file and the reports from unauthorised
reading to make the planning of an undetected attack difficult.
We have run Panoptis on the accounting records of our site,
an academic site X-terminal server, a dialup/WWW server and
a C/database development machine.
After some time of tuning and profile collection, Panoptis's reports
are reduced to a steady trickle reflecting the users' change
of interests or mode of work and the introduction of new programs
on the system.
Although Panoptis has up to now caught only a single security violation,
the results we have so far obtained are encouraging.
In some cases, Panoptis has helped us identify sources of system
performance degradation or potential security problems.
Furthermore, in a Gedankenexperiment2
we performed based on five security breaches described in
[3], we found that four of them could have been caught by
Panoptis.
An important aspect of an intrusion detection system is its impact
on the systems it monitors.
Panoptis monitors the system at a rather coarse level utilising
existing process accounting data that is by default measured by Unix
kernels.
On a 366MHz Pentium II machine, Panoptis, running the specification
that was used to detect the incident described in section 5,
will process 320 process accounting records per second (3ms / record).
The time taken to analyse a record is of the same order of magnitude as
that needed by the kernel to fork and execute a process.
Thus continous monitoring is feasible on moderately loaded systems, or on
systems that execute mostly long-running processes.
A pathological case involves the execution of
pure shell-scripts relying on thousands of short process executions (eval,
expr, test, etc).
In such an environment, Panoptis may account for up to 50% of the
system's load.
In typical cases, however, Panoptis adds negligible overhead to the system's
operation.
7 Related Work
In an early study on real-time intrusion detection [2],
it was suggested that an intruder could be detected by monitoring certain
system-wide parameters (i.e. CPU use, memory use, disk activity,
keystroke dynamics, etc.), and compare them with what had been historically
established as normal or expected for that facility.
It was, also, suggested to profile the normal behavior of programs,
files, and other objects.
This is often called a statistical anomaly detection approach.
Until this study, the relevant work focused on developing procedures
and algorithms for automating the offline security analysis of audit trails.
On the basis of the above, SRI scientists developed IDES
(Intrusion Detection Expert System) [22] and
Next-generation IDES [1].
IDES is a system that continuously monitors user behavior and
detects suspicious behavior as it occurs.
IDES takes the approach that intrusions can be detected by
flagging departures from historically established norms of behavior for
individual users.
To support the idea, various intrusion detection measures are profiled
for each user and statistical processing of them is carried out by the
monitoring facility.
Intruders often use known paths to attack a system
(e.g. programmed password attacks,
access to privileged files,
exploitation of known vulnerabilities, etc.).
With a model-based reasoning, specific models of defending prescribed
attacks can be developed [11].
Other approaches are either defining acceptable, as opposed to intrusive,
behavior [15], or - on earlier stages of technology - are based
on the introduction of trap doors for intruders (i.e. ``bogus'' user accounts
with ``magic'' passwords, etc.) [19].
None of them is sufficient alone, since it addresses a specific type of threats.
Several studies have demonstrated that the use of specialized (security-focused)
audit trails is needed for security purposes.
In addition to the raw audit data itself, additional data could prove to be
useful or necessary for intrusion detection:
external facts (e.g. changes in user job description),
supporting facts (e.g. file attributes), and
profiles of expected behavior (e.g. time schedules).
It seems to be a fact, that effective intrusion detection will not come into
widespread use until
good security auditing mechanisms are in place [21].
The appropriate level of auditing is really important.
It has been suggested [17,24] that the audit should be performed
at the lowest possible level (e.g. monitoring system service calls),
because in this case, to circumvent auditing is harder.
The more recent studies on intrusion detection focus more on the topology
of the modern information systems environment.
As a result, network intrusion detection systems have been developed
[27,30].
The cornerstone of these systems is also a domain-specific language
that enables
concise specification of network packet contents under normal/expected
and/or attack conditions.
These approaches claim to have easy-to-develop intrusion specifications,
to carry out high-speed and large-volume monitoring,
to be robust and extensible,
and to use a comprehensive evaluation framework.
Over the last years, the increasing number of security attacks
and the corresponding detection and reporting frameworks and tools
have resulted in the design and adoption of a number of languages for
specifying systems and communicating information.
Eckmann et al. [10] distinguish six classes of relevant languages:
event, response, reporting, correlation, exploit, and detection languages.
The DSL used by Panoptis can be classified as a detection
language.
Other detection languages include
ASAX [12],
the language used in the Bro system [23],
P-BEST [20],
N-code [26],
STATL [10], and
specification languages [16,28].
Our approach is not general-purpose as some of the above languages, but
targets a specific area (intrusion detection using process accounting records)
with a narrow but focused, we hope, language that can efficiently describe
how accounting records shall be processed to recognise anomalies.
8 Conclusions and Further Work
The use of a domain-specific language can make
process accounting data ammenable to intrusion detection.
Panoptis first expands the accounting data space by deriving new
quantities from
the existing records and scattering the results into the three dimensions
of value, monitored entity, and time interval.
It then analyses the data by comparing it against the profiles of the
past it has stored on a database and reports any significant changes.
The numerous parameters that affect Panoptis's performance can be easily
tuned to match the characteristics of the system being supervised
forming heuristic rules.
This approach is flexible and provides useful results while
limiting extraneous noise.
After using Panoptis for some time we found out that the data evaluated
can be expanded in a number of ways by increasing the number of
derived properties.
Some examples include the addition of running averages,
and the mapping of pseudo-terminals to IP addresses.
In addition, report triggering can be made more selective by introducing
thresholds, counters, and combined conditions.
This new complexity will require enhancing the system's DSL.
Some additions we are investigating are
the treatment of the database tables as first class variables,
the integration of the time dimension,
the provision of predicate logic rules, and
support for installable components within
the common intrusion detection framework [14].
Ultimately, we would like to investigate how a number of
specialised intrusion detection modules like Panoptis,
each based on a narrow domain-specific language,
can be combined into a confederated intrusion detection system.
Acknowledgements
The authors would like to thank the anonymous reviewers for their insightful
comments on earlier versions of this paper.
References
- [1]
-
D. Anderson et al.
Next-generation intrusion detection expert system (NIDES): A
summary.
Technical Report SRI-CSL-95-07, SRI Int'l., 1995.
- [2]
-
J. Anderson.
Computer security threat monitoring and surveillance.
Technical report, J. Anderson Co., Pennsylvania, Apr. 1980.
- [3]
-
F. Baran, H. Kaye, and M. Suarez.
Security breaches: Five recent incidents at Columbia university.
In UNIX Security Workshop II, pages 151-171, Portland, OR,
USA, Aug. 1990. Usenix Association.
- [4]
-
D. S. Bauser and M. E. Koblentz.
NIDX - a real-time intrusion detection expert system.
In USENIX Conference Proceedings, pages 261-273, San
Francisco, CA, USA, Summer 1988. Usenix Association.
- [5]
-
J. Bell, F. Bellegarde, J. Hook, R. B. Kieburtz, A. Kotov, J. Lewis,
L. McKinney, D. P. Oliva, T. Sheard, L. Tong, L. Walton, and T. Zhou.
Software design for reliability and reuse: a proof-of-concept
demonstration.
In Conference on TRI-Ada '94, pages 396-404. ACM, ACM Press,
1994.
- [6]
-
T. Berners-Lee and D. Connolly.
RFC 1866: Hypertext Markup Language - 2.0, Nov. 1995.
Status: PROPOSED STANDARD.
- [7]
-
D. E. R. Denning.
Cryptography and Data Security.
Addison-Wesley, 1983.
- [8]
-
P. J. Denning.
Computers Under Attack: Intruders, Worms, and Viruses.
Addison-Wesley, 1990.
- [9]
-
T. Duff.
Experience with viruses on UNIX systems.
Computing Systems, 2(2):155-171, Spring 1989.
- [10]
-
S. Eckmann, G. Vigna, and R. Kemmerer.
STATL: An attack language for state-based intrusion detection.
In Proceedings of the ACM Workshop on Intrusion Detection,
Athens, Greece, Nov. 2000. ACM.
- [11]
-
T. Garvey and T. Lunt.
Model-based intrusion detection.
In 14th National Computer Security Conference, 1991.
- [12]
-
N. Habra, B. L. Charlier, A. Mounji, and I. Mathieu.
ASAX: Software architecture and rule-based language for universal
audit trail analysis.
In ESORICS 92, Toulouse, France, Nov. 1992.
- [13]
-
S. C. Johnson and M. E. Lesk.
Language development tools.
Bell System Technical Journal, 56(6):2155-2176, July-August
1987.
- [14]
-
C. Kahn, P. Porras, S. Staniford-Chen, and B. Tung.
A common intrusion detection framework.
Available online http://www.gidos.org, July 1998.
- [15]
-
P. Karger.
Limiting the damage potential of discretionary Trojan horses.
In IEEE Symposium on Security and Privacy, pages 32-37. IEEE
Press, 1987.
- [16]
-
C. Ko, M. Ruschitzka, and K. Levitt.
Execution monitoring of security-critical programs in distributed
systems: A specification-based approach.
In 1997 IEEE Symposium on Security and Privacy, pages 175-187.
IEEE, 1997.
- [17]
-
J. Kuhn.
Research towards intrusion detection through the automated
abstraction of audit data.
In 9th National Computer Security Conference, 1986.
- [18]
-
S. J. Leffler, M. K. McKusick, M. J. Karels, and J. S. Quarterman.
The Design and Implementation of the 4.3BSD Unix Operating
System.
Addison-Wesley, 1988.
- [19]
-
R. Linde.
Operating system penetration.
In National Computer Conference, 1975.
- [20]
-
U. Lindqvist and P. A. Porras.
Detecting computer and network misuse with the production-based
expert system toolset (P-BEST).
In IEEE Symposium on Security and Privacy, Oakland, CA, May
1999. IEEE.
- [21]
-
T. Lunt.
A survey of intrusion detection techniques.
Computers and Security, 12(4):405-418, June 1993.
- [22]
-
T. Lunt et al.
A real-time intrusion-detection expert system.
Technical Report SRI-CSL-92-05, SRI Int'l., 1992.
- [23]
-
V. Paxson.
A system for detecting network intruders in real-time.
In 7th USENIX Security Symposium, San Antonio, TX, Jan. 1998.
Usenix Association.
- [24]
-
J. Picciotto.
The design of an effective auditing subsystem.
In IEEE Symposium on Research in Security and Privacy, pages
13-22. IEEE Press, 1987.
- [25]
-
J. C. Ramming, editor.
USENIX Conference on Domain-Specific Languages, Santa Monica,
CA, USA, Oct. 1997. Usenix Association.
- [26]
-
M. J. Ranum, K. Landfield, M. Stolarchuck, M. Sienkiewicz, A. Lambeth, and
E. Wall.
Implementing a generalized tool for network monitoring.
In 11th Systems Administration Conference (LISA '97). Usenix
Association, Oct. 1997.
- [27]
-
R. Sekar et al.
A high-performance network intrusion detection system.
In 6th ACM Conference on Computer and Communication Security,
pages 8-17. ACM Press, 1999.
- [28]
-
R. Sekar and P. Uppuluri.
Synthesizing fast intrusion detection/prevention systems from
high-level specifications.
In 8th USENIX Security Symposium. Usenix Association, 1999.
- [29]
-
D. Spinellis and V. Guruprasad.
Lightweight languages as software engineering tools.
In Ramming [25], pages 67-76.
- [30]
-
G. Vigna and R. Kemmerer.
NetSTAT: A network-based intrusion detection approach.
In Computer Security Applications Conference, 1998.
- [31]
-
L. Wall, T. Christiansen, R. L. Schwartz, and S. Potter.
Programming Perl.
O'Reilly and Associates, Sebastopol, CA, USA, second edition, 1996.
Footnotes:
1
Argos-Panoptis - the one who can see everything - is a Greek mythology
canine creature whose body is covered with eyes.
Even when Panoptis is sleeping half of its eyes remain open.
For this it was given the task of guarding Io, one of Zeus' lovers.
2
Gedankenexperiment is a technical term used by physicists
for an experiment which is only described but not made in reality, as it is
not possible or was not possible when someone thought of it for the first time.
By imagining the given experiment, one will hopefully reach interesting
conclusions (e.g. the Einstein-Podolsky-Rosen Gedankenexperiment).
The term was popularized by Einstein, who relied heavily on Gedankenexperiments
both in his derivation of relativity and in his arguments with Bohr about
quantum mechanics.