|Home Projects Mailing Lists General Contact Us||
Harvester Processor Module Overview
Harvester Processor Architecture
This document briefly describes the core components of the Harvester Processor Architecture and its corresponding Data Sources. The Harvester Processor Architecture is broken into four distinct functional areas: the Plug-in Message Dispatcher, Receiver Group Plug-ins, Parser Group Plug-ins and Processor Group Plug-ins. The Plug-in Message Dispatcher handles communication between all of the Plug-in Groups. Receiver Group Plug-ins act as a single point of contact for all incoming data from the Data Sources. Parser Group Plug-ins translate data received from the Receiver Group Plug-ins into a Unified Log Format (ULF) messages. Finally, Processor Group Plug-ins take specific actions on ULF messages.
Plug-in Message Dispatcher
The Plug-in Message Dispatcher is the information backbone of the Harvester Processor Architecture (HPA). It handles the communication and routing of messages between the various plug-ins that make up the HPA. In general, the Plug-in Message Dispatcher (PMD) handles two types of messages: raw log information that has been sent from a Data Source to the Receiver Group Plug-ins and ULF messages that have been generated by Parser Group Plug-ins or Processor Group Plug-ins. The PMD also provides a single point of entry for routing messages between the various plug-in groups that make up the Harvester Processor Architecture.
Receiver Group Plug-ins
The Receiver Group Plug-ins listen for network messages that are generated by Agents running on Data Sources. In a general sense Receiver plug-ins are simple socket listeners that wait for incoming data. Once a message from a Data Source Agent is received the message is converted to ULF and passed to the Plug-in Message Dispatcher for delivery to the appropriate Parser Group Plug-in.
The Syslog Receiver is a multi-purpose plug-in. Its purpose is to identify either application specific messages or generic RFC3164 formatted messages and pass them to the appropriate Parser Group Plug-in via the Plug-in Message Dispatcher. Application specific messages are an extension permitting multiple data formats to be passed via the standard Syslog network port allowing much simpler host configuration. Additional Data Source Agents are supported by defining a new recognition header for the Syslog Receiver, a standardized message format for communication with the parser, and a new Parser Group Plug-in.
SNMP Trap Receiver
The SNMP Trap Receiver plug-in listens for messages that are formatted in compliance with RFC1157. Raw RFC1157 compliant messages are then passed to the Plug-in Message Dispatcher for delivery to the appropriate Parser Group Plug-in.
The OPSEC Receiver is an active polling agent that queries remote OPSEC enabled devices for OPSEC formatted messages. Once a raw OPSEC formatted message is retrieved, it is passed to the Plug-in Message Dispatcher for delivery to the appropriate Parser Group Plug-in.
Parser Group Plug-ins
Parser Group Plug-ins retrieve data from the Plug-in Message Dispatcher. As raw messages are passed to the Plug-in Message Dispatcher from Receiver Group Plug-ins, they are tagged with routing information specifying to which Parser Group Plug-in they are destined. Parser Group Plug-ins receive these raw messages, parse them into ULF, and pass them back to the Plug-in Message Dispatcher.
For forensic purposes, it is necessary to store incoming messages in a consistent format that preserves as much of the original raw arbitrarily formatted message as possible. The Archival Storage Plug-in copies messages from the Plug-in Message Dispatcher and writes them out to the Archival Storage system. In most cases the Archival Storage Plug-in does not remove the message from the Message Dispatcher; it only copies and archives it. The information that is stored in Archival Storage is as close to the original message that was generated as possible.
RFC Syslog Parser
The RFC Syslog Parser processes messages that have been formatted in an RFC3164 compliant manner. Since the Syslog message format was not standardized until recently, parser logic has been developed to handle the many variations of Syslog message formats. These messages are formatted into ULF and passed back to the Plug-in Message Dispatcher.
NT Syslog Parser
The NT Syslog Parser Plug-in processes messages that have been formatted in the specific NT Syslog message format. This message format is defined in the NT Syslog Data Source Agent that is distributed by farm9.com, Inc. These messages are formatted into ULF and passed back to the Plug-in Message Dispatcher.
Snort IDS Parser
The Snort IDS Parser Plug-in processes messages that have been formatted in the specific Snort IDS message format. This message format is defined in the output plug-in that is distributed by farm9.com, Inc. These messages are formatted into ULF and passed back to the Plug-in Message Dispatcher.
SNMP Trap Parser
The SNMP Trap Parser Plug-in processes message that have been formatted in a RFC1157 compliant manner. These messages are formatted into ULF and passed back to the Plug-in Message Dispatcher.
The OPSEC Parser plug-in processes messages that have been formatted in an OPSEC compliant manner. These messages are formatted into ULF and passed back to the Plug-in Message Dispatcher.
Processor Group Plug-ins
Processor Plug-ins take action on ULF messages or groups of ULF messages. Possible actions are: selection, insertion, deletion, or modification of ULF messages that have be passed to the Plug-in Message Dispatcher, the generation of meta-data, and the storage of ULF messages to a database entity.
The Rule-based Ranking Plug-in ranks the importance of a ULF message by applying signature-based pattern matching to a message. Users can define rules that match specific ULF messages and define a score for that message. When a match occurs, the ULF message is tagged with the user defined score and passed back to the Plug-in Message Dispatcher.
To facilitate message retrieval, message review, analysis and correlation, it is necessary to interact with several data-stores. The Log Storage Plug-in handles the communication pathway between Plug-in Message Dispatcher and any number of databases. Currently this plug-in only supports MYSQL as a valid backend database. In the future, support for other types of databases will be added as necessary.
Unified Log Format messages contain an extensive amount of information. Specific pieces of information from a ULF message are useful for analysis, correlation and extrapolation. Event correlation requires use of data-warehouse techniques and relies heavily upon data summaries and other processed metadata. The Metadata Extraction Plug-ins are responsible for generating specific event related metadata and other data summaries. It is also possible to have the Metadata Extraction Plug-in generate new messages that can be re-processed by the system.
Network Flow Metadata
The Network Flow Metadata Plug-in parses ULF messages for source and destination IP addresses and port numbers. This metadata is then stored permitting external analysis of network flows for anomaly detection. Baselines of network flow can be created to uncover Distributed Denial of Service Attacks, Port Scans, Worm Infections, or other network flow related events. The external modules can generate log entries to be fed back into the overall correlation system for storage and display.
Host Event Metadata
The Host Event Metadata Plug-in monitors ULF messages for host-related events such as login failures, audit failures and other host-related accounting information. Metadata is generated representing the type and importance of ULF messages that have been received. This plug-in can be rule-based for dynamic addition of new event types. External analysis of this metadata can generate new log messages to be fed back into the correlation system.
The Statistics Collection Plug-in extracts information from ULF messages and generates numerical facts that are collected for study. Examples of statistics include event counts by type, message rates, and other useful information about overall system operation. The consolidated numerical facts serve as a useful tool for precise and accurate generation of numerical baselines. These baselines allow users to quickly understand the trends and tendencies of the network.
Data sources are defined as any entity that contains information that may or may not be useful to collect. To collect this information, collection Agents have been developed to facilitate communication between Data Source entities and the Harvester System. Two general classes of Agents exist in the Harvester Architecture: Integrated Agents and Custom Agents. Integrated Agents are agents that are commonly integrated into host operating systems, are widely deployed, and have a standardized method of communication. Custom Agents are simply custom applications that have been developed for the retrieval and transport of information pertaining to network entities.
Syslog implementations that conform to RFC3164 or the various ad-hoc Syslog implementations are considered Standard Syslog Integrated Agents. Standard Syslog Agents package and transmit message to the Syslog Receiver Plug-in. The Syslog Receiver Plug-in then passes this message to the Plug-in Message Dispatcher.
RFC1157 defines the Simple Network Management Protocol (SNMP). Applications that generate RFC1157 compliant SNMP Traps are considered compliant SNMP Agents. SNMP Agents transmit messages to the SNMP Receiver Plug-in that then passes the message to the Plug-in Message Dispatcher.
farm9.com, Inc. has developed a customer Win32 Agent for gathering and transmitting information. This Agent is loosely based on SaberNet's NT Syslog service that is distributed at "www.sabernet.net/software/ntsyslog.html". The NT Syslog Agent gathers information from the Windows Event Log sub-system and supports a generic file transportation mechanism. This allows information from the standard Event log and text files like IIS web logs to be transmitted across the network.
A Snort output plug-in was created to package and transmit Snort Intrusion Detection messages across the network. This output plug-in was developed by farm9.com, Inc. and defines a packet format that is understood by the Syslog Receiver Plug-in.
Checkpoint has developed the Open Platform for Secure Enterprise Connectivity (OPSEC) in an effort to standardize retrieval of Firewall-1 messages. OPSEC allows third-parties to implement polling Agents that gather messages from Firewall-1 devices. This Agent technically acts as both a Receiver Plug-in and Agent since it must poll Firewall-1 devices for information. Due to licensing restrictions of the OPSEC library, farm9.com, Inc. is unable to package this agent in its entirety. Entities wishing to use the OPSEC Agent will have to download the OPSEC library from CheckPoint before compiling this plug-in.
Copyright © 2005 farm9.com, Inc. - All Rights Reserved.
Last modified: January 01, 1970 00:00:00 UTC