Azinix

evlMgr.c File Reference

Routines for creating data structure containing everything needed to evaluate rules. More...

#include "evl.h"

Go to the source code of this file.

Functions

Evl_Manager_tEvl_BuildManager (char *ruleFileString, Tcl_Interp *interp, ClientData cd)
 Build basic evl data structures from a rule-set.
Evl_Action_tEvl_AllocAction (Rlp_Action_t *parsedAction)
 Create the action structured from the parsed entries.
bdd_t * Evl_BuildBddFromL4Formula (Rlp_L4Check_t *L4Check, st_table *defineTable, bdd_manager *bddMgr)
 Build a bdd for the L4 check.
bdd_t * Evl_BuildBddForIps (Rlp_Formula_t *srcIp, bdd_manager *bddMgr, st_table *defineTable)
 Build a bdd for an ip formula.
bdd_t * Evl_BuildBddForPorts (Rlp_Formula_t *portFormula, bdd_manager *bddMgr, st_table *defineTable)
 Create a BDD from a port formula.
bdd_t * Evl_BuildBddForPortRange (int lowPort, int highPort, bdd_manager *bddMgr)
 Build Bdd for a port interval [low,high], inclusive of end-points.
int Evl_BddCmp (char *aBdd, char *bBdd)
 Compare function for bdd_t.
int Evl_BddHash (char *aBdd, int modulus)
 Hash function for bdd_t.
Evl_L4LookupTable_tEvl_RuleRelationToLookupTable (bdd_t *TR, Evl_L4Manager_t *L4Mgr, array_t *ruleVarArray)
 Build a lookup table from the rule relation BDD.
Evl_L4LookupTable_tEvl_L4BuildLookup (Evl_BddLevelPair_t *root, int numClasses, Evl_L4Manager_t *L4Mgr, st_table *allDecisionNodes)
 Build the L4 lookup array.
Evl_BddLevelPair_tEvl_CreateBddLevelPair (bdd_t *aBdd, int level)
 Allocate and populate a bdd-and-level struct.
int Evl_BddLevelPairHash (char *aPtr, int modulus)
 Hash function for a bdd-level pair.
int Evl_BddLevelPairCmp (char *aPtr, char *bPtr)
 Compare function for a bdd-level pair.
Evl_L4Manager_tEvl_L4ManagerAlloc (Evl_Manager_t *mgr, char *mode)
 Allocate and initialize a TCP manager.
Evl_Manager_tEvl_ManagerAlloc ()
 Allocate and initialize a manager.
int Evl_ManagerFree (Evl_Manager_t *result)
 Free an L4 Manager.
Evl_ContentMgr_tEvl_BuildContentMgr (array_t *L7CheckArray)
 Given an array of Rlp_Formula_t's, assumed to be L7 rules, return an Evl_ContentMgr_t.
Evl_ContentMgr_tEvl_AllocContentMgr ()
 Allocate a content manager structure.
void Evl_PrintContentMgr (Evl_ContentMgr_t *aCM)
 Print a content manager.
int Evl_L4LookupTableAddRuleArrays (Evl_L4LookupTable_t *table)
 Add rule arrays in addition to the sets.
Evl_ContentMgr_tEvl_BuildContentMgrForLowerCaseStrings (Evl_ContentMgr_t *aCM)
 Build the content manager with all strings projected down to lower case.


Detailed Description

Routines for creating data structure containing everything needed to evaluate rules.

Content searches come in several flavors:



    o these are specified using depth and offset

      + most are of the form

        content:"HTTP/1.1 403"; depth:12 OR

        content:"|2a 02|"; offset:0; depth:2; content:"|00 04 00 06|"; offset:6; depth:4;
        which specify exactly where to look

        Note this contradicts what the manual says is the semantics
        of depth, it appears to be relative to the offset, not to the 
        start of payload.

     + some are like

       content:"|01 87 8B 00 00|"; offset:40; depth:8;

       which means look for this content string starting at 40 bytes
       into the payload, and only at locations 40 41 42 43

     + with one exception (750) no depth exceeds 128, and offsets
       are in the set {300, 200, 83, 45, 43, ... }  (the 750 and 300 are the
       same)

   o strings that have within/distance constraints

     complete set of values: distance:0 :1 :16 :32 :4 :5 
     within:1 :100 :1024 :12 :128 :150 :2 :255 :256 :3 :4 :50 :500 :512 :600

     there are 8 rules that mix depth/offset and within/distance

     There is some confusion as to whether within/distance is anchored to 
     the START of the last match, or the end of the last match (or the
     beginning of the new match for distance)

     content:"SITE "; nocase; content:"EXEC "; distance:0;
     content:"RETR"; nocase; content:"file_id.diz"; nocase; distance:1;
     content:"|00 01 86 A0|"; content:"|00 00 00 04|"; distance:4; within:4;
     content:"From\:"; content:"<><><><><><><><><><><><><><><><><><><><><><>"; 
             distance:0; content:"("; distance:1; content:")"; distance:1;

Some strings are very short (e.g. 1 char, and using a 
generic automaton for all strings will lead to huge 
numbers of hits in the automaton.

there are many singletons:

0x0 0x1 0x4 0x9 0x10 0x19 % & () * .  0 1 : ; > ?  @ A [ ] { |

doubles:

0x0 0x0 0x0 0x1 0x0 0x2 0x-76 0x-76 0x0 0x20 00 02 03 04 0x0 # 07 ?& 
09 0x2 0x0 -- ST 20 21 22 23 24 25 70 26 71 *0x1 *0x2 0x4 0x0 / A B 
C 40 41 91 92 S ` p 10 12 13 14 15 60 16 17 63 64 @/ @ ..  .0 30 
31 32 33 34 35 36 =+ 37 38 39 88

triples:

0x0 0x0 0x-4 00 911 121 FC 125 0a 5c 0x0 0x2 0x0 20 c:\ 0x0 
_0x2 MKD CWD B0x0 0x2 0x10 D/ !# ~0x10 %%% 199 l44 1MB 110 PWD 
117 118 000 0x0 0x6 0x0 /%% db= /// ;id 0x0 0x1 / 130 ../ 0x0 
0x1 C 0x5 0x0 > 370 et= 1u .. 0x4 0x1 0x0 100 0x0 0x-53 0x0 p

There are several ways in which to avoid this:



    this seems unreasonable, since we won't know the state of the 
    global automaton, unless we run from scratch.


    more generally, we could have a seperate automaton per equivalence
    class - there are a relatively small number of classes after we
    partition.  we'll need 300 automata, each over an avg of 2 layer 4 rules,
    each of which is ~20 rules. this might be too high a memory overhead.

    in addition we'd have 3 automata simulations to do per packet (plus
    the no case automata).

One solution is to have for each rule, the longest string appearing in it. Then build a table/fsm over these strings. Then run the payload over the table/fsm and see which strings are present in the payload. The specific set of strings that match is used to find corresponding rules; each of these rules is then invoked on a 1-by-1 basis.

We can do pretty much the same thing - first of all determine which rules are potentially applicable by running the automaton, and for each matching string, seeing the rules associated with it. We have the layer 4 information, so we can automatically do some pruning here.

It should be pretty simple to check a single rule on a packet; infact all we are checking is the optional checks (ttl, icmp, etc.) The payload checks will be either a intersection of strings, or a sequence of strings (need to check if there's also a possibility of mixed checks, e.g., abc U pqr OR xyz). For each string we're checking for, we can pre-compute the BM jump table, should be pretty quick.

One concern that remains is short strings - 1 or 2 chars. What do we do if they are the longest string appearing in a rule?

Since these strings will appear only in rules that are very qualified (src port = 6000, dest port = 2121), we can keep a BDD encoding the L4 conditions for these rules (call the set S) , and just check the rules directly.

It's likely that there will be few rules per class for the ECs engendered by S. (Most of the packets won't even satisfy S.) So a brute force evaluation of all the rules in the class should be simple enough.

Definition in file evlMgr.c.


Function Documentation

Evl_Action_t* Evl_AllocAction ( Rlp_Action_t parsedAction  ) 

Create the action structured from the parsed entries.

Definition at line 272 of file evlMgr.c.

Evl_ContentMgr_t* Evl_AllocContentMgr (  ) 

Allocate a content manager structure.

Definition at line 2003 of file evlMgr.c.

int Evl_BddCmp ( char *  aBdd,
char *  bBdd 
)

Compare function for bdd_t.

Definition at line 704 of file evlMgr.c.

int Evl_BddHash ( char *  aBdd,
int  modulus 
)

Hash function for bdd_t.

Makes use of the bdd_node structure, here's the info from the bdd.doc:

bdd_node * bdd_get_node(f, is_complemented) bdd_t *f; boolean *is_complemented;

Return the regularized (i.e. uncomplemented) pointer to the bdd_node to which `f' refers. Also, returns whether or not the bdd_node pointer is complemented. Note well: the address of a bdd_node changes upon garbage collection. If your code is sensitive to the addresses of bdd_nodes, and you invoke BDD routines which cause new bdd_nodes to be created, then a garbage collection may automatically be triggered. To prevent a garbage collection from automatically occurring, use bdd_set_gc_mode to turn off garbage collection before a sensitive piece of code is executed.

Definition at line 745 of file evlMgr.c.

int Evl_BddLevelPairCmp ( char *  aPtr,
char *  bPtr 
)

Compare function for a bdd-level pair.

Definition at line 1437 of file evlMgr.c.

int Evl_BddLevelPairHash ( char *  aPtr,
int  modulus 
)

Hash function for a bdd-level pair.

Definition at line 1419 of file evlMgr.c.

bdd_t* Evl_BuildBddForIps ( Rlp_Formula_t srcIp,
bdd_manager *  bddMgr,
st_table defineTable 
)

Build a bdd for an ip formula.

Variables are starting from 0, which is the LSB of the IP address.

Definition at line 427 of file evlMgr.c.

bdd_t* Evl_BuildBddForPortRange ( int  lowPort,
int  highPort,
bdd_manager *  bddMgr 
)

Build Bdd for a port interval [low,high], inclusive of end-points.

Definition at line 617 of file evlMgr.c.

bdd_t* Evl_BuildBddForPorts ( Rlp_Formula_t portFormula,
bdd_manager *  bddMgr,
st_table defineTable 
)

Create a BDD from a port formula.

Definition at line 558 of file evlMgr.c.

bdd_t* Evl_BuildBddFromL4Formula ( Rlp_L4Check_t L4Check,
st_table defineTable,
bdd_manager *  bddMgr 
)

Build a bdd for the L4 check.

Definition at line 344 of file evlMgr.c.

Evl_ContentMgr_t* Evl_BuildContentMgr ( array_t L7CheckArray  ) 

Given an array of Rlp_Formula_t's, assumed to be L7 rules, return an Evl_ContentMgr_t.

Definition at line 1891 of file evlMgr.c.

Evl_ContentMgr_t* Evl_BuildContentMgrForLowerCaseStrings ( Evl_ContentMgr_t aCM  ) 

Build the content manager with all strings projected down to lower case.

We use this in the following way: we have a number of checks that are case-insensitive, and a number that are case-sensitive. We could build two DFA, and then run the checks twice. On the other hand we could make ALL of the DFA checks case-insensitive, and then in the final detailed check, take case into account. We choose the latter since it means only a single pass over the packet. There is a possibility of false positives from this scheme, e.g., a case specific search search for "abc" will be matched by "ABC"; I don't have a detailed analysis/study of this pheonomenon, but it seems to be unlikely.

See also:
evalContentCheck

Definition at line 2445 of file evlMgr.c.

Evl_Manager_t* Evl_BuildManager ( char *  ruleFileString,
Tcl_Interp *  interp,
ClientData  cd 
)

Build basic evl data structures from a rule-set.

Definition at line 192 of file evlMgr.c.

Evl_BddLevelPair_t* Evl_CreateBddLevelPair ( bdd_t *  aBdd,
int  level 
)

Allocate and populate a bdd-and-level struct.

Definition at line 1403 of file evlMgr.c.

Evl_L4LookupTable_t* Evl_L4BuildLookup ( Evl_BddLevelPair_t root,
int  numClasses,
Evl_L4Manager_t L4Mgr,
st_table allDecisionNodes 
)

Build the L4 lookup array.

Definition at line 1202 of file evlMgr.c.

int Evl_L4LookupTableAddRuleArrays ( Evl_L4LookupTable_t table  ) 

Add rule arrays in addition to the sets.

Definition at line 2397 of file evlMgr.c.

Evl_L4Manager_t* Evl_L4ManagerAlloc ( Evl_Manager_t mgr,
char *  mode 
)

Allocate and initialize a TCP manager.

Definition at line 1497 of file evlMgr.c.

Evl_Manager_t* Evl_ManagerAlloc (  ) 

Allocate and initialize a manager.

Definition at line 1563 of file evlMgr.c.

int Evl_ManagerFree ( Evl_Manager_t result  ) 

Free an L4 Manager.

Definition at line 1877 of file evlMgr.c.

void Evl_PrintContentMgr ( Evl_ContentMgr_t aCM  ) 

Print a content manager.

Definition at line 2025 of file evlMgr.c.

Evl_L4LookupTable_t* Evl_RuleRelationToLookupTable ( bdd_t *  TR,
Evl_L4Manager_t L4Mgr,
array_t ruleVarArray 
)

Build a lookup table from the rule relation BDD.

Precondition:
First 96 vars are dest port, src port, dest ip, src ip.
We're going to return an array that will be used to represent a 256-ary tree mapping IP headers to bit-vectors whose length is equal to the number of distinct L4 rules.

Definition at line 983 of file evlMgr.c.