Multi buffer matching is implemented as a way for a rule to match
on multiple buffers within the same transaction.
Before this patch a rule like:
dns.query; content:"example"; dns.query; content:".com";
would be equivalent to:
dns.query; content:"example"; content:".com";
If a DNS query would request more than one name, e.g.:
DNS: [example.net][something.com]
Eeach would be inspected to have both patterns present. Otherwise,
it would not be a match. So the rule above would not match, as neither
example.net and somthing.com satisfy both conditions at the same time.
This patch changes this behavior. Instead of the above, each time the
sticky buffer is specified, it creates a separate detection unit. Each
buffer is a "multi buffer" sticky buffer will now be evaluated against
each "instance" of the sticky buffer.
To continue with the above example:
DNS: [example.net] <- matches 'dns.query; content:"example";'
DNS: [something.com] <- matches 'dns.query; content:".com"'
So this would now be a match.
To make sure both patterns match in a single query string, the expression
'dns.query; content:"example"; content:".com";' still works for this.
This patch doesn't yet enable the behavior for the keywords. That is
done in a follow up patch.
To be able to implement this the internal storage of parsed rules
is changed. Until this patch and array of lists was used, where the
index was the buffer id (e.g. http_uri, dns_query). Therefore there
was only one list of matches per buffer id. As a side effect this
array was always very sparsely populated as many buffers could not
be mixed.
This patch changes the internal representation. The new array is densely
packed:
dns.query; content:"1"; dns.query; bsize:1; content:"2";
[type: dns_query][list: content:"1";]
[type: dns_query][list: bsize:1; content:"2";]
The new scheme allows for multiple instances of the same buffer.
These lists are then translated into multiple inspection engines
during the final setup of the rule.
Ticket: #5784.
Update APIs to store files in transactions instead of the per flow state.
Goal is to avoid the overhead of matching up files and transactions in
cases where there are many of both.
Update all protocol implementations to support this.
Update file logging logic to account for having files in transactions. Instead
of it acting separately on file containers, it is now tied into the
transaction logging.
Update the filestore keyword to consider a match if filestore output not
enabled.
src/detect-engine-state.c:127:91: style: Suspicious calculation. Please use parentheses to clarify the code. The code ''a&b?c:d'' should be written as either ''(a&b)?c:d'' or ''a&(b?c:d)''. [clarifyCalculation]
DetectEngineStateDirection *dir_state = &state->dir_state[direction & STREAM_TOSERVER ? 0 : 1];
^
src/detect-engine-state.c:194:53: style: Suspicious calculation. Please use parentheses to clarify the code. The code ''a&b?c:d'' should be written as either ''(a&b)?c:d'' or ''a&(b?c:d)''. [clarifyCalculation]
de_state->dir_state[direction & STREAM_TOSERVER ? 0 : 1].filestore_cnt += file_no_match;
^
src/detect-engine-state.c:201:57: style: Suspicious calculation. Please use parentheses to clarify the code. The code ''a&b?c:d'' should be written as either ''(a&b)?c:d'' or ''a&(b?c:d)''. [clarifyCalculation]
if (de_state->dir_state[direction & STREAM_TOSERVER ? 0 : 1].filestore_cnt == sgh->filestore_cnt)
^
Fix issue where after a reset the now empty list elements are not
reused and the values may not be valid for the current detect
engine anymore.
Introduce a 'current' (cur) pointer that points to the store element
currently being filled. This way existing stores will be reused.
If 'cur' is NULL and 'head' is not NULL it means we need to use
'tail' to append a new store.
As the file prune is now moved to the flow worker, the file
prune is run later, meaning the first file has not yet
been pruned from the file container list.
Adjust test to look for a second file, and check the
flags on that file.
For commit addressing bug 2490.
Also remove the now useless 'state' argument from the SetTxDetectState
calls. For those app-layer parsers that use a state == tx approach,
the state pointer is passed as tx.
Update app-layer parsers to remove the unused call and update the
modified call.
Fix the inspection of multiple files in a single TX, where new files
may be added to the TX after inspection started.
Assign the hard coded id DE_STATE_FLAG_FILE_INSPECT to the file
inspect engine.
Make sure that sigs that do file inspection and don't match on the
current file always store a detailed state. This state will include
the DE_STATE_FLAG_FILE_INSPECT flag.
When the app-layer indicates a new file is available, for each sig
that has the DE_STATE_FLAG_FILE_INSPECT flag set, reset part of the
state so that the sig is evaluated again.
Use per tx detect_flags to track prefilter. Detect flags are used for 2
things:
1. marking tx as fully inspected
2. tracking already run prefilter (incl mpm) engines
This supercedes the MpmIDs API for directionless tracking
of the prefilter engines.
When we have no SGH we have to flag the txs that are 'complete'
as inspected as well.
Special handling for the stream engine:
If a rule mixes TX inspection and STREAM inspection, we can encounter
the case where the rule is evaluated against multiple transactions
during a single inspection run. As the stream data is exactly the same
for each of those runs, it's wasteful to rerun inspection of the stream
portion of the rule.
This patch enables caching of the stream 'inspect engine' result in
the local 'RuleMatchCandidateTx' array. This is valid only during the
live of a single inspection run.
Remove stateful inspection from 'mask' (SignatureMask). The mask wasn't
used in most cases for those rules anyway, as there we rely on the
prefilter. Add a alproto check to catch the remaining cases.
When building the active non-mpm/non-prefilter list check not just
the mask, but also the alproto. This especially helps stateful rules
with negated mpm.
Simplify AppLayerParserHasDecoderEvents usage in detection to only
return true if protocol detection events are set. Other detection is done
in inspect engines.
Move rule group lookup and handling into it's own function. Handle
'post lookup' tasks immediately, instead of after the first detect
run. The tasks were independent of the initial detection.
Many cleanups and much refactoring.
Now that MPM runs when the TX progress is right, stateful detection
operates differently.
Changes:
1. raw stream inspection is now also an inspect engine
Since this engine doesn't take the transactions into account, it
could potentially run multiple times on the same data. To avoid
this, basic result caching is in place.
2. the engines are sorted by progress, but the 'MPM' engine is first
even if the progress is higher
If MPM flags a rule to be inspected, the inspect engine for that
buffer runs first. If this step fails, the rule is no longer
evaluated. No state is stored.
Flowvars were already using a temporary store in the detect thread
ctx.
Use the same facility for pktvars. The reasons are:
1. packet is not always available, e.g. when running pcre on http
buffers.
2. setting of vars should be done post match. Until now it was also
possible that it is done on a partial match.