The HTTP bodies (http_client_body and http_server_body/file_data) use
settings to control how much data we have before doing first inspection:
request-body-minimal-inspect-size
response-body-minimal-inspect-size
These settings default to 32k as quite some existing rules need this.
At the same time, the 'raw stream' inspection uses its own limits. By
default it inspects the data in blocks of about 2.5k. This could lead
to a situation where rules would not match.
For example, with 2 rules like this:
content:"abc"; content:"data="; http_client_body; depth:5; sid:1;
content:"xyz"; sid:2;
Sid 1 would only be inspected when the POST body reached the 32k limit
or when it was complete. Observed case shows the POST body to be 18k.
Sid 2 is inspected as soon as the 2.5k limit is reached, and then again
for each 2.5k increment. This moves the raw stream tracker forward.
So by the time sid 1 is inspected, some 18/19k into the stream, the
raw stream tracker is actually already moved forward for approximately
17.5k, this leads to the stream match of sid 1 possibly not matching.
Since the body match is at the start of the buffer, it makes sense
that the body and stream are inspected together.
The body inspection uses a tracker 'body_inspected', that keeps track
of how far into the body both MPM and per signature inspection has
moved.
This patch updates the logic in 2 ways:
1. it triggers earlier HTTP body inspection, which is matched to the
stream inspection. When the detection engine finds it has stream
data available for inspection, it passes the new 'STREAM_FLUSH'
flag to the HTTP body inspection code. Which will then do an
early inspection, even if still before the min inspect size.
2. to still somewhat adhere to the min inspect size, the body
tracker is not updated until the min inspect size is reached.
This will lead to some re-evaluation of the same body data.
If raw stream reassembly is disabled, this 'STREAM_FLUSH' flag is
never set, and the old behavior is used.
Bug #2522.
Remove the 'StreamMsg' approach from the engine. In this approach the
stream engine would create a list of chunks for inspection by the
detection engine. There were several issues:
1. the messages had a fixed size, so blocks of data bigger than ~4k
would be cut into multiple messages
2. it lead to lots of data copying and unnecessary memory use
3. the StreamMsgs used a central pool
The Stream engine switched over to the streaming buffer API, which
means that the reassembled data is always available. This made the
StreamMsg approach even clunkier.
The new approach exposes the streaming buffer data to the detection
engine. It has to pay attention to an important issue though: packet
loss. The data may have gaps. The streaming buffer API tracks the
blocks of continuous data.
To access the data for inspection a callback approach is used. The
'StreamReassembleRaw' function is called with a callback and data.
This way it runs the MPM and individual rule inspection code. At
the end of each detection run the stream engine is notified that it
can move forward it's 'progress'.
StreamMsg would have a fixed size buffer. This patch replaces the buffer
by a dynamically allocated buffer.
Preparation of allowing bigger and customizable buffer sizes.
The stream chunk pool contains preallocating stream chunks (StreamMsg).
These are used for raw reassembly, used in raw content inspection by
the detection engine. The default setting so far has been 250, which
was hardcoded. This meant that in setups that needed more, allocs and
frees would be happen constantly.
This patch introduces a yaml option to set the 'prealloc' value in the
pool. The default is still 250.
stream.reassembly.chunk-prealloc
Related to feature #1093.
StreamMsg' flow reference was used mostly to make sure a flow would
not get removed from the hash before inspection. For this it needed
to reference the flow use_cnt reference counter. Nowadays we have
more advanced flow timeout handling. This will make sure that if
there still are pending smsgs' in a flow, these will still be
processed.
This patch introduces a function called StreamMsgForEach which
can be used to run a callback on all segments of a stream. This
is currently only supported for TCP as this is the only streaming
aware protocol.
Split stream reassembly in 2 parts: a part that sends ack'd data to the app
layer parsers as soon as it's available, and another part that queues up
data into larger chunks for raw inspection.
- Implement "closing" state in flow.
- Add protocol specific timeouts.
- Lots of stream tracking updates, fixing a lot of out of window issues.
- Stream reassembly fixes.
- Implement a new IDS runmode with 4 stream and detect threads.
- Added a BUG_ON macro that aborts the engine if the expression is true.
- Better balance the flow queue handler for traffic that doesn't have flow (like icmp currently).
- Simplify application level protocol in the Tcp Session.
- Add some debugging memory counters.