When an app-layer parser is enabled, it could set its
own stream_depth value calling the API AppLayerParserSetStreamDepth.
Then, the function AppLayerParserPostStreamSetup will replace
the stream_depth value already set with stream_config.reassembly_depth.
To avoid overwriting, in AppLayerParserSetStreamDepth API a flag
will be set internally to specify that a value is already set.
Also remove the now useless 'state' argument from the SetTxDetectState
calls. For those app-layer parsers that use a state == tx approach,
the state pointer is passed as tx.
Update app-layer parsers to remove the unused call and update the
modified call.
Until now, the transaction space is assumed to be terse. Transactions
are handled sequentially so the difference between the lowest and highest
active tx id's is small. For this reason the logic of walking every id
between the 'minimum' and max id made sense. The space might look like:
[..........TTTT]
Here the looping starts at the first T and loops 4 times.
This assumption isn't a great fit though. A protocol like NFS has 2 types
of transactions. Long running file transfer transactions and short lived
request/reply pairs are causing the id space to be sparse. This leads to
a lot of unnecessary looping in various parts of the engine, but most
prominently: detection, tx house keeping and tx logging.
[.T..T...TTTT.T]
Here the looping starts at the first T and loops for every spot, even
those where no tx exists anymore.
Cases have been observed where the lowest tx id was 2 and the highest
was 50k. This lead to a lot of unnecessary looping.
This patch add an alternative approach. It allows a protocol to register
an iterator function, that simply returns the next transaction until
all transactions are returned. To do this it uses a bit of state the
caller must keep.
The registration is optional. If no iterator is registered the old
behaviour will be used.
Free txs that are done out of order if we can. Some protocol
implementations have transactions running in parallel, where it is
possible that a tx that started later finishes earlier than other
transactions. Support freeing those.
Also improve handling on asynchronious transactions. If transactions
are unreplied, e.g. in the dns flood case, the parser may at some
point free transactions on it's own. Handle this case in
the app-layer engine so that the various tracking id's (inspect, log,
and 'min') are updated accordingly.
Next, free txs much more aggressively. Instead of freeing old txs
at the app-layer parsing stage, free all complete txs at the end
of the flow-worker. This frees txs much sooner in many cases.
Use per tx detect_flags to track prefilter. Detect flags are used for 2
things:
1. marking tx as fully inspected
2. tracking already run prefilter (incl mpm) engines
This supercedes the MpmIDs API for directionless tracking
of the prefilter engines.
When we have no SGH we have to flag the txs that are 'complete'
as inspected as well.
Special handling for the stream engine:
If a rule mixes TX inspection and STREAM inspection, we can encounter
the case where the rule is evaluated against multiple transactions
during a single inspection run. As the stream data is exactly the same
for each of those runs, it's wasteful to rerun inspection of the stream
portion of the rule.
This patch enables caching of the stream 'inspect engine' result in
the local 'RuleMatchCandidateTx' array. This is valid only during the
live of a single inspection run.
Remove stateful inspection from 'mask' (SignatureMask). The mask wasn't
used in most cases for those rules anyway, as there we rely on the
prefilter. Add a alproto check to catch the remaining cases.
When building the active non-mpm/non-prefilter list check not just
the mask, but also the alproto. This especially helps stateful rules
with negated mpm.
Simplify AppLayerParserHasDecoderEvents usage in detection to only
return true if protocol detection events are set. Other detection is done
in inspect engines.
Move rule group lookup and handling into it's own function. Handle
'post lookup' tasks immediately, instead of after the first detect
run. The tasks were independent of the initial detection.
Many cleanups and much refactoring.
Add API meant to replace the MpmIDs API. It uses a u64 for each direction
in a tx to keep track of 2 things:
1. is inspection done?
2. which prefilter engines (like mpm) are already completed
Avoid looping in transaction output.
Update app-layer API to store the bits in one step
and retrieve the bits in a single step as well.
Update users of the API.
Create a bitmap of the loggers per protocol. This is done at runtime
based on the loggers that are enabled. Take the logger_id for each
logger and store it as a bitmap in the app-layer protcol storage.
Goal is to be able to use it as an expectation later.
TCP reassembly is now deactivated more frequently and triggering a
bypass on it is resulting in missing some alerts due forgetting
about packet based signature.
So this patch is introducing a dedicated flag that can be set in
the app layer and transmitted in the streaming to trigger bypass.
It is currently used by the SSL app layer to trigger bypass when
the stream becomes encrypted.
A parser can now set a flag that will tell the application
layer that it is capable of handling gaps. If enabled, and a
gap occurs, the app-layer needs to be prepared to accept
input that is NULL with a length, where the length is the
number of bytes lost. It is up to the app-layer to
determine if it can sync up with the input data again.
In various scenarios buffers would be checked my MPM more than
once. This was because the buffers would be inspected for a
certain progress value or higher.
For example, for each packet in a file upload, the engine would
not just rerun the 'http client body' MPM on the new data, it
would also rerun the method, uri, headers, cookie, etc MPMs.
This was obviously inefficent, so this patch changes the logic.
The patch only runs the MPM engines when the progress is exactly
the intended progress. If the progress is beyond the desired
value, it is run once. A tracker is added to the app layer API,
where the completed MPMs are tracked.
Implemented for HTTP, TLS and SSH.
Prevents the case where the logged id is incremented if a newer
transaction is complete and an older one is still outstanding.
For example, dns request0, unsolicited dns response, dns response0
would result in the valid response0 never being logged.
Similarily this could happen for:
request0, request1, response1, response0
which would end up having request0, request1 and response1 logged,
but response0 would not be logged.
This permits to set a stream depth value for each
app-layer.
By default, the stream depth specified for tcp is set,
then it's possible to specify a own value into the app-layer
module with a proper API.
To be able to add a transaction counter we will need a ThreadVars
in the AppLayerParserParse function.
This function is massively used in unittests
and this result in an long commit.
This function globally checks if the protocol is registered and
enabled by testing for the per alproto callback:
StateGetProgressCompletionStatus
This check is to be used before enabling Tx-aware code, like loggers.
Add function AppLayerParserRegisterLoggerFuncs for registering
a callback function for checking if a specific logger has logged
a transaction, and a callback function for specifying that it has.
Also add functions AppLayerParserGetTxLogged and
AppLayerParserSetTxLogged to invoke these callback functions.
Change AppLayerParserRegisterGetStateProgressCompletionStatus to
only store one ProgressCompletionStatus callback function for each
alproto, instead of storing one for each ipproto.
This enables us to use AppLayerParserGetStateProgressCompletionStatus
in functions where we do not know the ipproto used.
This patch introduces a new set of commandline options meant for
assisting in fuzz testing the app layer implementations.
Per protocol, 2 commandline options are added:
--afl-http-request=<filename>
--afl-http=<filename>
In the former case, the contents of the file are passed directly to
the HTTP parser as request data.
In the latter case, the data is devided between request and responses.
First 64 bytes are request, then next 64 are response, next 64 are
request, etc, etc.
The new API call:
int AppLayerParserProtocolHasLogger(uint8_t ipproto,
AppProto alproto)
Returns TRUE if a logger is registered on the ip/alproto pair, and
FALSE otherwise.
When running w/o detect, TX cleanup handling needs to ignore the
inspect_id as it's only updated by detect.
This patch introduces a new ActiveTx handler for logging only:
AppLayerTransactionGetActiveLogOnly
If --disable-detection is passed on the commandline, this function
is registered.
Move app layer event handling into app-layer-event.[ch].
Convert 'Set' macro's to functions.
Get rid of duplication in Set and SetRaw. Set now calls SetRaw.
Fix potentential int overflow condition in the event storage.
Update callers.
In preparation of a patchset that will allow for disabling the detect
module, this patch introduces a way to register a function for getting
the lowest active tx id. This is used by the app layer for cleaning up
transactions that already fully inspected, and by the flow timeout code
to determine if a flow is fully inspected and logged at timeout.
The registration function RegisterAppLayerGetActiveTxIdFunc allows for
registration of a custom function of type:
uint64_t (*GetActiveTxIdFunc)(Flow *f, uint8_t flags);
If no function is called, AppLayerTransactionGetActiveDetectLog is used,
which implements the existing behaviour of considering both the
inspect_id's and the log_id.
For AppLayerThreadCtx, AppLayerParserState, AppLayerParserThreadCtx
and AppLayerProtoDetectThreadCtx, use opaque pointers instead of
void pointers.
AppLayerParserState is declared in flow.h as it's part of the Flow
structure.
AppLayerThreadCtx is declared in decode.h, as it's part of the
DecodeThreadVars structure.