Until now variable names, such as flowbit names, were local to a detect
engine. This made sense as they were only ever used in that context.
For the purpose of logging these names, this needs a different approach.
The loggers live outside of the detect engine. Also, in the case of
reloads and multi-tenancy, there are even multiple detect engines, so
it would be even more tricky to access them from the outside.
This patch brings a new approach. A any time, there is a single active
hash table mapping the variable names and their id's. For multiple
tenants the table is shared between tenants.
The table is set up in a 'staging' area, where locking makes sure that
multiple loading threads don't mess things up. Then when the preparing
of a detection engine is ready, but before the detect threads are made
aware of the new detect engine, the active varname hash is swapped with
the staging instance.
For this to work, all the mappings from the 'current' or active mapping
are added to the staging table.
After the threads have reloaded and the new detection engine is active,
the old table can be freed.
For multi tenancy things are similar. The staging area is used for
setting up until the new detection engines / tenants are applied to
the system.
This patch also changes the variable 'id'/'idx' field to uint32_t. Due
to data structure padding and alignment, this should have no practical
drawback while allowing for a lot more vars.
Matches on the start of a HTTP request or response.
Uses a buffer constructed from the request line and normalized request
headers, including the Cookie header.
Or for the response side, it uses the response line plus the
normalized response headers, including the Set-Cookie header.
Both buffers are terminated by an extra \r\n.
A sticky buffer that allows content inspection on a contructed buffer
of HTTP header names. The buffer starts with \r\n, the names are
separated by \r\n and the end of the buffer contains an extra \r\n.
E.g. \r\nHost\r\nUser-Agent\r\n\r\n
The leading \r\n is to make sure one can match on a full name in all
cases.
Some keywords need a scratch space where they can do store the results
of expensive operations that remain valid for the time of a packets
journey through the detection engine.
An example is the reconstructed 'http_header' field, that is needed
in MPM, and then for each rule that manually inspects it. Storing this
data in the flow is a waste, and reconstructing multiple times on
demand as well.
This API allows for registering a keyword with an init and free function.
It it mean to be used an initialization time, when the keyword is
registered.