To be able to identify mails with identical subjects without
using the subject itself as a key, it is possible to use the md5
hash of the subjet string. This allows to limit the privacy impact.
Some mail clients are using tabulation and/or space for comma
separated list. This patch removes them so the event will contain
only significative characters.
This patch adds a way to specify which MIME fields to log via
the custom keyword in the EVE configuration. it also adds an
extended logging where some fields are added. The logging support
mono value fields as well as multivalue fields via the use of
JSON array.
This patch introduces a new function that can be used to handle
multivalued MIME fields. A callback function can be called for
each corresponding field value.
If the status is not PARSE_DONE then in that case we may have
imcomplete information. Increasing the stream reassemly depth
in that case would be a good idea.
The body_md5 has been added and contain the value of the md5sum
of the body.
This patch is using the state PARSE_DONE on the MIME state to
detect when a message has been completely parsed.
This patch is computing the md5 sum of the body of the MIME message.
This will allow to detect messages with same content and sent to
different people.
This patch changes the way smtp message are written. It is using
the "email" key to store the email related fields. This will
allow to do the same search through SMTP and IMAP if we implement
this last one.
The mpm_uricontent_maxlen logic was meant to track the shortest
possible pattern in the MPM of a SGH. So a minlen more than a maxlen.
This patch replaces the complicated tracking logic by a simpler
scheme. When the SGH's are finalize, the minlen is calculated.
It also fixes a small corner case where the calculated "maxlen" could
be wrong. This would require a smaller pattern in a rule to be forced
as fast pattern.
More exceptional cases for protocol detection. In very unbalanced flows,
where just a few bytes are sent toserver and many toclient, proto detect
might not complete in time on the toserver direction. This can lead to
queuing up many segments in the toclient direction.
Another case is that in come cases the stream is flagged as proto detect
done, but the flows proto detect flags are not set. This is now handled
by the ProtoDetectDone() check.
Due to a broken sequence number check, detect could fail to process
smsgs in case of a sequence wrap. This could lead to excessive use
of smsg's but also of segments, since these aren't cleared until the
smsg containing them is.
Use the reassembly fast paths only after protocol detection has completed.
In some corner cases the sending of smaller segments lead to protocol
detection failing.
If the protocol required TOSERVER data first, but the SSN started with
a GAP, then the TOCLIENT side would get stuck in an expensive path:
1. it would run detection on TOCLIENT
2. it would try to force reassembly for TOSERVER
3. it would reset the detected protocol as TOSERVER failed
4. it would not evict any segment
This had 2 consequences:
1. on long running sessions this could lead to using lots of memory
on segments, denying other sessions resources
2. wasted cycles on protocol detection and segment list management
This patch introduces a fix. It checks in the (2) stage above, whether
the opposing stream (that we depend on) it is a NOREASSEMBLY state. If
so, it gives up on this side of the session as well.
In case of protocol detection not yet being complete, the segment
list was walked unconditionally to unset the app layer processed
flag. Optimize this to bail on the first segment that doesn't have
the flag set.
Support TLS in Lua detection scripts.
function init (args)
local needs = {}
needs["tls"] = tostring(true)
return needs
end
function match(args)
version, subject, issuer, fingerprint = TlsGetCertInfo();
if version == nil then
return 0
end
str = string.format("Version %s\nIssuer %s\nSubject %s\nFingerprint %s",
version, issuer, subject, fingerprint)
SCLogInfo(str);
return 1
end
Use simple bool values to track the transaction state in both directions.
A tx is only created in two cases:
1. full request parsed
2. response parsed (request missing)
This is true even for multi-packet TCP requests.
This leads to the following tx completion logic for the request side:
the presence of a tx implies the request is complete
On the response side, we consider the tx complete when we have seen
the response. If the DNS parser thinks the response was lost, we also
flag the response side as complete.
Remove most of the CFLAGS updates from configure. Flags are now (mostly)
set in AM_CLFLAGS.
Update all -DBLAH additions to CFLAGS to use AC_DEFINE([BLAH], ...)
Improve Lua vs LuaJIT checking.
Improve the configure output a bit.
Lots of smaller cleanups.
It's not uncommon to see an header like:
X-Forwarded-For: 1.2.3.4:56789
This patch recognizes this case and ignores the port. It also supports
this for IPv6 if the address has the following notation:
X-Forwarded-For: [12::34]:1234
This patch also adds unittests.
Move the tenant load and reload commands to be executed by the detect
loader thread(s).
Limitation: no yaml parsing in parallel. The Conf API is currently not
thread safe, so don't load the tenant config (yaml) in parallel.
To speed up startup with many tenants, tenant loading will be parallelized.
As no tempary threads should be used for these memory allocation heavy
tasks, this patch adds new type of 'command' thread that can be used to
load and reload tenants.
This patch hardcodes the number of loaders to 4. Future work will make it
dynamic.
The loader thread essentially sleeps constantly. When a tasks is sent to
it, it will wake up and execute it.
Store the tenant id in the flow and use the stored id when setting
up pesudo packets.
For tunnel and defrag packets, get tenant from parent. This will only
pass tenant_id's set at capture time.
For defrag packets, the tenant selector based on vlan id will still
work as the vlan id(s) are stored in the defrag tracker before being
passed on.
Make threshold loading prefix aware, so it can be part of tenant
configuration.
If the setting is missing from the tenant, the global setting is tried
and if that too is missing, the global default is used.
Note: currently per host thresholds are tracked globally and NOT per
tenant.
Make reference loading prefix aware, so it can be part of tenant
configuration.
If the setting is missing from the tenant, the global setting is tried
and if that too is missing, the global default is used.
Make classification loading prefix aware, so it can be part of tenant
configuration.
If the setting is missing from the tenant, the global setting is tried
and if that too is missing, the global default is used.
Allow for a tenant to be reloaded. The command is the same as the
register-tenant command, so with a yaml and tenant-id as argument.
However this replaces an existing tenant.
Register tenant handlers/selectors based on what the unix command
"register-tenant-handler" tells.
Check traffic id before adding it. No duplicated registrations for
a traffic id are allowed.
Make available to live mode and unix socket mode.
register-tenant:
Loads a new YAML, does basic validation.
Loads a new detection engine
Loads rules
Add new de_ctx to master store and stores tenant id in the de_ctx so
we can look it up by tenant id later.
unregister-tenant:
Gets the de_ctx, moves it to the freelist
Removes config
Introduce DetectEngineGetByTenantId, which gets a reference to the
detect engine by tenant id.
If a flow was 'pass'd, it means that no packet of it will flow be handled
by the detection engine. A side effect of this was that the per flow
inspect_id would never be moved forward. This in turn lead to a situation
where transactions wouldn't be freed.
This patch addresses this case by incrementing the inspect_id anyway for
the pass case.
Stream GAPs and stream reassembly depth are tracked per direction. In
many cases they will happen in one direction, but not in the other.
Example:
HTTP requests a generally smaller than responses. So on the response
side we may hit the depth limit, but not on the request side.
The asynchronious 'disruption' has a side effect in the transaction
engine. The 'progress' tracking would never mark such transactions
as complete, and thus some inspection and logging wouldn't happen
until the very last moment: when EOF's are passed around.
Especially in proxy environments with _very_ many transactions in a
single TCP connection, this could lead to serious resource issues. The
EOF handling would suddenly have to handle thousands or more
transactions. These transactions would have been stored for a long time.
This patch introduces the concept of disruption flags. Flags passed to
the tx progress logic that are and indication of disruptions in the
traffic or the traffic handling. The idea is that the progress is
marked as complete on disruption, even if a tx is not complete. This
allows the detection and logging engines to process the tx after which
it can be cleaned up.
The app layer state 'version' field is incremented with each update
to the state. It is used by the detection engine to see if the current
version of the state has already been inspected. Since app layer and
detect always run closely together there is no need for a big number
here. The detect code really only checks for equal/not-equal, so wrap
arounds are not an issue.
Set noinspection flags for payloads and packets on flow and stream
pseudo packets. Without these, the pseudo packets could trigger
inspection even though this was disabled for a flow.
This patch implements the rollover option in af_packet capture.
This should heavily minimize the packet drops as well as the
maximum bandwidth treated for a single flow.
The option has been deactivated by default but it is activated in
the af_packet default section. This ensure there is no change for
old users using an existing YAML. And new users will benefit from
the change.
This option is available since Linux 3.10. An analysis of af_packet
kernel code shows that setting the flag in all cases should not
cause any trouble for older kernel.
This patch implements the fanout load balancing modes available
in kernel 4.0. The more interesting is cluster_qm that does the
load balancing based on the RSS queues. So if the network card
is doing a flow based load balancing then a given socket will
receive all packets of a flow indepently of the CPU affinity.
Sync the replacement define with the latest Linux code.
This patch also updates the detection part in configure.ac
to do a declaration of all fields if the newest features are
not present.
When handling error cases on creation of a new idmef field, we are in an unlikely case. This patch adds the unlikely() expression to indicate this to gcc.
Add 'FatalError' and 'FatalErrorConditonal' that will take the same
args as SCLogError.
FatalError logs the error using SCLogError and then exits with return
code EXIT_FAILURE.
FatalErrorOnInit does the same only during init and with
--init-errors-fatal enabled, otherwise it just calls SCLogWarning. So
then the macro returns to the caller.
Implement this for output setup.
In case we can't write in the certs directory, this is possible
we flood the log for each TLS session or even worse each TLS
packet. So this patch puts a limit in the number of logged
messages related to file creation.
This patch implements backward compatibility in suricata.yaml
file. In case the new 'tls-store' output is not present in the
YAML we have to use the value defined in 'tls-log'.
An design error was made when doing the TLS storage module which
has been made dependant of the TLS logging. At the time there was
only one TLS logging module but there is now two different ones.
By putting the TLS store module in a separate module, we can now
use EVE output and TLS store at the same time.
`DetectStreamSizeParse` was first checking if mode[0] is '<', which is true for both '<' and '<=', thus '<=' (and resp. '>=') is never matched. This patch does the `strcmp` to '<=' (resp. '>=') within the if block of '<' (resp. '>') to fix#1488.
Currently the following rule can't be loaded:
alert tcp any any -> any 25 (msg:"SMTP file_data test"; flow:to_server,established; file_data; content:"abc";sid:1;)
and produces the error output:
"Can't use file_data with flow:to_server or from_client with http or smtp."
This checks if the alproto is not http in a signature,
so permits to use flow keyword also.
Issue reported by rmkml.
CID 1298891: Null pointer dereferences (REVERSE_INULL)
Null-checking "curr_file" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
On receiving TCP end of stream packets (e.g. RST, but also sometimes FIN
packets), in some cases the AppLayer parser would not be notified. This
could happen in IDS mode, but would especially be an issue in IPS mode.
This patch changes the logic of the AppLayer API to handle this. When no
new data is available, and the stream ends, the AppLayer API now gets
called with a NULL/0 input, but with the EOF flag set.
This allows the AppLayer parser to call it's final routines still in the
context of a real packet.
StreamMsg would have a fixed size buffer. This patch replaces the buffer
by a dynamically allocated buffer.
Preparation of allowing bigger and customizable buffer sizes.
The flow timeout mechanism called both from the flow manager at run time
and at shutdown creates pseudo packets. For this it has it's own packet
pool, which can be depleted if the timeout logic is faster than the packet
processing threads. In this case the flow timeout would enter a wait loop.
The problem however, is that this wait loop would happen while keeping a
flow locked. This could lead to a race condition when the packet thread(s)
are waiting for the lock that the flow manager has.
This patch introduces a new packet pool call 'PacketPoolWaitForN', meant
to make sure that the thread's packet pool has at least N available
packets. The flow timeout paths use this to make sure enough packets are
available *before* grabbing the flow lock. If there aren't enough packets
available yet, the wait happens before the lock as well.
This still means the wait can happen while the flow hash row is locked, so
we do make sure some more packets are available when entering that. But
perhaps in the future we need a more precise logic there as well.