Commit Graph

87 Commits (02218a8a42b684eeaabe1ce97325ffdc6eeb208b)

Author SHA1 Message Date
Victor Julien c6c93d1d12 detect/stream: convert to inspect API v2 4 years ago
Victor Julien 51f38f6453 detect/payload: minor formatting fixup 4 years ago
Victor Julien 58e48bcb87 debug: make it easier to trace flush logic 5 years ago
Victor Julien 32fb7d773a detect/content-inspect: turn void arg into Packet
Replace the 'void *data' argument by a 'Packet *p' as this was
the only user left of the data pointer.
6 years ago
Victor Julien 7186ce7b99 stream: introduce min inspect depth logic
Some rules need to inspect both raw stream data and higher level
buffers together. When this higher level buffer is a streaming
buffer itself, the risk of mismatch exists.

This patch allows an app-layer parser to set a 'min inspect depth'.
The value is used by the stream engine to keep at least this
depth worth of data, so that the detection engine can request
all of it for inspection.

For rules that have the SIG_FLAG_FLUSH flag set, data is inspected
not from offset raw_progress, but from raw_progress minus
min_inspect_depth.

At this time this is only used for sigs that have their fast_pattern
in a HTTP body and have raw stream match as well.
6 years ago
Victor Julien ee1d207454 detect: add debug statements to stream inspect 6 years ago
Victor Julien d14e51a4aa detect/content: pass START/END flags to inspection 7 years ago
Victor Julien 91296d1eec detect/prefilter: add de_ctx to registration 7 years ago
Victor Julien af51e0f5a1 detect: rewrite of the detect engine
Use per tx detect_flags to track prefilter. Detect flags are used for 2
things:
1. marking tx as fully inspected
2. tracking already run prefilter (incl mpm) engines

This supercedes the MpmIDs API for directionless tracking
of the prefilter engines.

When we have no SGH we have to flag the txs that are 'complete'
as inspected as well.

Special handling for the stream engine:

If a rule mixes TX inspection and STREAM inspection, we can encounter
the case where the rule is evaluated against multiple transactions
during a single inspection run. As the stream data is exactly the same
for each of those runs, it's wasteful to rerun inspection of the stream
portion of the rule.

This patch enables caching of the stream 'inspect engine' result in
the local 'RuleMatchCandidateTx' array. This is valid only during the
live of a single inspection run.

Remove stateful inspection from 'mask' (SignatureMask). The mask wasn't
used in most cases for those rules anyway, as there we rely on the
prefilter. Add a alproto check to catch the remaining cases.

When building the active non-mpm/non-prefilter list check not just
the mask, but also the alproto. This especially helps stateful rules
with negated mpm.

Simplify AppLayerParserHasDecoderEvents usage in detection to only
return true if protocol detection events are set. Other detection is done
in inspect engines.

Move rule group lookup and handling into it's own function. Handle
'post lookup' tasks immediately, instead of after the first detect
run. The tasks were independent of the initial detection.

Many cleanups and much refactoring.
7 years ago
Victor Julien 746638b220 cuda: remove
Remove CUDA support as it has been broken for a long time.

Ticket #2382.
7 years ago
Victor Julien 664f826f8d detect/dns: fix misdetection on dns_query on udp
If 'raw' content patterns were used in a dns_query rule, the raw
patterns would only be evaluated for TCP, but not for UDP.

This patch adds the inspection for UDP as well.

Bug #2263.
7 years ago
Victor Julien 885b8cefec detect: fix crash when stream inspect runs on UDP
Certain rules can apply to both TCP and UDP. For example 'alert dns'
rules are inspected against both TCP and UDP. This lead to the
stream inspect engine being called on a UDP packet.

This patch fixes the issue by exiting early from the stream inspect
engine if a) proto is not TCP or b) ssn is not available

Bug #2158.
7 years ago
Victor Julien ab1200fbd7 compiler: more strict compiler warnings
Set flags by default:

    -Wmissing-prototypes
    -Wmissing-declarations
    -Wstrict-prototypes
    -Wwrite-strings
    -Wcast-align
    -Wbad-function-cast
    -Wformat-security
    -Wno-format-nonliteral
    -Wmissing-format-attribute
    -funsigned-char

Fix minor compiler warnings for these new flags on gcc and clang.
8 years ago
Victor Julien 1bbf555318 detect: improve stateful detection
Now that MPM runs when the TX progress is right, stateful detection
operates differently.

Changes:

1. raw stream inspection is now also an inspect engine

   Since this engine doesn't take the transactions into account, it
   could potentially run multiple times on the same data. To avoid
   this, basic result caching is in place.

2. the engines are sorted by progress, but the 'MPM' engine is first
   even if the progress is higher

   If MPM flags a rule to be inspected, the inspect engine for that
   buffer runs first. If this step fails, the rule is no longer
   evaluated. No state is stored.
8 years ago
Victor Julien aba9cd7d02 stream inspection: add debug counters 8 years ago
Victor Julien d6d7f65050 stream: mpm inspect micro optimizations 8 years ago
Victor Julien 0ef46a8fd2 stream: raw content inspection inline mode
Implement the inline mode for raw content inspection. Packets
are leading, and when a packet's payload has been added to the
stream, the packet is inspected in the context of the stream.

Reassembly will return a buffer with the packet data with older
data in front of it and after it, if available.
8 years ago
Victor Julien 971ab18b95 detect / stream: new 'raw' stream inspection
Remove the 'StreamMsg' approach from the engine. In this approach the
stream engine would create a list of chunks for inspection by the
detection engine. There were several issues:

1. the messages had a fixed size, so blocks of data bigger than ~4k
   would be cut into multiple messages

2. it lead to lots of data copying and unnecessary memory use

3. the StreamMsgs used a central pool

The Stream engine switched over to the streaming buffer API, which
means that the reassembled data is always available. This made the
StreamMsg approach even clunkier.

The new approach exposes the streaming buffer data to the detection
engine. It has to pay attention to an important issue though: packet
loss. The data may have gaps. The streaming buffer API tracks the
blocks of continuous data.

To access the data for inspection a callback approach is used. The
'StreamReassembleRaw' function is called with a callback and data.
This way it runs the MPM and individual rule inspection code. At
the end of each detection run the stream engine is notified that it
can move forward it's 'progress'.
8 years ago
Victor Julien 8edc954e82 detect: get rid of Signature::sm_lists
Instead use the lists in init_data during setup and the SigMatchData
arrays during runtime.
8 years ago
Victor Julien bd456076a8 detect: pass SigMatchData to inspect functions 8 years ago
Victor Julien 5e0b0eea4b detect: remove unused flags 8 years ago
Victor Julien bfd4bc8233 detect: constify Signature/SigMatch use at runtime 8 years ago
Sascha Steinbiss e6044aaf1c mpm/spm: check for SSSE3 and enable/disable HS
The new Hyperscan 4.4 API provides a function to check for SSSE3
presence at runtime. This allows us to fall back to non-Hyperscan
matchers on systems without SSSE3 even when the suricata executable
is built with Hyperscan support. Addresses Redmine issue #2010.

Signed-off-by: Sascha Steinbiss <sascha@steinbiss.name>
Tested-by: Arturo Borrero Gonzalez <arturo@debian.org>
8 years ago
Victor Julien 9bb12ccb27 prefilter: move payload engines into separate list 8 years ago
Victor Julien 8798bf48b2 profiling: support prefilter engines 8 years ago
Victor Julien 9ff5703c49 packet/stream: mpm prefilter engine 8 years ago
Jason Ish 796dd5223b tests: no longer necessary to provide successful return code
1 pass, 0 is fail.
9 years ago
Victor Julien a2223bb066 mpm: consify packet/stream search 9 years ago
Victor Julien 87f3adbe4c detect/mpm: unify packet/stream mpm_ctx pointers
SGH's for tcp and udp are now always only per proto and per direction.
This means we can simply reuse the packet and stream mpm pointers.

The SGH's for the other protocols already used a directionless catch
all mpm pointer.
9 years ago
Victor Julien e43c4f3ea2 mpm: optimize calls
For all mpm wrapper functions, check minlen vs the input buffer to see
if we can bypass the mpm search.

Next to this, make all the function inline. Also constify the input and
do other minor cleanups.
9 years ago
Victor Julien 6bb2b001a3 mpm: cleanup: move mpm funcs into buffer specific files 9 years ago
Victor Julien 4f8e1f59a6 mpm: remove obsolete mpm algos
Remove: ac-gfbs, wumanber, b2g, b3g.
9 years ago
Ken Steele 8f1d75039a Enforce function coding standard
Functions should be defined as:

int foo(void)
{
}

Rather than:
int food(void) {
}

All functions where changed by a script to match this standard.
10 years ago
Victor Julien 5e1bc99e5b detect: cleanup
Remove unused alstate and app layer flags arguments from
DetectEngineInspectPacketPayload()
11 years ago
Victor Julien 79c924af8c Fix 2 compiler warnings
FreeBSD 10 32-bit with clang 3.3:

log-tlslog.c:172:14: error: format specifies type 'long' but the argument has type 'time_t' (aka 'int') [-Werror,-Wformat]
             p->ts.tv_sec,
             ^~~~~~~~~~~~
1 error generated.

detect-engine-payload.c:508:27: warning: format specifies type 'long' but the argument has type 'time_t' (aka 'int') [-Wformat]
    printf("%ld.%06ld\n", tv_diff.tv_sec, (long int)tv_diff.tv_usec);
            ~~~           ^~~~~~~~~~~~~~
            %d
1 warning generated.
11 years ago
Anoop Saldanha 3749fc98fd Modify handling of negated content.
The old behaviour of returning a failure if we found a pattern while
matching on negated content is now changed to continuing searching
for other combinations where we don't find the pattern for the
negated content.

Thanks to Will Metcalf for reporting this.
11 years ago
Ken Steele e05034f5dd New Multi-pattern matcher, ac-tile, optimized for Tile architecture.
Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
util-mpm-ac.c code base. The primary optimizations are:
1) Matching function used Tilera specific instructions.
2) Alphabet compression to reduce delta table size to increase cache
   utilization  and performance.

The basic observation is that not all 256 ASCII characters are used by
the set of multiple patterns in a group for which a DFA is
created. The first reason is that Suricata's pattern matching is
case-insensitive, so all uppercase characters are converted to
lowercase, leaving a hole of 26 characters in the
alphabet. Previously, this hole was simply left in the middle of the
alphabet and thus in the generated Next State (delta) tables.

A new, smaller, alphabet is created using a translation table of 256
bytes per mpm group. Previously, there was one global translation
table for converting upper case to lowercase.

Additional, unused characters are found by creating a histogram of all
the characters in all the patterns. Then all the characters with zero
counts are mapped to one character (0) in the new alphabet. Since
These characters appear in no pattern, they can all be mapped to a
single character and still result in the same matches being
found. Zero was chosen for the value in the new alphabet since this
"character" is more likely to appear in the input. The unused
character always results in the next state being state zero, but that
fact is not currently used by the code, since special casing takes
additional instructions.

The characters that do appear in some pattern are mapped to
consecutive characters in the new alphabet, starting at 1. This
results in a dense packing of next state values in the delta tables
and additionally can allow for a smaller number of columns in that
table, thus using less memory and better packing into the cache. The
size of the new alphabet is the number of used characters plus 1 for
the unused catch-all character.

The alphabet size is rounded up to the next larger power-of-2 so that
multiplication by the alphabet size can be done with a shift.  It
might be possible to use a multiply instruction, so that the exact
alphabet size could be used, which would further reduce the size of
the delta tables, increase cache density and not require the
specialized search functions. The multiply would likely add 1 cycle to
the inner search loop.

Since the multiply by alphabet-size is cleverly merged with a mask
instruction (in the SINDEX macro), specialized versions of the
SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
and 16.  This is done by including the file util-mpm-ac-small.c
multiple times with a redefined SINDEX macro. A function pointer is
then stored in the mpm context for the search function. For alpha bit
sizes of 8 or smaller, the number of states usually small, so the DFA
is already very small, so there is little difference using the 16
state search function.

The SCACSearch function is also specialized by the size of the value
stored in the next state (delta) tables, either 16-bits or 32-bits.
This removes a conditional inside the Search function. That
conditional is only called once, but doesn't hurt to remove
it. 16-bits are used for up to 32K states, with the sign bit set for
states with matches.

Future optimization:

The state-has-match values is only needed per state, not per next
state, so checking the next-state sign bit could be replaced with
reading a different value, at the cost of an additional load, but
increasing the 16-bit next state span to 64K.

Since the order of the characters in the new alphabet doesn't matter,
the new alphabet could be sorted by the frequency of the characters in
the expected input stream for that multi-pattern matcher. This would
group more frequent characters into the same cache lines, thus
increasing the probability of reusing a cache-line.

All the next state values for each state live in their own set of
cache-lines. With power-of-two sizes alphabets, these don't overlap.
So either 32 or 16 character's next states are loaded in each cache
line load. If the alphabet size is not an exact power-of-2, then the
last cache-line is not completely full and up to 31*2 bytes of that
line could be wasted per state.

The next state table could be transposed, so that all the next states
for a specific character are stored sequentially, this could be better
if some characters, for example the unused character, are much more
frequent.
11 years ago
Anoop Saldanha bd6896bee1 Unit-tests exposing a bug in byte_test, byte_jump and byte_extract.
Bug emanates from all the keywords being unable to handle negative offsets
when the inspection pointer is at the end of the buffer.
12 years ago
Anoop Saldanha ab4b15c2e7 fix for #788.
Now depth is kept in mind when we inspect chunks in client/server body.
This takes care of FPs originating from inspecting subsequent chunks that
match with depth, but shouldn't.
12 years ago
Anoop Saldanha aa363a8144 unittest to display #784. 12 years ago
Victor Julien 0c84a7a2a9 Use _mm_free for memory allocated by _mm_alloc. Bug 703. Minor compiler warning fixes. 12 years ago
Victor Julien 472e061c6d build: more checking for includes 12 years ago
Anoop Saldanha 19e8f82f25 Unittest to display #bug 529. pcre anchor not respected 12 years ago
Anoop Saldanha a34f91358d tests to highlight that
- suricata treates sigs with offset/depth without any packet keywords as stream sigs
- as a consequence suricata will FN on such sigs

The tests introduced here will fail, displaying the issues.  The
next patch in the series would fix the said issues.
13 years ago
Anoop Saldanha 37f66e5f46 update handling negative offsets in byte_extract. Also improve validation in byte_extract to not extract values out of the buffer range 13 years ago
Anoop Saldanha 603d4a719a remove det_ctx->payload_offset and use det_ctx->buffer_offset. Update hscd and hsmd to use the new generic content inspection engine 13 years ago
Anoop Saldanha d1d5507679 remove all old content inspection engines and references to them. We have cleaned the entire content inspection phase and improved alert accuracy 13 years ago
Anoop Saldanha 35f1f7e8d9 unify payload detection engines + fix other bugs in pcre init 13 years ago
Anoop Saldanha 7433d92dd2 undo this commit -
commit eff08f93d8
Author: Anoop Saldanha <poonaatsoc@gmail.com>
Date:   Thu Nov 3 14:31:24 2011 +0530

    update failing unittest to reflect the mpm design update

Fixed a bug in the mpm code that would make all the changes in the commit just undone wrong.
13 years ago
Anoop Saldanha eff08f93d8 update failing unittest to reflect the mpm design update 13 years ago