Move SIMD the implementations of SigMatchSignaturesBuildMatchArray()
for SSE3 and Tile out of detect.c to reduce the size of the file.
Also moved SIMD unit tests to detect-simd.c
Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
util-mpm-ac.c code base. The primary optimizations are:
1) Matching function used Tilera specific instructions.
2) Alphabet compression to reduce delta table size to increase cache
utilization and performance.
The basic observation is that not all 256 ASCII characters are used by
the set of multiple patterns in a group for which a DFA is
created. The first reason is that Suricata's pattern matching is
case-insensitive, so all uppercase characters are converted to
lowercase, leaving a hole of 26 characters in the
alphabet. Previously, this hole was simply left in the middle of the
alphabet and thus in the generated Next State (delta) tables.
A new, smaller, alphabet is created using a translation table of 256
bytes per mpm group. Previously, there was one global translation
table for converting upper case to lowercase.
Additional, unused characters are found by creating a histogram of all
the characters in all the patterns. Then all the characters with zero
counts are mapped to one character (0) in the new alphabet. Since
These characters appear in no pattern, they can all be mapped to a
single character and still result in the same matches being
found. Zero was chosen for the value in the new alphabet since this
"character" is more likely to appear in the input. The unused
character always results in the next state being state zero, but that
fact is not currently used by the code, since special casing takes
additional instructions.
The characters that do appear in some pattern are mapped to
consecutive characters in the new alphabet, starting at 1. This
results in a dense packing of next state values in the delta tables
and additionally can allow for a smaller number of columns in that
table, thus using less memory and better packing into the cache. The
size of the new alphabet is the number of used characters plus 1 for
the unused catch-all character.
The alphabet size is rounded up to the next larger power-of-2 so that
multiplication by the alphabet size can be done with a shift. It
might be possible to use a multiply instruction, so that the exact
alphabet size could be used, which would further reduce the size of
the delta tables, increase cache density and not require the
specialized search functions. The multiply would likely add 1 cycle to
the inner search loop.
Since the multiply by alphabet-size is cleverly merged with a mask
instruction (in the SINDEX macro), specialized versions of the
SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
and 16. This is done by including the file util-mpm-ac-small.c
multiple times with a redefined SINDEX macro. A function pointer is
then stored in the mpm context for the search function. For alpha bit
sizes of 8 or smaller, the number of states usually small, so the DFA
is already very small, so there is little difference using the 16
state search function.
The SCACSearch function is also specialized by the size of the value
stored in the next state (delta) tables, either 16-bits or 32-bits.
This removes a conditional inside the Search function. That
conditional is only called once, but doesn't hurt to remove
it. 16-bits are used for up to 32K states, with the sign bit set for
states with matches.
Future optimization:
The state-has-match values is only needed per state, not per next
state, so checking the next-state sign bit could be replaced with
reading a different value, at the cost of an additional load, but
increasing the 16-bit next state span to 64K.
Since the order of the characters in the new alphabet doesn't matter,
the new alphabet could be sorted by the frequency of the characters in
the expected input stream for that multi-pattern matcher. This would
group more frequent characters into the same cache lines, thus
increasing the probability of reusing a cache-line.
All the next state values for each state live in their own set of
cache-lines. With power-of-two sizes alphabets, these don't overlap.
So either 32 or 16 character's next states are loaded in each cache
line load. If the alphabet size is not an exact power-of-2, then the
last cache-line is not completely full and up to 31*2 bytes of that
line could be wasted per state.
The next state table could be transposed, so that all the next states
for a specific character are stored sequentially, this could be better
if some characters, for example the unused character, are much more
frequent.
This patch is using the qa/log directory to store the output
of the check. In case of success, the directory is deleted.
In case of failure, the directory remains in place.
This should fixes#910.
The list-keyword and app-layer listing code was spread over all the
init code. This patch introduces a separate file to store non standard
running mode like these ones.
The TILE-Gx processor includes a packet processing engine, called
mPIPE, that can deliver packets directly into user space memory. It
handles buffer allocation and load balancing (either static 5-tuple
hashing, or dynamic flow affinity hashing are used here). The new
packet source code is in source-mpipe.c and source-mpipe.h
A new Tile runmode is added that configures the Suricata pipelines in
worker mode, where each thread does the entire packet processing
pipeline. It scales across all the Gx chips sizes of 9, 16, 36 or 72
cores. The new runmode is in runmode-tile.c and runmode-tile.h
The configure script detects the TILE-Gx architecture and defines
HAVE_MPIPE, which is then used to conditionally enable the code to
support mPIPE packet processing. Suricata runs on TILE-Gx even without
mPIPE support enabled.
The Suricata Packet structures are allocated by the mPIPE hardware by
allocating the Suricata Packet structure immediatley before the mPIPE
packet buffer and then pushing the mPIPE packet buffer pointer onto
the mPIPE buffer stack. This way, mPIPE writes the packet data into
the buffer, returns the mPIPE packet buffer pointer, which is then
converted into a Suricata Packet pointer for processing inside
Suricata. When the Packet is freed, the buffer is returned to mPIPE's
buffer stack, by setting ReleasePacket to an mPIPE release specific
function.
The code checks for the largest Huge page available in Linux when
Suricata is started. TILE-Gx supports Huge pages sizes of 16MB, 64MB,
256MB, 1GB and 4GB. Suricata then divides one of those page into
packet buffers for mPIPE.
The code is not yet optimized for high performance. Performance
improvements will follow shortly.
The code was originally written by Tom Decanio and then further
modified by Tilera.
This code has been tested with Tilera's Multicore Developement
Environment (MDE) version 4.1.5. The TILEncore-Gx36 (PCIe card) and
TILEmpower-Gx (1U Rack mount).
In preparation of the libhtp upgrade, move all libhtp related conditionals
to configure. This allows for one set of build scripts that works regardless
of the presence of a local libhtp dir.
Adds support for match-on conditions (src, dst, any, both)
Uses GEOIP_MEMORY_CACHE for performance reasons
Adds support for negation and multiple countries in the same rule
Bug fixes
Changed to take flow direction from rule, if present
Comments addressed. Unit tests added.
This patch introduces a unix command socket. JSON formatted messages
can be exchanged between suricata and a program connecting to a
dedicated socket.
The protocol is the following:
* Client connects to the socket
* It sends a version message: { "version": "$VERSION_ID" }
* Server answers with { "return": "OK|NOK" }
If server returns OK, the client is now allowed to send command.
The format of command is the following:
{
"command": "pcap-file",
"arguments": { "filename": "smtp-clean.pcap", "output-dir": "/tmp/out" }
}
The server will try to execute the "command" specified with the
(optional) provided "arguments".
The answer by server is the following:
{
"return": "OK|NOK",
"message": JSON_OBJECT or information string
}
A simple script is provided and is available under scripts/suricatasc. It
is not intended to be enterprise-grade tool but it is more a proof of
concept/example code. The first command line argument of suricatasc is
used to specify the socket to connect to.
Configuration of the feature is made in the YAML under the 'unix-command'
section:
unix-command:
enabled: yes
filename: custom.socket
The path specified in 'filename' is not absolute and is relative to the
state directory.
A new running mode called 'unix-socket' is also added.
When starting in this mode, only a unix socket manager
is started. When it receives a 'pcap-file' command, the manager
start a 'pcap-file' running mode which does not really leave at
the end of file but simply exit. The manager is then able to start
a new running mode with a new file.
To start this mode, Suricata must be started with the --unix-socket
option which has an optional argument which fix the file name of the
socket. The path is not absolute and is relative to the state directory.
THe 'pcap-file' command adds a file to the list of files to treat.
For each pcap file, a pcap file running mode is started and the output
directory is changed to what specified in the command. The running
mode specified in the 'runmode' YAML setting is used to select which
running mode must be use for the pcap file treatment.
This requires modification in suricata.c file where initialisation code
is now conditional to the fact 'unix-socket' mode is not used.
Two other commands exists to get info on the remaining tasks:
* pcap-file-number: return the number of files in the waiting queue
* pcap-file-list: return the list of waiting files
'pcap-file-list' returns a structured object as message. The
structure is the following:
{
'count': 2,
'files': ['file1.pcap', 'file2.pcap']
}
This patch creates a pid file per default and use it to avoid to be
able to run two Suricata. Separate pid file have to be provided to
be able to do it.
Removed the Napatech 2GD support
runmode-napatech-3gd.c had an include from runmode-napatech.h which was erroneous and has been removed as well.
Signed-off-by: Matt Keeler <mk@npulsetech.com>
For use with Network Cards from Napatech utilizing the 3GD driver/api.
- Implemented new run modes in runmode-napatech-3gd.*
- Implemented capture/decode threads in source-napatech-3gd.*
- Integrated the new run modes and source into the build infrastructure.
New configure switches
--enabled-napatech-3gd : Turns on the NT 3GD support
--with-napatech-3gd-includes : The directory containing the NT 3GD header files
--with-napatech-3gd-libraries : The directory containing the NT 3GD libraries to link against.
New CLI switch
--napatech-3gd : Uses the Napatech 3GD run mode
Runmodes Supported:
- auto
- autofp
- workers
Notes:
- tested with 1 Gbps sustained traffic (no drops)
Signed-off-by: Matt Keeler <mk@npulsetech.com>
This patch adds a l3_proto keyword to the signature language. It
can be used to specify if the signature has to match on IPv4, IPv6
or both. For example, one can write:
alert http any any -> any 22 (msg: "HTTP v6"; l3_proto:ip6; sid:14;)
This should close#494.
Creation of the log-tlslog file in order to log tls message.
Need to add some information into suricata.yaml to work.
- tls-log:
enabled: yes # Log TLS connections.
filename: tls.log # File to store TLS logs.
This patch should fix#480 by adding the support of Teredo tunnel.
The IPv6 content of the tunnel will be parsed in a similar way as
what is done the GRE tunnel. Signatures will then be matched on the
IPv6 content.
Found here:
http://burtleburtle.net/bob/hash/doobs.htmlhttp://burtleburtle.net/bob/c/lookup3.c
From the file header:
lookup3.c, by Bob Jenkins, May 2006, Public Domain.
These are functions for producing 32-bit hashes for hash table lookup.
hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final()
are externally useful functions. Routines to test the hash are included
if SELF_TEST is defined. You can use this free for any purpose. It's in
the public domain. It has no warranty.
Add a decoder for the SERVER_CERTIFICATE during a TLS handshake, extracts the
certificates and keep the subject name.
Add the tls.subject keyword for substring match in rules (TLS layer).
Signed-off-by: Pierre Chifflier <pierre.chifflier@ssi.gouv.fr>
Add profiling per lock location in the code. Accounts how often a
lock is requested, how often it was contended, the max number of
ticks spent waiting for it, avg number of ticks waiting for it and
the total ticks for that location.
Added a new configure flag --enable-profiling-locks to enable this
feature.
Add a host table similar to the flow table. A hash using fine grained
locking. Flow manager for now takes care of book keeping / garbage
collecting.
Tag subsystem now uses this for host based tagging instead of the
global tag hash table. Because the latter used a global lock and the
new code uses very fine grained locking this patch should improve
scalability.
The http_server_body content modifier modifies the previous content to inspect
the normalized (dechunked, unzipped) http_server_body. The workings are similar
to http_client_body. Additionally, a new pcre flag was introduced "/S".
To facilitate this change the signature flags field was changed to be 64 bit.
This patch adds a new alert format called pcap-info. It aims at
providing an easy to parse one-line per-alert format containing
the packet id in the parsed pcap for each alert. This permit to
add information inside the pcap parser.
This format is made to be used with suriwire which is a plugin for
wireshark. Its target is to enable the display of suricata results
inside wireshark.
This format doesn't use append mode per default because a clean file
is needed to operate with wireshark.
The format is a list of values separated by ':':
Packet number:GID of matching signature:SID of signature:REV of signature:Flow:To Server:To Client:0:0:Message of signature
The two zero are not yet used values. Candidate for usage is the
part of the packet that matched the signature.
This patch adds support for the replace keyword. It is used with
content to change selected part of the payload. The major point
with this patch is that having a replace keyword made necessary
to avoid all stream level check because we need to access to the
could-be-modified packet payload.
One of the main difficulty is to handle complex signature. If there is
other content check, we must do the substitution when we're sure all
match are valid. The patch adds an attribute to the thread context
variable to be able to deal with recursivity of the match function.
Replace is only activated in IPS mode and apply only to raw match.
Per packet per app layer parser profiling. Example summary output:
Per App layer parser stats:
App Layer IP ver Proto cnt min max avg
-------------------- ------ ----- ------ ------ ---------- -------
ALPROTO_HTTP IPv4 6 163394 126 38560320 42814
ALPROTO_FTP IPv4 6 644 117 26100 2566
ALPROTO_TLS IPv4 6 670 117 7137 799
ALPROTO_SMB IPv4 6 114794 126 225270 957
ALPROTO_DCERPC IPv4 6 5207 126 25596 1266
Also added to the csv out.
In the csv out there is a new column "stream (no app)" that removes the
app layer parsers from the stream tracking. So raw stream engine performance
becomes visible.
Per packet profiling uses tick based accounting. It has 2 outputs, a summary
and a csv file that contains per packet stats.
Stats per packet include:
1) total ticks spent
2) ticks spent per individual thread module
3) "threading overhead" which is simply calculated by subtracting (2) of (1).
A number of changes were made to integrate the new code in a clean way:
a number of generic enums are now placed in tm-threads-common.h so we can
include them from any part of the engine.
Code depends on --enable-profiling just like the rule profiling code.
New yaml parameters:
profiling:
# packet profiling
packets:
# Profiling can be disabled here, but it will still have a
# performance impact if compiled in.
enabled: yes
filename: packet_stats.log
append: yes
# per packet csv output
csv:
# Output can be disabled here, but it will still have a
# performance impact if compiled in.
enabled: no
filename: packet_stats.csv
Example output of summary stats:
IP ver Proto cnt min max avg
------ ----- ------ ------ ---------- -------
IPv4 6 19436 11448 5404365 32993
IPv4 256 4 11511 49968 30575
Per Thread module stats:
Thread Module IP ver Proto cnt min max avg
------------------------ ------ ----- ------ ------ ---------- -------
TMM_DECODEPCAPFILE IPv4 6 19434 1242 47889 1770
TMM_DETECT IPv4 6 19436 1107 137241 1504
TMM_ALERTFASTLOG IPv4 6 19436 90 1323 155
TMM_ALERTUNIFIED2ALERT IPv4 6 19436 108 1359 138
TMM_ALERTDEBUGLOG IPv4 6 19436 90 1134 154
TMM_LOGHTTPLOG IPv4 6 19436 414 5392089 7944
TMM_STREAMTCP IPv4 6 19434 828 1299159 19438
The proto 256 is a counter for handling of pseudo/tunnel packets.
Example output of csv:
pcap_cnt,ipver,ipproto,total,TMM_DECODENFQ,TMM_VERDICTNFQ,TMM_RECEIVENFQ,TMM_RECEIVEPCAP,TMM_RECEIVEPCAPFILE,TMM_DECODEPCAP,TMM_DECODEPCAPFILE,TMM_RECEIVEPFRING,TMM_DECODEPFRING,TMM_DETECT,TMM_ALERTFASTLOG,TMM_ALERTFASTLOG4,TMM_ALERTFASTLOG6,TMM_ALERTUNIFIEDLOG,TMM_ALERTUNIFIEDALERT,TMM_ALERTUNIFIED2ALERT,TMM_ALERTPRELUDE,TMM_ALERTDEBUGLOG,TMM_ALERTSYSLOG,TMM_LOGDROPLOG,TMM_ALERTSYSLOG4,TMM_ALERTSYSLOG6,TMM_RESPONDREJECT,TMM_LOGHTTPLOG,TMM_LOGHTTPLOG4,TMM_LOGHTTPLOG6,TMM_PCAPLOG,TMM_STREAMTCP,TMM_DECODEIPFW,TMM_VERDICTIPFW,TMM_RECEIVEIPFW,TMM_RECEIVEERFFILE,TMM_DECODEERFFILE,TMM_RECEIVEERFDAG,TMM_DECODEERFDAG,threading
1,4,6,172008,0,0,0,0,0,0,47889,0,0,48582,1323,0,0,0,0,1359,0,1134,0,0,0,0,0,8028,0,0,0,49356,0,0,0,0,0,0,0,14337
First line of the file contains labels.
2 example gnuplot scripts added to plot the data.
This patch introduces 'nfq_set_mark' which is new rules option. If a packet
matches a rule using nfq_set_mark in NFQ mode, it is marked with the mark/mask
specified in the option during the verdict.
It is thus possible to trigger different behaviour on the packet inside
Linux/Netfilter.
Use the --dag <dagname> cmd line option to specify from which DAG card to read pkts
from.
Issue at the moment with pkts being ejected during shutdown -- at the moment we
ignore any packets that are not of link type Ethernet.
Stateful detection for app layer detection keywords, except uricontent. Stores it's partial results in the flow structure. Other modifications:
- Generalize transaction tracking, logging and inspection.
- Adapt http and dcerpc to use the new transaction handling.
- Stream engine now always notifies app layer of a stream eof.
This commit fixes bug #124.
Add support for reporting alerts to the Prelude SIEM system, using
libprelude to send IDMEF (RFC4765) messages.
Each message contains the alert description and reference (using
the SID/GID), and a normalized description (assessment, impact,
sources etc.)
libprelude handles the connection with the manager (collecting component),
spooling and sending the event asynchronously. It also offers transport
security (using TLS and trusted certificates) and reliability (events
are retransmitted if not sent successfully).
This modules requires a Prelude profile to work (see man prelude-admin
and the Prelude Handbook for help).
Signed-off-by: Pierre Chifflier <chifflier@edenwall.com>
Hello,
Below is a patch that gets "make distcheck" working. Its against the
current code in git. The project version was set to 0.1 in configure,
I changed that to 0.8.1 just so its actually relevant. You might want
to set that to something else.
After checking this patch, I find that there are several source code
files in src/ that are not getting compiled:
-app-layer-detect.c
-app-layer-detect.h
-app-layer-http.c
-reputation.h
Are these new or abandoned? Anyways...here's the patch.
-Steve