Add the modbus.function and subfunction) keywords for public function match in rules (Modbus layer).
Matching based on code function, and if necessary, sub-function code
or based on category (assigned, unassigned, public, user or reserved)
and negation is permitted.
Add the modbus.access keyword for read/write Modbus function match in rules (Modbus layer).
Matching based on access type (read or write),
and/or function type (discretes, coils, input or holding)
and, if necessary, read or write address access,
and, if necessary, value to write.
For address and value matching, "<", ">" and "<>" is permitted.
Based on TLS source code and file size source code (address and value matching).
Signed-off-by: David DIALLO <diallo@et.esia.fr>
Decode Modbus request and response messages, and extracts
MODBUS Application Protocol header and the code function.
In case of read/write function, extracts message contents
(read/write address, quantity, count, data to write).
Links request and response messages in a transaction according to
Transaction Identifier (transaction management based on DNS source code).
MODBUS Messaging on TCP/IP Implementation Guide V1.0b
(http://www.modbus.org/docs/Modbus_Messaging_Implementation_Guide_V1_0b.pdf)
MODBUS Application Protocol Specification V1.1b3
(http://www.modbus.org/docs/Modbus_Application_Protocol_V1_1b3.pdf)
Based on DNS source code.
Signed-off-by: David DIALLO <diallo@et.esia.fr>
This patch adds a new Log API for streaming data such as TCP reassembled
data and HTTP body data. It could also replace Filedata API.
Each time a new chunk of data is available, the callback will be called.
EVE logging is a really good substitute for pcapinfo. Suriwire is
now supporting EVE output so it is not anymore necessary to have
pcapinfo in Suricata.
Remove local copies from each MPM file and use include file instead.
Might be better to also add util-memcpy.c rather than inlining it each time,
to get smaller code, since only seems to be used at initialization.
When running on a TILEncore-Gx PCIe card, setting the filetype of fast.log
to pcie, will open a connection over PCIe to a host application caleld
tile-pcie-logd, that receives the alert strings and writes them to a file
on the host. The file name to open is also passed over the PCIe link.
This allows running Suricata on the TILEncore-Gx PCIe card, but have the
alerts logged to the host system's file system efficiently. The PCIe API that
is used is the Tilera Packet Queue (PQ) API which can access PCIe from User
Space, thus avoiding system calls.
Created util-logopenfile-tile.c and util-logopen-tile.h for the TILE
specific PCIe logging functionality.
Using Write() and Close() function pointers in LogFileCtx, which
default to standard write and close for files and sockets, but are
changed to PCIe write and close functions when a PCIe channel is
openned for logging.
Moved Logging contex out of tm-modules.h into util-logopenfile.h,
where it makes more sense. This required including util-logopenfile.h
into a couple of alert-*.c files, which previously were getting the
definitions from tm-modules.h.
The source and Makefile for tile-pcie-logd are added in contrib/tile-pcie-logd.
By default, the file name for fast.log specified in suricata.yaml is used as
the filename on the host. An optional argument to tile-pcie-logd, --prefix=,
can be added to prepend the supplied file path. For example, is the file
in suricata.yaml is specified as "/var/log/fast.log" and --prefix="/tmp",
then the file will be written to "/tmp/var/log/fast.log".
Check for TILERA_ROOT environment variable before building tile_pcie_logd
Building tile_pcie_logd on x86 requires the Tilera MDE for its PCIe libraries
and API header files. Configure now checs for TILERA_ROOT before enabling
builing tile_pcie_logd in contrib/tile_pcie_logd
- restore json output-file.[ch] as output-json-file.[ch] after rebase conflict
- fix Makefile.am after merge conflict
- some dev-log-api-v4.0 rebase json fallout cleanup
A new logger API for registering file storage handlers. Where the
FileLog handler is called once per file, this handler will be called
for each data chunk so that storing the entire file is possible.
The logger call in the API is as follows:
typedef int (*FiledataLogger)(ThreadVars *, void *thread_data,
const Packet *, const File *, const FileData *, uint8_t flags);
All data is const, thus should be read only. The final flags field
is used to indicate to the caller that the file is new, or if it's
being closed.
Files use an internal unique id 'file_id' which can be used by the
loggers to create unique file names. This id can use the 'waldo'
feature of the log-filestore module. This patch moves that waldo
loading and storing logic to this API's implementation. A new
configuration directive 'file-store-waldo: <filename>' is added,
but the existing waldo settings will also continue to work.
This patch introduces a new logging API for logging extracted file info.
It allows for registration of a callback that is called once per file:
when it's considered 'closed'.
Users of this API register their Log Function through:
OutputRegisterFileModule()
The API uses a magic settings globally. This might be changed later.
This patch introduces a new API for logging transactions from
tx-aware app layer protocols. It runs all the registered loggers
from a single thread module. This thread module takes care of the
transaction handling and flow locking. The logger just gets a
transaction to log out.
All loggers for a protocol will be run at the same time, so there
will not be any timing differences.
Loggers will no longer act as Thread Modules in the strictest sense.
The Func is NULL, and SetupOuputs no longer attaches them to the
thread module chain individually. Instead, after registering through
OutputRegisterTxModule, the setup data is used in the single logging
module.
The logger (LogFunc) is called for each transaction once, at the end
of the transaction.
This patch introduces a new API for outputs that log based on the
packet, such as alert outputs. In converts fast-log to the new API.
The API gets rid of the concept of each logger being a thread module,
but instead there is one thread module that runs all packet loggers.
Through the registration function OutputRegisterPacketModule a log
module can register itself to be considered for each packet.
Each logger registers itself to this new API with 2 functions and the
OutputCtx object that was already used in the old implementation.
The function pointers are:
LogFunc: the log function
ConditionFunc: this function is called before the LogFunc and only
if this returns TRUE the LogFunc is called.
For a simple alert logger like fast-log, the condition function will
simply return TRUE if p->alerts.cnt > 0.
Move app layer event handling into app-layer-event.[ch].
Convert 'Set' macro's to functions.
Get rid of duplication in Set and SetRaw. Set now calls SetRaw.
Fix potentential int overflow condition in the event storage.
Update callers.
This patch introduces wrapper functions around allocation functions
to be able to have a global HTP memcap. A simple subsitution of
function was not enough because allocated size needed to be known
during freeing and reallocation.
The value of the memcap can be set in the YAML and is left by default
to unlimited (0) to avoid any surprise to users.
Split threads.h into several files, where each of these files defines
all lock types and macro's.
threads.h defines the normal case
threads-debug.h defines the debug variants
threads-profile.h defines the lock profiling variants
Finally, threads-arch-tile.h moves the Tilera specifics out
Move SIMD the implementations of SigMatchSignaturesBuildMatchArray()
for SSE3 and Tile out of detect.c to reduce the size of the file.
Also moved SIMD unit tests to detect-simd.c
Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
util-mpm-ac.c code base. The primary optimizations are:
1) Matching function used Tilera specific instructions.
2) Alphabet compression to reduce delta table size to increase cache
utilization and performance.
The basic observation is that not all 256 ASCII characters are used by
the set of multiple patterns in a group for which a DFA is
created. The first reason is that Suricata's pattern matching is
case-insensitive, so all uppercase characters are converted to
lowercase, leaving a hole of 26 characters in the
alphabet. Previously, this hole was simply left in the middle of the
alphabet and thus in the generated Next State (delta) tables.
A new, smaller, alphabet is created using a translation table of 256
bytes per mpm group. Previously, there was one global translation
table for converting upper case to lowercase.
Additional, unused characters are found by creating a histogram of all
the characters in all the patterns. Then all the characters with zero
counts are mapped to one character (0) in the new alphabet. Since
These characters appear in no pattern, they can all be mapped to a
single character and still result in the same matches being
found. Zero was chosen for the value in the new alphabet since this
"character" is more likely to appear in the input. The unused
character always results in the next state being state zero, but that
fact is not currently used by the code, since special casing takes
additional instructions.
The characters that do appear in some pattern are mapped to
consecutive characters in the new alphabet, starting at 1. This
results in a dense packing of next state values in the delta tables
and additionally can allow for a smaller number of columns in that
table, thus using less memory and better packing into the cache. The
size of the new alphabet is the number of used characters plus 1 for
the unused catch-all character.
The alphabet size is rounded up to the next larger power-of-2 so that
multiplication by the alphabet size can be done with a shift. It
might be possible to use a multiply instruction, so that the exact
alphabet size could be used, which would further reduce the size of
the delta tables, increase cache density and not require the
specialized search functions. The multiply would likely add 1 cycle to
the inner search loop.
Since the multiply by alphabet-size is cleverly merged with a mask
instruction (in the SINDEX macro), specialized versions of the
SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
and 16. This is done by including the file util-mpm-ac-small.c
multiple times with a redefined SINDEX macro. A function pointer is
then stored in the mpm context for the search function. For alpha bit
sizes of 8 or smaller, the number of states usually small, so the DFA
is already very small, so there is little difference using the 16
state search function.
The SCACSearch function is also specialized by the size of the value
stored in the next state (delta) tables, either 16-bits or 32-bits.
This removes a conditional inside the Search function. That
conditional is only called once, but doesn't hurt to remove
it. 16-bits are used for up to 32K states, with the sign bit set for
states with matches.
Future optimization:
The state-has-match values is only needed per state, not per next
state, so checking the next-state sign bit could be replaced with
reading a different value, at the cost of an additional load, but
increasing the 16-bit next state span to 64K.
Since the order of the characters in the new alphabet doesn't matter,
the new alphabet could be sorted by the frequency of the characters in
the expected input stream for that multi-pattern matcher. This would
group more frequent characters into the same cache lines, thus
increasing the probability of reusing a cache-line.
All the next state values for each state live in their own set of
cache-lines. With power-of-two sizes alphabets, these don't overlap.
So either 32 or 16 character's next states are loaded in each cache
line load. If the alphabet size is not an exact power-of-2, then the
last cache-line is not completely full and up to 31*2 bytes of that
line could be wasted per state.
The next state table could be transposed, so that all the next states
for a specific character are stored sequentially, this could be better
if some characters, for example the unused character, are much more
frequent.
This patch is using the qa/log directory to store the output
of the check. In case of success, the directory is deleted.
In case of failure, the directory remains in place.
This should fixes#910.
The list-keyword and app-layer listing code was spread over all the
init code. This patch introduces a separate file to store non standard
running mode like these ones.
The TILE-Gx processor includes a packet processing engine, called
mPIPE, that can deliver packets directly into user space memory. It
handles buffer allocation and load balancing (either static 5-tuple
hashing, or dynamic flow affinity hashing are used here). The new
packet source code is in source-mpipe.c and source-mpipe.h
A new Tile runmode is added that configures the Suricata pipelines in
worker mode, where each thread does the entire packet processing
pipeline. It scales across all the Gx chips sizes of 9, 16, 36 or 72
cores. The new runmode is in runmode-tile.c and runmode-tile.h
The configure script detects the TILE-Gx architecture and defines
HAVE_MPIPE, which is then used to conditionally enable the code to
support mPIPE packet processing. Suricata runs on TILE-Gx even without
mPIPE support enabled.
The Suricata Packet structures are allocated by the mPIPE hardware by
allocating the Suricata Packet structure immediatley before the mPIPE
packet buffer and then pushing the mPIPE packet buffer pointer onto
the mPIPE buffer stack. This way, mPIPE writes the packet data into
the buffer, returns the mPIPE packet buffer pointer, which is then
converted into a Suricata Packet pointer for processing inside
Suricata. When the Packet is freed, the buffer is returned to mPIPE's
buffer stack, by setting ReleasePacket to an mPIPE release specific
function.
The code checks for the largest Huge page available in Linux when
Suricata is started. TILE-Gx supports Huge pages sizes of 16MB, 64MB,
256MB, 1GB and 4GB. Suricata then divides one of those page into
packet buffers for mPIPE.
The code is not yet optimized for high performance. Performance
improvements will follow shortly.
The code was originally written by Tom Decanio and then further
modified by Tilera.
This code has been tested with Tilera's Multicore Developement
Environment (MDE) version 4.1.5. The TILEncore-Gx36 (PCIe card) and
TILEmpower-Gx (1U Rack mount).
In preparation of the libhtp upgrade, move all libhtp related conditionals
to configure. This allows for one set of build scripts that works regardless
of the presence of a local libhtp dir.
Adds support for match-on conditions (src, dst, any, both)
Uses GEOIP_MEMORY_CACHE for performance reasons
Adds support for negation and multiple countries in the same rule
Bug fixes
Changed to take flow direction from rule, if present
Comments addressed. Unit tests added.
This patch introduces a unix command socket. JSON formatted messages
can be exchanged between suricata and a program connecting to a
dedicated socket.
The protocol is the following:
* Client connects to the socket
* It sends a version message: { "version": "$VERSION_ID" }
* Server answers with { "return": "OK|NOK" }
If server returns OK, the client is now allowed to send command.
The format of command is the following:
{
"command": "pcap-file",
"arguments": { "filename": "smtp-clean.pcap", "output-dir": "/tmp/out" }
}
The server will try to execute the "command" specified with the
(optional) provided "arguments".
The answer by server is the following:
{
"return": "OK|NOK",
"message": JSON_OBJECT or information string
}
A simple script is provided and is available under scripts/suricatasc. It
is not intended to be enterprise-grade tool but it is more a proof of
concept/example code. The first command line argument of suricatasc is
used to specify the socket to connect to.
Configuration of the feature is made in the YAML under the 'unix-command'
section:
unix-command:
enabled: yes
filename: custom.socket
The path specified in 'filename' is not absolute and is relative to the
state directory.
A new running mode called 'unix-socket' is also added.
When starting in this mode, only a unix socket manager
is started. When it receives a 'pcap-file' command, the manager
start a 'pcap-file' running mode which does not really leave at
the end of file but simply exit. The manager is then able to start
a new running mode with a new file.
To start this mode, Suricata must be started with the --unix-socket
option which has an optional argument which fix the file name of the
socket. The path is not absolute and is relative to the state directory.
THe 'pcap-file' command adds a file to the list of files to treat.
For each pcap file, a pcap file running mode is started and the output
directory is changed to what specified in the command. The running
mode specified in the 'runmode' YAML setting is used to select which
running mode must be use for the pcap file treatment.
This requires modification in suricata.c file where initialisation code
is now conditional to the fact 'unix-socket' mode is not used.
Two other commands exists to get info on the remaining tasks:
* pcap-file-number: return the number of files in the waiting queue
* pcap-file-list: return the list of waiting files
'pcap-file-list' returns a structured object as message. The
structure is the following:
{
'count': 2,
'files': ['file1.pcap', 'file2.pcap']
}
This patch creates a pid file per default and use it to avoid to be
able to run two Suricata. Separate pid file have to be provided to
be able to do it.
Removed the Napatech 2GD support
runmode-napatech-3gd.c had an include from runmode-napatech.h which was erroneous and has been removed as well.
Signed-off-by: Matt Keeler <mk@npulsetech.com>
For use with Network Cards from Napatech utilizing the 3GD driver/api.
- Implemented new run modes in runmode-napatech-3gd.*
- Implemented capture/decode threads in source-napatech-3gd.*
- Integrated the new run modes and source into the build infrastructure.
New configure switches
--enabled-napatech-3gd : Turns on the NT 3GD support
--with-napatech-3gd-includes : The directory containing the NT 3GD header files
--with-napatech-3gd-libraries : The directory containing the NT 3GD libraries to link against.
New CLI switch
--napatech-3gd : Uses the Napatech 3GD run mode
Runmodes Supported:
- auto
- autofp
- workers
Notes:
- tested with 1 Gbps sustained traffic (no drops)
Signed-off-by: Matt Keeler <mk@npulsetech.com>
This patch adds a l3_proto keyword to the signature language. It
can be used to specify if the signature has to match on IPv4, IPv6
or both. For example, one can write:
alert http any any -> any 22 (msg: "HTTP v6"; l3_proto:ip6; sid:14;)
This should close#494.
Creation of the log-tlslog file in order to log tls message.
Need to add some information into suricata.yaml to work.
- tls-log:
enabled: yes # Log TLS connections.
filename: tls.log # File to store TLS logs.
This patch should fix#480 by adding the support of Teredo tunnel.
The IPv6 content of the tunnel will be parsed in a similar way as
what is done the GRE tunnel. Signatures will then be matched on the
IPv6 content.
Found here:
http://burtleburtle.net/bob/hash/doobs.htmlhttp://burtleburtle.net/bob/c/lookup3.c
From the file header:
lookup3.c, by Bob Jenkins, May 2006, Public Domain.
These are functions for producing 32-bit hashes for hash table lookup.
hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final()
are externally useful functions. Routines to test the hash are included
if SELF_TEST is defined. You can use this free for any purpose. It's in
the public domain. It has no warranty.
Add a decoder for the SERVER_CERTIFICATE during a TLS handshake, extracts the
certificates and keep the subject name.
Add the tls.subject keyword for substring match in rules (TLS layer).
Signed-off-by: Pierre Chifflier <pierre.chifflier@ssi.gouv.fr>
Add profiling per lock location in the code. Accounts how often a
lock is requested, how often it was contended, the max number of
ticks spent waiting for it, avg number of ticks waiting for it and
the total ticks for that location.
Added a new configure flag --enable-profiling-locks to enable this
feature.
Add a host table similar to the flow table. A hash using fine grained
locking. Flow manager for now takes care of book keeping / garbage
collecting.
Tag subsystem now uses this for host based tagging instead of the
global tag hash table. Because the latter used a global lock and the
new code uses very fine grained locking this patch should improve
scalability.
The http_server_body content modifier modifies the previous content to inspect
the normalized (dechunked, unzipped) http_server_body. The workings are similar
to http_client_body. Additionally, a new pcre flag was introduced "/S".
To facilitate this change the signature flags field was changed to be 64 bit.
This patch adds a new alert format called pcap-info. It aims at
providing an easy to parse one-line per-alert format containing
the packet id in the parsed pcap for each alert. This permit to
add information inside the pcap parser.
This format is made to be used with suriwire which is a plugin for
wireshark. Its target is to enable the display of suricata results
inside wireshark.
This format doesn't use append mode per default because a clean file
is needed to operate with wireshark.
The format is a list of values separated by ':':
Packet number:GID of matching signature:SID of signature:REV of signature:Flow:To Server:To Client:0:0:Message of signature
The two zero are not yet used values. Candidate for usage is the
part of the packet that matched the signature.
This patch adds support for the replace keyword. It is used with
content to change selected part of the payload. The major point
with this patch is that having a replace keyword made necessary
to avoid all stream level check because we need to access to the
could-be-modified packet payload.
One of the main difficulty is to handle complex signature. If there is
other content check, we must do the substitution when we're sure all
match are valid. The patch adds an attribute to the thread context
variable to be able to deal with recursivity of the match function.
Replace is only activated in IPS mode and apply only to raw match.
Per packet per app layer parser profiling. Example summary output:
Per App layer parser stats:
App Layer IP ver Proto cnt min max avg
-------------------- ------ ----- ------ ------ ---------- -------
ALPROTO_HTTP IPv4 6 163394 126 38560320 42814
ALPROTO_FTP IPv4 6 644 117 26100 2566
ALPROTO_TLS IPv4 6 670 117 7137 799
ALPROTO_SMB IPv4 6 114794 126 225270 957
ALPROTO_DCERPC IPv4 6 5207 126 25596 1266
Also added to the csv out.
In the csv out there is a new column "stream (no app)" that removes the
app layer parsers from the stream tracking. So raw stream engine performance
becomes visible.
Per packet profiling uses tick based accounting. It has 2 outputs, a summary
and a csv file that contains per packet stats.
Stats per packet include:
1) total ticks spent
2) ticks spent per individual thread module
3) "threading overhead" which is simply calculated by subtracting (2) of (1).
A number of changes were made to integrate the new code in a clean way:
a number of generic enums are now placed in tm-threads-common.h so we can
include them from any part of the engine.
Code depends on --enable-profiling just like the rule profiling code.
New yaml parameters:
profiling:
# packet profiling
packets:
# Profiling can be disabled here, but it will still have a
# performance impact if compiled in.
enabled: yes
filename: packet_stats.log
append: yes
# per packet csv output
csv:
# Output can be disabled here, but it will still have a
# performance impact if compiled in.
enabled: no
filename: packet_stats.csv
Example output of summary stats:
IP ver Proto cnt min max avg
------ ----- ------ ------ ---------- -------
IPv4 6 19436 11448 5404365 32993
IPv4 256 4 11511 49968 30575
Per Thread module stats:
Thread Module IP ver Proto cnt min max avg
------------------------ ------ ----- ------ ------ ---------- -------
TMM_DECODEPCAPFILE IPv4 6 19434 1242 47889 1770
TMM_DETECT IPv4 6 19436 1107 137241 1504
TMM_ALERTFASTLOG IPv4 6 19436 90 1323 155
TMM_ALERTUNIFIED2ALERT IPv4 6 19436 108 1359 138
TMM_ALERTDEBUGLOG IPv4 6 19436 90 1134 154
TMM_LOGHTTPLOG IPv4 6 19436 414 5392089 7944
TMM_STREAMTCP IPv4 6 19434 828 1299159 19438
The proto 256 is a counter for handling of pseudo/tunnel packets.
Example output of csv:
pcap_cnt,ipver,ipproto,total,TMM_DECODENFQ,TMM_VERDICTNFQ,TMM_RECEIVENFQ,TMM_RECEIVEPCAP,TMM_RECEIVEPCAPFILE,TMM_DECODEPCAP,TMM_DECODEPCAPFILE,TMM_RECEIVEPFRING,TMM_DECODEPFRING,TMM_DETECT,TMM_ALERTFASTLOG,TMM_ALERTFASTLOG4,TMM_ALERTFASTLOG6,TMM_ALERTUNIFIEDLOG,TMM_ALERTUNIFIEDALERT,TMM_ALERTUNIFIED2ALERT,TMM_ALERTPRELUDE,TMM_ALERTDEBUGLOG,TMM_ALERTSYSLOG,TMM_LOGDROPLOG,TMM_ALERTSYSLOG4,TMM_ALERTSYSLOG6,TMM_RESPONDREJECT,TMM_LOGHTTPLOG,TMM_LOGHTTPLOG4,TMM_LOGHTTPLOG6,TMM_PCAPLOG,TMM_STREAMTCP,TMM_DECODEIPFW,TMM_VERDICTIPFW,TMM_RECEIVEIPFW,TMM_RECEIVEERFFILE,TMM_DECODEERFFILE,TMM_RECEIVEERFDAG,TMM_DECODEERFDAG,threading
1,4,6,172008,0,0,0,0,0,0,47889,0,0,48582,1323,0,0,0,0,1359,0,1134,0,0,0,0,0,8028,0,0,0,49356,0,0,0,0,0,0,0,14337
First line of the file contains labels.
2 example gnuplot scripts added to plot the data.