Fixes nfqueue and delayed-detect.
On systems with small amount of traffic (or with no traffic at all)
nfqueue with 'delayed-detect' enabled hanged in 'workers' mode.
Bug #2362.
Change the decode handler signature to increase the size of its decode
handler, from uint16 to uint32. This is necessary to let suricata use
interfaces with mtu > 65535 (ex: lo interface has default size 65536).
It's necessary to change several primitive for Packet manipulation, to
unify the parameter "packet length" whenever we are before IP decoding.
Add tests before calling DecodeIPVX function to avoid a possible
integer overflow over the len parameter.
Device storage requires the devices to be created after storage
is finalized so we need to first get the list of devices then
create them when the storage is finalized.
This patch introduces the LiveDeviceName structure that is a list
of device name used during registration.
Code uses LiveRegisterDeviceName for pre registration and keep
using the LiveRegisterDevice function for part of the code that
create the interface during the runmode creation.
On MinGW the result of ntohl needs to be casted to uint32_t and
the result of ntohs to uint16_t. To avoid doing this everywhere
add SCNtohl and SCNtohs macros.
Observed:
STARTTLS creates 2 pseudo packets which are tied to a real packet.
TPR (tunnel packet ref) counter increased to 2.
Pseudo 1: goes through 'verdict', increments 'ready to verdict' to 1.
Packet pool return code frees this packet and decrements TPR in root
to 1. RTV counter not changed. So both are now 1.
Pseudo 2: verdict code sees RTV == TPR, so verdict is set based on
pseudo packet. This is too soon. Packet pool return code frees this
packet and decrements TPR in root to 0.
Real packet: TRP is 0 so set verdict on this packet. As verdict was
already set, NFQ reports an issue.
The decrementing of TPR doesn't seem to make sense as RTV is not
updated.
Solution:
This patch refactors the ref count and verdict count logic. The beef
is now handled in the generic function TmqhOutputPacketpool(). NFQ
and IPFW call a utility function VerdictTunnelPacket to see if they
need to verdict a packet.
Remove some unused macro's for managing these counters.
Set flags by default:
-Wmissing-prototypes
-Wmissing-declarations
-Wstrict-prototypes
-Wwrite-strings
-Wcast-align
-Wbad-function-cast
-Wformat-security
-Wno-format-nonliteral
-Wmissing-format-attribute
-funsigned-char
Fix minor compiler warnings for these new flags on gcc and clang.
In case of a tunnel packet, adding a mark to the root packet will have
for consequence to bypass all the flows that are hosted in this tunnel.
This is not the attended behavior and as initial fix let's simply warn
suricata that bypass for NFQ is not possible for this kind of packets.
This patch also fixes a segfault. The root packet was accessed even if it is
NULL causing a NULL dereference:
ASAN:SIGSEGV
=================================================================
==24408==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000060 (pc 0x00000076f948 bp 0x7f435c000240 sp 0x7f435c000220 T5)
ASAN:SIGSEGV
==24408==AddressSanitizer: while reporting a bug found another one. Ignoring.
#0 0x76f947 in NFQBypassCallback /home/victor/dev/suricata/src/source-nfq.c:510
#1 0x4d0f02 in PacketBypassCallback /home/victor/dev/suricata/src/decode.c:395
#2 0x7b8a95 in StreamTcpPacket /home/victor/dev/suricata/src/stream-tcp.c:4661
#3 0x7b9ddd in StreamTcp /home/victor/dev/suricata/src/stream-tcp.c:4913
#4 0x68fa50 in FlowWorker /home/victor/dev/suricata/src/flow-worker.c:194
#5 0x7f0abd in TmThreadsSlotVarRun /home/victor/dev/suricata/src/tm-threads.c:128
#6 0x7f2958 in TmThreadsSlotVar /home/victor/dev/suricata/src/tm-threads.c:585
#7 0x7f436368e6f9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76f9)
#8 0x7f4362802b5c in clone (/lib/x86_64-linux-gnu/libc.so.6+0x106b5c)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/victor/dev/suricata/src/source-nfq.c:510 NFQBypassCallback
Thread T5 (W#04) created by T0 (Suricata-Main) here:
#0 0x7f4364ff2253 in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x36253)
#1 0x7f9c48 in TmThreadSpawn /home/victor/dev/suricata/src/tm-threads.c:1843
#2 0x8da7c0 in RunModeSetIPSAutoFp /home/victor/dev/suricata/src/util-runmodes.c:519
#3 0x73e3ff in RunModeIpsNFQAutoFp /home/victor/dev/suricata/src/runmode-nfq.c:74
#4 0x7503fa in RunModeDispatch /home/victor/dev/suricata/src/runmodes.c:382
#5 0x7e5cb3 in main /home/victor/dev/suricata/src/suricata.c:2547
#6 0x7f436271c82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
This patch adds a new callback PktAcqBreakLoop() in TmModule to let
packet acquisition modules define "break-loop" functions to terminate
the capture loop. This is useful in case of blocking functions that
need special actions to take place in order to stop the execution.
Implement this for PF_RING
The check for the return value was wrong, we have 0 for success and 1
(and 2) for the error cases like TM_ECODE_FAILED, so we should quit
unless TM_ECODE_OK (0) is returned for NFQInitThread. This fixes#1870
Using a stack for free Packet storage causes recently freed Packets to be
reused quickly, while there is more likelihood of the data still being in
cache.
The new structure has a per-thread private stack for allocating Packets
which does not need any locking. Since Packets can be freed by any thread,
there is a second stack (return stack) for freeing packets by other threads.
The return stack is protected by a mutex. Packets are moved from the return
stack to the private stack when the private stack is empty.
Returning packets back to their "home" stack keeps the stacks from getting out
of balance.
The PacketPoolInit() function is now called by each thread that will be
allocating packets. Each thread allocates max_pending_packets, which is a
change from before, where that was the total number of packets across all
threads.
Flow-timeout code injects pseudo packets into the decoders, leading
to various issues. For a full explanation, see:
https://redmine.openinfosecfoundation.org/issues/1107
This patch works around the issues with a hack. It adds a check to
each of the decoder entry points to bail out as soon as a pseudo
packet from the flow timeout is encountered.
Ticket #1107.
To be able to register counters from AppLayerGetCtxThread, the
ThreadVars pointer needs to be available in it and thus in it's
callers:
- AppLayerGetCtxThread
- DecodeThreadVarsAlloc
- StreamTcpReassembleInitThreadCtx
This patch adds and increments a invalid packet counter. It
does this by introducing PacketDecodeFinalize function
This function is incrementing the invalid counter and is also
signalling the packet to CUDA.
To be sure to always verdict packets (bug #769), this patch adds
a ReleaseData function to NFQ packets. The release function simply
drop the packet if it has not been verdicted before.
Use test macro instead of direct access to action field.
This patch has been obtained by using the following
spatch file:
@@
Packet *p;
expression E;
@@
- p->action & E
+ TEST_PACKET_ACTION(p, E)
In case of error, errno is set by sendmsg which is called by
nfnetlink and which is called by libnetfilter_queue. This patch
displays the string expression of errno if verdict has failed.
Normally, there is one verdict per packet, i.e., we receive a packet,
process it, and then tell the kernel what to do with that packet (eg.
DROP or ACCEPT).
recv(), packet id x
send verdict v, packet id x
recv(), packet id x+1
send verdict v, packet id x+1
[..]
recv(), packet id x+n
send verdict v, packet id x+n
An alternative is to process several packets from the queue, and then send
a batch-verdict.
recv(), packet id x
recv(), packet id x+1
[..]
recv(), packet id x+n
send batch verdict v, packet id x+n
A batch verdict affects all previous packets (packet_id <= x+n),
we thus only need to remember the last packet_id seen.
Caveats:
- can't modify payload
- verdict is applied to all packets
- nfmark (if set) will be set for all packets
- increases latency (packets remain queued by the kernel
until batch verdict is sent).
To solve this, we only defer verdict for up to 20 packets and
send pending batch-verdict immediately if:
- no packets are currently queue
- current packet should be dropped
- current packet has different nfmark
- payload of packet was modified
This patch adds a configurable batch verdict support for workers runmode.
The batch verdicts are turned off by default.
Problem is that batch verdicts only work with kernels >= 3.1, i.e.
using newer libnetfilter_queue with an old kernel means non-working
suricata. So the functionnality has to be disabled by default.
currently, the packet payload recv()d from the nfqueue netlink
socket is copied into a new packet buffer.
This is required because the recv-buffer space used is tied
to the current thread, but a packet may be handed off to other
threads, and the recv-buffer can be re-used while the packet
is handled by another thread.
However, in worker runmode, the packet will always be handled
by the current thread, and the recv-buffer will only be reused
after the entire packet processing stack is done with the packet.
Thus, in worker runmode, we can avoid the copy and assign
the packet data area directly.