Threads are only set to paused upon initialization and never again, we
should only have to wait once, so move the wait before any loop that
was waiting before.
Additionally, if the thread was killed while waiting to be unpaused,
don't enter the loop.
The pattern of checking the pause flag, setting to paused then
waiting to unpause was done enough times to factor out into its own
function. This is also needed by library users who bring their own
packet acquisition threads.
Remove the call to SCDropCaps for packet processing threads. This
logic in this function is required to setup packet processing even
when the thread is provided by a library user, in which case Suricata
should not be touching is capabilities.
As SCDropCaps is currently a no-op its clear this feature needs to
be (re)designed properly, taking into consideration library users as
well.
Related ticket: https://redmine.openinfosecfoundation.org/issues/2375
In certain conditions, it can take a long time for threads to start up.
For example in af-packet, setting up the socket, rings, etc has been
observed to take close to half a second per thread, and since the
threads go one by one in a preset order, this means the start up can
take a lot of time if there are many threads. The old logic would just
allow a hard coded 60s. This was not always enough when the number of
threads was high.
This patch makes the wait time take the number of threads into account.
It adds a second of time budget to the base 60s for each thread.
So as an example, if a system has 112 af-packet threads, it would wait
172 seconds (60 + 112) for the threads to get ready.
Ticket: #7048.
When starting a large amount of threads, the loop was inefficient. It
would loop over the threads and if one wasn't yet ready it would sleep a
bit and then reevaluate all the threads. This reevaluation of threads
already checked was inefficient, and could lead to the time budget
running out.
This patch splits the check, and keeps track of the threads that have
already passed. This avoids the rescanning of already checked threads.
In offline mode, a timestamp is kept per thread, and the lowest
timestamp of the active threads is used. This was also considering the
non-packet threads, which could lead to the used timestamp being further
behind that needed. This would happen at the start of the program, as
the non-packet threads were set up the same way as the packet threads.
This patch both no longer sets up the timestamp for non-packet threads
as well as not considering non-packet threads during timestamp
retrieval.
Fixes: 6f560144c1 ("time: improve offline time handling")
Bug: #7034.
Ensure that the mutex protecting the condition variable is held before
signaling it. This ensures that the thread(s) awaiting the signal are
notified.
Issue: 6569
When the decoders encounter multiple layers of tunneling, multiple tunnel
packets are created. These are then stored in ThreadVars::decode_pq, where
they are processed after the current thread "slot" is done. However, due
to a logic error, the tunnel packets after the first, where not called
for the correct position in the packet pipeline. This would lead to these
packets not going through the FlowWorker module, so skipping everything
from flow tracking, detection and logging.
This would only happen for single and workers, due to how the pipelines
are constructed.
The "slot" holding the decoder, would contain 2 packets in
ThreadVars::decode_pq. Then it would call the pipeline on the first
packet with the next slot of the pipeline through a indirect call to
TmThreadsSlotVarRun(), so it would be called for the FlowWorker.
However when that first (the most inner) packet was done, the call
to TmThreadsSlotVarRun() would again service the ThreadVars::decode_pq
and process it, again moving the slot pointer forward, so past the
FlowWorker.
This patch addresses the issue by making sure only a "decode" thread
slot will service the ThreadVars::decode_pq, thus never moving the
slot past the FlowWorker.
Bug: #6402.
Flow house keeping can accumulate work that wasn't taken into account
during shutdown. This could lead to flows still in the flowworker
thread context when being it was freed, leading to missed work and
memory leaks.
This patch adds a new way of checking if a thread module is still
busy.
Bug: #6062.
Until this patch the logic the flow worker flow house keeping used was:
- at most 2 flows are handled per packet
- pseudo packets could flush the entire queue
This patch changes that. Pseudo packets are fairly common, and can lead
to packet stalls / latency spikes if the number of flows in the queue
is large.
It does that by adding a new packet type only used at shutdown, which
flushes out the queues completely. All other packets will now stick
to the 2 flow rate limit.
Enforce assumption that packets in ThreadVars::decode_pq have no flow
attached to it because this is only true for packets while they are
in the FlowWorker.
Issue: 5718
This commit switches the majority of time handling to a new type --
SCTime_t -- which is a 64 bit container for time:
- 44 bits -- seconds
- 20 bits -- useconds
Tested on Fedora 37 with clang 15.
app-layer.c:1055:27: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
void AppLayerSetupCounters()
^
void
app-layer.c:1176:29: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
void AppLayerDeSetupCounters()
^
void
2 errors generated.
Each module (thread) updates its status to indicate running.
Main thread awaits for all threads to be in a running state
before continuing the initialisation process
Implements feature 5384
(https://redmine.openinfosecfoundation.org/issues/5384)
Issue: 4550
This commit adjusts the per-thread stack size if a size has been
configured. If the setting has not been configured, the default
per-thread stack size provided by the runtime mechanisms are used.
In a scenario where there was suddenly no more traffic flowing, flows
in a threads `flow_queue` would not be processed. The easiest way to
see this would be in a traffic replay scenario. After the replay is done
no more packets come in and these evicted flows got stuck.
In workers mode, the capture part handles timeout this was updated to
take the `ThreadVars::flow_queue` into account.
The autofp mode the logic that puts a flow into a threads `flow_queue`
would already wake a thread up, but the `flow_queue` was then ignored.
This has been updated to take the `flow_queue` into account.
In both cases a "capture timeout" packet is pushed through the pipeline
to "flush" the queues.
Bug: #4722.
Goals:
- reduce locking
- take advantage of 'hot' caches
- better locality
Locking reduction
New flow spare pool. The global pool is implmented as a list of blocks,
where each block has a 100 spare flows. Worker threads fetch a block at
a time, storing the block in the local thread storage.
Flow Recycler now returns flows to the pool is blocks as well.
Flow Recycler fetches all flows to be processed in one step instead of
one at a time.
Cache 'hot'ness
Worker threads now check the timeout of flows they evaluate during lookup.
The worker will have to read the flow into cache anyway, so the added
overhead of checking the timeout value is minimal. When a flow is considered
timed out, one of 2 things happens:
- if the flow is 'owned' by the thread it is handled locally. Handling means
checking if the flow needs 'timeout' work.
- otherwise, the flow is added to a special 'evicted' list in the flow
bucket where it will be picked up by the flow manager.
Flow Manager timing
By default the flow manager now tries to do passes of the flow hash in
smaller steps, where the goal is to do full pass in 8 x the lowest timeout
value it has to enforce. So if the lowest timeout value is 30s, a full pass
will take 4 minutes. The goal here is to reduce locking overhead and not
get in the way of the workers.
In emergency mode each pass is full, and lower timeouts are used.
Timing of the flow manager is also no longer relying on pthread condition
variables, as these generally cause waking up much quicker than the desired
timout. Instead a simple (u)sleep loop is used.
Both changes reduce the number of hash passes a lot.
Emergency behavior
In emergency mode there a number of changes to the workers. In this scenario
the flow memcap is fully used up and it is unavoidable that some flows won't
be tracked.
1. flow spare pool fetches are reduced to once a second. This avoids locking
overhead, while the chance of success was very low.
2. getting an active flow directly from the hash skips flows that had very
recent activity to avoid the scenario where all flows get only into the
NEW state before getting reused. Rather allow some to have a chance of
completing.
3. TCP packets that are not SYN packets will not get a used flow, unless
stream.midstream is enabled. The goal here is again to avoid evicting
active flows unnecessarily.
Better Localily
Flow Manager injects flows into the worker threads now, instead of one or
two packets. Advantage of this is that the worker threads can get packets
from their local packet pools, avoiding constant overhead of packets returning
to 'foreign' pools.
Counters
A lot of flow counters have been added and some have been renamed.
Overall the worker threads increment 'flow.wrk.*' counters, while the flow
manager increments 'flow.mgr.*'.
Additionally, none of the counters are snapshots anymore, they all increment
over time. The flow.memuse and flow.spare counters are exceptions.
Misc
FlowQueue has been split into a FlowQueuePrivate (unlocked) and FlowQueue.
Flow no longer has 'prev' pointers and used a unified 'next' pointer for
both hash and queue use.
Replaces all patterns of SCLogError() followed by exit() with
FatalError(). Cocci script to do this:
@@
constant C;
constant char[] msg;
@@
- SCLogError(C,
+ FatalError(SC_ERR_FATAL,
msg);
- exit(EXIT_FAILURE);
Closes redmine ticket 3188.
Support either the __thread GNUism or the C11 _Thread_local.
Use 'thread_local' to point to the one that is used. Convert existing
__thread user to 'thread_local'.
Remove non-thread-local code from the packet pool code.
Until now both atomic ints and pointers were initialized by SC_ATOMIC_INIT
by setting them to 0. However, C11's atomic pointer type cannot be
initialized this way w/o causing compiler warnings.
As a preparation to supporting C11's atomics, this patch introduces a
new macro to initialize atomic pointers and updates the relevant callers
to use it.
This patch addresses two problems.
First, various parts of the engine, but most notably the flow manager (FM),
use a minimum of the time notion of the packet threads. This did not
however, take into account the scenario where one or more of these
threads would be inactive for prolonged times. This could lead to the
time used by the FM could get stale.
This is addressed by keeping track of the last time the per thread packet
timestamp was updated, and only considering it for the 'minimum' when it
is reasonably current.
Second, there was a minor race condition at start up, where the FM would
already inspect the hash table(s) while the packet threads weren't active
yet. Since FM gets the time from the packet threads, it would use a bogus
time of 0.
This is addressed by adding a wait loop to the start of the FM that waits
for 'time' to get ready.