Hyperscan MPM can cache the compiled contexts to files.
This however grows as rulesets change and leads to bloating
the system. This addition prunes the stale cache files based
on their modified file timestamp.
Part of this work incorporates new model for MPM cache stats
to split it out from the cache save function and aggregate
cache-related stats in one place (newly added pruning).
Ticket: 7893
(cherry picked from commit 15c83be61a)
hs: suppress TOCTOU stat use
To explain a bit more the TOCTOU issue found, we can consider
a case where Suricata starts to prune, yet externally somebody also
starts erasing cache files.
Right after Suricata checks the file age with the stat function,
somebody may delete or update the file of our interest.
Suricata aging decision doesn't reflect the actual state of the file.
This commit additionally adds a check for noent failure of the unlink operation
(considered as a success). The code can still delete a file that is recently
updated but was considered stale.
In the documentation-following deployments this should not happen anyway as
one cache folder should only be used by a single Suricata instance (and within
Suricata instance only one thread handles cache eviction).
Additionally, the `stat` and `unlink` command are immediatelly followed, making
this scenario extra unlikely.
Additional comment in the code explains problems of using fstat and potential
issues on Windows.
Ticket: 8244
(cherry picked from commit 0fe0390a2f)
hs/cache: cleaner and more detailed output
Reduce logging level of a minor informational message.
Split tracking of pruning by age and by version and log those
separately, where the logging only appears if something has been
removed.
Ticket: 8323
(cherry picked from commit 569ba3d26f)
hs: remove redundant file handle in HSLoadCache
HSLoadCache opened the cache file but never used the resulting handle
for reading. The actual read was done by HSReadStream which opened
the same file independently.
Removed the unused fopen/fclose pair and flattened the control flow.
Ticket: 8326
(cherry picked from commit d754b28717)
hs: use binary mode for cache file I/O
HSSaveCache wrote serialized Hyperscan databases using text mode ("w")
while HSReadStream already read them with binary mode ("rb").
Matched file reading modes to the binary format and simplified
write-size check.
Ticket: 8326
(cherry picked from commit 0cdc77b707)
hs: warn about the same cache directory
This is especially relevant for multi-instance simultaneous setups
as we might risk read/write races.
(cherry picked from commit 56c1552c3e)
hs: validate cached database against current HS installation
After deserializing a cached Hyperscan database, verify that its
version, CPU features, and mode match the current Hyperscan
installation by comparing hs_database_info output against a
reference database. Reject loading incompatible caches.
Ticket: 8326
(cherry picked from commit 2e7b12dda4)
hs: include HS platform info in cache file hash
Hash Hyperscan installation info (version, CPU features, mode)
into the cache filename. A Hyperscan upgrade or platform change
would now produce a different filename, so stale caches from an
older installation are never opened.
Ticket: 8326
(cherry picked from commit d640719413)
hs: address coverity warning in a reference string
Move the locking mechanism outside of the getter function and hold the
lock until the reference string is no longer reused.
** CID 1682023: Concurrent data access violations (MISSING_LOCK)
/src/util-mpm-hs-cache.c: 139 in HSGetReferenceDbInfo()
(cherry picked from commit 6ec9e5c957)
This directory contains the Suricata Guide. The Suricata Developer's guide
is included as a chapter of the Guide.
The Sphinx Document Generator is used to build the
documentation. For a primer os reStructuredText see the
reStructuredText Primer.
Verifying Changes
There are a number of output formats to choose from when making the source documentation locally (e.g. html, pdf, man).
The documentation source can be built with make -f Makefile.sphinx html. Substitute the 'html' word for desired output format.
There are different application dependencies based on the output desired.