This is required for crates that use a C compiler to use the same one as
used by Suricata. Important for cross compiling.
Also pass AR and RANLIB which are often used for cross compiling.
On a normal project where the Cargo.lock is checked in, it would be
normal to see an updated Cargo.lock in git status and the like. As we
use autoconf to generate this file, we should just copy it back to the
input file so we get the same convenience of seeing when it is
updated, which usually means it needs to be checked in.
However, to satisfy "make distcheck", only copy it if the input
template exists, if the input template does not exist we are in an out
of tree build.
Ticket: #2696
There are a lot of changes here, which are described below.
In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.
Functions introduced to handle opaque htp_tx_t:
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
and “Upgrade: h2c" header.
Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.
Other changes:
Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.
Make function arguments and return values const where possible
htp_tx_destroy(tx) will now free an incomplete transaction
htp_time_t replaced with standard struct timeval
Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.
SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID
htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.
libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values.
New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR
The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.
Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
This uses the cbindgen found during ./configure, and not the one
found on the path during "make", which while often the same, aren't
always the same.
Ticket: #6384
Also disable bindgen's generated layout tests. They are valid for the
platform generating the tests, but may not be valid for other
platforms. For example, if the tests are generated on a 64 bit
platform the tests will not be valid when run on a 32 bit platform as
pointers are a different size.
However, the generating bindings are valid for both platform.
Ticket: #7341
We don't keep bindgen's autogenerated do not edit line as it contains
the bindgen version which could break the CI check for out of date
bindings. So add our own do not edit line.
Ticket: #7341
Bindgen works by processing a header file which includes all other
header files it should generate bindings for. For this I've created
bindgen.h which just includes app-layer-protos.h for now as an
example.
These bindings are then generated and saved in the "suricata-sys"
crate and become availale as "suricata_sys::sys".
Ticket: #7341
Follow Rust convention of using a "sys" crate for bindings to C
functions. The bindings don't exist yet, but will be generated by
bindgen and put into this crate.
Ticket: #7341
To ensure that all calls to cargo use the same environment variables,
put the environment variables in CARGO_ENV so every call to cargo can
easily use the same vars.
The Cargo build system is smarter than make, it can detect a change in
an environment variable that affects the build, and the setting of
SURICATA_LUA_SYS_HEADER_DST changing could cause a rebuild.
Also update suricata-lua-sys, which is smarter about copying headers. It
will only copy if the destination does not exist, or the source header
is newer than the target, which can also prevent unnecessary rebuilds.
This is mainly to fix an issue where subsequent builds may fail,
especially when running an editor with a LSP enabled:
Update lua crate to 0.1.0-alpha.5. This update will force a rewrite of
the headers if the env var SURICATA_LUA_SYS_HEADER_DST changes. This
fixes the issue where the headers may not be written.
The cause is that Rust dependencies are cached, and if your editor is
using rust-analyzer, it might cache the build without this var being
set, so these headers are not available to Suricata. This crate update
forces the re-run of the Lua build.rs if this env var changes, fixing
this issue.
This crate lets us instruct it where to copy the header files instead
of our Makefile trying to find the correct ones and copying them into
place.
Can prevent the simultaneous copy errors sometimes seen on a make
without a clean.
Remove maintainer-clean-local, this is not needed.
In distclean-local, remove "rust/dist" and "rust/vendor" as they are
created during "make dist".
In "clean-local", remove "rust/target" and "rust/gen" as they are
created during a normal "make".
Addresses this warning from the Rust compiler:
warning: `../rust/.cargo/config` is deprecated in favor of `config.toml`
note: if you need to support cargo 1.38 or earlier, you can symlink `config` to `config.toml`
Remove the path.lib parameter that is substituted into the output
Cargo.toml by autoconf. Instead, as part of the build, "cd" into the
source directory. We already set the Rust target directory to the
external build directory.
This makes the Cargo.toml more generic, and in a format suitable for
publishing to crates.io. It also makes it easier to pull in external
crates without needing to patch up their Cargo.toml, for example, it
might make pulling libhtp-rs easier.
Cargo.lock has to be provided as template, Cargo.lock.in so it can
live beside Cargo.lock in out of tree automake builds, like distcheck.
This will pin Rust dependencies even for git builds, updating
Cargo.lock will now be a manual process that we'll have to take care
of periodically.
cargo vendor has been part of the core cargo command since Rust 1.37,
and are minimum Rust version is not 1.41, so remove the check. Its
always available now.
Rename the Rust lib to simply "suricata" instead of "suricata_rust".
This allows Rust plugin/library code to use it under the name "suricata"
which is what should be expected.
The name was only "suricata_rust" to prevent on-disk conflict with the C
code, so just rename the file on disk, which doesn't affect how the code
is interacted with from an API layer.
Currently has one derive, AppLayerEvent to be used like:
#[derive(AppLayerEvent)]
pub enum DNSEvent {
MalformedData,
NotRequest,
NotResponse,
ZFlagSet,
}
Code will be generated to:
- Convert enum to a c type string
- Convert string to enum variant
- Convert id to enum variant
As we don't install the libraries by default, provide a make target,
"install-library" to install the libsuricata library files.
If shared library support exists, both the static and shared
libraries will be installed, otherwise only the static libraries
will be installed.
Prior to Rust 1.44, Cargo would name static libs with the .lib
extension. 1.44 changes this extension to .a when running under
a GNU environment on Windows like msys to make it more similar
to other unix environments.
Now assume static library name to be the same on Windows and
unix, but rename the .lib if found to still support older
versions of Rust on Windows.
Only run cbindgen when necessary. This is a bit tricky. When
building a dist we want to unconditionally build the headers.
When going through a "make; sudo make install" type process,
cbindgen should not be run as the headers already exist, are
valid, and the environment under sudo is more often than
not suitable to pick up the Rust toolchains when installed
with rustup.
For the normal "make" case we have the gen/rust-bindings.h file
depend on library file, this will cause it to only be rebuilt
if the code was modified.
For "make dist" we unconditionally create "dist/rust-bindings.h".
This means the generated file could be in 2 locations, so update
configure.ac, and the library search find to find it.
The "gen/rust-bindings.h" should be picked up first if it exists,
for those who develop from a dist archive where "dist/rust-bindings.h"
also exists.
Not completely happy having the same file in 2 locations, but not
sure how else to get the dependency tracking correct.
Uses "cargo doc --no-deps" to build the documentation just for
our Suricata package. Without --no-deps, documentation will be
build for all our dependencies as well.
The generated documentation will end up in target/doc as HTML.
For make clean, only remove gen/ if cbindgen is available.
This prevents make clean from remove gen when the headers
were bundled, but cbindgen is not available to remove them.
Unconditionally remove gen and vendor in maintainerclean.
The modifications as part of the cbindgen commit caused issues
with distcheck, revert the Makefile to how it was with the Python
generator, but still using cbindgen.
Also always assume we'll include the generated headers in the
distribution archive to fix make distcheck from distribution
archives with headers included, but no cbindgen.
If sources are vendored, we get the same effect of using frozen
with a lock file, and the Cargo.lock is generated based
on the vendored sources.
This also removes the need to ship a Cargo.lock.
Fixed out of source builds with vendored sources.