Issue
-----
So far, with the SMTP parser, we would buffer data up until an LF char
was found indicating the end of one line. This would happen in case of
fragmented data where a line might come broken into multiple chunks.
This was problematic if there was a really long line without any LF
character. It would mean that we'd keep buffering data up until we
encounter one such LF char which may be many many bytes of data later.
Fix
---
Fix this issue by setting an upper limit of 4KB on the buffering of
lines. If the limit is reached then we save the data into current line
and process it as if it were a regular request/response up until 4KB
only. Any data after 4KB is discarded up until there is a new LF char in
the received input.
Cases
-----
1. Fragmentation
The limit is enforced for any cases where a line of >= 4KB comes as diff
fragments that are each/some < 4KB.
2. Single too long line
The limit is also enforced for any cases where a single line exceeds the
limit of buffer.
Reported by Victor Julien.
Ticket 5023
Every transaction has an existing mandatory field, tx_data. As
DetectEngineState is also mandatory, include it in tx_data.
This allows us to remove the boilerplate every app-layer has
for managing detect engine state.
This parameter is NULL or the pointer to the previous state
for the previous protocol in the case of a protocol change,
for instance from HTTP1 to HTTP2
This way, the new protocol can use the old protocol context.
For instance, HTTP2 mimicks the HTTP1 request, to have a HTTP2
transaction with both request and response
Implement min_inspect_depth for SMTP so that file_data and
regular stream matches don't go out of sync on the stream start.
Added toserver bytes tracking.
Bug #3190.
Add a raw-extraction option for smtp. When enabled, this feature will
store the raw e-mail inside a file, including headers, e-mail content,
attachments (base64 encoded). This content is stored in a normal File *,
allowing for normal file detection.
It'd also allow for all-emails extraction if a rule has
detect-filename:"rawmsg" matcher (and filestore).
Note that this feature is in contrast with decode-mime.
This feature is disabled by default, and will be disabled automatically
if decode-mime is enabled.
The SMTP app layer used a thread local data structure for the mpm in
reply parsing, but it only used a pmq. The MpmThreadCtx was actually
global. Until now this wasn't really noticed because non of the mpm's
used the thread ctx.
Hyperscan does use it however.
This patch creates a new structure SMTPThreadCtx, which contains both
the pmq and the mpm thread ctx. It's passed directly to the reply
parsing function instead of storing a pointer to it in the SMTPState.
Additionally fix a small memory leak warning wrt the smtp global mpm
state.
Make the file storage use the streaming buffer API.
As the individual file chunks were not needed by themselves, this
approach uses a chunkless implementation.
This patch fixes the following leak:
Direct leak of 9982880 byte(s) in 2902 object(s) allocated from:
#0 0x4c253b in malloc ??:?
#1 0x10c39ac in MimeDecInitParser /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/util-decode-mime.c:2379
#2 0x6a0f91 in SMTPProcessRequest /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/app-layer-smtp.c:1085
#3 0x697658 in SMTPParse /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/app-layer-smtp.c:1185
#4 0x68fa7a in SMTPParseClientRecord /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/app-layer-smtp.c:1208
#5 0x6561c5 in AppLayerParserParse /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/app-layer-parser.c:908
#6 0x53dc2e in AppLayerHandleTCPData /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/app-layer.c:444
#7 0xf8e0af in DoReassemble /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/stream-tcp-reassemble.c:2635
#8 0xf8c3f8 in StreamTcpReassembleAppLayer /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/stream-tcp-reassemble.c:3028
#9 0xf94267 in StreamTcpReassembleHandleSegmentUpdateACK /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/stream-tcp-reassemble.c:3404
#10 0xf9643d in StreamTcpReassembleHandleSegment /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/stream-tcp-reassemble.c:3432
#11 0xf578b4 in HandleEstablishedPacketToClient /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/stream-tcp.c:2245
#12 0xeea3c7 in StreamTcpPacketStateEstablished /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/stream-tcp.c:2489
#13 0xec1d38 in StreamTcpPacket /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/stream-tcp.c:4568
#14 0xeb0e16 in StreamTcp /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/stream-tcp.c:5064
#15 0xff52a4 in TmThreadsSlotVarRun /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/tm-threads.c:130
#16 0xffdad1 in TmThreadsSlotVar /home/victor/qa/buildbot/donkey/z600fuzz/Private/src/tm-threads.c:474
#17 0x7f7cd678d181 in start_thread /build/buildd/eglibc-2.19/nptl/pthread_create.c:312 (discriminator 2)
We come to this case when a SMTP session contains at least 2 mails
and then the ending of the first is not correctly detected. In that
case, switching to a new tx seems a good solution. This way we still
have partial logging.
If SMTP session is weird then we may reach a state where a field
like MAIL FROM is seen as duplicated.
Valgrind output is:
30 bytes in 1 blocks are definitely lost in loss record 96 of 399
at 0x4C29C0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x4A5803: SMTPParseCommandWithParam (app-layer-smtp.c:996)
by 0x4A4DCE: SMTPParseCommandMAILFROM (app-layer-smtp.c:1016)
by 0x4A3F55: SMTPProcessRequest (app-layer-smtp.c:1127)
by 0x4A1F8C: SMTPParse (app-layer-smtp.c:1191)
by 0x493AD7: SMTPParseClientRecord (app-layer-smtp.c:1214)
by 0x4878A6: AppLayerParserParse (app-layer-parser.c:908)
by 0x42384E: AppLayerHandleTCPData (app-layer.c:444)
by 0x8D7EAD: DoReassemble (stream-tcp-reassemble.c:2635)
by 0x8D795F: StreamTcpReassembleAppLayer (stream-tcp-reassemble.c:3028)
by 0x8D8BE0: StreamTcpReassembleHandleSegmentUpdateACK (stream-tcp-reassemble.c:3404)
by 0x8D8F6E: StreamTcpReassembleHandleSegment (stream-tcp-reassemble.c:3432)
If a boundary was longer than 254 bytes a stack overflow would result
in mime decoding.
Ticket #1449
Reported-by: Kostya Kortchinsky of the Google Security Team