mirror of https://github.com/OISF/suricata
				
				
				
			
			You cannot select more than 25 topics
			Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
		
		
		
		
		
			
		
			
				
	
	
		
			288 lines
		
	
	
		
			8.1 KiB
		
	
	
	
		
			ReStructuredText
		
	
			
		
		
	
	
			288 lines
		
	
	
		
			8.1 KiB
		
	
	
	
		
			ReStructuredText
		
	
AF_XDP
 | 
						|
======
 | 
						|
 | 
						|
AF_XDP (eXpress Data Path) is a high speed capture framework for Linux that was
 | 
						|
introduced in Linux v4.18. AF_XDP aims at improving capture performance by
 | 
						|
redirecting ingress frames to user-space memory rings, thus bypassing the network
 | 
						|
stack.
 | 
						|
 | 
						|
Note that during ``af_xdp`` operation the selected interface cannot be used for
 | 
						|
regular network usage.
 | 
						|
 | 
						|
Further reading:
 | 
						|
 | 
						|
    - https://www.kernel.org/doc/html/latest/networking/af_xdp.html
 | 
						|
 | 
						|
Compiling Suricata
 | 
						|
------------------
 | 
						|
 | 
						|
Linux
 | 
						|
~~~~~
 | 
						|
 | 
						|
libxdp and libpbf are required for this feature. When building from source the
 | 
						|
development files will also be required.
 | 
						|
 | 
						|
Example::
 | 
						|
 | 
						|
    dnf -y install libxdp-devel libbpf-devel
 | 
						|
 | 
						|
This feature is enabled provided the libraries above are installed, the user
 | 
						|
does not need to add any additional command line options. 
 | 
						|
 | 
						|
The command line option ``--disable-af-xdp`` can be used to disable this
 | 
						|
feature.
 | 
						|
 | 
						|
Example::
 | 
						|
 | 
						|
    ./configure --disable-af-xdp
 | 
						|
 | 
						|
Starting Suricata
 | 
						|
-----------------
 | 
						|
 | 
						|
IDS
 | 
						|
~~~
 | 
						|
 | 
						|
Suricata can be started as follows to use af-xdp:
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  af-xdp:
 | 
						|
    suricata --af-xdp=<interface>
 | 
						|
    suricata --af-xdp=igb0
 | 
						|
 | 
						|
In the above example Suricata will start reading from the `igb0` network interface.
 | 
						|
 | 
						|
AF_XDP Configuration
 | 
						|
--------------------
 | 
						|
 | 
						|
Each of these settings can be configured under ``af-xdp`` within the "Configure
 | 
						|
common capture settings" section of suricata.yaml configuration file.
 | 
						|
 | 
						|
The number of threads created can be configured in the suricata.yaml configuration
 | 
						|
file. It is recommended to use threads equal to NIC queues/CPU cores.
 | 
						|
 | 
						|
Another option is to select ``auto`` which will allow Suricata to configure the
 | 
						|
number of threads based on the number of RSS queues available on the NIC.
 | 
						|
 | 
						|
With ``auto`` selected, Suricata spawns receive threads equal to the number of
 | 
						|
configured RSS queues on the interface.
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  af-xdp:
 | 
						|
    threads: <number>
 | 
						|
    threads: auto
 | 
						|
    threads: 8
 | 
						|
 | 
						|
Advanced setup
 | 
						|
---------------
 | 
						|
 | 
						|
af-xdp capture source will operate using the default configuration settings.
 | 
						|
However, these settings are available in the suricata.yaml configuration file.
 | 
						|
 | 
						|
Available configuration options are:
 | 
						|
 | 
						|
force-xdp-mode
 | 
						|
~~~~~~~~~~~~~~
 | 
						|
 | 
						|
There are two operating modes employed when loading the XDP program, these are:
 | 
						|
 | 
						|
- XDP_DRV: Mode chosen when the driver supports AF_XDP
 | 
						|
- XDP_SKB: Mode chosen when no AF_XDP support is unavailable
 | 
						|
 | 
						|
XDP_DRV mode is the preferred mode, used to ensure best performance.
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  af-xdp:
 | 
						|
    force-xdp-mode: <value> where: value = <skb|drv|none>
 | 
						|
    force-xdp-mode: drv
 | 
						|
 | 
						|
force-bind-mode
 | 
						|
~~~~~~~~~~~~~~~
 | 
						|
 | 
						|
During binding the kernel will first attempt to use zero-copy (preferred). If
 | 
						|
zero-copy support is unavailable it will fallback to copy mode, copying all
 | 
						|
packets out to user space.
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  af-xdp:
 | 
						|
    force-bind-mode: <value> where: value = <copy|zero|none>
 | 
						|
    force-bind-mode: zero
 | 
						|
 | 
						|
For both options, the kernel will attempt the 'preferred' option first and
 | 
						|
fallback upon failure. Therefore the default (none) means the kernel has
 | 
						|
control of which option to apply. By configuring these options the user
 | 
						|
is forcing said option. Note that if enabled, the bind will only attempt
 | 
						|
this option, upon failure the bind will fail i.e. no fallback.
 | 
						|
 | 
						|
mem-unaligned
 | 
						|
~~~~~~~~~~~~~~~~
 | 
						|
 | 
						|
AF_XDP can operate in two memory alignment modes, these are:
 | 
						|
 | 
						|
- Aligned chunk mode
 | 
						|
- Unaligned chunk mode
 | 
						|
 | 
						|
Aligned chunk mode is the default option which ensures alignment of the
 | 
						|
data within the UMEM.
 | 
						|
 | 
						|
Unaligned chunk mode uses hugepages for the UMEM.
 | 
						|
Hugepages start at the size of 2MB but they can be as large as 1GB.
 | 
						|
Lower count of pages (memory chunks) allows faster lookup of page entries.
 | 
						|
The hugepages need to be allocated on the NUMA node where the NIC and CPU resides.
 | 
						|
Otherwise, if the hugepages are allocated only on NUMA node 0 and the NIC is
 | 
						|
connected to NUMA node 1, then the application will fail to start.
 | 
						|
Therefore, it is recommended to first find out to which NUMA node the NIC is
 | 
						|
connected to and only then allocate hugepages and set CPU cores affinity
 | 
						|
to the given NUMA node.
 | 
						|
 | 
						|
Memory assigned per socket/thread is 16MB, so each worker thread requires at least
 | 
						|
16MB of free space. As stated above hugepages can be of various sizes, consult the
 | 
						|
OS to confirm with ``cat /proc/meminfo``.
 | 
						|
 | 
						|
Example ::
 | 
						|
  
 | 
						|
    8 worker threads * 16Mb = 128Mb
 | 
						|
    hugepages = 2048 kB
 | 
						|
    so: pages required = 62.5 (63) pages
 | 
						|
 | 
						|
See https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt for detailed
 | 
						|
description.
 | 
						|
 | 
						|
To enable unaligned chunk mode:
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  af-xdp:
 | 
						|
    mem-unaligned: <yes/no>
 | 
						|
    mem-unaligned: yes
 | 
						|
 | 
						|
Introduced from Linux v5.11 a ``SO_PREFER_BUSY_POLL`` option has been added to
 | 
						|
AF_XDP that allows a true polling of the socket queues. This feature has
 | 
						|
been introduced to reduce context switching and improve CPU reaction time
 | 
						|
during traffic reception.
 | 
						|
 | 
						|
Enabled by default, this feature will apply the following options, unless
 | 
						|
disabled (see below). The following options are used to configure this feature.
 | 
						|
 | 
						|
enable-busy-poll
 | 
						|
~~~~~~~~~~~~~~~~
 | 
						|
 | 
						|
Enables or disables busy polling.
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  af-xdp:
 | 
						|
    enable-busy-poll: <yes/no>
 | 
						|
    enable-busy-poll: yes
 | 
						|
 | 
						|
busy-poll-time
 | 
						|
~~~~~~~~~~~~~~
 | 
						|
 | 
						|
Sets the approximate time in microseconds to busy poll on a ``blocking receive``
 | 
						|
when there is no data.
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  af-xdp:
 | 
						|
    busy-poll-time: <time>
 | 
						|
    busy-poll-time: 20
 | 
						|
 | 
						|
busy-poll-budget
 | 
						|
~~~~~~~~~~~~~~~~
 | 
						|
 | 
						|
Budget allowed for batching of ingress frames. Larger values means more
 | 
						|
frames can be stored/read. It is recommended to test this for performance.
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  af-xdp:
 | 
						|
    busy-poll-budget: <budget>
 | 
						|
    busy-poll-budget: 64
 | 
						|
 | 
						|
Linux tunables
 | 
						|
~~~~~~~~~~~~~~~
 | 
						|
 | 
						|
The ``SO_PREFER_BUSY_POLL`` option works in concert with the following two Linux
 | 
						|
knobs to ensure best capture performance. These are not socket options:
 | 
						|
 | 
						|
- gro-flush-timeout
 | 
						|
- napi-defer-hard-irq
 | 
						|
 | 
						|
The purpose of these two knobs is to defer interrupts and to allow the
 | 
						|
NAPI context to be scheduled from a watchdog timer instead.
 | 
						|
 | 
						|
The ``gro-flush-timeout`` indicates the timeout period for the watchdog
 | 
						|
timer. When no traffic is received for ``gro-flush-timeout`` the timer will
 | 
						|
exit and softirq handling will resume.
 | 
						|
 | 
						|
The ``napi-defer-hard-irq`` indicates the number of queue scan attempts
 | 
						|
before exiting to interrupt context. When enabled, the softirq NAPI context will
 | 
						|
exit early, allowing busy polling.
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  af-xdp:
 | 
						|
    gro-flush-timeout: 2000000
 | 
						|
    napi-defer-hard-irq: 2
 | 
						|
 | 
						|
 | 
						|
Hardware setup
 | 
						|
---------------
 | 
						|
 | 
						|
Intel NIC setup
 | 
						|
~~~~~~~~~~~~~~~
 | 
						|
 | 
						|
Intel network cards don't support symmetric hashing but it is possible to emulate
 | 
						|
it by using a specific hashing function.
 | 
						|
 | 
						|
Follow these instructions closely for desired result::
 | 
						|
 | 
						|
 ifconfig eth3 down
 | 
						|
 | 
						|
Enable symmetric hashing ::
 | 
						|
 | 
						|
 ifconfig eth3 down 
 | 
						|
 ethtool -L eth3 combined 16 # if you have at least 16 cores
 | 
						|
 ethtool -K eth3 rxhash on 
 | 
						|
 ethtool -K eth3 ntuple on
 | 
						|
 ifconfig eth3 up
 | 
						|
 ./set_irq_affinity 0-15 eth3
 | 
						|
 ethtool -X eth3 hkey 6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A equal 16
 | 
						|
 ethtool -x eth3
 | 
						|
 ethtool -n eth3
 | 
						|
 | 
						|
In the above setup you are free to use any recent ``set_irq_affinity`` script. It is available in any Intel x520/710 NIC sources driver download.
 | 
						|
 | 
						|
**NOTE:**
 | 
						|
We use a special low entropy key for the symmetric hashing. `More info about the research for symmetric hashing set up <http://www.ndsl.kaist.edu/~kyoungsoo/papers/TR-symRSS.pdf>`_
 | 
						|
 | 
						|
Disable any NIC offloading
 | 
						|
~~~~~~~~~~~~~~~~~~~~~~~~~~
 | 
						|
 | 
						|
Suricata shall disable NIC offloading based on configuration parameter ``disable-offloading``, which is enabled by default.
 | 
						|
See ``capture`` section of yaml file.
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  capture:
 | 
						|
    # disable NIC offloading. It's restored when Suricata exits.
 | 
						|
    # Enabled by default.
 | 
						|
    #disable-offloading: false
 | 
						|
 | 
						|
Balance as much as you can
 | 
						|
~~~~~~~~~~~~~~~~~~~~~~~~~~
 | 
						|
 | 
						|
Try to use the network card's flow balancing as much as possible ::
 | 
						|
 
 | 
						|
 for proto in tcp4 udp4 ah4 esp4 sctp4 tcp6 udp6 ah6 esp6 sctp6; do 
 | 
						|
    /sbin/ethtool -N eth3 rx-flow-hash $proto sd
 | 
						|
 done
 | 
						|
 | 
						|
This command triggers load balancing using only source and destination IPs. This may be not optimal
 | 
						|
in terms of load balancing fairness but this ensures all packets of a flow will reach the same thread
 | 
						|
even in the case of IP fragmentation (where source and destination port will not be available for
 | 
						|
some fragmented packets).
 |