title | revealOptions | backgroundTransition | verticalSeparator | ||
---|---|---|---|---|---|
Linux's New Superpowers - Introducting eBPF |
|
fade |
-v- |
rfc2119
- Packet: a formatted unit of data carried by a packet-switched network
- Packet strictly refers to a protocol data unit (PDU) at layer 3, the network layer.
- Filters are implemented at both software and hardware (e.g dedicated firewall box) levels
- Software packet filters allows general-purpose machines for network monitoring
notes: so how does properiatery software packet filters / firewalls operate ? their OS should be finely tuned to their hardware, but ¯\(ツ)/¯
note: When a packet arrives at a network interface the link level device driver normally sends it up the system protocol stack. But when BPF is listening on this interface, the driver first calls BPF. BPF feeds the packet to each participating process’ filter. This user-defined filter decides whether a packet is to be accepted and how many bytes of each packet should be saved. For each filter that accepts the packet, BPF copies the requested amount of data to the buffer associated with that filter. The device driver then regains control. If the packet was not addressed to the local host, the driver returns from the interrupt. Otherwise, normal protocol processing proceeds
- Berkely Packet Filter
- 1993, made for network packet filtering
- provides a raw interface to data link layers
- goals:
- minimize transition to user-space
- filtering packet as efficient as possible
An in-kernel sandboxed VM:
# tcpdump host 127.0.0.1 -d
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 6
(002) ld [26]
(003) jeq #0x7f000001 jt 12 jf 4
(004) ld [30]
(005) jeq #0x7f000001 jt 12 jf 13
(006) jeq #0x806 jt 8 jf 7
(007) jeq #0x8035 jt 8 jf 13
(008) ld [28]
(009) jeq #0x7f000001 jt 12 jf 10
(010) ld [38]
(011) jeq #0x7f000001 jt 12 jf 13
(012) ret #262144
(013) ret #0
- extended BPF (eBPF). Can be used for non-networking purposes!
- Available in the Linux kernel >= 3.15 (see patch)
- Running user-space code inside kernel is a powerful tool for kernel developers and production engineers
-v-
-v-
-v-
note: data-background-position="bottom"
-v-
-
kernel code:
-
direct access to hardware (peripherals, memory, hard disks, network cards, ... etc.)
-
schedule jobs
-
crash = BSOD
-
is included in your operating system
-
-
user code:
-
limited access (can it read files from your hard disk ?)
-
crash = throw an exception
-
-v-
-v-
-v-
-
system calls
-
CPU exceptions (running out of memory, division by zero, ...)
-
hardware Interrupts
note: see img/ARM-shell-0.png.pagespeed.ce.6wC1FtlsVk.png
- an eXpress Data Path (XDP) in kernel-space
- determines data paths for a received packet
- works by adding eBPF hooks in the NIC driver
- eBPF hooks can change on the fly!
- can drop a whooping 26 millions of packets per second per core with commodity hardware!
- Netronome NICs has native support for XDP
note: from wiki: The idea behind XDP is to add an early hook in the RX path of the kernel, and let a user supplied eBPF program decide the fate of the packet. The hook is placed in the NIC driver just after the interrupt processing, and before any memory allocation needed by the network stack itself, because memory allocation can be an expensive operation also, there's some company out there that made a commercial SDN solution using eBPF
- could do anything with high performance and minimal overhead (filter/classify traffic, reactive defensive networking, ... )
- take eBPF seriously, it's game changing!
- katran (L2-L3 load balancing), goBPF, redBPF, bpfd, bpf-seccomp, cilium.
note: bpf-cilium-turning-linux-into-a-microservicesaware-operating-system-26-638.jpg
-v-
-v-
- eBPF is especially suited to writing network programs and it's possible to write programs that attach to a network socket to filter traffic, to classify traffic, and to run network classifier actions
- Another type of filtering performed by the kernel is restricting which system calls a process can use. This is done with seccomp BPF. (see
pledge
syscall andsysctl security.bsd
in bsd) - eBPF is also useful for debugging the kernel and carrying out performance analysis (and also user). It's even possible to use eBPF to debug user-space programs by using Userland Statically Defined Tracepoints
-
The BSD Packet Filter: A New Architecture for User-level Packet Capture (original BPF paper)
-
BPF Performance Tools: Linux System and Application Observability- Brendan Gregg's Blog (see example chapter on observability)
-
Understanding User and Kernel Mode - Jeff Atwood
-
strace Wow Much Syscall - Brendan Gregg's Bloga
-
XDP - IO Visor Project
-
eBPF working diagram
-
BPF, eBPF, XDP and Bpfilter… What are These Things and What do They Mean for the Enterprise? - Netronome Blog
-
eBPF Vulnerability (CVE-2017-16995)
slides @ rfc2119.github.io/ebpf-talk