eBPF and bpftrace: the superpower of the modern Linux kernel

eBPF (extended Berkeley Packet Filter) mature in Linux 4.x-5.x kernels, with bpftrace (Brendan Gregg, 2018): awk-like DSL for kernel tracing, hooks on syscalls/functions/events. The new generation of Linux observability and security.

Open SourceCyber Security eBPFbpftraceLinux KernelBrendan GreggTracingObservabilityCyber SecurityOpen Source

A runtime in the kernel

BPF (Berkeley Packet Filter) was born in 1992 as a packet filter for tcpdump. The concept — running small verified programs in the kernel — is extended by Alexei Starovoitov (Facebook/Meta) from 2014: eBPF introduces broader instruction set, static verifier, JIT compiler. Integrated into Linux kernel 3.18 (December 2014) and matured through 4.x-5.x releases.

By 2018 eBPF is mature enough for production use in observability, networking, security. The adoption tipping point is the release of bpftrace by Brendan Gregg (Netflix).

What eBPF can do

eBPF programs “attach” to kernel hook points:

  • kprobes — kernel functions
  • uprobes — userspace functions
  • tracepoints — stable kernel points
  • network (XDP, tc) — packet processing
  • cgroup — container events
  • perf events, LSM hooks (security), fentry/fexit

Each hook executes eBPF code that:

  • Observes kernel state
  • Accumulates statistics in maps (hash, array, histogram)
  • Can modify behaviour (in XDP, in LSM) within verifier limits

bpftrace — DSL for tracing

bpftrace, released in November 2018 by Brendan Gregg and Alastair Robertson, brings eBPF within reach of sysadmins. awk-like DSL:

# Count syscalls by process
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'

# Read syscall latency
bpftrace -e '
    kprobe:vfs_read  { @start[tid] = nsecs; }
    kretprobe:vfs_read /@start[tid]/ {
        @latency = hist(nsecs - @start[tid]);
        delete(@start[tid]);
    }'

Run a bpftrace script, see system state at a detail level previously available only with SystemTap (more complex, less stable).

BCC and BPF CO-RE

  • BCC (BPF Compiler Collection, 2015) — Python + eBPF tool collection by Brendan Gregg and IO Visor team: execsnoop, opensnoop, tcpconnect, biolatency, tcplife, dozens more
  • BPF CO-RE (Compile Once, Run Everywhere, 2019-2020) — lets eBPF compile once and run across different kernels, bypassing the symbol offset compatibility problem

eBPF-based projects

  • Cilium — CNI/service mesh on eBPF (dedicated article)
  • Falco — runtime security on eBPF probe (dedicated article)
  • Pixie (Splunk/Palo Alto) — auto-telemetry observability via eBPF
  • Tetragon (Isovalent) — runtime security with enforcement
  • Parca (Polar Signals) — continuous profiling
  • Pyroscope — profiling
  • Katran (Facebook) — L4 load balancer

In the Italian context

eBPF adopted indirectly via Cilium, Falco, commercial tools. Directly:

  • Advanced performance troubleshooting (SRE teams)
  • CERT and incident response
  • Advanced Linux teaching

References: eBPF in Linux kernel 3.18 (December 2014), Alexei Starovoitov. bpftrace 0.9 (November 2018), Brendan Gregg and Alastair Robertson. BCC (IO Visor). BPF CO-RE. Brendan Gregg, “BPF Performance Tools” (Addison-Wesley, 2019).

Need support? Under attack? Service Status
Need support? Under attack? Service Status