Adding BPF to my DHCP Server
Ages ago, i’ve written a small DHCP Server and Client in Go.
DHCP has a few (surprising to me, back then) complications which force one to use raw sockets:
A client needs to be able to perform an arp ping and receive replies of the server before its IP stack is configured.
Likewise, the server needs to be able to see requests which are not adressed to its IP or MAC (since the client doesn’t know this information yet!) and therefore needs to ‘sniff’ all IP traffic which can be quite expensive if the machine is busy doing other things. Even if my server was written in assembly, things would be suboptimal since eachh received packet causes an (often unneeded) context switch. So how can we do better?
BPF to the rescue
The Berkeley Packet Filter allows us to attach a BPF Filter to a (raw) socket. BPF filter are small ‘bpf assembly’ programs which allow us to inspect data and produce a verdict on whether or not to forward a particular packet to the application which listens on the socket which is perfect for our use case since it allows us to discard everything that doesn’t look like a DHCP message on the kernel level.
Attaching a BPF filter using Go
Time for some raw syscalls! First, lets check out the kernel documentation on how to use BPF from C code:
struct sock_fprog bpf = {
.len = ARRAY_SIZE(code),
.filter = code,
};
sock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
if (sock < 0)
/* ... bail out ... */
ret = setsockopt(sock, SOL_SOCKET, SO_ATTACH_FILTER, &bpf, sizeof(bpf));
if (ret < 0)
/* ... bail out ... */
/* ... */
Ok, that does look pretty straight forward, so lets translate this to Go:
We already do have code to open the raw socket. Note that we pass in ETH_P_IP
instead of ETH_P_ALL
which already restricts traffic we see to IP traffic (this will be important later when we write our filter):
sock, err := syscall.Socket(syscall.AF_PACKET, syscall.SOCK_DGRAM, syscall.ETH_P_IP)
So all we have to do now is to mimic the construction of sock_fprog
and the call to attach the filter.
(Finished code)
// build sock_fprog struct
program := unix.SockFprog{
Len: uint16(len(assembled)),
Filter: (*unix.SockFilter)(unsafe.Pointer(&assembled[0])),
}
// convert this to raw bytes
b := (*[unix.SizeofSockFprog]byte)(unsafe.Pointer(&program))[:unix.SizeofSockFprog]
// and finally perform the syscall to attach the filter:
if _, _, errno := syscall.Syscall6(syscall.SYS_SETSOCKOPT,
uintptr(fd), uintptr(syscall.SOL_SOCKET), uintptr(syscall.SO_ATTACH_FILTER),
uintptr(unsafe.Pointer(&b[0])), uintptr(len(b)), 0); errno != 0 {
return errno
}
Note that i borrowed the conversion to raw bytes from sys_bpf.go (which unfortunately does not export any of its code in a way that would be useful for us).
But with this all setup, we can finally write a filter!
BPF filter for DHCP
In order to reduce context switches (and load put on our raw packet parser) we try to filter everything that doesn’t look like a DHCP message. Note that the filter does not have to be perfect: We just want to throw away everything which absolutely does not seem to be a DHCP message, meaning that we are only interested in packets which:
- Are IPv4 and UDP
- Have the destination port set to 67 or 68
- Contain the DHCP opcode (request=1, reply=2) we are looking for.
All this can be done using this simple filter:
Note that we passed in ETH_P_IP
to our raw socket which means that we won’t see the ethernet frame header - which affects the offsets.
func FilterDhcpOperation(dstPort uint32, op uint8) []bpf.Instruction {
return []bpf.Instruction{
// IPv4
bpf.LoadAbsolute{Off: 0, Size: 1},
// Skip 7 instructions if loaded value != 0x45.
bpf.JumpIf{Val: 0x45, SkipFalse: 7},
// UPD
bpf.LoadAbsolute{Off: 9, Size: 1},
// Skip 5 instructions if loaded value is != 0x11.
bpf.JumpIf{Val: 0x11, SkipFalse: 5},
// DST Port (67 for server, 68 for client)
bpf.LoadAbsolute{Off: 23, Size: 1},
// Skip 3 instructions if loaded value is not `dstPort`.
bpf.JumpIf{Val: dstPort, SkipFalse: 3},
// DHCP Opcode
bpf.LoadAbsolute{Off: 28, Size: 1},
// Skip one instruction if loaded value is not `op`.
bpf.JumpIf{Val: uint32(op), SkipFalse: 1},
// Return full packet payload
bpf.RetConstant{Val: math.MaxUint32},
// Reject
bpf.RetConstant{Val: 0},
}
}
Verdict
BPF is great: Setting it up is rather simple and the VM/assembly language is fast to learn. Adding this simple filter caused the CPU load of my DHCP server (running on a RaspberryPi 2) to drop from 2% to 0% :)