netfilter is a framework for packet mangling, outside the normal Berkeley socket interface. It has four parts. Firstly, each protocol defines "hooks" (IPv4 defines 5) which are well-defined points in a packet's traversal of that protocol stack. At each of these points, the protocol will call the netfilter framework with the packet and the hook number.
Secondly, parts of the kernel can register to listen to the different hooks for each protocol. So when a packet is passed to the netfilter framework, it checks to see if anyone has registered for that protocol and hook; if so, they each get a chance to examine (and possibly alter) the packet in order, then discard the packet (
NF_DROP), allow it to pass (
NF_ACCEPT), tell netfilter to forget about the packet (
NF_STOLEN), or ask netfilter to queue the packet for userspace (
The third part is that packets that have been queued are collected (by the ip_queue driver) for sending to userspace; these packets are handled asynchronously.
The final part consists of cool comments in the code and documentation. This is instrumental for any experimental project. The netfilter motto is (stolen shamelessly from Cort Dougan):
``So... how is this better than KDE?''
(This motto narrowly edged out `Whip me, beat me, make me use ipchains').
In addition to this raw framework, various modules have been written which provide functionality similar to previous (pre-netfilter) kernels, in particular, an extensible NAT system, and an extensible packet filtering system (iptables).
Netfilter consists of three tables: Filter, Nat and Mangle. Each table has a number of build-in chains: PREROUTING, INPUT, FORWARD, OUTPUT and POSTROUTING.
Rules in the various tables are used as follows:
Packet filtering (rejecting, dropping or accepting packets)
Network Address Translation including DNAT, SNAT and Masquerading
General packet header modification such as setting the TOS value or marking packets for policy routing and traffic shaping.
Used primarily for creating exemptions from connection tracking with the NOTRACK target. Also used for stateless DNAT.
Used for stateless SNAT.
Netfilter allows us to filter packets, or mangle their headers. One special feature is that we can mark a packet with a number. This is done with the --set-mark facility.
As an example, this command marks all packets destined for port 25, outgoing mail:
# iptables -A PREROUTING -i eth0 -t mangle -p tcp --dport 25 \ -j MARK --set-mark 1
Let's say that we have multiple connections, one that is fast (and expensive, per megabyte) and one that is slower, but flat fee. We would most certainly like outgoing mail to go via the cheap route.
We've already marked the packets with a '1', we now instruct the routing policy database to act on this:
# echo 201 mail.out >> /etc/iproute2/rt_tables # ip rule add fwmark 1 table mail.out # ip rule ls 0: from all lookup local 32764: from all fwmark 1 lookup mail.out 32766: from all lookup main 32767: from all lookup default
Now we generate a route to the slow but cheap link in the mail.out table:
# /sbin/ip route add default via 184.108.40.206 dev ppp0 table mail.out
And we are done. Should we want to make exceptions, there are lots of ways to achieve this. We can modify the netfilter statement to exclude certain hosts, or we can insert a rule with a lower priority that points to the main table for our excepted hosts.
We can also use this feature to honour TOS bits by marking packets with a different type of service with different numbers, and creating rules to act on that. This way you can even dedicate, say, an ISDN line to interactive sessions.
Needless to say, this also works fine on a host that's doing NAT ('masquerading').