Netfilter workshop 2019 Malaga summary
This week we had the annual Netfilter Workshop. This time the venue was in Malaga (Spain). We had the hotel right in the Malaga downtown and the meeting room was in University ETSII Malaga. We had plenty of talks, sessions, discussions and debates, and I will try to summarice in this post what it was about.
Florian Westphal, Linux kernel hacker, Netfilter coreteam member and engineer from Red Hat, started with a talk related to some work being done in the core of the Netfilter code in the kernel to convert packet processing to lists. He shared an overview of current problems and challenges. Processing in a list rather than per packet seems to have several benefits: code can be smarter and faster, so this seems like a good improvement. On the other hand, Florian thinks some of the pain to refactor all the code may not worth it. Other approaches may be considered to introduce even more fast forwarding paths (apart from the flow table mechanism which is already available).
Florian also followed up with the next topic: testing. We are starting to have a lot of duplicated code to do testing. Suggestion by Pablo is to introduce some dedicated tools to ease in maintenance and testing itself. Special mentions to nfqueue and tproxy, 2 mechanisms that require quite a bit of code to be well tested (and could be hard to setup anyway).
Ahmed Abdelsalam, engineer from Cisco, gave a talk on SRv6 Network programming. This protocol allows to simplify some interesting use cases from the network engineering point of view. For example, SRv6 aims to eliminate some tunneling and overlay protocols (VXLAN and friends), and increase native multi-tenancy support in IPv6 networks. Network Services Chaining is one of the main uses cases, which is really interesting in cloud environments. He mentioned that some Kubernetes CNI mechanisms are going to implement SRv6 soon. This protocol does not looks interesting only for the cloud use cases, but also from the general network engineering point of view. By the way, Ahmed shared some really interesting numbers and graphs regarding global IPv6 adoption. Ahmed shared the work that has been done in Linux in general and in nftables in particular to support such setups. I had the opportunity to talk more personally with Ahmed during the workshop to learn more about this mechanism and the different use cases and applications it has.
Fernando, GSoC student, gave us an overview of the OSF functionality of nftables to identify different operating systems from inside the ruleset. He shared some of the improvements he has been working on, and some of them are great, like version matching and wildcards.
Brett, engineer from Untangle, shared some plans to use a new nftables expression (nft_dict) to arbitrarily match on metadata. The proposed design is interesting because it covers some use cases from new perspectives. This triggered a debate on different design approaches and solutions to the issues presented.
Next day, Pablo Neira, head of the Netfilter project, started by opening a debate about extra features for nftables, like the ones provided via xtables-addons for iptables. The first we evaluated was GeoIP. I suggested having some generic infrastructure to be able to write/query external metadata from nftables, given we have more and more use cases looking for this (OSF, the dict expression, GeoIP). Other exhotics extension were discussed, like TARPIT, DELUDE, DHCPMAC, DNETMAP, ECHO, fuzzy, gradm, iface, ipp2p, etc.
A talk on connection tracking support for the linux bridge followed, led by Pablo. A status update on latest work was shared, and some debate happened regarding different use cases for ebtables & nftables.
Next topic was a complex one with no easy solutions: hosting of the Netfilter project infrastructure: git repositories, mailing lists, web pages, wiki deployments, bugzilla, etc. Right now the project has a couple of physical servers housed in a datacenter in Seville. But nobody has time to properly maintain them, upgrade them, and such. Also, part of our infra is getting old, for example the webpage. Some other stuff is mostly unmaintained, like project twitter accounts. Nobody actually has time to keep things updated, and this is probably the base problem. Many options were considered, including moving to github, gitlab, or other hosting providers.
After lunch, Pablo followed up with a status update on hardware flow offload capabilities for nftables. He started with an overview of the current status of ethtool_rx and tc offloads, capabilities and limitations. It should be possible for most commodity hardware to support some variable amount of offload capabilities, but apparently the code was not in very good shape. The new flow block API should improve this situation, while also giving support for nftables offload. There is a related article in LWN.
Next talk was by Phil, engineer at Red Hat. He commented on user-defined strings in nftables, which presents some challenges. Some debate happened, mostly to get to an agreement on how to proceed.
Next day, Phil was the one to continue with the workshop talks. This time the talk was about sharing his TODO list for iptables-nft, presentation and discussion of planned work. This triggered a discussion on how to handle certain bugs in Debian Buster, which have a lot of patch dependencies (so we cannot simply cherry-pick a single patch for stable). It turns out I maintain most of the Debian Netfilter packages, and Sebastian Delafond was attending the workshop too, who is also a Debian Developer. We provided some Debian-side input on how to better proceed with fixes for specific bugs in Debian. Phil continued pointing out several improvements that we require in nftables in order to support some rather exhotic uses cases in both iptables-nft and ebtables-nft.
Yi-Hung Wei, engineer working in OpenVSwitch shared some intresting features related to using the conntrack engine in certain scenarios. OVS is really useful in cloud environments. Specifically, the open discussion was around the zone based timeout policy support for other Netfilter use cases. It was pointed out by Pablo that nftables already support this. By the way, the Wikimedia Cloud Services team plans to use OVS in the near future by means of Neutron (a VXLAN+OVS setup)
Phil gave another talk related to nftables undefined behaviour situations. He has been working lately in polishing the last gaps between -legacy and -nft flavors of iptables and friends. Mostly what we have yet to solve are some corner cases. Also some weird ICMP situation. Thanks to Phil for taking care of this. Actually, Phils has been contributing a lot to the Netfilter project in the past few years.
Stephen, engineer from secunet, followed up after lunch to bring up a couple of topics about improvements to the kernel datapath using XDP. Also, he commented on partitioning the system into control and dataplace CPUs. The nftables flow table infra is doing exactly this, as pointed out by Florian.
Florian continued with some open-for.discussion topics for pending features in nftables. It looks like every day we have more requests for more different setups and use cases with nftables. We need to define uses cases as well as possible, and also try to avoid reinventing the wheel for some stuff.
Laura, engineer from Zevenet, followed up with a really interesting talk on load balancing and clustering using nftables. The amount of features and improvements added to nftlb since the last year is amazing: stateless DNAT topologies, L7 helpers support, more topologies for virtual services and backends, improvements for affinities, security policies, diffrerent clustering architectures, etc. We had an interesting conversation about how we integrate with etcd in the Wikimedia Foundation for sharing information between load balancer and for pooling/depooling backends. They are also spearheading a proposal to include support for nftables into Kubernetes kube-proxy.
Abdessamad El Abbassi, also engineer from Zevenet, shared the project that this company is developing to create a nft-based L7 proxy capable of offloading. They showed some metrics in which this new L7 proxy outperforms HAproxy for some setups. Quite interesting. Also some debate happened around SSL termination and how to better handle that situation.
That very afternoon the core team of the Netfilter project had a meeting in which some internal topics were discussed. Among other things, we decided to invite Phil Sutter to join the Netfilter coreteam.
I really enjoyed this round of Netfilter workshop. Pretty much enjoyed the time with all the folks, old friends and new ones.