Wikimedia Cloud VPS: IPv6 support
Dietmar Rabich, Cape Town (ZA), Sea Point, Nachtansicht — 2024 — 1867-70 – 2, CC BY-SA 4.0
This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.
Wikimedia Cloud VPS is a service offered by the Wikimedia Foundation, built using OpenStack and managed by the Wikimedia Cloud Services team. It provides cloud computing resources for projects related to the Wikimedia movement, including virtual machines, databases, storage, Kubernetes, and DNS.
A few weeks ago, in April 2025, we were finally able to introduceIPv6 to the cloud virtual network, enhancing the platform’s scalability, security, and future-readiness. This is a major milestone, many years in the making, and serves as an excellent point to take a moment to reflect on the road that got us here. There were definitely a number of challenges that needed to be addressed before we could get into IPv6. This post covers the journey to this implementation.
The Wikimedia Foundation was an early adopter of the OpenStack technology, and the original OpenStack deployment in the organization dates back to 2011. At that time, IPv6 support was still nascent and had limited implementation across various OpenStack components. In 2012, the Wikimedia cloud users formally requested IPv6 support.
When Cloud VPS was originally deployed, we had set up the network following some of the upstream-recommended patterns:
- nova-networks as the engine in charge of the software-defined virtual network
- using a flat network topology – all virtual machines would share the same network
- using a physical VLAN in the datacenter
- using Linux bridges to make this physical datacenter VLAN available to virtual machines
- using a single virtual router as the edge network gateway, also executing a global egress NAT – barring some exceptions, using what was called “dmz_cidr” mechanism
In order for us to be able to implement IPv6 in a way that aligned with our architectural goals and operational requirements, pretty much all the elements in this list would need to change. First of all, we needed to migrate from nova-networks into Neutron, a migration effort that started in2017. Neutron was the more modern component to implement software-defined networks in OpenStack. To facilitate this transition, we made the strategic decision to backport certain functionalities from nova-networks into Neutron, specifically the “dmz_cidr” mechanism and some egress NAT capabilities.
Once in Neutron, we started to think about IPv6. In 2018 there was an initial attempt to decide on the network CIDR allocations that Wikimedia Cloud Services would have. This initiative encountered unforeseen challenges and was subsequently put on hold. We focused on removing the previously backported nova-networks patches from Neutron.
Between 2020 and 2021, we initiated another significant network refresh. We were able to introduce the cloudgw project, as part of a larger effort to rework the Cloud VPS edge network. The new edge routers allowed us to drop all the custom backported patches we had in Neutron from the nova-networks era, unblocking further progress. Worth mentioning that the cloudgw router would use nftables as firewalling and NAT engine.
A pivotal decision in 2022 was to expose the OpenStack APIs to the internet, which crucially enabled infrastructure management via OpenTofu. This was key in the IPv6 rollout as will be explained later. Before this, management was limited to Horizon – the OpenStack graphical interface – or the command-line interface accessible only from internal control servers.
Later, in 2023, following the OpenStack project’s announcement of the deprecation of the neutron-linuxbridge-agent, we began to seriously consider migrating to the neutron-openvswitch-agent. This transition would, in turn, simplify the enablement of “tenant networks” – a feature allowing each OpenStack project to define its own isolated network, rather than all virtual machines sharing a single flat network.
Once we replaced neutron-linuxbridge-agent with neutron-openvswitch-agent, we were ready to migrate virtual machines to VXLAN. Demonstrating perseverance, we decided to execute the VXLAN migration in conjunction with the IPv6 rollout.
We prepared and tested several things, including the rework of the edge routing to be based on BGP/OSPF instead of static routing. In 2024 we were ready for the initial attempt to deploy IPv6,which failed for unknown reasons. There was a full network outage and we immediately reverted the changes. This quick rollback was feasible due to our adoption of OpenTofu: deploying IPv6 had been reduced to a single code change within our repository.
We started an investigation, corrected a few issues, and increased our network functional testing coverage before trying again. One of the problems we discovered was that Neutron would enable the “enable_snat” configuration flag for our main router when adding the new external IPv6 address.
Finally, in April 2025, after many years in the making, IPv6 was successfully deployed.
Compared to the network from 2011, we would have:
- Neutron as the engine in charge of the software-defined virtual network
- Ready to use tenant-networks
- Using a VXLAN-based overlay network
- Using neutron-openvswitch-agent to provide networking to virtual machines
- A modern and robust edge network setup
Over time, the WMCS team has skillfully navigated numerous challenges to ensure our service offerings consistently meet high standards of quality and operational efficiency. Often engaging in multi-year planning strategies, we have enabled ourselves to set and achieve significant milestones.
The successful IPv6 deployment stands as further testament to the team’s dedication and hard work over the years. I believe we can confidently say that the 2025 Cloud VPS represents its most advanced and capable iteration to date.
This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.