Posted on 2014-08-15
This email is merely a generic notification. There is no outage or maintenance involved, if you are not interested you can safely ignore this notification.
In the past two years we have been individually updating our clients about the (possible and most likely) upcoming issues with the internet.
Most people and companies were afraid off the issues year 2K might bring due to the possible bugs in software. Others got scared because of the major Cisco and Juniper router bugs that occurred in the past years, which brought down a large portion of the internet. And then there are some people that are always scared I guess...
Anyways, now the time has come that the internet will actually start to cripple and become unstable for a couple of weeks. This is what we have been warning many clients and other ISP about in the past years. Many have acted upon this, but we can say for sure that most have not acted yet.
In the past years everyone heard (or should have) about the depletion of the IPv4 address space. Registries such as RIPE, who distribute those address blocks, thought it was a smart move to start and limit the size of each subnet that they distribute to their members who request additional address space. This has resulted in many subnet sizes of a /22 (1024 IP addresses) up to a /24 (256 IP addresses) to be distributed. Providers need to use these subnets across their infrastructure and as many providers have their infrastructure widely spread, this required them to divide those subnets into smaller portions, blocks. This is normal behavior, however as the subnets they received were already very small, there was not much to divide. Providers had to start dividing them into the smallest blocks possible that is allowed to be routed on the internet, which are /24's (256 addresses).
Up till a few years ago, there were about 400K routes active on the internet (see reference 1). Since last Tuesday we have reached over 512K routes. This poses a serious problem all over the internet as many routers have hardware limit of 512K routes that can be installed in their forwarding table of their line-cards. Every brand and type of router will act differently when this limit is hit. We have seen Cisco routers crash/reload (continuously). And other routers just not installing new routes, causing unavailability of parts of the internet. Last Tuesday we have been given a preview of these issues. Major providers like Level 3, AT&T, Cogent, Sprint and Verizon were having serious issues. And surely there were many more that should be added to this list.
The crashes and other issues will be seen in the coming weeks all over the internet. We expect that from time to time there will be, as we call it, BGP flaps all over the internet. This will cause routes to changes back and forth for a few minutes or even hours, which on its turn could cause a domino effect or at least a non-optimal set of routes on the internet.
It could be that many people will not notice this. But if you can not reach your local bank website, your company email or you just cannot reach your girlfriend by phone... Think about what we have said. And do not blame your local ISP -just yet-, as it can be very much any of the other 70.000 Internet Service Providers out there. Anyone provider that has been given an AS (Autonomous System) number and is between you and your destination (and/or back) can be causing these issues. Just wait a minute and retry.
We can now officially call the 12th of August 2014, the "512k day" (reference 5).
1. CIDR Active BGP entries (FIB): http://www.cidr-report.org/cgi-bin/plota?file=/var/data/bgp/as2.0/bgp-active.txt&descr=Active BGP entries (FIB)&ylabel=Active BGP entries (FIB)&with=step