Last week's DDoS attack caused big problems for some, went unnoticed by others.
We reported last week on a
massive distributed denial of service attack that was intended to take
anti-spam organization Spamhaus offline.
We described the scale of the attack as "Internet-threatening,"
elaborating further that the attack, peaking at more than 300 gigabits
per second, "is the kind of scale that threatens the core routers that
join the Internet's disparate networks."
Subsequently, posts on
Gizmodo and
The Guardian called into question these assessments, with Gizmodo casting doubt on the description by asking some "simple questions" and
The Guardian specifically claiming that it was "shoddy journalism."
We stand by our original description and reporting. Here's why.
A network of networks
Before looking at the anti-Spamhaus attacks specifically, it's
important to know a little about how the Internet is constructed. The
Internet is often described as a "network of networks." Organizations
around the world have their own independently owned and operated
networks—university campuses, the retail Internet Service Providers
(ISPs) that provide DSL, cable, and more exotic connections to homes and
businesses, corporations, government departments, and so on and so
forth.
All of these are useful networks in their own right, but they become
enormously more useful when they're joined up. Joining up networks
creates an
internetwork. The first internetwork infrastructure
came from the US government, and the first internetwork, ARPANET, joined
a number of US universities in the 1970s.
Through the development of a series of other internetworks—both
academic and commercial—and the establishment of international
internetworks, we came to the situation we have today.
A small number of companies (about a dozen, though it's hard to know
with absolute certainty) own and operate high-speed, transnational
networks. These companies, called Tier 1 providers, pass traffic between
one another freely, providing transfers between smaller networks. This
free traffic transfer is called peering.
They provide the thing that's closest to the Internet's "backbone"
(though the term isn't really accurate: there's no single fragile spine,
but rather a complex mesh of redundant, interconnected networks): from a
Tier 1 provider, it's possible to send traffic to any public IP
address.
Purchasing connectivity from the Tier 1 providers are the Tier 2 providers. Tier 2 providers
buy Internet connectivity from Tier 1 providers, which is called
transit. However, they also connect directly to other Tier 2 providers, with
peering relationships. Tier 2 providers can be regional, but they can also be large transnational networks.

How customers connect to ISPs and ISPs connect between tiers.
Large Tier 2 providers can peer with many, many other Tier 2
providers, with the result that Internet traffic from that provider only
infrequently has to use the Tier 1 connectivity. The distinction
between Tier 1 and Tier 2 is not size or scale as such; it's simply that
Tier 1 networks only use peering. Tier 2 networks have to buy at least
some transit.
Tier 1 providers generally sell only to Tier 2 providers. Tier 2
providers may sell directly to end users, or they may sell to Tier 3
providers: ISPs who
only buy transit and don't have
any peering.
Tier 2 and 3 providers fall into two further categories. They can be
multi-homed, with multiple transit connections to different networks, or they can be
single homed, with just one transit link.
When two providers want to connect to one another, whether for
peering or for transit, they obviously need a physical link of some
kind. For providers with only a few connections, one-off point-to-point
connections known as private network interconnects (PNIs) are used. But
if you want to connect with
lots of peers, you don't want to build
lots
of individual expensive optic fiber links. You want to consolidate:
bring all the peers together in one place, and then stick a router or a
network switch between them all to join them up.
As a result, around the globe are dotted a few hundred
Internet Exchanges
(IXs). At each IX, there may be hundreds of providers from all three
tiers coming together. The IXs generally use Ethernet infrastructure for
their internal connectivity. Gigabit and 10 gigabit Ethernet are
predominant, but 100 gigabit Ethernet is starting to gain more use,
though its cost today prevents it from being used as the standard
technology. Longer links may be gigabit, 10 gigabit, 40 gigabit, or 100
gigabit. In principle, faster speeds still are possible through
aggregating these 100 gigabit connections, but in practice, today's IXs
are mainly 10 gigabit (or aggregated multiples thereof) networks.
IXs are important. Major service providers such as Google, Microsoft, and Facebook
connect to IXs.
If two Tier 2 operators can send traffic directly to each other, via
peering at an IX, that's cheaper and more efficient than going via
transit to a Tier 1.
Enter Spamhaus, STOPhaus, and CloudFlare
STOPhaus doesn't care much for Spamhaus.
Spamhaus provides useful services to e-mail administrators wishing to
keep junk e-mail out of the servers they own and operate. STOPhaus is
an informal group that doesn't like Spamhaus. STOPhaus members wanted to
knock Spamhaus off the Internet using a distributed denial of service
(DDoS) attack that flooded Spamhaus's systems and drowned out legitimate
traffic. They did so by aiming a
flood of DNS traffic at Spamhaus's servers.
In response, Spamhaus
started using the services of CloudFlare,
a company that specializes in providing robust serving that's difficult
to take offline with DDoS attacks. CloudFlare does this by replicating
content around the globe and using a routing technique called anycast.
Anycast allows servers with the same IP address to coexist
simultaneously around the globe. Internet providers will generally route
traffic to the geographically
nearest instance of those anycasted IP addresses.
This does two things. By picking a site that's geographically close, it cuts the
latency
to access the site, making it react faster. Second, it dilutes the
effect of DDoS attacks. Instead of a distributed attack using systems
around the world being able to focus its flood on a single IP address in
a single location, each attacking system can only focus on a
nearby target.
Two attackers on opposite sides of the world may still be aiming at
the same victim IP address, but their traffic will go to different
computers that are relatively nearby.
For CloudFlare's technology to work well, it needs a high level of distribution. The company
currently reports that it has 23 data centers around the world and peers with
nearly 70 different Tier 1 and Tier 2 providers around the world; it does this with a mix of PNIs and IXs.
CloudFlare did its job, and Spamhaus remained accessible. Trying to flood the anycasted addresses wasn't working.
So the attackers changed their approach. Rather than attacking
CloudFlare's distributed servers, they took aim at the network
infrastructure used by CloudFlare's providers: the IXs. Attacks were
made on IXs in Frankfurt, Amsterdam, London, and Hong Kong. It's the
London IX, LINX, that suffered.

Optical patch panel at the AMS-IX Internet exchange point in Amsterdam, which was targeted by the attackers.
Each provider peering at LINX has its own IP address, through which
traffic to that provider is passed. The attackers noticed that LINX's IP
addresses were accessible from anywhere in the world. This, in turn,
meant that they could be the target of a DDoS attack.
On March 23rd, the attackers used this information to attack specific
addresses within LINX. As is typical in IXes, these are addresses that
are generally interconnected with 10 gigabit Ethernet. Throwing hundreds
of gigabits per second swamped them. The result was that
CloudFlare-protected services were, for some people (especially within
the UK), slow or inaccessible. LINX also suffered an issue with its
traffic monitoring, which showed traffic across its network
approximately halved, that may have been related.
LINX subsequently changed its network configuration so that the IP
addresses in question weren't reachable from outside LINX's own trusted
network. This cut off the attacks, and normal operation was restored
soon after.
The fault here was arguably in part LINX's, as it should have been
configured in a safer way from the outset (the Amsterdam IX (AMS-IX),
for example, explicitly
prohibits
advertising routes to its internal IP addresses), but it wasn't, and it
caused trouble as a result. That said, the IX community does not
universally agree with this approach.
Breaking IXs breaks the Internet
IX infrastructure is core to the Internet. It is not the only
Internet infrastructure, and there would still be an Internet if an IX
blew up or burned down, but it wouldn't be the same Internet. LINX's
infrastructure
in aggregate has several terabits per second of capacity, and the Internet as a whole has an aggregate of
hundreds of terabits per second of capacity, but any one provider
within
LINX has only a fraction of that capacity; big ISPs have 80-100 Gbps,
but few (if any) have more than that. Having lots of bandwidth somewhere
else in the world doesn't actually help very much.
Moreover, 300Gbps is well above the level at which it's easy to
quickly add extra bandwidth to respond. 100 gigabit Ethernet is
expensive: IXs and ISPs don't have an abundance of 100 gigabit network
ports lying around waiting for a rainy day, and they
certainly don't give
every
customer peering at the IX an extra few hundreds of gigabits of
capacity "just in case." At LINX, for example, 100 gigabit ports are
installed on demand. They're too expensive to treat any other way.
Richard Steenbergen, currently CTO for GTT, a large network provider and upstream operator to, among other customers, CloudFlare,
wrote in response to Gizmodo's article:
My company, most other large Internet carriers, and
even the largest Internet exchange points, all deliver traffic at
multi-terabits-per-second rates, so in the grand scheme of things 300
Gbps is certainly not going to destroy the Internet, wipe anybody off
the map, or even show up as more than a blip on the charts of global
traffic levels. That said, there is absolutely NO network on this planet
who maintains 300 Gbps of active/lit but unused capacity to every point
in their network. This would be incredibly expensive and wasteful, and
most of us are trying to run for-profit commercial networks, so when 300
Gbps of NEW traffic suddenly shows up and all wants to go to ONE
location, someone is going to have a bad day.
To make this more concrete: GTT has multiple terabits per second of
connection around the world. But its IPv4 connectivity at LINX is
reported to be 30Gbps.
Send more than 30Gbps of traffic to its LINX IP address and anyone
counting on using GTT for peering/transit through LINX is going to have a
rough time. CloudFlare
appears to have just 10Gbps of connectivity to LINX. The Internet is full of choke points such as this.
Paul Vixie, Internet engineer and co-ounder of the Internet Systems
Consortium, concurred, telling Ars via e-mail, "300 Gbps is fatal for
some parts of the 'Net, but not all parts. It's when they started going
after Internet exchange connections that third parties started losing."
Large providers—both on the demand side, such as ISPs, and the supply
side, such as Facebook or Google or the BBC—peer at multiple IXs and
have PNIs, so they're not so dependent on the health of any one IX.
Small ones, however, do not. Flood the IX's infrastructure and they'll
effectively drop off the Internet.
This is breaking the Internet. The "network of networks" reverts to
being "disjoint networks," at least for some. For the rest, multihoming
should mask any fatal errors. Things may be a little slower, and for
ISPs having to switch to transit instead of peering they may be a little
more expensive, but disruption shouldn't be
too visible.
Similar behavior occurs in other Internet incidents. When undersea
cables are cut, it's rare for a national network to be completely
isolated, but cut enough cables and the Internet can become disjointed,
as it reportedly did in East Africa after
four cables were cut simultaneously
in 2012. When faced with cable cuts, the global Internet is fine, and
the national networks are also fine. They're just not joined up.
Similarly, when Pakistan
published routes disabling YouTube to the global Internet, almost every network making up the Internet remained reachable, except one: YouTube's network.
STOPhaus even tried a
similar attack of their own on Spamhaus, trying to hijack Spamhaus's IP address range and redirect it to
CyberBunker.
The Internet is generally quite resilient to this kind of thing. But problems do happen.
Not that shoddy
If the Gizmodo and
Guardian writers were perhaps expecting a
broken Internet to mean that the entire thing simultaneously fell apart
into a million different networks, then certainly, these attacks (and
others, such as hijacking IP addresses or cutting cables) won't "break
the Internet."
If that's what you're after, however, nothing really will. Not because the Internet was designed to survive a nuclear attack—
it wasn't—but because it has grown to be widely distributed, with lots of redundant links, and few people really care about the
entire Internet.
Gizmodo's questions about the attacks were:
- Why wasn't my internet slow?
- Why didn't anyone notice this over the course of the past week, when it began?
- Why isn't anyone without a financial stake in the attack saying the attack was this much of a disaster?
- Why haven't there been any reports of Netflix outages, as the New York Times and BBC reported?
- Why do firms that do nothing but monitor the health of the web,
like Internet Traffic Report, show zero evidence of this Dutch conflict
spilling over into our online backyards?
Four of those, at least, are easy enough to answer.
- Because you're an American, in America, primarily accessing
American sites. The Internet, however, is a global network. Disruption
in one area need not lead to disruption in other areas, particularly if
the services you are interested in are geographically close. Network
security company Arbor Networks noted that the DDoS attack was substantially larger than those that have gone previously, and its Asia Pacific analyst Roland Dobbins wrote that problems were indeed seen by providers in Europe, the Middle East, Africa, and Asia-Pacific.
- They did. Quoting Andree Toonk, a network engineer for OpenDNS, "Those who claim there was no impact probably don't run global networks. I've seen Tier1's struggle and had to route around it, EU and Asia! significant packet loss." This corroborates CloudFlare's claim that Tier 1 providers were congested.
- People who do not work for CloudFlare are saying that the
attack was substantial, that it was disruptive, and that it caused
service problems for some people. Indeed, they're annoyed by it, as it
rendered other CloudFlare-hosted sites unusable from the UK.
For example, Andy Gambles of UK-based SSL provider and CloudFlare
customer ServerTastic complained to CloudFlare, "Our sites were dead slow/practically offline for the whole time."
- Who knows?
- Two reasons. First, because the Internet Traffic Report doesn't
monitor Africa at all, has poor coverage of Asia, has European data
that's sporadic at best (lots of the systems it tests simply aren't
returning any traffic at all), and provides only aggregate graphs for
periods longer than 24 hours, making it impossible to see local effects
that occurred on the 23rd of March. It's a useful resource, but hardly
the final arbiter of whether the Internet is working well or not.
Second, because the Internet doesn't work that way. If a network that
you don't care about has been cut off from the network of networks,
you'll never notice or care.
CloudFlare's blog post, "
The DDoS that almost broke the Internet,"
certainly had a rather hyperbolic title. It's probably not the first
blog post to have a hyperbolic title. It almost certainly won't be the
last. Shattering the Internet into a billion disconnected hosts will
never happen, so in that sense, the Internet is safe. But breaking it
into two, or three, or a handful of separate networks? With the right
amount of traffic in the right place, that can happen.
View: Original Article