The Risks and Dangers of Amplified Routing Loops.

This article will take a closer look at network loops and how they can be abused as part of DDoS attacks. Network loops combined with existing reflection-based attacks can create a traffic amplification factor of over a thousand. In this article, we’ll see how an attacker will only need 50mb/s to fill up a 100gb/s link. I'll demonstrate this in a lab environment.

This blog is also a call to action for all network engineers to clean up those lingering network loops as they aren’t just bad hygiene but a significant operational DDoS risk.

Network Loops

All network engineers are familiar with network loops. A network loop causes an individual packet to bounce around in a network while consuming valuable network resources (bandwidth and PPS). There are various reasons IP networks can have loops, typically caused by configuration mistakes. The only real “protection” against network loops is the Time To Live (TTL) check. However, the Time To Live check is less of protection against loops, and more protection against them looping forever.

The Time to Live (TTL) refers to the amount of time or “hops” that a packet is set to exist inside a network before being discarded by a router. The TTL is an 8-bit field in the IP header, and so it has a maximum value of 255. The typical TTL value on most operating systems is 64, which should work fine in most cases.

Most network engineers know loops are bad hygiene. I think it’s much less understood (or thought about) what the operational risk of a loop is. In my experience, most loops are for IP addresses that aren’t actually in use (otherwise, it would be an outage), so typically, solving this ends up on the backburner.

A Simple Loop example

Let’s look at a simple example. The diagram below shows two routers; imagine those being your core or edge routers in your datacenter.

For whatever reason, they both believe 192.0.2.0/24 is reachable via the other router. As a result, packets for that destination will bounce between the two routers. Depending on the TTL value, the bandwidth used for that one packet will be:

packet size in bytes x 8 x TTL.

Meaning for a 512 byte packet and a TTL of 60, the amount bandwidth used is 245Kbs. In other words, the 512 byte packet turned into 307,20 bytes (60x) before it was discarded. You could think of this number 60 as the amplification factor.

But.. but.. Network loops are rare

That depends on your definition of rare. The good folks at Qrator monitor for loops on the Internet. According to the Qrator measurements, there are roughly twenty million unique loops (measured as unique router pairs).

source: Qrator Radar. Number of routing loops globally

Combining loops and common Amplification attacks.

Now that we’ve covered the risk of loops and their potential > 100x (the TLL) amplification factor, this may remind you of the traditional DDoS attacks.

The record high DDOS attacks you read about in the news are now hitting hundreds of Gigabits per second and peaking into the Terabits. All these large volume metric attacks are mostly the same type of attack and are commonly known as amplification or reflection attacks.

They rely on an attacker sending a small packet with a spoofed source IP, which is then reflected and amplified to the attack target. The reflectors/amplifiers are typically some kind of open UDP service that takes a small request and yields a large answer. Typical examples are DNS, NTP, SSDP, and LDAP. The large attacks you read about in the news typically combine a few of these services.

By now, it may be clear what the danger of combining these two types of scenarios can be. Let’s look at an example. A typical DNS amplification query could be something like this, and RRSIG query for irs.gov

dig -t RRSIG irs.gov @8.8.8.8

This query is 64 bytes on the wire. The resulting answer is large and needs to be sent in two packets; the first packet is 1500 bytes, the second packet 944 bytes. So in total, we have an amplification factor of (1500+944)/64 = 38.

IP (tos 0x0, ttl 64, id 32810, offset 0, flags [none], 
proto UDP (17), length 64)
    192.168.0.30.57327 > 8.8.8.8.53: 42548+ [1au] RRSIG? irs.gov. (36)

IP (tos 0x0, ttl 123, id 15817, offset 0, flags [+], 
proto UDP (17), length 1500)
    8.8.8.8.53 > 192.168.0.30.57327: 42548 8/0/1 irs.gov. RRSIG, irs.gov. 
    RRSIG, irs.gov. RRSIG, irs.gov. RRSIG, irs.gov. RRSIG[|domain]
    
IP (tos 0x0, ttl 123, id 15817, offset 1448, flags [none], 
proto UDP (17), length 944)
    8.8.8.8 > 192.168.0.30: ip-proto-17

Note: There are many different types of amplification attacks. This is just a modest and straightforward DNS example. Also, note that common open reflectors, such as Public DNS resolvers typically have smart mechanisms to limit suspicious traffic to reduce the negative impact these services could have.

The tcpdump output above shows that when the answer arrives back from Google’s DNS service, the TTL value is 123; this is higher than most other public DNS resolvers (most appear to default to 64).

If we combine this attack with the ‘loop’ factor we looked at previously (determined by the TTL value), we have the total amplification factor.

Adding up the numbers

Ok, so let’s continue to work on the DNS amplification example. The amplification number of 38 and a TTL of 123, would result in a total amplification number of :

38 * (123 / 2) = 2,337

Note that I’m dividing the TTL number by two so that we get a per receive (RX) and transmit(tx) number.

For now, let’s use 2,337 as a reasonable total amplification number. What kind of traffic would an attacker need to generate 10G or 100G of traffic? One would need just about 5Mbs/s to saturate a 10g link and say ~50Mbs to saturate a 100Gbs link! These numbers are low enough to generate from a simple home connection. Now imagine what an attacker with bad intentions and access to a larger botnet could do…

Let’s double-check this with a Demo

To make sure this is indeed all possible and the math adds up, I decided to build a lab to reproduce the scenario we looked at above.

The lab contains four devices:

An attacker: initiating a DNS-based reflection attack. The (spoofed) source IP is set to 192.0.2.53
A DNS resolver: receiving the DNS queries (with a spoofed source IP) and replying with an answer that is 38 times larger than the original question. The IP TTL value in the DNS answer is 123.
A router pair (rtr1 — rtr2): Both routers have a route for 192.0.2.0/24 pointing to each other. As a result, the DNS answer with a destination IP of 192.0.2.53 will bounce back and forth between the two routers until the TTL expires.

In the screenshot above, we see the attacker on the top left sending queries at a rate of 5.9Mbs. On the bottom left, we see the DNS resolver receiving traffic at 5.9Mbs from the client and answering the queries at a rate of ~173mbs. The IP packets with the DNS responses have a TTL of 123.

We see the router pair on the right-hand side: rtr1 on the top right and rtr2 on the bottom right. As you can see, both devices are sending and receiving at 10Gb/s. So, in this case, we observe how the client (attacker) turned 6mb/s into 10Gb/s.

Wrapping up

In this blog, we looked at the danger of network loops. I hope it’s clear that loops aren’t just a hygiene or cosmetic issue but instead expose a significant vulnerability that should be cleaned up ASAP.

We saw that loops are by no means rare and that there are millions of router pairs with network loops. In fact, according to Qrator data, over 30% of all Autonomous Systems (ASns), including many of the big cloud providers, have networks with loops in them.

We observed that an attacker can easily saturate a 10G link with 85Mbs (at a TTL of 240) without any UDP amplification. Or if combined with a typical UDP amplification attack, 6Mbs of seed traffic will result in 10G on a looped path, or 60Mb/s could potentially fill up a 100Gbs path!

Not all loops are the same

Most loops happen between two adjacent routers; quite a few of those appear to occur between an ISP’s router and the customer router. I have also seen loops happen involving up to eight hops (routers) spanning various metro areas while looping between Europe and the US. These transatlantic loops are expensive and hard to scale up quickly. As a result, loops on links like these will have a more significant impact.

Call to Action

I hope this article convinced you to check your network for loops and make sure you won’t be affected by attacks like these. Consider signing up for the free Qrator service, and you’ll get alerted when new loops (or other issues) are detected in your network.