Observing ECMP With Dublin Traceroute

Being a new blog, let’s start with an introductory post :)

A small introduction to ECMP

Dublin Traceroute has been created with multi-path networks in mind. It is very common in large networks to use ECMP to spread packets across multiple paths in order to improve reliability, increase bandwidth and more.

While very useful, ECMP introduced some challenges when running traceroute on this kind of networks. For example, it’s not possible to anticipate which path each packet will take, and since there are multiple possible paths, a regular traceroute may show impossible connections. For example, the following diagrams show in black the real connections between devices, and in red what a regular traceroute might see:

or worse, on variable-length paths:

These problems have been analyzed and addressed a decade ago by a team of researchers, and this resulted in the publication of several papers and a proof-of-concept tool, Paris-Traceroute.

Dublin Traceroute is an implementation of the techniques found in Paris-Traceroute, i.e. it is able to map multipath networks that use ECMP. On top of this there are a few new features:

  • a technique to detect NATs, and where precisely they are implemented (so that it works on the internet)
  • a way to detect broken NATs, by using a loose packet matching when requested. See my answer on Scaleway’s forum for more context
  • zero-TTL forwarding bug detection - a bug that some routers (still!) have, when they forward packets with TTL=1 instead of responding with an ICMP TTL-expired

I keep adding features over time, so this list should grow in the future.

So, how do I traceroute ECMP networks?

Glad you asked! Nothing better than an example. Google uses ECMP for their global DNS resolver. What better than seeing it with our eyes? Install dublin-traceroute (see https://dublin-traceroute.net for instructions on how to do it) and run it as shown below. I am running it on a 3G connection:

$ dublin-traceroute 8.8.8.8
Starting dublin-traceroute
Traceroute from 0.0.0.0:12345 to 8.8.8.8:33434~33453 (probing 20 paths, min TTL is 1, max TTL is 30, delay is 10 ms)
== Flow ID 33434 ==
1    192.168.43.1 (gateway), IP ID: 17503 RTT 7.657 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25516
2    *
3    172.16.0.213 (172.16.0.213), IP ID: 0 RTT 59.862 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25516
4    172.23.5.194 (172.23.5.194), IP ID: 0 RTT 65.349 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25516
5    *
6    213.191.237.45 (213.191.237.45), IP ID: 61214 RTT 50.283 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25516
7    213.191.237.46 (213.191.237.46), IP ID: 60862 RTT 45.321 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25516
8    172.16.161.14 (172.16.161.14), IP ID: 38099 RTT 61.53 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25516
9    *
10    *
11    172.16.101.1 (172.16.101.1), IP ID: 0 RTT 40.522 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753 (NAT detected), flow hash: 25516
12    193.120.76.205 (tengig4-3.ea101.bmt.esat.net), IP ID: 60944 RTT 40.795 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25516
13    193.95.130.1 (bundle-ether127.10.rt101.bmt.btireland.net), IP ID: 54623 RTT 41.168 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25516
14    193.95.129.96 (193.95.129.96), IP ID: 11148 RTT 57.186 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25516
15    193.95.129.135 (bundle-ether24.br002.bmt.btireland.net), IP ID: 42365 RTT 52.70 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25516
16    *
17    216.239.43.3 (216.239.43.3), IP ID: 61147 RTT 47.72 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25516
18    8.8.8.8 (google-public-dns-a.google.com), IP ID: 39240 RTT 68.68 ms  ICMP (type=3, code=3) 'Destination port unreachable', NAT ID: 42753, flow hash: 25516
== Flow ID 33435 ==
1    192.168.43.1 (gateway), IP ID: 17532 RTT 5.152 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25517
2    *
3    172.16.0.213 (172.16.0.213), IP ID: 0 RTT 46.750 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25517
4    172.23.5.194 (172.23.5.194), IP ID: 0 RTT 41.670 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25517
5    *
6    213.191.237.45 (213.191.237.45), IP ID: 61229 RTT 41.929 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25517
7    213.191.237.46 (213.191.237.46), IP ID: 60864 RTT 41.824 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25517
8    172.16.161.14 (172.16.161.14), IP ID: 38105 RTT 41.803 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 0, flow hash: 25517
9    *
10    *
11    172.16.101.1 (172.16.101.1), IP ID: 0 RTT 41.473 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753 (NAT detected), flow hash: 25517
12    193.120.76.205 (tengig4-3.ea101.bmt.esat.net), IP ID: 60955 RTT 46.598 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25517
13    193.95.130.1 (bundle-ether127.10.rt101.bmt.btireland.net), IP ID: 54626 RTT 83.564 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25517
14    193.95.129.96 (193.95.129.96), IP ID: 6587 RTT 78.430 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25517
15    193.95.129.135 (bundle-ether24.br002.bmt.btireland.net), IP ID: 52380 RTT 134.670 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25517
16    *
17    66.249.95.91 (66.249.95.91), IP ID: 58171 RTT 119.80 ms  ICMP (type=11, code=0) 'TTL expired in transit', NAT ID: 42753, flow hash: 25517
18    8.8.8.8 (google-public-dns-a.google.com), IP ID: 57371 RTT 113.685 ms  ICMP (type=3, code=3) 'Destination port unreachable', NAT ID: 42753, flow hash: 25517
...
Saved JSON file to trace.json .
You can convert it to DOT by running python -m dublintraceroute plot trace.json

An image can tell more than a thousand words

We can plot the resulting JSON using the dublintraceroute Python module (you installed it earlier, right?) as follows:

$ python3 -m dublintraceroute plot trace.json
Saved to trace.json.png

I used Python 3 but you can also use Python 2. The resulting output is:

trace.json.png

And here is the corresponding trace.json.

The graph shows all the traversed hops, starting with the local machine at the botton, and the target on top. Notice the last layer of hops just before the target? That’s Google’s network, where you can clearly see the multiple paths discovered by dublin-traceroute.

Et voilà! We will analyze these results in more details in the next post.

 
comments powered by Disqus