Practical large-scale latency estimation

Sun, 03/23/2008 - 13:05 by Damien Saucez • Categories:

The paper "Practical large-scale latency estimation" written by Szymaniak et al., presents an architecture able the estimate latency of millions of nodes in an efficient manner.

The idea of the paper is to propose CDNs like Google to determine the best server for any client.

To do so, the authors uses GNP, a landmarks-based network coordinate system. First, they establish a base of 7 redundant landmarks which computes their coordinates based on active probing.

After, the landmarks estimates the coordinates of the different cluster of servers of the CDN (in the paper they use Google). These latencies are also estimated by active probing.

The landmarks must also determine the coordinates of the clients. However, the clients of CDNs like google are potentially al the Web client on the Internet. So active probing is not possible for that part of the measurements. To solve the problem, the authors use the fact that http traffic is TCP. So that, instead of making active probing, the Landmarks just have to estimate the time between SYN+ACK/ACK packets in the TCP stream to estimate the latency.

This solution avoid active probing but is not scalable due to the number of TCP connections to listen. This is why the solution proposed in the paper is based on clustering and centralized scheduling. The paper shows that latencies are globally equivalent with the same /24 prefix on the Internet ("in 91% of /24 networks median latencies to 80% of clients differ by at most 20%"). Section 3.3.2 presents an interesting procedure to estimate such correlation in performances. Based on these results, clusters of /24 prefixes are retained. This clustering technique offers the perspective to limit passive measurements to only one hosts per /24 prefix.
The centralized scheduler permit to determine whether or not passive measurement must be done when a node query Google (i.e., if the coordinates of the cluster is already known, it is no more required to measure the latency for a given moment).

The problem with this solution is that Landmarks must be able to have the TCP streams in order to estimate the latency. To solve this problem, object are added to Google web-pages. The objects are on the Landmarks thus client must open a TCP connection on the Landmarks. To avoid delay on web-page rendering, prefetching is used.

Based on the coordinates of the servers and the clusters of /24 prefixes, it is now possible to select the "best" server in term of latency.

The authors have deployed the system on Google and show that 86% of the time, this selection scheme leads to the best choice. And in other 10%, replicas are at most 2 time (in latency) than the best choice. Only 2% of the choices are further than 3 times the closest replica.

The paper offers an interesting solution to estimate latency for large-scale systems. It propose some good hints on how to make traffic/load-efficient measurements (scheduling algorithm + clustering + TCP) and proposes a nice and really simple procedure to estimate the "correlation" between the latencies within a cluster.

The two sentences "We therefore exploit a proprietary mechanism that precisely discovers which DNS server is used by each Google client. The details of this mechanism are out of the scope of this article" should not appear in the paper has they say: ok, we use a good system, just trust us! Too simple for me...

This paper can be considered as a really good reference for our IDIPS project:

Refrence of the paper: M. Szymaniak et al., "Practical large-scale latency estimation", Comput. Netw.(2008), doi:10.1016/j.comnet.2007.11.022.