Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

The Theory of Network Tracing - First-Year Interest Group Seminar | N 1, Papers of Health sciences

Material Type: Paper; Class: FIRST-YEAR INTEREST GROUP SMNR; Subject: Nursing; University: University of Texas - Austin; Term: Unknown 1989;

Typology: Papers

Pre 2010

Uploaded on 08/26/2009

koofers-user-wxh-1
koofers-user-wxh-1 🇺🇸

4

(2)

10 documents

1 / 13

Toggle sidebar

Related documents


Partial preview of the text

Download The Theory of Network Tracing - First-Year Interest Group Seminar | N 1 and more Papers Health sciences in PDF only on Docsity! The Theory of Network Tracing H. B. Acharya M. G. Gouda Department of Computer Science Department of Computer Science University of Texas at Austin University of Texas at Austin Student author : H. B. Acharya 1 Abstract A widely used mechanism for computing the topology of any network in the Internet is Traceroute. Using Traceroute, one simply needs to choose any two nodes in a network and then obtain the sequence of nodes that occur between these two nodes, as specified by the routing tables in these nodes. Thus, each use of Traceroute in a network produces a trace of nodes that constitute a simple path in this network. In every trace that is produced by Traceroute, each node occurs either by its unique identifier or by the anonymous identifier “∗”. In this paper, we introduce the first theory aimed at answering the following important question. Is there an algorithm to compute the topology of a network N from a trace set T that is produced by using Traceroute in N , assuming that each edge in N occurs in at least one trace in T , and that each node in N occurs by its unique identifier in at least one trace in T ? Our theory shows that the answer to this question is “No” in general. But if N is a tree, or is a ring with an odd number of nodes, then the answer is “Yes”. On the other hand, if N is a ring with an even number of nodes, then the answer is “No”, but if N is a “mostly regular” ring with an even number of nodes, then the answer is “Yes”. 1 Introduction Traceroute is arguably the most popular mechanism for computing the topology of a network in the Internet [3] and [13]. Each use of Traceroute between any two nodes, say nodes x and y, in a network produces a sequence, also called trace, of nodes that occur, in the network, between x and y as determined by the routing tables in the network nodes. It follows that each trace, that is produced from using Traceroute between nodes x and y, does correspond to a simple path between x and y in the network. On the other hand, if the network has multiple simple paths between nodes x and y, then only one of these paths corresponds to the trace that is produced from using Traceroute between x and y. Traceroute can be used to compute the topology of a network N in the Internet as follows [3] : 1. Identify the nodes that occur at the perimeter of network N . We refer to these nodes as the terminal nodes of N . 2. Use Traceroute between any two terminal nodes of N to produce the trace of nodes that occur between these two terminal nodes (as determined by the routing tables in the nodes of N ). 3. Put all traces, that are produced in Step 2, together in order to compute the topology of network N . There are three problems that can hinder computing the correct topology of network N in Step 3. These three problems are as follows: (a) Incomplete coverage: It is likely that the set of traces produced in Step 2 do not cover every edge or every node in the network. And when this happens, the computed topology in Step 3 will not be correct. (b) Aliasing of Node Identifiers: A node in a network may have two or more (unique) identifiers and so such a node may occur by its different identifiers in different traces in the trace set produced in Step 2. This may cause the node to be regarded as multiple nodes and the computed topology in Step 3 to be incorrect.[7] (c) Node Anonymity: If a node that occurs in a trace is busy, when this trace is being produced in Step 2, then the node may decide not to bother announcing its unique identifier in the produced trace. If this happens, then the node will occur in the trace by an anonymous identifier “∗i”, where i is a positive integer, rather than by the node’s unique identifier. This in turn may cause the computed topology in Step 3 to be incorrect.[12] 2 Figure 2: Network N2 3 Impossibility of Network Tracing In this section, we show that the network tracing problem is not solvable for general networks. Theorem 1. There is no algorithm that takes as an input a trace set T that is generable from a network, and produces as output a network N such that: • T is generable from N , and • T is not generable from any other network. Proof. (By contradiction) Assume that such an algorithm exists. The following trace set T is generable from network N2 in Figure 2. {(a, b), (a, ∗1, d), (a, f), (b, c, d), (b, f), (d, e, f)} If T is given as an input to this algorithm, the algorithm will compute network N2 in Figure 2 as the output. This implies that T is not generable from any other network, which contradicts the fact that T is also generable from network N3 in Figure 3. Theorem 1 shows that the network tracing problem is not solvable for general networks. However, as shown below, the problem is solvable for special classes of networks. One such class of networks is the class of regular networks defined next. A network N is called regular iff every node in N is regular. The next theorem states that the network tracing problem is solvable for regular networks. 5 Figure 3: Network N3 Theorem 2. There is an algorithm that takes as an input a trace set T that is generable from a regular network, and produces as output a regular network N such that: • T is generable from N , and • T is not generable from any other regular network. Proof. The proof of this theorem follows from the fact that every node occurs by its unique identifier in every trace in a trace set that is generable from a regular network. In the next two sections, we consider two other special classes of networks, trees and rings, and discuss whether the network tracing problem is solvable for these two classes. 4 Tracing of Tree Networks A network N is called a tree if N is acyclic. In this section, we show that the network tracing problem is solvable for tree networks. Theorem 3. There is an algorithm that takes as an input a trace set T that is generable from a tree network, and produces as output a tree network N such that: • T is generable from N , and • T is not generable from any other tree network. Proof. (By construction) We prove Theorem 3 by describing the algorithm that is mentioned in the theorem. The algorithm consists of the following eight steps: 1. Initially, tree N is empty. 6 2. Apply procedure Leaf, discussed below, to compute, from T , the unique identifier of each leaf node in N . 3. Apply procedure Parent, discussed below, to compute from T , the unique identifier of the parent of each leaf node in N . 4. For every node y that is the parent of a leaf node x, add to tree N an (undirected) edge between nodes x and y. 5. For every node y that is the parent of a leaf node x, replace in T each trace of the form (x, ∗i, . . .) by a trace of the form (x, y, . . .). 6. Shorten the traces in T by replacing in T each trace of the form (x, y, . . .), where x is a leaf node, by the trace (y, . . .) and by discarding from T each trace that has only one node or is empty. 7. Repeat the algorithm, starting from Step 2, on the trace set T , that results from Step 6, provided that the resulting set T is non-empty. 8. The algorithm outputs N and terminates when the resulting T from Step 6 is empty. Next, we specify the two procedures Leaf and Parent that are used in Steps 2 and 3, respectively, of the above algorithm. The correctness of procedure Leaf follows from the observation that each leaf node in N occurs as a terminal node in some trace in T , but the converse is not necessarily true. Procedure Leaf is specified as follows: procedure Leaf for each terminal node y in any trace in T if T has three traces t = (x, . . . , y), t′ = (y, . . . , z), t′′ = (x, . . . , z), such that |t|+ |t′| = |t′′| then y is a non-leaf node in N else y is a leaf node in N end The correctness of procedure Parent follows from the observation that the parent of each leaf node in N must occur by its unique identifier in some trace in T . Procedure Parent is specified as follows: procedure Parent for each leaf node x in N, if T has a trace of the form (x, y . . .), or T has two traces of the form (x, ∗i, z) and (z, y, . . .) then the unique identifier of the parent of node x is y end 7 Figure 5: Network N5 Proof. It is sufficient to identify both neighbors of each node to specify the ring completely. Assume a trace (a, . . . , w/∗, ∗, y, z, . . . , b) The anonymous node is x. both its neighbors are not irregular, as the ring is guaranteed to be mostly- regular. Now if x is non-terminal and y is regular, all traces with x also have y, so there must be a trace of form (. . . , x, y, . . .). If x is a terminal and y is regular, then in the traces where it is a terminal x is not anonymous. Some trace passing through x must cover the edge x− y; this trace is to b. By consistent routing, when a trace is obtained from x to b, it must have the form x, y, . . . , b. Hence the regular neighbor of x can always be identified. further, both neighbors of a regular node y can always be identified - the case where both are regular is obvious; even when a node is irregular, its position next to a regular node can be identified, as above. Now as x is irregular, the other neighbor of y, ie z, must be regular. The other neighbor of z, say z1 (whether regular or irregular) is also known to be the neighbor of z (as above : the regular neighbor of an irregular node can be identified). By similar reasoning to the case of x (ie, w is terminal/non-terminal) w is identified as being 2 hops from y. w and z1 are the only two nodes with this description, and the position of z1 is known. Hence the position of w is the only other one. The irregular neighbor of x can always be identified. Hence in all cases, both neighbors of any node can be identified. The topology can be reconstructed. Encouraged by Theorem 6, one may have hoped that the network tracing problem is solvable for the whole class of mostly-regular networks. Unfortunately, as shown by the next theorem, this turns out not to be the case. Theorem 7. There is no algorithm that takes as an input a trace set T that is generable from any mostly- regular network, and produces as output a mostly-regular network N such that: • T is generable from N , and 10 • T is not generable from any other mostly-regular network. Proof. (By contradiction) Assume that such an algorithm exists. The following trace set T is generable from the mostly-regular network N2 in Figure 2. {(a, b), (a, ∗1, d), (a, f), (b, c, d), (b, f), (d, e, f)} If T is given as an input to this algorithm, the output produced is network N2. This implies that T is not generable from any other mostly-regular network. This contradicts the fact that T is also generable from the mostly-regular network N3 in Figure 3. 7 Discussion and Related Work Anonymous router resolution is an inherent problem in traceroute based topology mapping studies. Most of the early work in the area ignores or circumvents the problem. In [3], authors avoid the problem by stopping a trace toward a destination on encountering an anonymous router on the path; this approach obviously discards useful information. In [2], authors handle anonymous routers by replacing them either with arcs connecting the known routers at two ends, or with unique identifiers to treat them as separate nodes, producing inaccurate maps. The “sandwich” approach used in [1], merges a chain of anonymous nodes, “sandwiched” between the same pair of known nodes, with each other - thereby losing resolution. There have been three attacks on the anonymous router resolution problem. Yao et al. formulate it as an optimization problem [14]: building the smallest possible topology by combining anonymous nodes with each other under the constraints of trace preservation and distance preservation. They prove that the optimum topology inference under these conditions is NP-complete, then propose a heuristic to minimize the constructed topology by identifying anonymous nodes that, when merged, satisfy the two conditions. This is an O(n5) algorithm; also, its constraint of distance preservation states that the anonymous router resolution process should not reduce the length of the shortest path between any two nodes in the resulting topology map. This assumes, besides stable and symmetric routing, the additional constraint that routing is always along the shortest path. Jin et al. propose two heuristics to address the problem in [10]. The first one, an ISOMAP based dimensionality reduction approach, uses link delays or node connectivity as attributes in the dimensionality reduction process. This is still a O(n3) algorithm; further, they ignore the difficulty of estimating individual link delays from round trip delays in path traces [4]. The second, a simple neighbor matching heuristic, is O(n2) but suffers from accuracy problems: it may introduce a high rates of both false positives and false negatives. Gunes et al. propose their own heuristics in [8] and show good performance, strictly better than O(n3) for five heuristics they apply in succession. This paper addresses the problem and provides a theoretical basis for stating which instances of trace set can be used to compute exactly one network, and which cannot. We give a metric for reduction - the irregularity number - and bounds on algorithms such as in the above papers. We also give polynomial-time exact algorithms for several network cases of interest. 11 8 Concluding Remarks We have made two contributions in this paper. First, we formally stated the network tracing problem. Second we identified network classes for which this problem is solvable and network classes for which the problem is unsolvable. In particular, we showed that the problem is solvable for the following network classes: 1. regular networks ( Theorem 2) 2. tree networks ( Theorem 3) 3. odd ring networks ( Theorem 4) 4. mostly regular even ring networks ( Theorem 6) We also showed that the problem is not solvable for the following network classes: 1. general networks ( Theorem 1) 2. even ring networks ( Theorem 5) 3. mostly regular networks ( Theorem 7) The research in this paper can be extended by weakening the network tracing problem and so making it solvable for many more classes of networks. As an example, a weak version of the network tracing problem can be stated as follows: “Is there an algorithm that takes as an input a trace set T that is generable from a network, and produces a “small” set {N1, . . . , Nk} of networks such that • T is generable from each network in set {N1, . . . , Nk}, and • T is not generable from any network not in set {N1, . . . , Nk}?” In fact, one can view the results of this paper as solving this weak network tracing problem when the value of k is 1. Solving this problem, when the value of k is 2 or 3 or . . ., remains open and merits further research. Finally, recall that our theory of network tracing has so far been based on the two assumptions of unique identifiers and complete coverage (as discussed in Section 1). It would be interesting to explore ways to relax these two assumptions while maintaining the effectiveness and elegance of the theory. 12
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved