Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Scalable P2P Architectures Outline | CLASSIC 0153I, Study notes of Classical Philology

Material Type: Notes; Class: GRECO-ROMAN ARCHTCT; Subject: Classics; University: University of California - Los Angeles; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 08/31/2009

koofers-user-z7n
koofers-user-z7n 🇺🇸

10 documents

1 / 37

Toggle sidebar

Related documents


Partial preview of the text

Download Scalable P2P Architectures Outline | CLASSIC 0153I and more Study notes Classical Philology in PDF only on Docsity! Scalable P2P architectures Oscar Boykin Electrical Engineering, UCLA Joint work with: Jesse Bridgewater, Joseph Kong, Kamen Lozev, Behnam Rezaei, Vwani Roychowdhury, Nima Sarshar Outline * Introduction to P2P models: DHT and Unstructured Query Systems * Routing Packets on a “Small World” * Properties of real P2P systems (e.g. Gnutella). * A model for Power-law graphs * Percolating Messages on a Graph * Design of a new P2P system: Brunet DHT Systems ° If each node does not have a pointer to every other node, routing schemes are introduced. * Each node knows about k other nodes. * All queries are routed through these k nodes. * The query should be resolved in the fewest number of hops. * Most academic work has focused on DHT systems. Hyperspace Routing (Pastry and Tapestry) 010000 110 100 111 011001 101 Examples: Routing 101 starting at 000: 000 ­> 100 ­> 101 Routing 101 starting at 010: 110 ­> 100 ­> 101 Routing 101 starting at 011: 011 ­> 111 ­> 101 ● Messages are routed by matching the prefix of the  destination to the current node, and sending to the node  which matches the next element. ● Nodes need O( M (log n)/log M ) neighbors for an alphabet  of size M, which gives O( log n/ log M) distance. Distance Based Routing ● A distance metric is defined on the key space. ● Nodes are connected to their nearest neighbors in  the space and usually to remote nodes. ● Messages are routed to the node which is closest  to the destination. ● Examples:  System Space Latency Connections CAN M-dimension torus M N ^{1/M}Neighbors: M Chord Ring log N Neighbors exponentially increasing: log N Symphony Ring (log^2 N)/k Neighbors and k remote Viceroy log N stacked rings log N Neighbors A Routable Small World The red nodes have connections to distance eee eOeq L with P(L) ~ 1/L ® 1 © @ How can we show it is routable? Greedy Routing Works ● The probability of connections going a distance d: – P(d)=1/d log N ● What's the probability that a connection takes us to a distance  less than d: Source DestinationDistance = d dDistance P  =∫ d 1−d 1 x log N dx=−log 1− log N Greedy Routing Works ● How many such connections are needed to get close: ● How many nodes (M) do we need to get lucky L  times: Source DestinationDistance = d dDistance L d=log N L= log  log N d  log M P  =L M=L÷−log1− log N M= log N log d−log log N log1− log Since we must be prepared for d = N, then: M = O(log^2 N) Broadcast Query Systems ● In a broadcast query system, each node  has some records.  To query the network,  the node sends a query to ALL neighbors. ● Each query has an identifying number,  responses are routed back the way the  query came. ● To query the entire system, a query will  need to cross all edges (E), thus query cost  is O(E) and E > N for all connected  networks. How do we make scalable query  systems? ● Gnutella is popular protocol for file sharing which uses  the unstructured query model. ● To attempt to solve the scalability problems, they  introduced “UltraPeers”, which are nodes that keep  copies of all the records of their “LeafPeers”. ● Now, each query costs O(U), if U is the number of  UltraPeers.  But, if U is a constant fraction of N, then  query costs are still O(N), only the constant has changed. Can we do better if we take  advantage of network structure? Scale Free Networks ● Many large networks with interacting nodes,  are what is called “scale free” networks, or  power­law networks. ● Many mechanisms have been suggested which  can account for such degree distributions. ● Power­law distributions are called scale free  because of the following feature: Pk=  k ∝1/k P k=   k  = / k ∝1/k Preferential Attachment ● A simple model which gives rise to a power­law degree distribution  was proposed by Barabasi, Albert 1999. ● At each time step, a new node joins and selects a node to connect to.   The target node is selected with a probability proportional to its  degree.  The probability we select a node of degree k: ● Assuming a steady state solution, we want to write a difference  equation for the number of nodes with degree k: qk= k nk 2 nk=qk−1−qkk ,1 nk= k−1nk−1−k nk 2 k ,1 k2nk=k−1nk−1k ,1 nk= 4 k k1k2 ∝1/k3 (Bond) Percolation Problem: ° If we have a graph and we delete each edge with probability (1-p), as a function of p, what is the size of the largest connected component? Bond Percolation on Random Graphs (with generating functions) ● Suppose we have a random graph with a constrained degree distribution:  p(k).  Each node has a degree selected according to this distribution, but  its edges are randomly connected. ● We use a generating function to represent this distribution: P x =∑k x k pk ● If the random variable Z is the sum of independent random variables:  Z = K_1 + K_2 +... + K_m, then the generating function is the  product: Q  x =∑z x z pz=∏ P x =[P x ]m We can put this together to compute expected  cluster sizes! ● The mean is the first derivative at x=1: P '  x =∑k x k−1 k pk P ' 1=∑k k pk=〈k 〉 Percolation Thresholds for Example  Graphs pk= Zeta 3 k3 〈k 〉=Zeta 2/Zeta 3≈1.37 〈k2〉=Zeta 1=∞ qc= 〈k 〉 〈k2〉−〈k 〉 =0 pk= Zeta 4 k 4 〈k 〉=Zeta 3/Zeta 4≈1.11 〈k2〉=Zeta 2= 2 6 qc= 〈k 〉 〈k2〉−〈k 〉 =2.071 pk= Zeta 3.5 k3.5 〈k 〉=Zeta 2.5/Zeta 3.5≈1.19 〈k2〉=Zeta 1.5=2.61 qc= 〈k 〉 〈k2〉−〈k 〉 =0.83 pk=−1 −k 〈k 〉=  −1 〈k2〉=1 −12 qc= 〈k 〉 〈k2〉−〈k 〉 = −1 2 What does this mean? We can predict how many edges need to pass a packet to  reach a constant fraction of the nodes! Percolation in P2P (due to Nima Sarshar) With probability p we send the query to each neighbor. Each node that gets the query responds with any matches, and sends the query to each of his neighbors with probability p. How small can p be? It must be bigger than q_c! Getting Poly­log Scaling in  Unstructured Query Systems ● Assume we have a random network of N nodes, and a degree  distribution ~ 1/k^2.  There is a maximum degree k_max  (which is O(N)). ● We can get such a network using the protocol from Sarshar,  Roychowdhury (PRE 2004) ● What is the cost of a percolation query at the threshold? C=qc E= qc 〈k 〉N 2 pk=/k 2 〈k 〉= log kmax 〈k2〉= kmax qc= log kmax kmax−log kmax C= log kmax kmax−log kmax N log kmax C= log2 kmax kmax /N−log kmax /N kmax=O N  C≈ log2 N Hence we get only O(log^2 N)  cost for each query! Simulation Results Performance of Percolation Search an CRAWLS Network with TTL=15 1 — 1 r T Hit Rate i Fraction of Edges Used ------- O9F Fraction of Nodes Used -------- ‘ i 4 O8 O7 - O68 F O5 - O4 - O38 + 02 be Ot F 0.15 2 0.25 Percolation Probability * A percolation search protocol on a Gnutella network of size 39,730. The network structure was obtained by Limewire's Crawler Brunet: A Hybrid P2P System ● DHTs cannot resolve general queries. ● Unstructured systems (usually) require large  routing tables to return query hits. ● Brunet is a new P2P protocol which combines the  advantages of both DHTs and Unstructured  Power­law networks. ● Brunet offers a general P2P foundation on which  a wide variety of protocols and applications can  build. Brunet: A Hybrid P2P System ¢ Each node has a 160 bit address which can also be thought of as a 160 bit positive integer. A distance metric using the integer representation. ¢ Each node is situated on a routable small world ring with “structured” connections to its neighbors on the ring, and shortcuts to remote locations. ¢ Each node also is on an “unstructured” network and has “unstructured” oe "eo nodes on that power-law network ow we Structured Subgraph (small world) Unstructured Subgraph (1/k‘2) Brunet Implementation ● The first implementation of the Brunet protocol is being  completed at UCLA's Complex Networks Group. ● The code is developed using GNU/Linux and the Mono  C# development environment. ● In addition to a programming library which implements  the Brunet protocol, we have developed other tools: – Netmodeler: a general C++ network modeling package –  Brunet Verifier: a protocol debugger for Brunet  implementations Open Problems ● Can the DHT or unstructured systems be used to  build an improved model of distributed  computing (e.g. how can these P2P models help  in mapping task graphs onto resources)? ● What common primitives can be implemented  using P2P systems?  (e.g. what kinds of  communications costs are incurred building a  P2P Database?) ● What results can be obtained about protocol  security?  Can bad nodes ruin the network? Summary ● Using models inspired from social contexts (such as small  world and power­law networks) we see how some computer  networking systems and architectures can be improved. ● Statistical Mechanics tools (percolation) allow us to analyze  some novel networking conditions. ● By engineering previously ignored structural details of P2P  systems, poly­log scaling is achieved. ● The Brunet P2P system puts the DHT model together with  the percolation search to get state of the art scaling  properties.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved