Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Scalable P2P Architectures Outline | CLASSIC 0153I, Study notes of Classical Philology

University of California - Los Angeles (UCLA)Classical Philology

Material Type: Notes; Class: GRECO-ROMAN ARCHTCT; Subject: Classics; University: University of California - Los Angeles; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 08/31/2009

koofers-user-z7n 🇺🇸

10 documents

1 / 37

Partial preview of the text

Download Scalable P2P Architectures Outline | CLASSIC 0153I and more Study notes Classical Philology in PDF only on Docsity! Scalable P2P architectures Oscar Boykin Electrical Engineering, UCLA Joint work with: Jesse Bridgewater, Joseph Kong, Kamen Lozev, Behnam Rezaei, Vwani Roychowdhury, Nima Sarshar Outline * Introduction to P2P models: DHT and Unstructured Query Systems * Routing Packets on a “Small World” * Properties of real P2P systems (e.g. Gnutella). * A model for Power-law graphs * Percolating Messages on a Graph * Design of a new P2P system: Brunet DHT Systems ° If each node does not have a pointer to every other node, routing schemes are introduced. * Each node knows about k other nodes. * All queries are routed through these k nodes. * The query should be resolved in the fewest number of hops. * Most academic work has focused on DHT systems. Hyperspace Routing (Pastry and Tapestry) 010000 110 100 111 011001 101 Examples: Routing 101 starting at 000: 000 > 100 > 101 Routing 101 starting at 010: 110 > 100 > 101 Routing 101 starting at 011: 011 > 111 > 101 ● Messages are routed by matching the prefix of the destination to the current node, and sending to the node which matches the next element. ● Nodes need O( M (log n)/log M ) neighbors for an alphabet of size M, which gives O( log n/ log M) distance. Distance Based Routing ● A distance metric is defined on the key space. ● Nodes are connected to their nearest neighbors in the space and usually to remote nodes. ● Messages are routed to the node which is closest to the destination. ● Examples: System Space Latency Connections CAN M-dimension torus M N ^{1/M}Neighbors: M Chord Ring log N Neighbors exponentially increasing: log N Symphony Ring (log^2 N)/k Neighbors and k remote Viceroy log N stacked rings log N Neighbors A Routable Small World The red nodes have connections to distance eee eOeq L with P(L) ~ 1/L ® 1 © @ How can we show it is routable? Greedy Routing Works ● The probability of connections going a distance d: – P(d)=1/d log N ● What's the probability that a connection takes us to a distance less than d: Source DestinationDistance = d dDistance P  =∫ d 1−d 1 x log N dx=−log 1− log N Greedy Routing Works ● How many such connections are needed to get close: ● How many nodes (M) do we need to get lucky L times: Source DestinationDistance = d dDistance L d=log N L= log  log N d  log M P  =L M=L÷−log1− log N M= log N log d−log log N log1− log Since we must be prepared for d = N, then: M = O(log^2 N) Broadcast Query Systems ● In a broadcast query system, each node has some records. To query the network, the node sends a query to ALL neighbors. ● Each query has an identifying number, responses are routed back the way the query came. ● To query the entire system, a query will need to cross all edges (E), thus query cost is O(E) and E > N for all connected networks. How do we make scalable query systems? ● Gnutella is popular protocol for file sharing which uses the unstructured query model. ● To attempt to solve the scalability problems, they introduced “UltraPeers”, which are nodes that keep copies of all the records of their “LeafPeers”. ● Now, each query costs O(U), if U is the number of UltraPeers. But, if U is a constant fraction of N, then query costs are still O(N), only the constant has changed. Can we do better if we take advantage of network structure? Scale Free Networks ● Many large networks with interacting nodes, are what is called “scale free” networks, or powerlaw networks. ● Many mechanisms have been suggested which can account for such degree distributions. ● Powerlaw distributions are called scale free because of the following feature: Pk=  k ∝1/k P k=   k  = / k ∝1/k Preferential Attachment ● A simple model which gives rise to a powerlaw degree distribution was proposed by Barabasi, Albert 1999. ● At each time step, a new node joins and selects a node to connect to. The target node is selected with a probability proportional to its degree. The probability we select a node of degree k: ● Assuming a steady state solution, we want to write a difference equation for the number of nodes with degree k: qk= k nk 2 nk=qk−1−qkk ,1 nk= k−1nk−1−k nk 2 k ,1 k2nk=k−1nk−1k ,1 nk= 4 k k1k2 ∝1/k3 (Bond) Percolation Problem: ° If we have a graph and we delete each edge with probability (1-p), as a function of p, what is the size of the largest connected component? Bond Percolation on Random Graphs (with generating functions) ● Suppose we have a random graph with a constrained degree distribution: p(k). Each node has a degree selected according to this distribution, but its edges are randomly connected. ● We use a generating function to represent this distribution: P x =∑k x k pk ● If the random variable Z is the sum of independent random variables: Z = K_1 + K_2 +... + K_m, then the generating function is the product: Q  x =∑z x z pz=∏ P x =[P x ]m We can put this together to compute expected cluster sizes! ● The mean is the first derivative at x=1: P '  x =∑k x k−1 k pk P ' 1=∑k k pk=〈k 〉 Percolation Thresholds for Example Graphs pk= Zeta 3 k3 〈k 〉=Zeta 2/Zeta 3≈1.37 〈k2〉=Zeta 1=∞ qc= 〈k 〉 〈k2〉−〈k 〉 =0 pk= Zeta 4 k 4 〈k 〉=Zeta 3/Zeta 4≈1.11 〈k2〉=Zeta 2= 2 6 qc= 〈k 〉 〈k2〉−〈k 〉 =2.071 pk= Zeta 3.5 k3.5 〈k 〉=Zeta 2.5/Zeta 3.5≈1.19 〈k2〉=Zeta 1.5=2.61 qc= 〈k 〉 〈k2〉−〈k 〉 =0.83 pk=−1 −k 〈k 〉=  −1 〈k2〉=1 −12 qc= 〈k 〉 〈k2〉−〈k 〉 = −1 2 What does this mean? We can predict how many edges need to pass a packet to reach a constant fraction of the nodes! Percolation in P2P (due to Nima Sarshar) With probability p we send the query to each neighbor. Each node that gets the query responds with any matches, and sends the query to each of his neighbors with probability p. How small can p be? It must be bigger than q_c! Getting Polylog Scaling in Unstructured Query Systems ● Assume we have a random network of N nodes, and a degree distribution ~ 1/k^2. There is a maximum degree k_max (which is O(N)). ● We can get such a network using the protocol from Sarshar, Roychowdhury (PRE 2004) ● What is the cost of a percolation query at the threshold? C=qc E= qc 〈k 〉N 2 pk=/k 2 〈k 〉= log kmax 〈k2〉= kmax qc= log kmax kmax−log kmax C= log kmax kmax−log kmax N log kmax C= log2 kmax kmax /N−log kmax /N kmax=O N  C≈ log2 N Hence we get only O(log^2 N) cost for each query! Simulation Results Performance of Percolation Search an CRAWLS Network with TTL=15 1 — 1 r T Hit Rate i Fraction of Edges Used ------- O9F Fraction of Nodes Used -------- ‘ i 4 O8 O7 - O68 F O5 - O4 - O38 + 02 be Ot F 0.15 2 0.25 Percolation Probability * A percolation search protocol on a Gnutella network of size 39,730. The network structure was obtained by Limewire's Crawler Brunet: A Hybrid P2P System ● DHTs cannot resolve general queries. ● Unstructured systems (usually) require large routing tables to return query hits. ● Brunet is a new P2P protocol which combines the advantages of both DHTs and Unstructured Powerlaw networks. ● Brunet offers a general P2P foundation on which a wide variety of protocols and applications can build. Brunet: A Hybrid P2P System ¢ Each node has a 160 bit address which can also be thought of as a 160 bit positive integer. A distance metric using the integer representation. ¢ Each node is situated on a routable small world ring with “structured” connections to its neighbors on the ring, and shortcuts to remote locations. ¢ Each node also is on an “unstructured” network and has “unstructured” oe "eo nodes on that power-law network ow we Structured Subgraph (small world) Unstructured Subgraph (1/k‘2) Brunet Implementation ● The first implementation of the Brunet protocol is being completed at UCLA's Complex Networks Group. ● The code is developed using GNU/Linux and the Mono C# development environment. ● In addition to a programming library which implements the Brunet protocol, we have developed other tools: – Netmodeler: a general C++ network modeling package – Brunet Verifier: a protocol debugger for Brunet implementations Open Problems ● Can the DHT or unstructured systems be used to build an improved model of distributed computing (e.g. how can these P2P models help in mapping task graphs onto resources)? ● What common primitives can be implemented using P2P systems? (e.g. what kinds of communications costs are incurred building a P2P Database?) ● What results can be obtained about protocol security? Can bad nodes ruin the network? Summary ● Using models inspired from social contexts (such as small world and powerlaw networks) we see how some computer networking systems and architectures can be improved. ● Statistical Mechanics tools (percolation) allow us to analyze some novel networking conditions. ● By engineering previously ignored structural details of P2P systems, polylog scaling is achieved. ● The Brunet P2P system puts the DHT model together with the percolation search to get state of the art scaling properties.

Documents

questions

Scalable P2P Architectures Outline | CLASSIC 0153I, Study notes of Classical Philology

Related documents

Partial preview of the text