Download Peer-to-Peer Networks: Lecture 14 in CSE 123b and more Study notes Computer Systems Networking and Telecommunications in PDF only on Docsity! 1 CSE 123b Communications Software Spring 2002 Lecture 14: Peer-to-Peer Networks Stefan Savage Some slides courtesy Ion Stoica and Srini Seshan May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 2 Peer-to-peer systems Examples ◆ Napster, Gnutella, Freenet, KaZaA, CFS, etc. Definition? ◆ No distinction between client and server ◆ All nodes are potential users of a service AND potential providers of a service May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 3 Classifications What resource is shared? ◆ CPU: SETI@Home ◆ Storage & BW: most of the rest How are resources located? ◆ Centralized systems » Napster, Seti@Home ◆ Distributed systems » Unstructured: e.g. Gnutella » Structured/routed: e.g. CFS/Chord, Freenet Search vs Lookup May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 4 Challenges Dynamic availability Scale Heterogeneity Security Fairness Performance Management May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 5 The Lookup Problem Internet N1 N2 N3 N6N5 N4 Publisher Key=“title” Value=MP3 data… Client Lookup(“title”) ? May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 6 Centralized Lookup (Napster) Publisher@ Client Lookup(“title”) N6 N9 N7 DB N8 N3 N2N1SetLoc(“title”, N4) Simple, but O(N) state and a single point of failure Key=“title” Value=MP3 data… N4 2 May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 7 Napster Simple centralized scheme motivated by ability to sell/control How to find a file: ◆ On startup, client contacts central server and reports list of files ◆ Query the index system return a machine that stores the required file » Ideally this is the closest/least-loaded machine ◆ Fetch the file directly from peer Advantages: ◆ Simplicity, easy to implement sophisticated search engines on top of the index system Disadvantages: ◆ Robustness, scalability May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 8 Flooded Queries (Gnutella) N4Publisher@ Client N6 N9 N7 N8 N3 N2N1 Robust, but worst case O(N) messages per lookup Key=“title” Value=MP3 data… Lookup(“title”) May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 9 Gnutella Distributed location information Idea: multicast the request How to find a file: ◆ Send request to all neighbors ◆ Neighbors recursively multicast the request ◆ Eventually a machine that has the file receives the request, and it sends back the answer Advantages: ◆ Totally decentralized, highly robust Disadvantages: ◆ Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL) May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 10 Gnutella Details Basic message header ◆ Unique ID, TTL, Hops Message types ◆ Ping – probes network for other nodes ◆ Pong – response to ping, contains IP addr, # of files, # of Kbytes shared ◆ Query – search criteria + speed requirement of node ◆ QueryHit – successful response to Query, contains addr + port to transfer from, speed of node, number of hits, hit results, node ID ◆ Push – request to node ID to initiate connection, used to traverse firewalls Ping, Queries are flooded QueryHit, Pong, Push reverse path of previous message May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 11 Routed Queries (Freenet, Chord, etc) N4Publisher Client N6 N9 N7 N8 N3 N2N1 Lookup(“title”) Key=“title” Value=MP3 data… May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 12 Example: Freenet Architecture: ◆ Each file is identified by a unique identifier ◆ Each machine stores a set of files, and maintains a “routing table” to route the individual requests Additional goals to file location: ◆ Provide publisher anonymity, security ◆ Resistant to attacks – a third party shouldn’t be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines 5 May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 25 Chord Example Assume an identifier space 0..8 Node n1:(1) joins all entries in its finger table are initialized to itself 0 1 2 3 4 5 6 7 i id+2i succ 0 2 1 1 3 1 2 5 1 Succ. Table May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 26 Chord Example Node n2:(3) joins 0 1 2 3 4 5 6 7 i id+2i succ 0 2 2 1 3 1 2 5 1 Succ. Table i id+2i succ 0 3 1 1 4 1 2 6 1 Succ. Table May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 27 Chord Example Nodes n3:(0), n4:(6) join 0 1 2 3 4 5 6 7 i id+2i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2i succ 0 1 1 1 2 2 2 4 0 Succ. Table i id+2i succ 0 7 0 1 0 0 2 2 2 Succ. Table May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 28 Chord Examples Nodes: n1:(1), n2(3), n3(0), n4(6) Items: f1:(7), f2:(2) 0 1 2 3 4 5 6 7 i id+2i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2i succ 0 1 1 1 2 2 2 4 0 Succ. Table 7 Items 1 Items i id+2i succ 0 7 0 1 0 0 2 2 2 Succ. Table May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 29 Query Upon receiving a query for item id, a node checks whether it stores the item locally If not, forwards the query to the largest node in its successor table that does not exceed id 0 1 2 3 4 5 6 7 i id+2i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2i succ 0 1 1 1 2 2 2 4 0 Succ. Table 7 Items 1 Items i id+2i succ 0 7 0 1 0 0 2 2 2 Succ. Table query(7) May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 30 Chord Summary O(logN) guaranteed lookup performance No search Performance: routing in the overlay network can be more expensive than in the underlying network ◆ Because usually there is no correlation between node ids and their locality; a query can repeatedly jump from Europe to North America, though both the initiator and the node that store the item are in Europe! ◆ Partial solution: Weight neighbor nodes by RTT » when routing, choose neighbor who is closer to destination with lowest RTT from me » reduces path latency 6 May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 31 Discussion Freeloading problem ◆ Does everyone participate? Trust? Availability/reliability? May 31, 2002 CSE 123b – Lecture 14 – Peer-to-Peer Networks 32 Summary A key challenge of building wide area P2P systems is a scalable and robust location service Solutions covered in this lecture ◆ Naptser: centralized location service ◆ Gnutella: broadcast-based decentralized location service ◆ Freenet: intelligent-routing decentralized solution (but correctness not guaranteed; queries for existing items may fail) ◆ Chord (and others): intelligent-routing decentralized solution » Guarantee correctness » May not be efficient Lots of open questions