Download Distributed Systems: Concepts, Fundamental Problems, and Models and more Study notes Electrical and Electronics Engineering in PDF only on Docsity! Lecture 1
introduction: systems, fundamental
problems, and models
distributed systems
CS425 / ECE 428 / CSE 424
acknowledgment These slides are based on ideas and material from the following sources: • slides prepared by Professors M. T. Harandi and J. Hou and subsequently modified by Professors Indranil Gupta, Nitin Vaidya, and Yih‐Chun Hu at University of Illinois • slides from Professor S. Ghosh’s course at University of Iowa 1. the Internet
= : g We we
pe eal
So aa
x hup:innww cheatin camdmapandex tra
Copyright (C) 1999, Lucent Technolagies
ans.not “7 7
urchy/Cheswick map of the Itemet
showing the major ISPs, Data collected 28 June 1999
Internet Mapping Project, color coded by ISPs
a small part of the Internet intranet ISP desktop: backbone satellite link server: network link: a typical Intranet the rest of email server Web server Desktop computers File server router/firewall print and other servers other servers print Local area network email server the Internet 4. distributed mobile robots
si
Kiva robot
500 robots coordinate to
manage inventory in a
warehouse
http://www. raffaello.name/KivaSystems.html
5. computation grids • Seti@Home • uses Internet‐connected computers in the Search for Extraterrestrial Intelligence • downloads and analyzes radio telescope data on your desktop and sends it back to Seti servers • http://setiathome.berkeley.edu/ what is a distributed system? it is a collection of entities (computers, robots, fireflies,…) – each of which is autonomous, asynchronous and [possibly] failure‐ prone – communicating through [possibly] unreliable channels (ethernet, wireless, vision,…) – to perform some common function (detection, communication, computation) nodes or processes or agents edges or links or channels why distributed systems ?
* geographic distribution of processes
* resource sharing (as used in P2P networks)
* computation speed up (as in a grid)
e fault tolerance
CS 425
CS 425 is an introduction to the key
concepts for
— designing
— analyzing and
— implementing
distributed systems
important role of failures
another definition:
a distributed system is a system in which the
failure of a computer you’ve never heard of can
prevent you from getting your work done
fundamental problems in distributed systems • time synchronization • leader election • mutual exclusion • distributed snapshot • routing • consensus • replica management • transactions we will study algorithms (and lower bounds) for these types of problems approaching implementations the rest of email server Web server Desktop computers File server router/firewall print and other servers other servers print Local area network email server the Internet a simple communication model: message passing system topology is a graph G = (V, E), where V = set of nodes (sequential processes) E = set of edges (links or channels, bi/unidirectional) four types of actions by a process: – internal action: sequential process computes – send action: sends a message (puts message in channel) and performs computation – receive action: receives a message (message taken out from channel) and performs computation a reliable FIFO channel Axiom 1. Message m sent ⇔ message m received Axiom 2. Message propagation delay is arbitrary but finite. Axiom 3. m1 sent before m2 ⇒ m1 received before m2. P Q shared memory model address spaces of processes overlap M1 1 3 2 4 M2 concurrent operations on a shared variable are serialized weak vs. strong models One object (or operation) of a strong model = more than one objects (or operations) of a weaker model. Often, weaker models are synonymous with fewer restrictions. One can add layers (additional restrictions) to create a stronger model from weaker one. Examples HLL model is stronger than assembly languagemodel. Asynchronous is weaker than synchronous. Bounded delay is stronger than unbounded delay (channel) model transformation Stronger models ‐ simplify reasoning, but ‐ needs extra work to implement Weaker models ‐ are easier to implement. ‐ have a closer relationship with the real world “Can model X be implemented using model Y?” is an interesting question in computer science. sample problems non‐FIFO to FIFO channel message passing to shared memory from non-FIFO to FIFO channel ?
Cr) m2 m3 m4 mi Ce}
buffer
typical design goals • heterogeneity – can the system handle different types of PCs and devices? • robustness – is the system resilient to crashes and failures? • availability – are data, services always there? • transparency – does the system hide internal workings from the users? • concurrency – can the server handle multiple clients simultaneously? • efficiency – is it fast enough? • scalability – can it handle 100 million nodes? • security – can the system withstand hacker attacks? • openness – is the system extensible? extra slides
Internet: Protocols and Transport Application e-mail remote terminal access Web file transfer streaming multimedia remote file server Internet telephony Application layer protocol smtp [RFC 821] telnet [RFC 854] http [RFC 2068] ftp [RFC 959] rtsp [RFC 2326], proprietary NFS, AFS, … H.323, SIP [RFC 2543], … Underlying transport protocol TCP TCP TCP TCP TCP or UDP TCP or UDP typically UDP Security • Security has three components: – Confidentiality (protection against disclosure to unauthorized individuals). – Integrity (protection against alternation or corruption). – Availability (protection against interference with the means to access the resources). • Two security challenges: – Denial of service attacks: bombarding the service with a large number of pointless requests. – Mobile code security: mobile codes may be accessing local resources. Scalability • A system is said to be scalable if it will remain effective when there is a significant increase in the number of resources and users: – Controlling the cost of resources – Controlling the performance loss – Preventing software resources running out (e.g., IP addresses) – Avoiding performance bottlenecks Failure Handling • Failure in DS is partial – some component fails while the rest is functional; – Detecting failures (remote site crash or delay in message transmission?) – Masking failures (message retransmission, file replication) – Tolerance for failure (clients give up after a pre‐ determined number of attempts and take other actions) – Failure recovery (checkpoint and rollback recovery) – Redundancy (multipath routing, replicated database, replicated DNS)