Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Application Level Tools - Lecture Slides | CMSC 818, Study notes of Computer Science

Material Type: Notes; Subject: Computer Science; University: University of Maryland; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-46e
koofers-user-46e 🇺🇸

10 documents

1 / 69

Toggle sidebar

Related documents


Partial preview of the text

Download Application Level Tools - Lecture Slides | CMSC 818 and more Study notes Computer Science in PDF only on Docsity! Chapter 24 Application-Level Tools Naiwen Lin Questions How do I port my application to the Grid? How do I run my application on the Grid? Where is the Grid I can use now? Application scientists do not have the time, expertise, and motivation to learn the details of Grid information services Grid Programming Models Four different programming models that appear particularly relevant to Grid evironment RPC Task parallelism Message passing Java-based models Grid Programming Models (cont.) RPC Several projects have been initiated to specialize coarse-grained task parallelism in their program and to access remote computational services available on Grid resources. Related projects NetSolve, RCS, PUNCH, and NEOS. Grid Programming Models (cont.) Task Parallelism Achieved by partitioning the work to be performed Include functionality for collecting and combining results example: Satin Grid Application Execution Environments Parameter Sweep Applications Workflow Applications Portals Grid Application Execution Environments Parameter Sweep Applications Set of multiple tasks executed with a distinct set of parameters Executing a parameter sweep involves assignment of tasks to resources Tasks are basically independent Namely Many compute tasks No or simple dependencies Several output post-processing stages Potentially large datasets RANT me ATED Raw Output J WJ & Post-processing Final Output Grid Application Execution Environments Portals can provide simple interfaces Portals are web based Interface to middle-tier infrastructure of the Grid Users can be isolated from resource specific details Uniform interface isolates system changes/differences APST Case Study Parameter Sweep Applications “Many” tasks Tasks vary in switches and/or input files Minimal inter-task communication Typically produce files as output Found in bio-informatics, neuroscience, computer graphics, discrete-event simulations, protein folding, database searches, etc. APST Case Study PSAs Map Well to Grid Can effectively use huge amounts of resources Flexibility in task assignment Latency tolerant—little communication Fault tolerant—restarting a task is sufficient APST Case Study Encyclopedia of Life (eol.sdsc.edu) an ambitious project seeking to catalog the complete proteome of every living species in a flexible, powerful reference system includes calculating three-dimensional models and assigning biological function for all recognizable proteins in all currently known genomes APST Case Study Example: Encyclopedia of Life Project Genome Data Preprocessing Sequences Analysis Apps (e.g., psiblast, 123d) Data BaseAnalysis Output Postprocessing Application Tasks Application Data Files Conclusions The main goal of Grid application-level tools is to render Grid capabilities accessible to not just individual heroic users but entire end-user communities. Will focus on performance issues so that the base infrastructure can be used efficiently, via for example both static compilation and dynamic optimization techniques. Road map Geographical environment DAS Three programming environments MagPie library for collective communication with MPI RepMI (Replicated Method Invocation) mechanism for Java Java-based Satin system for running divide- and-conquer programs Panda wide-area network emulator DAS stands for Distributed ASCI Supercomputer A wide-area distributed cluster designed by the Advanced School for Computing and Imaging (ASCI). DAS will be used for research on parallel and distributed computing by five Dutch universities. DAS (cont.) 4 clusters and 200 nodes 128 nodes at Vrije Universiteit Amsterdam 72 nodes at University of Amsterdam, Delft University of Technology, University of Leiden running RedHat Linux WAN : SurfNet LAN : MyriNet Solutions Most applications can be rewritten in order to tolerate the high latency and the low bandwidth of WAN links. High WAN latency can be tolerated by overlapping computation with asynchronous communication Low WAN bandwidth can be tolerated by Avoiding redundant communication Combining several short messages into longer ones Issues Our manual modifications to the application source code were effective but also increase code complexity. Multi-cluster aspects of communication should be separated from the application- specific parts of the source code. The Grid programming environments are developed. MagPie The collective communication operations Broadcast, barrier, reduce, etc. MagPie library implements MPI’s collective operations with optimizations for wide area systems. Existing parallel MPI applications can be run on Grid platforms using MagPie by relinking the programs with MagPie library Satin An extension of the single-thread Java model. Satin programs do not have to use Java’s threads or RMI Instead, they use the much simpler divide- and-conquer primitives Allow the combination of its divide-and- conquer primitives with Java threads and RMIs Satin (cont.) Augment the Java language with The satin modifier is place in front of a method declaration The spawn keyword is placed in front of a method invocation to indicate possibly parallel execution The sync operation waits until all spawned calls in the current method invocation are finished Satin (cont.) Load balancing Satin’s implementation eliminates thread creation A spawned method invocation is put into a local work queue, and the method might be transferred to a different CPU where it may run concurrently (work stealing) Satin – Work Stealing Cluster-Hierarchical Stealing (CHS) Arrange processors in a tree topology When a node is idle, it first asks its child nodes for work, and then recursively descends the tree. Only when the entire subtree is idle, send stealing message upwards in the tree Drawback: all nodes of a cluster have to become idle before wide-area stealing Satin – Work Stealing Cluster-aware Random Stealing (CRS) Each node can steal jobs from nodes in remote clusters, but at most one job at a time Wide-area steal request is sent asynchronously Set a flag and perform additional synchronous steal requests to randomly selected nodes within its own cluster If remote request is successful, put the new job into the work queue and reset the flag Panda – the WAN emulator Panda is a virtual machine designed to portable implementations of parallel programming systems It provides communication primitives and thread support to higher-level layers, such as message passing, RPC, and group communication One dedicated node in each cluster acts as a gateway Satin Case Study 9 scenarios with different network configurations Scenario 9 is prerecorded NWS measurements of the real DAS system Using the following Satin applications Adaptive integration N Queens Ray Tracer Traveling Salesperson Scenarios 1-4 200 KB/s, 100 ms 100 KB/s, 30 ms “s/ay OOL su OOF ‘S/ay 00S 100 KB/s, 80 ms wo — 2 Ss = | 2 a x 2 Ss ° 1000 KB/s, 100 ms 100 KB/s, 1 ms 200 KB/s, 30 ms 200 KB/s, 30 ms ‘G Co) | aes § 100 KB/s, 100 ms 300 KB/s, 80 ms sui 0S ‘s/a 00S 300 KB/s, 80 ms SWI OS ‘S/Q¥ 00S +—_—_+ 1 KB/s, 300 ms 1000 KB/s, 1 ms 1000 KB/s, 1 ms Scenarios 5-8 200 KB/s, 30 ms 100 KB/s, 100 ms th E o co wo a x o Se ro) 1000 KB/s, 1 ms 200 KB/s, 30 ms Enabling Applications on the Grid: A GridLab overview Naiwen Lin Introduction The GridLab project develops a easy-to-use, flexible, generic and modular Grid Application Toolkit (GAT), enabling today’s applications to make innovative use of global computing resources. The project is grounded by two principles the co-development of infrastructure with real applications and user communities, leading to working scenarios, and dynamic use of grids, with self-aware simulations adapting to their changing environment. GridLab Aims Get Computational Scientists using the “Grid” and Grid services for real, everyday, production work (AEI Relativists, EU Network, Grav Wave Data Analysis, Cactus User Community). Make it easier for applications to make flexible, efficient, robust, use of the resources available to their virtual organizations. Dream up, prototype, and test new application scenarios which make adaptive, dynamic, wild, and futuristic uses of resources. What is the GAT ? Set of application developer APIs for Grid tools and services Usable from any high level “application” (Any generic code, Cactus, Triana, Portals, Scripts, …) Terminology Capability Provider an entity providing a specific capability Service a network-enabled entity that provides a specific capability Adaptor The adaptor pattern provides programmers with interfaces Requirements Abstraction of the environment Adaptivity to the environment Interchangeability of capability providers Complete control on all levels Smart adaptivity on all level GridLab Architecture The applications, located on the highest level of the user space, can access all capability providers they need via the GAT-API The GAT also resides in user space, providing interfaces to the capability providers in the capability space GridLab Services Boot Application Layer » c co Ve oN Application ® Se SSeS UE Ee er ee ee TTT wn ae y GATLayr J hee GAT API ® (ie) Lt “ems ~ se I Gridab Serviges ! Triaeary Services Service Layer — @ ' : : and Librafies i O Cm we, @® i s — Monitoring @ O © " c ig aoeds A} Fig. 1 The General GridLab Architecture. The GAT will be designed to interact with all types of capability provid- ers, via various communication channels. Security and uniformity for both types of components are recom- mended, not mandated (see text), GAT Design GAT must be able to choose from a pool of adaptors and invoke different adaptors as needed Each of these adaptors will present the same interface to GAT GAT must choose between them at runtime, and invoke its selection on the fly Application av eee ely Adaptor 1 Adaptor 2 ! Initialize q q | I | \ | I | | Find Function | | Register Co i } — te =F -- --- HH Fig.3 GAT Initialization. On initialization, all adaptors register the capabilities they provide with the GAT Capability Registry. Application arin See Adaptor Capability T T T T Operation Find Function T | | | | | | | | | Operation | | | Operation 1 Fig. 4 GAT-API Call. On invocation, the GAT queries the capability registry for matching adaptors, and returns the best fitting one with respect to some metric. The GAT engine invokes that adaptor in order to service the API call. The adaptor forwards this call to the actual capability provider implementing the associated functionality. GridLab Applications Triana An open source PSE written in Java Has a flexible and intuitive design that can be used in many different problem domains and at many different levels
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved