Download Introduction to Graph Model - Lecture Slides | BINF 739 and more Study notes Bioinformatics in PDF only on Docsity! 1 Introduction to Graph Models BINF739 Solka/Weller BINF739 Introduction to Graph Models BINF739 SPRING2007 Jeff Solka and Jennifer Weller Introduction to Graph Models BINF739 Solka/Weller BINF739 Acknowledgement Unless otherwise noted all figures in this lecture have been adapted from Gross and Yellen, Graph Theory and Its Application, Chapman and Hall/CRC Press, 2006. 2 Introduction to Graph Models BINF739 Solka/Weller BINF739 What is a Graph? A graph consists of a collection of nodes and edges that connect the nodes. The nodes are entities and the edges represent relationships between the entities. Nodes = proteins in a cell Edges = relationships between these proteins Usually denoted G = (V, E) V = vertices and E = edges Edges can of course be assigned weights, directions, and types Introduction to Graph Models BINF739 Solka/Weller BINF739 Applications of Graph Theory Communication networks Social network analysis Regulatory and developmental networks Citation networks Statistical data mining Dimensionality reduction Classification Clustering 5 Introduction to Graph Models BINF739 Solka/Weller BINF739 Gene Ontology: A Graph of Concept Terms Gentleman et al., Bioinformatics and Computational Biology Solutions Using R and Bioconductor, Springer 2005. Introduction to Graph Models BINF739 Solka/Weller BINF739 Bipartite Gene Article Graph Gentleman et al., Bioinformatics and Computational Biology Solutions Using R and Bioconductor, Springer 2005. 6 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 - Graphs and Digraphs Def – A graph G = (V, E) is a mathematical structure consisting of two finite sets V and E. The elements of V are called the vertices (or nodes), and the elements of E are called the edges. Each edge has a set of one or two vertices associated with it, which are called its endpoints. Ex. 1.1.1 – The vertex and edge set of graph A is VA = {p, q, r, s} and EA = {pq, pr, ps, rs, qs} Ex. 1.1.1 – The (open) neighborhood of a vertex v in a graph G, denoted N(v), is the set of all the neighbors of v. The closed neighborhood of v is given by N[v] = N(v) U {v} Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 - Simple Graphs and General Graphs Def. – A proper edge joins two distinct vertices. Def. – A self-loop is an edge that joins a single endpoint to itself. Def. – A multi-edge is a collection of two or more edges having identical end-points. The edge multiplicity is the number of edges within the multi-edge. Def. – A simple graph has neither self-loops nor multi-edges. Def. – A loopless graph (or multi-graph) may have multi-edges but no self-loops. Def. – A (general) graph may have self-loops and/or multi- edges. 7 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 - Null and Trivial Graphs Def. – A null graph is a graph whose vertex- and edge-sets are empty. Def. – A trivial graph is a graph consisting of one vertex and no edges. Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 – Edge Directions Def. – A directed edge (or arc) is an edge, one of whose endpoints is designated as the tail, and whose other endpoint is designated as the head. Def. – A directed graph (or a digraph) is a graph each of whose edges is directed. A digraph is simple if it has neither self-loops or multi-arcs. 10 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 - Mathematical Modeling With Graphs A mixed graph roadmap model. Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 – Mathematical Modeling with Graphs A digraph model of a corporate hierarchy 11 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 – Degree of a Vertex Def. – Adjacent vertices are two vertices that have an endpoint in common. Def. – Adjacent edges are two edges that have an endpoint in common. Def. – If a vertex v is an endpoint of edge e, then v is said to be incident on e, and e is incident on v. Def. – The degree (or valence) of a vertex v in a graph G, denoted deg(v), is the number of proper edges incident on v plus twice the number of self- loops. Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 – Degree of a Vertex Def. - The degree sequence of a graph is the sequence formed by arranging the vertex degrees in non-increasing order. 12 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 – Degree of a Vertex The degree sequence does not uniquely determine the graph Introduction to Graph Models BINF739 Solka/Weller BINF739 1.1 – Degree of a Vertex 15 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.2 Common Families of Graphs Introduction to Graph Models BINF739 Solka/Weller BINF739 1.2 Common Families of Graphs Def. A regular graph is a graph whose vertices all have equal degree. 16 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.2 Common Families of Graphs The Petersen graph (a “poster child” for conjecture testing and theorem proving) Introduction to Graph Models BINF739 Solka/Weller BINF739 1.2 Common Families of Graphs We can use graph theoretic models to model chemical compounds Working Group on Computer-Generated Conjectures from Graph Theoretic and Chemical Databases I http://dimacs.rutgers.edu/SpecialYears/2001_Data/Conjectures/ abstracts.html 17 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.2 Common Families of Graphs Introduction to Graph Models BINF739 Solka/Weller BINF739 1.2 Common Families of Graphs 20 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.2 Common Families of Graphs Topics in intersection graph theory [SIAM Monographs on Discrete Mathematics and Applications #2] Terry A. McKee and F.R. McMorris. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1999, vii+205 pp. ISBN: 0-89871-430-3 QA 166.105.M34 Decomposition of overlapping protein complexes: A graph theoretical method for analyzing static and dynamic protein associations Elena Zotenko1,2, Katia S Guimarães1,3, Raja Jothi1 and Teresa M Przytycka, Algorithms for Molecular Biology 2006, 1:7 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.2 Common Families of Graphs 21 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.3 Graph Modeling Applications Words Documents A bipartite encoding a document collection. Introduction to Graph Models BINF739 Solka/Weller BINF739 1.3 Graph Modeling Applications genes samples A bipartite encoding of a gene expression experiment. 22 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.3 Graph Modeling Applications (Evolution of co-author networks) http://www.scimaps.org/dev/big_thumb.php?map_id=54 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.3 Graph Modeling Applications • Classroom friendship data • Dark lines indicate reciprocated relationships. • Random Effects Models for Network Data (2003) Peter Hoff Proceedings of the National Academy of Sciences: Symposium on Social Network Analysis for National Security 25 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.4 – Walks and Distance Introduction to Graph Models BINF739 Solka/Weller BINF739 1.4 – Walks and Distance 26 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.4 – Walks and Distance Def. – The distance d(s,t) from a vertex s to a vertex t in a graph G is the length of a shortest s-t walk if one exists; otherwise, d(s,t) = infinity. For digraphs, the directed distance d(s,t) is the length of the shortest directed walk from s to t. Introduction to Graph Models BINF739 Solka/Weller BINF739 1.4 – Walks and Distance 27 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.4 – Walks and Distance Def. – The eccentricity of a vertex v in a graph G, denoted ecc(v), is the distance from v to a vertex farthest from v. That is The diameter of a graph G, denoted diam(G), is the maximum of the vertex eccentricities in G, or equivalently, the maximum distance between two vertices in G. That is, ( ){ }( ) max , Gx V ecc v d v x∈= { } ( ){ },( ) max ( ) max ,G Gx V x y Vdiam G ecc x d x y∈ ∈= = Introduction to Graph Models BINF739 Solka/Weller BINF739 1.4 – Walks and Distance Def. – The radius of a graph G, denoted rad(G), is the minimum of the vertex eccentricities. That is, Def. – A central vertex v of a graph G is a vertex with minimum eccentricity. Thus, ecc(v) = rad(G). { }( ) min ( ) Gx V rad G ecc x∈= 30 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.5 – Paths, Cycles, and Trees Def. – A trail is a walk with no repeated edges. Def. – A path is a trail with no repeated vertices (except possibly the initial and final vertices). Introduction to Graph Models BINF739 Solka/Weller BINF739 1.5 – Paths, Cycles, and Trees Def. – A nontrivial closed path is called a cycle. De. – An acyclic graph is a graph that has no cycles. Def. – A cycle that includes every vertex of a graph is called a hamilton cycle. Def. – A hamilton graph is a graph that has a hamilton cycle. 31 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.5 – Paths, Cycles, and Trees Introduction to Graph Models BINF739 Solka/Weller BINF739 1.5 – Paths, Cycles, and Trees Def. – An eulerian trail in a graph is a trail that contains every edge of that graph. Def. – An eulerian tour is a closed eulerian trail. Def. – An eulerian graph is a graph that has an eulerian tour. 32 Introduction to Graph Models BINF739 Solka/Weller BINF739 1.5 – Paths, Cycles, and Trees Introduction to Graph Models BINF739 Solka/Weller BINF739 1.5 – Paths, Cycles, and Trees Def. – The girth of a graph G with at least one cycle is the length of a shortest cycle in G. The girth if an acyclic graph is undefined.