Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Molecular Evolution: Reconstructing Evolutionary Trees from Distance Matrices, Slides of Discrete Mathematics

An overview of molecular evolution, focusing on the reconstruction of evolutionary trees from distance matrices. Topics covered include various algorithms for tree reconstruction, such as neighbor joining and the upgma algorithm, as well as the concept of additive distance matrices. The document also discusses the importance of these methods in understanding human evolution, including the out of africa hypothesis.

Typology: Slides

2012/2013

Uploaded on 04/27/2013

asmita
asmita 🇮🇳

4.6

(31)

197 documents

1 / 55

Toggle sidebar

Related documents


Partial preview of the text

Download Molecular Evolution: Reconstructing Evolutionary Trees from Distance Matrices and more Slides Discrete Mathematics in PDF only on Docsity! Molecular Evolution Docsity.com Outline • Evolutionary Tree Reconstruction • “Out of Africa” hypothesis • Did we evolve from Neanderthals? • Distance Based Phylogeny • Neighbor Joining Algorithm • Additive Phylogeny • Least Squares Distance Phylogeny • UPGMA • Character Based Phylogeny • Small Parsimony Problem • Fitch and Sankoff Algorithms • Large Parsimony Problem • Evolution of Wings • HIV Evolution • Evolution of Human Repeats Docsity.com Evolutionary Tree of Bears and Raccoons 40-- | apt sot 2 20- Millions of years ago 15- 10- ED PANDA ® Docsity.com Evolutionary Trees: DNA-based Approach • 40 years ago: Emile Zuckerkandl and Linus Pauling brought reconstructing evolutionary relationships with DNA into the spotlight • In the first few years after Zuckerkandl and Pauling proposed using DNA for evolutionary studies, the possibility of reconstructing evolutionary trees by DNA analysis was hotly debated • Now it is a dominant approach to study evolution. Docsity.com Out of Africa Hypothesis • Around the time the giant panda riddle was solved, a DNA-based reconstruction of the human evolutionary tree led to the Out of Africa Hypothesis that claims our most ancient ancestor lived in Africa roughly 200,000 years ago Docsity.com mtDNA analysis supports “Out of Africa” Hypothesis • African origin of humans inferred from: • African population was the most diverse (sub-populations had more time to diverge) • The evolutionary tree separated one group of Africans from a group containing all five populations. • Tree was rooted on branch between groups of greatest difference. Docsity.com Evolutionary Tree of Humans (mtDNA) The evolutionary tree separates one group of Africans from a group containing all five populations. Vigilant, Stoneking, Harpending, Hawkes, and Wilson (1991) Docsity.com Evolutionary Tree of Humans: (microsatellites) • Neighbor joining tree for 14 human populations genotyped with 30 microsatellite loci. Docsity.com Rooted and Unrooted Trees In the unrooted tree the position of the root (“oldest ancestor”) is unknown. Otherwise, they are like rooted trees Docsity.com Distances in Trees • Edges may have weights reflecting: • Number of mutations on evolutionary path from one species to another • Time estimate for evolution of one species into another • In a tree T, we often compute dij(T) - the length of a path between leaves i and j dij(T) – tree distance between i and j Docsity.com Distance in Trees: an Exampe d1,4 = 12 + 13 + 14 + 17 + 12 = 68 i j Docsity.com Fitting Distance Matrix • Given n species, we can compute the n x n distance matrix Dij • Evolution of these genes is described by a tree that we don’t know. • We need an algorithm to construct a tree that best fits the distance matrix Dij Docsity.com Fitting Distance Matrix • Fitting means Dij = dij(T) Lengths of path in an (unknown) tree T Edit distance between species (known) Docsity.com Reconstructing a 3 Leaved Tree • Tree reconstruction for any 3x3 matrix is straightforward • We have 3 leaves i, j, k and a center vertex c Observe: dic + djc = Dij dic + dkc = Dik djc + dkc = Djk Docsity.com Additive Distance Matrices Matrix D is ADDITIVE if there exists a tree T with dij(T) = Dij NON-ADDITIVE otherwise Docsity.com Distance Based Phylogeny Problem • Goal: Reconstruct an evolutionary tree from a distance matrix • Input: n x n distance matrix Dij • Output: weighted tree T with n leaves fitting D • If D is additive, this problem has a solution and there is a simple algorithm to solve it Docsity.com Using Neighboring Leaves to Construct the Tree • Find neighboring leaves i and j with parent k • Remove the rows and columns of i and j • Add a new row and column corresponding to k, where the distance from k to any other leaf m can be computed as: Dkm = (Dim + Djm – Dij)/2 Compress i and j into k, iterate algorithm for rest of tree Docsity.com Finding Neighboring Leaves • Closest leaves aren’t necessarily neighbors • i and j are neighbors, but (dij = 13) > (djk = 12) • Finding a pair of neighboring leaves is a nontrivial problem! Docsity.com Neighbor Joining Algorithm • In 1987 Naruya Saitou and Masatoshi Nei developed a neighbor joining algorithm for phylogenetic tree reconstruction • Finds a pair of leaves that are close to each other but far from other leaves: implicitly finds a pair of neighboring leaves • Advantages: works well for additive and other non- additive matrices, it does not have the flawed molecular clock assumption Docsity.com Degenerate Triples • A degenerate triple is a set of three distinct elements 1≤i,j,k≤n where Dij + Djk = Dik • Element j in a degenerate triple i,j,k lies on the evolutionary path from i to k (or is attached to this path by an edge of length 0). Docsity.com Finding Degenerate Triples • If there is no degenerate triple, all hanging edges are reduced by the same amount δ, so that all pair- wise distances in the matrix are reduced by 2δ. • Eventually this process collapses one of the leaves (when δ = length of shortest hanging edge), forming a degenerate triple i,j,k and reducing the size of the distance matrix D. • The attachment point for j can be recovered in the reverse transformations by saving Dij for each collapsed leaf. Docsity.com Reconstructing Trees for Additive Distance Matrices Docsity.com bid sao Ole eno Ulne on geri oso in qjo on Aoud Character-Based Tree Reconstruction • Better technique: • Character-based reconstruction algorithms use the n x m alignment matrix (n = # species, m = #characters) directly instead of using distance matrix. • GOAL: determine what character strings at internal nodes would best explain the character strings for the n observed species Docsity.com Parsimony and Tree Reconstruction ACCC ACCC 4 { 4 4 ACCA ACCG ACCA ATCC ak ATCG ATCC ATCG ACCG Less More Parsimonious Parsimonious Score: 6 Score: 5 ® Docsity.com Character-Based Tree Reconstruction (cont'd) (a) Parsimony Score=3 (b) Parsimeany Score=2 Figure 10.16 If we label a tree's leaves with characters (in this case, eyebrows and mouth, each with two states), and choose labels for each internal vertex, we umplicitly create a parsimony score for the tree. By changing the labels in (a) we are able to create a tree with a betber parsimony score in (b). 3 Docsity.com Small Parsimony Problem • Input: Tree T with each leaf labeled by an m- character string. • Output: Labeling of internal vertices of the tree T minimizing the parsimony score. • We can assume that every leaf is labeled by a single character, because the characters in the string are independent. Docsity.com HIV Transmission • Took multiple samples from the patient, the woman, and controls (non-related HIV+ people) • In every reconstruction, the woman’s sequences were found to be evolved from the patient’s sequences, indicating a close relationship between the two • Nesting of the victim’s sequences within the patient sequence indicated the direction of transmission was from patient to victim • This was the first time phylogenetic analysis was used in a court case as evidence (Metzker, et. al., 2002) Docsity.com Evolutionary Tree Leads to Conviction Patient V1.BCM.RT wae V2Z.BCM.RT T— V1.MIC.RT | V2.MIC.RT 1 14 2 RT a, 16 ® Docsity.com Minimum Spanning Trees • The first algorithm for finding a MST was developed in 1926 by Otakar Borůvka. Its purpose was to minimize the cost of electrical coverage in Bohemia. • The Problem • Connect all of the cities but use the least amount of electrical wire possible. This reduces the cost. • We will see how building a MST can be used to study evolution of Alu repeats Docsity.com Prim’s Algorithm Example 4 6 4 jS 6 4 6 4 5 6 4 Why Prim Algorithm Constructs Minimum Spanning Tree? • Proof: • This proof applies to a graph with distinct edges • Let e be any edge that Prim algorithm chose to connect two sets of nodes. Suppose that Prim’s algorithm is flawed and it is cheaper to connect the two sets of nodes via some other edge f • Notice that since Prim algorithm selected edge e we know that cost(e) < cost(f) • By connecting the two sets via edge f, the cost of connecting the two vertices has gone up by exactly cost(f) – cost(e) • The contradiction is that edge e does not belong in the MST yet the MST can’t be formed without using edge e Docsity.com Minimum Spanning Tree As An Evolutionary Tree {| Alus 1: AluJo subfamilies 9- AluSx 3: AluSq 4: AluSp 5: Aluy 2 | Alus 6: AluYas subfamilies 3 Aly subfamilies 5 The evolutionary tree of the 31 Repbase Update subfamilies, defined as their Minimum Spanning Tree (Kruskal 1956). 14 leaves in this tree = at least 14 A/u source elements. Docsity.com
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved