Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Phylogenetics: Distance-Based Methods for Tree Building in Molecular Biology - Prof. Drena, Lab Reports of Bioinformatics

An overview of distance-based methods for constructing phylogenetic trees in molecular biology. The basics of phylogenetics, the procedure for building a tree, the choice of molecular markers, multiple sequence alignment, and automatic editing of alignments. Distance matrices, upgma, and neighbor joining methods are discussed in detail.

Typology: Lab Reports

Pre 2010

Uploaded on 09/02/2009

koofers-user-n9r-1
koofers-user-n9r-1 🇺🇸

10 documents

1 / 9

Toggle sidebar

Related documents


Partial preview of the text

Download Phylogenetics: Distance-Based Methods for Tree Building in Molecular Biology - Prof. Drena and more Lab Reports Bioinformatics in PDF only on Docsity! #30 - Phylogenetics Distance-Based Methods 11/02/07 BCB 444/544 Fall 07 Dobbs 1 1BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 BCB 444/544 Lecture 30 Phylogenetics – Distance-Based Methods #30_Nov02 2BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Wed Oct 30 - Lecture 29 Phylogenetics Basics • Chp 10 - pp 127 - 141 Thurs Oct 31 - Lab 9 Gene & Regulatory Element Prediction Fri Oct 30 - Lecture 30 Phylogenetic – Distance-Based Methods • Chp 11 - pp 142 – 169 Mon Nov 5 - Lecture 31 Phylogenetics – Parsimony and ML • Chp 11 - pp 142 - 169 Required Reading (before lecture) 3BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Assignments & Announcements Mon Oct 29 - HW#5 HW#5 = Hands-on exercises with phylogenetics and tree-building software Due: Mon Nov 5 (not Fri Nov 1 as previously posted) 4BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 BCB 544 "Team" Projects Last week of classes will be devoted to Projects • Written reports due: • Mon Dec 3 (no class that day) • Oral presentations (20-30') will be: • Wed-Fri Dec 5,6,7 • 1 or 2 teams will present during each class period See Guidelines for Projects posted online 5BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 BCB 544 Only: New Homework Assignment 544 Extra#2 Due: √PART 1 - ASAP PART 2 - meeting prior to 5 PM Fri Nov 2 Part 1 - Brief outline of Project, email to Drena & Michael after response/approval, then: Part 2 - More detailed outline of project Read a few papers and summarize status of problem Schedule meeting with Drena & Michael to discuss ideas 6BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Seminars this Week BCB List of URLs for Seminars related to Bioinformatics: http://www.bcb.iastate.edu/seminars/index.html • Nov 2 Fri - BCB Faculty Seminar 2:10 in 102 ScI • Bob Jernigan BBMB, ISU • Control of Protein Motions by Structure #30 - Phylogenetics Distance-Based Methods 11/02/07 BCB 444/544 Fall 07 Dobbs 2 7BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Chp 10 - Phylogenetics SECTION IV MOLECULAR PHYLOGENETICS Xiong: Chp 10 Phylogenetics Basics • Evolution and Phylogenetics • Terminology • Gene Phylogeny vs. Species Phylogeny • Forms of Tree Representation • Why Finding a True Tree is Dificult • Procedure of Building a Phylogenetic Tree 8BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Tree Building Procedure • Choose molecular markers • Perform MSA • Choose a model of evolution • Determine tree building method • Assess tree reliability 9BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Choice of Molecular Markers • Very closely related organisms - nucleic acid sequence will show more differences • For individuals within a species - faster mutation rate is in noncoding regions of mtDNA • More distantly related species - slowly evolving nucleic acid sequences like ribosomal RNA or protein sequences • Very distantly related species - use highly conserved protein sequences 10BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Multiple Sequence Alignment • Most critical step in tree building - cannot build correct tree without correct alignment • Should build alignments with multiple programs, then inspect and compare to identify the most reasonable one • Most alignments need manual editing • Make sure important functional residues align • Align secondary structure elements • Use full alignment or just parts 11BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Automatic Editing of Alignments • Rascal and NorMD – correct alignment errors, remove potentially unrelated or highly divergent sequences • Gblocks – detect and eliminate poorly aligned positions and divergent regions 12BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 How do we measure divergence between sequences? • Simple measure – just count the number of substitutions observed between the sequences in the MSA • Problem – number of substitutions may not represent the number of evolutionary events that actually occurred #30 - Phylogenetics Distance-Based Methods 11/02/07 BCB 444/544 Fall 07 Dobbs 5 25BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Clustering-Based Methods • E.g., UPGMA and Neighbor-Joining • A cluster is a set of taxa • Interspecies distances translate into intercluster distances • Clusters are repeatedly merged • “Closest” clusters merged first • Distances are recomputed after merging 26BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 UPGMA • UPGMA – Unweighted Pair Group Method Using Arithmetic Average • Uses molecular clock assumption – all taxa evolve at a constant rate and are equally distant from the root (ultrametric tree) • This assumption is usually wrong • So why use UPGMA? • Very fast 27BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 UPGMA Example 28BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 UPGMA Example 29BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 UPGMA Example 30BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 UPGMA Example #30 - Phylogenetics Distance-Based Methods 11/02/07 BCB 444/544 Fall 07 Dobbs 6 31BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining • Idea: Find a pair of taxa that are close to each other but far from other taxa • Implicitly finds a pair of neighboring taxa • No molecular clock assumption 32BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining • NJ corrects for unequal evolutionary rates between sequences by using a conversion step • The conversion step requires calculation of “r-values” and “transformed r-values” 33BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining ∑= iji dr The r-value for a sequence is: The sum of the distances between sequence i and all other sequences 34BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining 2 ' − = n rr ii The transformed r-value for a sequence is: Where n is the number of taxa Transformed r-values are used to determine the distance of a taxon to the nearest node 35BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining ( )jiijij rrdd +−= 2 1' The converted distance between two sequences is: These converted distances are used in building the tree 36BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining ( )[ ] 2 '' jiij iu rrd d −+ = The final equation we need is for computing the distance from a new cluster to each taxa. Assume taxa i and j were merged into a cluster u. The distance from taxa i to cluster u is: #30 - Phylogenetics Distance-Based Methods 11/02/07 BCB 444/544 Fall 07 Dobbs 7 37BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining Example 0.550.700.60D 0.450.35C 0.40B CBA 38BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining Example • Initialize tree into a star shape with all taxa connected to the center • Step 1: Compute r-values and transformed r-values for all taxa 675.0 2 35.1 24 ' 35.16.035.04.0 == − = =++=++= A A ADACABA rr dddr 39BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining Example • Step 2: Compute converted distances ( ) ( ) 05.1 55.135.1 2 14.0 2 1' −= +−= +−= BAABAB rrdd 40BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining Example -1.05-1-1D -1-1C -1.05B CBA • Step 3: Fill out converted distance matrix 41BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining Example • Step 4: Create a node by merging closest taxa • In this example, the distance between A and B is the same as the distance between C and D • We can pick either pair to start with • Let’s pick A and B and create a node called U U ? ? B A DC BA 42BCB 444/544 F07 ISU Terribilini #30- Phylogenetics - Distance-Based Methods 11/02/07 Neighbor Joining Example • Step 5: Compute branch lengths • Use the equation for computing the distance from a taxa to a node ( )[ ] ( )[ ] 15.0 2 775.0675.04.0 2 '' = −+ = −+ = BAABAU rrdd U 0.15 0.25 B A
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved