Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Applying Fitch Algorithm to Newick Tree & DNA Sequences, Assignments of Genetics

Instructions for a homework assignment in computational molecular biology, where students are required to write a program using code from the previous assignment to read in a newick-format tree and a phylip-format dna sequences file. They will then apply the fitch algorithm to count the number of steps a parsimony method would get for this tree. Students are encouraged to alter the tree and check if rerooting it changes the number of steps.

Typology: Assignments

Pre 2010

Uploaded on 03/11/2009

koofers-user-726
koofers-user-726 🇺🇸

10 documents

1 / 1

Toggle sidebar

Related documents


Partial preview of the text

Download Applying Fitch Algorithm to Newick Tree & DNA Sequences and more Assignments Genetics in PDF only on Docsity! Genome 541 Spring, 2003 Computational Molecular Biology J. Felsenstein Homework no. 4 Due Friday, May 16 Write a program (most conveniently using code from the previous homework) to read in a Newick-format tree, and a data set of DNA sequences. Then use the Fitch algorithm to count the number of steps a parsimony method would get for this tree. Just for the fun of it, you might want to alter the tree (by hand) and see what that does to the number of steps. Does rerooting the tree change it? The tree we will use is the one you used for the previous assignment. It has branch lengths, though those will not be made use of here. The data set is the one that can be fetched from the link in the description of lecture 34 on the course web site. You should be able to save it by using the Save As function in the File menu of your browser. It is a set of sites from the D-loop region of mitochondrial DNA and adjacent noncoding (third codon position) sites. The data set format is the PHYLIP format. The first line has two integers, the number of species and the number of sites. Each species starts on a new line, with 10 characters of species name (keep in mind that species that have a blank in their name have this represented by an underscore character in the tree file). Then the sequence continues, going to new lines as needed. The sequences are aligned, and all are of the same length. The sequences are A’s, T’s, C’s, and G’s, as well as some “-” (minus sign) characters that mean “gap”. In the Fitch algorithm, “-” can be represented by the set of all four bases, {A,C,G,T}. Show me (by email) the output of your program. If there are problems I reserve the right to ask for the code, later.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved