Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Phylogenetic Trees in Bioinformatics: Building and Analyzing Evolutionary Relationships - , Study notes of Computer Science

University of Maryland Computer Science

Prof. Mihai Pop

The concepts of phylogenetic trees in the context of bioinformatics, focusing on methods to determine the evolutionary relationships between organisms based on their features. Topics include constructing rooted trees, minimizing state changes with sankoff's algorithm, and clustering sequences using upgma and neighbor-joining. The document also touches upon maximum likelihood methods and tree analysis and display.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-1cz 🇺🇸

10 documents

1 / 18

Partial preview of the text

Download Phylogenetic Trees in Bioinformatics: Building and Analyzing Evolutionary Relationships - and more Study notes Computer Science in PDF only on Docsity! CMSC423: Bioinformatic Algorithms, Databases and Tools Lecture 14 phylogenetic trees CMSC423 Fall 2008 2 Phylogeny questions • Given several organisms & a set of features (usually sequence, but also morphological: wing shape/color...) • A. Given a phylogenetic tree – figure out what the ancestors looked like (what are the features of internal nodes) • B. Find the phylogenetic tree that best describes the common evolutionary heritage of the organisms wings, feathers, teeth claws, no wings, fur ? A C AB B B A C C CMSC423 Fall 2008 5 Example 0 0 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1 1 10 0 1 0 1 CMSC423 Fall 2008 6 Sankoff's algorithm • At each node v in the tree store s(v,t) – best parsimony score for subtree rooted at v if character stored at v is t • Traverse the tree in post-order and update s(v,t) as follows – assume node v has children u and w – s(v,t) = mini {s(u,i) + score(i,t)} + minj {s(w,j) + score(j,t)} • Character at root will be the one that maximizes s(root, t) • Note – this solves the weighted version. For unweighted set score (i,i) = 0, score(i,j) = 1 for any i,j CMSC423 Fall 2008 7 Trees as clustering • Start with a distance matrix – distance (e.g. alignment distance) between any two sequences (leaves) • Intuitively – want to cluster together the most similar sequences • UPGMA – Unweighted Pair Group Method using Arithmetic averages – Build pairwise distance matrix (e.g. from a multiple alignment) – Pick pair of sequences that are closest to each other and cluster them – create internal node that has the sequences as children – Repeat, including newly created internal nodes in the distance matrix – Key element – must be able to quickly compute distance between clusters (internal nodes) – weighted distance 1 2 1 2 ,1 2 1( , ) ( , ) | || | p cl q cl D cl cl D p q cl cl ∈ ∈ = ∑ CMSC423 Fall 2008 10 Trees as clustering • Note that both UPGMA and NJ assume distance matrix is additive: D(i,j) + D(j,k) = D(i,k) - usually not true but close • Also, NJ can be proven to build the optimal tree! • But, simple alignment distance is not a good metric CMSC423 Fall 2008 11 Maximum likelihood • For every branch S->T of length t, compute P(T|S,t) – likelihood that sequence S could have evolved in time t into sequence T • Find tree that maximizes the likelihood • Note that likelihood of a tree can be computed with an algorithm similar to Sankoffs • However, no simple way to find a tree given the sequences – most approaches use heuristic search techniques • Often, start with NJ tree – then "tweak" it to improve likelihood CMSC423 Fall 2008 12 Tree analysis & display CMSC423 Fall 2008 15 Drawing trees • Trees are easy to draw – just need to figure out how much space the leaves will take • Step 1 – calculate how much space each node will take (how many leaves from current node) • Step 2 – spread out the nodes according to # of leaves • Many ways of optimizing: e.g. width, area • For large trees – 3D displays (there's more room in 3D) – interactive displays (expand contract nodes as needed) CMSC423 Fall 2008 16 Analysis example • Build multiple alignment (e.g. Muscle, ClustalW) • Clean up alignment – manual editing – filters (pre-defined structure information) • Build tree – PAUP – parsimony & others – Phylip – maximum likelihood – Tree-Puzzle –maximum likelihood – etc... (many packages) • Integrated system – ARB – www.arb-home.de CMSC423 Fall 2008 17 Antibiotic resistance in Staphylococcus aureus Green boxes – individual strains in a phylogenetic tree Red diamonds, yellow triangle - acquisition of resistance Hexagon – loss of resistance

Documents

questions

Phylogenetic Trees in Bioinformatics: Building and Analyzing Evolutionary Relationships - , Study notes of Computer Science

Related documents

Partial preview of the text