Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Bioinformatic Algorithms: Dynamic Programming for Inexact String Alignment, Study notes of Computer Science

An introduction to the problem of inexact string alignment using dynamic programming and gapped alignment. The intuition behind the problem, the recurrences for calculating the best score, and the dynamic programming table. It also explains how to output the result and the difference between local and global alignment. The document also touches on various flavors of alignment and gap penalties.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-0i6
koofers-user-0i6 🇺🇸

10 documents

1 / 15

Toggle sidebar

Related documents


Partial preview of the text

Download Bioinformatic Algorithms: Dynamic Programming for Inexact String Alignment and more Study notes Computer Science in PDF only on Docsity! CMSC423 Fall 2008 1 CMSC423: Bioinformatic Algorithms, Databases and Tools Lecture 10 inexact alignment dynamic programming, gapped alignment CMSC423 Fall 2008 2 Intuition • What is the best way to align strings S1 and S2? • just look at last character for now – what is it aligned to? S1[n] S2[m] S1[n] S2[m] S1[n] S2[m] AG-C-GTAG -GTCAG-A- CMSC423 Fall 2008 5 How do you output the result? • Goal: produce the “nice” string with gaps that is shown in the examples • Idea: create the string backwards – starting from the right • As you follow backtrack pointers: – if you follow diagonal pointer – add characters to both output strings (aligned versions of original strings) – if you move up – add gap character to string represented on the y axis, add string character to string represented on x axis – if you move left – gap goes in string on x axis and character in string on y axis • When you reach (0,0) output the two aligned strings CMSC423 Fall 2008 6 Local vs. global alignment • Can we change the algorithm to allow S1 to be a substring of S2? ACAGTTGACCCGTGCAT ----TG-CC-G------ • Key idea: gaps at the end of S2 are free • Simply change the first row in the DP table to 0s • Answer is no longer Score[n, m], rather the largest value in the last row CMSC423 Fall 2008 7 Sub-string alignment 00000000- -6 -4 -2 - T G C A G 262830186 18208 810 GATGC AGCGTAG CGT CMSC423 Fall 2008 10 Various flavors of alignment • Alignment problem also called "edit distance" – how many changes do you have to make to a string to convert it into another one. • Edit distance also called Levenshtein distance • Local alignment – Smith-Waterman • Global alignment – Needleman-Wunsch 11 Gap penalties CMSC423 Fall 2008 12 How much do we pay for gaps? • In the edit-distance/alignment framework Cost(n gaps in a row) = n * Cost(gap) • This doesn't work for e.g. RNA-DNA alignments ACAGTTCGACTAGAGGACCTAGACCACTCTGT TTCGA----------TAGACCAC • Affine gap penalties Cost(n gaps in a row) = Cost(gap open) + n * Cost(gap) • Gap opening penalty is high, gap extension penalty is low (once we start a gap we might as well pile more gaps on top)
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved