Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Multiple Sequence Alignment: Techniques, Tools, and Applications - Prof. Dietlind Gerloff, Study notes of Chemistry

An overview of multiple sequence alignment (msa), its importance in bioinformatics, various methods such as dynamic programming, progressive alignment, and psi-blast, and popular tools like clustalw, tcoffee, and hmms. It also covers conservation patterns and their significance.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-wva
koofers-user-wva 🇺🇸

10 documents

1 / 8

Toggle sidebar

Related documents


Partial preview of the text

Download Multiple Sequence Alignment: Techniques, Tools, and Applications - Prof. Dietlind Gerloff and more Study notes Chemistry in PDF only on Docsity! Multiple Sequence Alignment • Multiple sequence alignment is probably the single- most important bioinformatics tools. • Many applications require accurate MSAs • PSIBLAST • Family and domain classification • Pattern identification • Structure prediction • secondary structure • fold recognition • Phylogeny • Full-genome alignments in browsers Conservation Patterns • Cys pairs -disulfide bonds • His, Ser -catalytic sites • Cys, His -metal binding sites • Gly, Pro -ends of 2° structure elements, turns • Lys, Arg, Asp, Glu - ligand binding • Lys/Arg-Asp/Glu pairs - salt bridges • Leu -coiled coils, leucine zippers • Motifs, secondary structure, indels PSI-BLAST Alignments • The goal of BLAST is rapid detection by detecting high-scoring local alignments. It doesn’t necessarily find the optimal global or local alignment • Profiles throw away information for regions that are insertions relative to the query Methods • Dynamic Programming • Gives the optimal solution, but prohibitively slow • Progressive • ClustalW • http://www.ebi.ac.uk/clustalw/index.html (most commonly used) • Tcoffee • http://igs-server.cnrs-mrs.fr/Tcoffee/ (a little better, but slower) • Iterative • better than progressive methods, but slower • Dialign • HMMs ClustalW Guide Tree • The guide tree shows the distances between sequences obtained from the initial pairwise alignments. • This is the order that sequences were added into the MSA • Guide tree is not a phylogenetic tree (it’s just a rough estimate of similarity), however a true phylogenetic tree can be generated after making an alignment Progressive Alignment • Greedy algorithm • Breaks problem up into smaller problems • Finds best solution to each small problem • Combine solutions to get answer to whole problem • Not necessarily the global answer. • Doesn’t use all information in solving sub-problems. • Suboptimal answers for small problems may combine to give a better overall answer • Gaps: once created, they stay as part of alignment for rest of alignment iterations ClustalW Alignment CLUSTAL W (1.82) multiple sequence alignment sp_P13795 ---MAEDAD------------------------MRNELEEMQRRADQLADESLESTRRML 33 gi_31242623 MPAAAPPAENG-------------------AAVPKTELQELQMKQQQVVDESLDSTRRML 41 gi_3822409 MPTTAEPAQE--------------------NGAPRSELQELQLKAGQVTDETLESTRRML 40 gi_39593308 MSARRGAPGGQRHPRPYAVEPTVDINGLVLPADMSDELKGLNVGIDEKTIESLESTRRML 60 gi_32567202 MSGDDDIPEG---------------------------LEAINLKMNATTDDSLESTRRML 33 . *: :: . ::*:****** sp_P13795 QLVEESKDAGIRTLVMLDEQGEQLERIEEGMDQINKDMKEAEKNLTDLGKFCGLCVCPCN 93 gi_31242623 ALCEESTEVGMRTIVMLDEQGEQLDRIEEGMDQINADMREAEKNLSGMEKCCGICVLPCN 101 gi_3822409 ALCEESKEAGIRTLVALDDQGEQLERIEENMDQINADMKEAEKNLTGMEKFCGLCVLPWN 100 gi_39593308 ALCEESKEAGIKTLVMLDDQGEQLERCEGALDTINQDMKEAEDHLKGMEKCCGLCVLPWN 120 gi_32567202 ALCEESKEAGIKTLVMLDDQGEQLERCEGALDTINQDMKEAEDHLKGMEKCCGLCVLPWN 93 * ***.:.*::*:* **:*****:* * :* ** **:***.:*..: * **:** * * sp_P13795 KLKSSDA---YKKAWGNNQDG-VVASQPARVVDEREQMAISGGFIRRVTNDARENEMDEN 149 gi_31242623 KSASFKE---DDGTWKGNDDGKVVNNQPQRVMDDRNGLGPQAGYIGRITNDAREDEMEEN 158 gi_3822409 KSAPFKE---NEDAWKGNDDGKVVNNQPQRVMDDGSGLGPQGGYIGRITNDAREDEMEEN 157 gi_39593308 KTDDFEKNSEYAKAWKKDDDGGVISDQPRITVGDPT-MGPQGGYITKITNDAREDEMDEN 179 gi_32567202 KTDDFEK-TEFAKAWKKDDDGGVISDQPRITVGDSS-MGPQGGYITKITNDAREDEMDEN 151 * . :* ::** *: .** .:.: :. ..*:* ::******:**:** sp_P13795 LEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSNKTRIDEANQRATKMLGSG 206 gi_31242623 MGQVNTMIGNLRNMALDMGSELENQNRQIDRINRKGDSNATRIAAANERAHDLLK-- 213 gi_3822409 VGQVNTMIGNLRNMAIDMGSELENQNRQIDRIKNKAEM------------------- 195 gi_39593308 IQQVSTMVGNLRNMAIDMSTEVSNQNRQLDRIHDKAQSNEVRVESANKRAKNLITK- 235 gi_32567202 VQQVSTMVGNLRNMAIDMSTEVSNQNRQLDRIHDKAQSNEVRVESANKRAKNLITK- 207 : **. ::****:**:**..*:..****:*** *.: Interleaved Formats • Most common output formats for MSAs are interleaved: • MSF, ASN, BLAST query-anchored formats • All sequences are stacked up, and chopped into blocks of ~60 residues • Easy for humans to read, but difficult to edit • Tools for converting formats are available on the web Aligned FASTA (A2M) Format >SN29_RAT/142-196 PSSRLKEAINTSKDQESKYQASHPNLRRLHDAE---LDSVPASTV----NTEVY-----P KNSSL---R-----A >SN29_HUMAN/142-197 PNNRLKEAISTSKEQEAKYQASHPNLR-------KLDDTDPVPRGA---GSAMSTDA-YP KNPHL---R-----A >SN25_TORMA/95-148 PCNK----LKNFEAGGAYKKVWGNNQD------G-VVASQP-ARVMD-DREQMA-----M SGGYI--RRI-TDDA >O93578/11-59 PCNK----MKS-----GASKAWGNNQD------G-VVASQP-ARVVD-EREQMA-----I SGGFI--RRV-TDDA >SN25_DROME/98-149 PCNK----SQSFK---EDDGTWKGNDD------GKVVNNQP-QRVMD-DRNGM-----MA QAGYI--GRI-TNDA • Uppercase and ‘-’ characters are alignment columns. There must be the same number of aligned characters in all sequences. • Insertions that are not part of the alignment, are indicated with lower case and ‘.’ characters. These are not read (i.e. they’re for humans only) • Benefits • Easily machine readable • Readable by most programs that read FASTA format (Note: characters in lowercase, if there were any, would indicate that the alignment is incertain at these positions) Graphical - Jalview • Postscript, PDF, HTML • Looks pretty and very visually informative • Completely useless for further computational analysis. DO NOT SAVE GRAPHICS AS YOUR ONLY OUTPUT • Jalview -- Java alignment editor (http://www.jalview.org) • Available as an online applet or as an application • Makes nice pictures and allow interactive editing e.g. Jalview, ClustalX (or others)
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved