Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Familiarizing with BLAST: A NCBI Tool for Sequence Comparison, Papers of Molecular biology

Instructions for using blast (basic local alignment search tool) from the national center for biotechnology information (ncbi) to explore sequence similarities between nucleotides and proteins. The guide covers different types of blast searches, the use of ncbi's databases, and interpreting the results.

Typology: Papers

Pre 2010

Uploaded on 08/31/2009

koofers-user-3te
koofers-user-3te 🇺🇸

10 documents

1 / 6

Toggle sidebar

Related documents


Partial preview of the text

Download Familiarizing with BLAST: A NCBI Tool for Sequence Comparison and more Papers Molecular biology in PDF only on Docsity! MCB411 Fall 2006 GOAL-to make you familiar with BLAST. ANSWER ALL QUESTIONS ON A SEPARATE PIECE OF PAPER OR IN AN EMAIL AND TURN IN BY MONDAY OCTOBER 23. This exercise is designed to let you become familiar with how one finds out about what a DNA sequence means: its similarity to other DNA sequences and the similarity of its sequence translated into amino acids to other proteins. One can also look for defined motifs or domains in the protein sequence. The main website you will use: BLAST (NCBI) http://www.ncbi.nlm.nih.gov What does NCBI do? Established in 1988 as a national resource for molecular biology information as part of the human genome project, NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information - all for the better understanding of molecular processes affecting human health and disease. NCBI has an excellent literature search tool called pubmed. Try searching for Crick F and you will find out what Francis has been up to the last few years, as well as republication of his paper with Watson. This is the most popular literature search tool for biomedical scientists. What is BLAST? BLAST® (Basic Local Alignment Search Tool) is a set of similarity search programs designed to explore all of the available sequence databases regardless of whether the query is protein or DNA. To get to BLAST interface select BLAST from the links across the top of the page. Then click Standard nucleotide-nucleotide BLAST [blastn] from the section titled Nucleotide Blast. Since we are first trying to identify a nucleotide sequence we will use Nucleotide BLAST (blastn). These searches allow one to input nucleotide sequences and compare these against other nucleotides. There are other types of BLAST searches: Protein BLAST (blastp) allows one to input protein sequences and compare these against other protein sequences. Blastx compares a nucleotide query sequence translated in all reading frames against a protein sequence database. Tblastn compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. Tblastx compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. Please note that tblastx program cannot be used with the nr database of the BLAST web page.\ Now go to the BLAST icon bar at the top of the page and click on it. On the next page, under the header of nucleotide BLAST, click on standard nucleotide- nucleotide BLAST (blastn, the third choice under nucleotide). In the Search box, cut and paste the following sequence: 1 ggaaaggagg aagaaggaga ttgtgatgga gaaagggggt ctgtgaaacg tcaggcacag 61 cacaaaggct ttgccacgta attacaggct cctattaagt cgagatctgc cctcccaggg 121 gtctccaatt ttcttgtatt ccctacaaag cctcctctgc atgccagttt gtgccttttg 181 aagtgccaga gagcttcttg atccaactga gaaggaaaaa ggagcccagc aagaagaggg 241 ggagagagag aaggggaaag gggggaaccc accagcaccc tccgtcggac tcttgaagcc 301 tttttttttt aattcttaat tttttttttt actctttaca aaaagtaaag tgagaatcct 361 gctctctaat acatctgcaa gacatcaccc tctcctcctg aaactttagt cactcctgag 421 aatccacagg agtgcagaga ggggggaaca cgttttcttg aagatgtttt aaagctggaa 481 caagccttct tctgttggtg cttgaactct tgcctgggaa taactttttt aacctttaaa 541 aaaaccattc actttgattc ttctctccca ccccttcttc tctcttcttc tgtttgccta 601 actcccccgc cctgctggcc tccgctttcc tctctccccc ttgttattat ttttagtctg 661 tgcgtgtgga cacttttgga gagttggaag ggattttttt ctcctgactt gaacataggg 721 tgacttttta atattgtatt ttactgtgga ttatctcttt ggaccgcgcc ggacttggcc 781 tcaggaaatc aaccaatgct gcggaaggcg gctggtgcac aacgctctgc tctacagaag 841 ggggtccccc accctctttt ccaatttttt ttttttggcc ttcctctcct tccctccctc 901 ttcctccctc tctctctctc tctctccact acccccctct ttcttcccca ctcggctcct alignment is indicated by one of five different colors, which divides the range of scores into five groups. Scroll down to the color coded-graph. Red means good matches. How many good matches are there? Below this box is a list of the matches in order of significance. What organisms are the best 5 matches from? How closely related are these organisms? To figure out about the relatedness of these organisms, click on the distance tree of results. This program creates a phylogenetic tree of the sequences your search pulls up. Accession link allows you to collect any information known on this specific match. Click on the gi/ ID number for the first match and you will find information on that gene (The first entry represents a splicing variant of the protein I started with). If you were dealing with the sequence of an unknown gene, the best score would tell you which gene in the genome matched and lead you to understanding more about that gene. Click on the score for the first match. This will take you to the alignment. What percent of the bases are identical? Now let’s work with the protein sequence. Copy this and go back to the BLAST page. MTSSYGHVLERQPALGGRLDSPGNLDTLQAKKNFSVSHLLDLEE AGDMVAAQADENVGEAGRSLLESPGLTSGSDTPQQDNDQLNSEEKKKRKQRRNRTTFN SSQLQALERVFERTHYPDAFVREDLARRVNLTEARVQVWFQNRRAKFRRNERAMLANK NASLLKSYSGDVTAVEQPIVPRPAPRPTDYLSWGTASPYRSSSLPRCCLHEGLHNGF Go to the protein blast and to the blastp button. Copy the protein sequence to the Search box and push the BLAST button. On the next window you get a picture of any conserved domains. What did the program pick up? Click on the format button and you get a similar screen as with the DNA sequence. How many protein sequences are in the database? Why are there fewer amino acids than bases of DNA? Why are there some entries that are identical or almost identical to the human sequence you are working with? Would you expect the DNA sequence to be as similar? Answer as best as you can.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved