Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Genomics, Phylogenetics, Machine Learning, and Artificial Neural Networks - Prof. Drena Le, Study notes of Bioinformatics

Lecture notes for bcb 444/544 at iowa state university, covering topics such as genomics, phylogenetics, machine learning, and artificial neural networks. It includes details on bayes theorem, naïve bayes for binary classification, artificial neural networks, perceptrons, and training perceptrons.

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-7h5
koofers-user-7h5 🇺🇸

4.5

(2)

10 documents

1 / 10

Toggle sidebar

Related documents


Partial preview of the text

Download Genomics, Phylogenetics, Machine Learning, and Artificial Neural Networks - Prof. Drena Le and more Study notes Bioinformatics in PDF only on Docsity! #33 - Genomics 11/09/07 BCB 444/544 Fall 07 Dobbs 1 1BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 BCB 444/544 Lecture 33 Genomics #33_Nov09 2BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 √ Mon Nov 5 - Lecture 31 Phylogenetics – Parsimony and ML • Chp 11 - pp 142 – 169 √ Wed Nov 7 - Lecture 32 Machine Learning Fri Nov 9 - Lecture 33 Functional and Comparative Genomics •Chp 17 and Chp 18 Required Reading (before lecture) 3BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Assignments & Announcements Fri Nov 9 - HW#6 (will be posted this weekend) HW#6 - More fun with Machine Learning!! Due: Fri Nov 16 (or sometime before Mon Nov 26) 4BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Seminars this Week BCB List of URLs for Seminars related to Bioinformatics: http://www.bcb.iastate.edu/seminars/index.html • Nov 7 Wed - BBMB Seminar 4:10 in 1414 MBB • Sharon Roth Dent MD Anderson Cancer Center • Role of chromatin and chromatin modifying proteins in regulating gene expression • Nov 8 Thurs - BBMB Seminar 4:10 in 1414 MBB • Jianzhi George Zhang U. Michigan • Evolution of new functions for proteins • Nov 9 Fri - BCB Faculty Seminar 2:10 in 102 SciI • Amy Andreotti ISU • T cell signaling: insights from protein NMR spectroscopy 5BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Chp 11 – Phylogenetic Tree Construction Methods and Programs SECTION IV MOLECULAR PHYLOGENETICS Xiong: Chp 11 Phylogenetic Tree Construction Methods and Programs • Distance-Based Methods • Character-Based Methods • Phylogenetic Tree Evaluation • Phylogenetic Programs 6BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Machine Learning • What is learning? • What is machine learning? • Learning algorithms • Machine learning applied to bioinformatics and computational biology • Some slides adapted from Dr. Vasant Honavar and Dr. Byron Olson #33 - Genomics 11/09/07 BCB 444/544 Fall 07 Dobbs 2 7BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Examples of Machine Learning Algorithms • Naïve Bayes (NB) • Bayes Theorem • Neural network (NN) or Artificial Neural Net (ANN) • Perceptrons • Support Vector Machine (SVM) • Kernel functions Lab - WEKA: Decision Trees (DT), NB, SVM 8BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 An Application: Predicting RNA Binding Sites in Proteins • Problem: Given an amino acid sequence, classify each residue as RNA binding or non-RNA binding • Input to the classifier is a string of amino acid identities • Output from the classifier is a class label, either binding or not 9BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Bayes Theorem Applied to RNA Binding Site Prediction )( )1|()1( )|1( xXP cxXPcP xXcP = === === )( )0|()0( )|0( xXP cxXPcP xXcP = === === ) ( )| ()( ) |( seqaaP bindingseqaaPbindingP seqaabindingP = 10BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Naïve Bayes for Binary Classification !" == == )|0( )|1( xXcP xXcPAssign c = 1 if Otherwise, assign c = 0 11BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Example: Is ARG 6 RNA-binding or not? ARG 6 T S K K K R Q R G S R p(X1 = T | c = 1) p(X2 = S | c = 1) … p(X1 = T | c = 0) p(X2 = S | c = 0) … ≥ θ 12BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Predicted vs Actual RNA Binding for Ribosomal protein L15 (PDB ID 1JJ2:K) ActualPredicted #33 - Genomics 11/09/07 BCB 444/544 Fall 07 Dobbs 5 25BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Genomics - for excellent overview lectures, see these posted by NHGRI & Pevsner: 1- Genomic sequencing Mapping and Sequencing CTGA2005Lecture1.pdf Eric Green, NHGRI 2- Human genome project The Human Genome 2005-10-19_ch17.pdf Jonathan Pevsner, Kennedy Krieger Institute 3- SNPs Studying Genetic Variation II: Computational Techniques Jim Mullikin, NHGRI TGA2005Lecture13.pdf 4- Comparative Genomics Comparative Sequence Analysis Elliott Margulies, NHGRI CTGA2005Lecture8.pdf 26BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 1- Genomic sequencing Many thanks to: Eric Green, NHGRI for the following slides extracted from his lecture on: Mapping and Sequencing CTGA2005Lecture1.pdf 27BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Genomic Sequencing - Brief Review 28BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Comparison of Sequenced Genome Sizes 29BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Comparison of Genetic & Physical Maps 30BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 STSs: Provide common markers for "linking" genetic & physical maps #33 - Genomics 11/09/07 BCB 444/544 Fall 07 Dobbs 6 31BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 With complete genomes (now), why bother to generate physical maps? 32BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Genomic sequencing requires assembly of sequences obtained from cloned DNA 33BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07 Human Genome Sequencing Two approaches: • Public (government) - International Consortium (6 countries, NIH-funded in US) • "Hierarchical" cloning & BAC-by-BAC sequencing • Map-based assembly • Private (industry) - Celera (Craig Venter) • Whole genome random "shotgun" sequencing • Computational assembly (took advantage of public maps & sequences,too) Guess which human genome Celera sequenced? 34BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 NIH: "Hierarchical" BAC-by-BAC Sequencing 35BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 "Hierarchical" Subcloning Strategy 36BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Celera: Whole-Genome "Shotgun" Sequencing #33 - Genomics 11/09/07 BCB 444/544 Fall 07 Dobbs 7 37BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 "Shotgun" Sequencing Stategy 38BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Either Strategy: Sequence "Finishing" = Hardest part !! 39BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Advances in DNA Sequencing Technology 40BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Sequencing Method #1: Gilbert-Maxim "Chemical Degradation" 41BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Sequencing Method #2: Sanger "Di-deoxy Chain Termination" 42BCB 444/544 F07 ISU Dobbs#33 - Genomics 11/09/07E Green 2005 Automated Sequencing for Genome Projects: Sanger method - with improvements Another “recent” improvement: rapid & high resolution separation of fragments in capillaries instead of gels (E Yeung,Ames Lab, ISU)
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved