Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Biological Databases in BCB 444/544: Understanding Genomic Information - Prof. Drena Leigh, Assignments of Bioinformatics

Information on various biological databases and resources related to genomic sequencing, gene structure, proteomics, metabolomics, and systems biology. Students in the bcb 444/544 course at isu can access updated syllabus, lecture and lab schedules, and homework assignments on the course website. Topics such as genomic databases, comparative genomics, molecular recognition, and bioinformatics. Students will learn about different protein interactions, network analysis, and systems biology to understand gene function on a genomic scale.

Typology: Assignments

Pre 2010

Uploaded on 09/02/2009

koofers-user-kbl
koofers-user-kbl 🇺🇸

5

(1)

10 documents

1 / 9

Toggle sidebar

Related documents


Partial preview of the text

Download Biological Databases in BCB 444/544: Understanding Genomic Information - Prof. Drena Leigh and more Assignments Bioinformatics in PDF only on Docsity! #2 - Biological Databases 8/22/07 BCB 444/544 Fall 07 Dobbs 1 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 1 Finish: Lecture 1- What is Bioinformatics? Lecture 2 Biological Databases & ISU Resources #2_Aug22 BCB 444/544 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 2 BCB 444/544 - Website http://bindr.gdcb.iastate.edu/bcb544 • Updated Syllabus • Lecture & Lab Schedules (with Homework Assignments) • Lecture PPTs & PDFs • Lab Exercises • Practice Exams • Grading Policy • Project Guidelines, etc. • Links • Check regularly for updates! Hyperlink 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 3 Meets in 1304 MBB every week EXCEPT this week: Current schedule: Thurs 1-3 PM Conflicts? See Drena BCB 444/544 - Computer Lab 1st Lab meets in Library Rm 32 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 4 Assignment #1: Tell us about you Due: Today - Wed, Aug 22 1- Complete HW1_Aug20 for Drena 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 5 Required Reading (must read before lecture) Wed Aug 22 - for Lecture #2 • Xiong Textbook: • Chp 1 - Introduction • Chp 2 - Biological Databases Thurs Aug 23 - for Lab #1: • Literature Resources for Bioinformatics Andrea Dinkelman, see Lab Schedule for URL Fri Aug 24 • Genomics & Its Impact on Science & Society: Genomics & Human Genome Project Primer see Lecture Schedule for URL 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 6 A tutorial on genomic sequencing, gene structure, genes prediction Howard Hughes Medical Institute (HHMI) Cold Spring Harbor Laboratory (CSHL) Assignment #2 (& for Fun): DNA Interactive "Genomes" 1. Take the Tour 2. Read about the Project 3. Do some Genome Mining with:  Nothing to turn in - just do it! http://www.dnai.org/c/index.html #2 - Biological Databases 8/22/07 BCB 444/544 Fall 07 Dobbs 2 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 7 #1- What is Bioinformatics? (cont.) Xiong: Chp 1 1 Introduction What Is Bioinformatics? Goal Scope Applications Limitations New Themes Further Reading 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 8 1st Draft Human Genome: "Finished" in 2001 Modified from Eric Green 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 9 Human Genome Sequencing Two approaches: • Public (government) - International Consortium (mainly 6 countries, NIH-funded in US) • Hierarchical cloning & BAC-to-BAC sequencing • Map-based assembly • Private (industry) - Celera, Craig Venter, CEO • Whole genome random "shotgun" sequencing • Computational assembly (took advantage of public maps & sequences, too) Guess which human genome they sequenced? Craig's How many genes? ~ 20,000 (Science, May 2007) 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 10 Public Sequencing: International Consortium Modified from Eric Green 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 11 Comparison of Sequenced Genome Sizes Plants? Many have much larger genomes than human! Modified from Eric Green 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 12 "Complete" Human Genome Sequence: What next? from Eric Green #2 - Biological Databases 8/22/07 BCB 444/544 Fall 07 Dobbs 5 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 25 Challenges in Organizing Information: Redundancy and Multiplicity • Different protein sequences can assume the same 3-D structure • Organisms have many similar genes with redundant functions • A single gene may have several different functions • Genes & proteins function in complex genetic & regulatory pathways • How do we organize all this information so that we can make sense of it? Functional Genomics & Systems Biology: sequences <> motifs <> genes <> RNAs <> proteins <> structures <> functions <> expression levels <> pathways <> regulatory networks <> functional systems Modified from Mark Gerstein 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 26 One Strategy: Molecular Parts = Conserved Domains Modified from Mark Gerstein 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 27 "Parts List" approach to bike maintenance: Which are the common parts (bolt, nut,washer, spring, bearing)? Which are unique parts (cogs, levers)? How flexible and adaptable are parts mechanically? Where are the parts located? Modified from Mark Gerstein 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 28 ~ 2,000 folds ~ 20,000 genes ~ 2,000 genes H. sapiens World of macromolecular structures is also finite, providing a valuable simplification Global surveys of a finite set of parts from different perspectives Same logic for pathways, functions, sequence families, blocks, motifs.... T. pallidum Modified from Mark Gerstein 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 29 BUT, what actually happens inside cells or within whole organisms is very complex - providing a challenging complication ! Exploring the Virtual Cell at ISU Virtual Cell projects elsewhere... NCBI's Bookshelf - a great resource! 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 30 So, having a list of parts is not enough! BIG QUESTION? SYSTEMS BIOLOGY How do parts work together to form a functional system? What is a system? Macromolecular complex, pathway, network, cell, tissue, organism, ecosystem… #2 - Biological Databases 8/22/07 BCB 444/544 Fall 07 Dobbs 6 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 31 So, this is Bioinformatics What is it good for? Just a few examples… 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 32 Designing drugs • Understanding how proteins bind other molecules • Structural modeling & ligand docking • Designing inhibitors or modulators of key proteins Figures adapted from Olsen Group Docking Page at Scripps, Dyson NMR Group Web page at Scripps, and from Computational Chemistry Page at Cornell Theory Center). Modified from Mark Gerstein 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 33 Finding homologs of "new" human genes Modified from Mark Gerstein 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 34 Finding WHAT? Homologs - "same genes" in different organisms • Human vs Mouse vs Yeast • Much easier to do experiments on yeast to determine function • Often, function of an ortholog in at least one organism is known Best Sequence Similarity Matches to Date Between Positionally Cloned Human Genes and S. cerevisiae Proteins Human Disease MIM # Human GenBank BLASTX Yeast GenBank Yeast Gene Gene Acc# for P-value Gene Acc# for Description Human cDNA Yeast cDNA Hereditary Non-polyposis Colon Cancer 120436 MSH2 U03911 9.2e-261 MSH2 M84170 DNA repair protein Hereditary Non-polyposis Colon Cancer 120436 MLH1 U07418 6.3e-196 MLH1 U07187 DNA repair protein Cystic Fibrosis 219700 CFTR M28668 1.3e-167 YCF1 L35237 Metal resistance protein Wilson Disease 277900 WND U11700 5.9e-161 CCC2 L36317 Probable copper transporter Glycerol Kinase Deficiency 307030 GK L13943 1.8e-129 GUT1 X69049 Glycerol kinase Bloom Syndrome 210900 BLM U39817 2.6e-119 SGS1 U22341 Helicase Adrenoleukodystrophy, X-linked 300100 ALD Z21876 3.4e-107 PXA1 U17065 Peroxisomal ABC transporter Ataxia Telangiectasia 208900 ATM U26455 2.8e-90 TEL1 U31331 PI3 kinase Amyotrophic Lateral Sclerosis 105400 SOD1 K00065 2.0e-58 SOD1 J03279 Superoxide dismutase Myotonic Dystrophy 160900 DM L19268 5.4e-53 YPK1 M21307 Serine/threonine protein kinase Lowe Syndrome 309000 OCRL M88162 1.2e-47 YIL002C Z47047 Putative IPP-5-phosphatase Neurofibromatosis, Type 1 162200 NF1 M89914 2.0e-46 IRA2 M33779 Inhibitory regulator protein Choroideremia 303100 CHM X78121 2.1e-42 GDI1 S69371 GDP dissociation inhibitor Diastrophic Dysplasia 222600 DTD U14528 7.2e-38 SUL1 X82013 Sulfate permease Lissencephaly 247200 LIS1 L13385 1.7e-34 MET30 L26505 Methionine metabolism Thomsen Disease 160800 CLC1 Z25884 7.9e-31 GEF1 Z23117 Voltage-gated chloride channel Wilms Tumor 194070 WT1 X51630 1.1e-20 FZF1 X67787 Sulphite resistance protein Achondroplasia 100800 FGFR3 M58051 2.0e-18 IPL1 U07163 Serine/threoinine protein kinase Menkes Syndrome 309400 MNK X69208 2.1e-17 CCC2 L36317 Probable copper transporter Modified from Mark Gerstein 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 35 Comparative Genomics: Genome/Transcriptome/Proteome/Metabolome Databases, statistics • Occurrence of a specific genes or features in a genome • How many kinases in yeast? • Compare Tissues • Which proteins are expressed in cancer vs normal tissues? • Diagnostic tools • Drug target discovery Modified from Mark Gerstein 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 36 Molecular Recognition: Analyzing & Predicting Macromolecular Interfaces (in DNA, RNA & protein complexes) Drena Dobbs, GDCB Jae-Hyung Lee Michael Terribilini Jeff Sander Pete Zaback Vasant Honavar, Com S Feihong Wu Cornelia Caragea Fadi Towfic Jivo Sinapov Robert Jernigan, BBMB Taner Sen Andrzej Kloczkowski Kai-Ming Ho, Physics #2 - Biological Databases 8/22/07 BCB 444/544 Fall 07 Dobbs 7 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 37 Designing Zinc Finger DNA-binding Proteins to Recognize Specific Sites in Genomic DNA Drena Dobbs, GDCB Jeff Sander Pete Zaback Dan Voytas, GDCB Fengli Fu Les Miller, ComS Vasant Honavar, ComS Keith Joung, Harvard 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 38 Structure & Function of Human Telomerase: Predicting structure & functional sites in a clinically important but "recalcitrant" RNP www.intl-pag.org/ Cell Biologist: Biochemist: Imagined structure: Lingner et al (1997) Science 276: 561-567.www.chemicon.com How would a systems biologist study telomerase? 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 39 SUMMARY: #1- What is Bioinformatics? 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 40 #2- Biological Databases Xiong: Chp 2 2 Introduction to Biological Databases What Is a Database? Types of Databases Biological Databases Pitfalls of Biological Databases Information Retrieval from Biological Databases Summary Further Reading 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 41 What is a Database? Duh!! OK: skip we'll skip that! 8/22/07BCB 444/544 F07 ISU Dobbs #2 - Biological Databases 42 Types of Databases 3 Major types of electronic databases: 1- Flat files - simple text files • no organization to facilitate retrieval 2- Relational - data organized as tables ("relations") • shared features among tables allows rapid search 3- Object-oriented - data organized as "objects" • objects associated hierarchically
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved