Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Bioinformatics: Building Local Genomic Databases and Key Gene Ontology Features, Study notes of Computer Science

Information on building local genomic databases using ncbi entrez database and other resources. It also covers key features of the gene ontology, including functional classification and the three main ontologies: molecular function, biological process, and cellular component.

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-mw8
koofers-user-mw8 🇺🇸

10 documents

1 / 56

Toggle sidebar

Related documents


Partial preview of the text

Download Bioinformatics: Building Local Genomic Databases and Key Gene Ontology Features and more Study notes Computer Science in PDF only on Docsity! 8/19/2005 Su-Shing Chen, CISE 1 CAP 5510-2 BIOINFORMATICS Su-Shing Chen CISE 8/19/2005 Su-Shing Chen, CISE 2 Building Local Genomic Databases  Genomic research integrates sequence data with gene function knowledge.  Gene ontology to represent the knowledge in local genomic databases.  Multiple organisms and gene products (e.g., proteins) with their functions.NCBI Entrez database with functions collected from other databases: Local SEED database, SWISS PROT, KEGG 8/19/2005 Su-Shing Chen, CISE 5 Key Gene Ontology Features http://www.geneontology.org/doc/gene_ontology_discussion.html  Where is a gene expressed? Spatial problem: organism’s anatomy.  What is the subcellular localization of a gene product? Subcellular anatomy.  When is a gene expressed? Temporal problem:organism’s ontogeny.  What is the function of a gene product? Functional classification of gene products. 8/19/2005 Su-Shing Chen, CISE 6 Key Gene Ontology Features http://www.geneontology.org/doc/gene_ontology_disc ussion.html  Of what larger process is the gene product function a part? Process hierarchy.  By what process is a gene’s activities controlled? Regulatory hierarchy.  Of what larger complex is this function a component? Parts-list of multicomponent complexes.  What genes in species A have the function of gene X in species B? Functional classification of species A and B. 8/19/2005 Su-Shing Chen, CISE 7 Gene Ontology Consortium  GO Consortium: SGD (Saccharomyces), FlyBase (Drosophila), MGD/GXD (Mouse), TAIR (Arabidopsis), Caenorhabditis elegans.  Goals: 1. To compile a comprehensive structured vocabulary of terms, synonyms, biological dimensions (DNA metabolism, molecular function, cell). 2. To describe biological objects using these terms. 3. To provide tools for querying and manipulating vocabularies. 4. To provide tools to assign GO terms to biological objects (sequence, annotation, microarray, protein binding experiments). 8/19/2005 Su-Shing Chen, CISE 10 Ontology Structure & Standards  The ontologies are structured vocabularies in the form of directed acyclic graphs (DAG’s) that represent a network of childs and parents (is-a or part-of).  See http://www.geneontology.org/. 8/19/2005 Su-Shing Chen, CISE 11 Database Management Systems  A DBMS is a software for keeping computerized records about an enterprise and for querying information in the records.  DBMS models: hierarchical, network, relational, and object-oriented.  SQL (Structured Query Language) is a database language.  Logical database design: Entity-relation and object-orientation.  Physical database design: Indexing, storage, organization. 8/19/2005 Su-Shing Chen, CISE 12 A database is a set of named tables (relations) Columns (Attributes) Rows (Tuples) A relational schema = the set of attributes of a table 8/19/2005 Su-Shing Chen, CISE 15 Entity-Relationship for Gene Product Gene-Product Enzyme-ReactionMetabolic Pathway Reaction Species Genome Linkage-Group MapLocus Term | ; -(0 Objects | Protocols | Controlled Values | Named Primitives | ASDT's | Options | Refresh | Print | Close | Help | (SRR eee eis BEI Es Objects | Protecols | Controlled Values | Named Primitives | ASDT's | Options | Refresh | Print | Close | Help | Object Classes Anything Ee EnzymeCatalyzedReaction AUDIT_TRAIL breakpoints ccref —{io_ |} (INTEGER) CloneLibrary Column -—__Name_}—_____— VARCHAR(120} Column_Joins constraints Reaction cp DNARNAlsolatio GeneProduct Entity Fa Environment + Species |__| SpeciesGenome Environments pe [max }- roar Field! __ fields EnzymeCatalyzedReaction |_| Substratespecificity VARCHAR(255) GelPattern i Sonar OptimumpH FLOAT GeneProduc —(__opaimunterp roar genes ID_DATE —{_temments | text Journal KaryotypicVariat <| Next | Bp [ NextStep } EnzymeCatalh, LCbin a Ig <{ Prev | & {__ PreviousStep Enzym LinkageGroup Locus Map ~ 4 r a OBJECT CLASS [EnzymeCatalyzedReaction]; HAS IDS ID_. (Only local attributes shown in this browser.) is tele Ll #130) |Java Applet Window ie | 8/19/2005 Su-Shing Chen, CISE 17 Generalization Hierarchies  Several types of entities with common attributes can be generalized into a higher-level entity type.  Conversely an entity can be decomposed into lower-level entities. 8/19/2005 Su-Shing Chen, CISE 20 Object Model - Biological Objects  Genomic Objects  Enzyme Objects  Sequence Objects  Structure Objects  Experiment Objects  Variation Objects  Mapping Objects  Citation - Literature + References  Registry - People + Organizations  External Links - Databases 8/19/2005 Su-Shing Chen, CISE 21 Dynamic Model  Biochemical Processes  Metabolic Pathways  Signal Transduction Pathways  Neural Networks 8/19/2005 Su-Shing Chen, CISE 22 DATA TYPES: An instance or object of the class contains values for the class attributes stored in the database  Text (clone name)  Number (insert size)  Restricted Value (DNA type)  List (people)  Table (complex related attributes)  Association (gene to gene-product: protein)  Sequence  Pointer (other databases) 8/19/2005 Su-Shing Chen, CISE 25 Object-oriented concepts  Object and object identity  Encapsulation  Message passing  Complex object  Object class/type  Inheritance  Polymorphism and run-time binding  Persistance 8/19/2005 Su-Shing Chen, CISE 26 Private memory Data + Operation Public Interface Operation Spec Any thing (physical object, abstract concept, event, function, process) can be modeled as object. Data: instance variables, attributes, slots. Operations: methods, actions, behaviors. OBJECT 8/19/2005 Su-Shing Chen, CISE 27 CLASS protein DATA sequence structure OPERATION function Object type declaration OBJECT CLASS Container of object instances Protein instances Protein Class 8/19/2005 Su-Shing Chen, CISE 30 MESSAGE PASSING A B Source object (sender) Target object (receiver) Message= (objectB, methodX, parameter, return value) Return message 8/19/2005 Su-Shing Chen, CISE 31 COMPLEX OBJECT CLASS - Gene Product Class Gene product RNA protein gene Trypsin PRSS1 8/19/2005 Su-Shing Chen, CISE 32 COMPLEX OBJECT CLASS - (Biological) Polymorphism Class (Biological) Polymorphism Class Polymorphism Object Allele Set Fragments in kb’s Sizes detected in a polymorphism Allele frequency Population Alleles Detection method 8/19/2005 Su-Shing Chen, CISE 35 Type: Genetic map Physical map Contig map Transcript map Radiation hybrid map Cytogenetic map Mapped Entity: Amplimer Sequencing region Bin Syndromic region Breakpoint Syntenic region Chromosome Cell line Chromosome reagent Library Clone Contig CpG Island Cytogenetic marker EST Gene Gene element Regulatory region Repeat 8/19/2005 Su-Shing Chen, CISE 36 INHERITANCESEukaryote operations: exons, introns Animal FungiPlant operations: leaves SUPERCLASS CLASS Hominidae CanidaeSUBCLASS Man operations: chromosomeY Woman Dog Wolf Coyote SUB-SUB- CLASS exons introns chromosomeY exons introns leaves 8/19/2005 Su-Shing Chen, CISE 37 Advantages of Inheritance  Reuse of object type declaration.  Reuse of software implementations.  Modularization of complex problems. 8/19/2005 Su-Shing Chen, CISE 40 nucleotide sequence gene clone amplimer (PCR primer) Relation: aplimers are contained in genes Relation: aplimers are contained in clones Relation: aplimers from clones overlap genes POLYMORPHISM - MUTATION aggregation 8/19/2005 Su-Shing Chen, CISE 41 Persistent Databases Class Libraries Design Tools Query Tools API Database Manager Object Manager Object-Oriented DBMS Architecture query, transaction, schema management, concurrency control, type management, versioning, object caching page management, object locking, disk access, logging, recovery, transaction commit 8/19/2005 Su-Shing Chen, CISE 42 ---------------- next grouping is phylogenetic data [family/superfamily classification] [species]+[tissue]+[cell type]+[localization in cell]+[state of maturity(embryo, juvenile, adult, unspecified)] [genus] [phylum] [kingdom] [cDNA sequence] [aa sequence] [bibliography for sequences] PHYLOGENETIC DATA 8/19/2005 Su-Shing Chen, CISE 45 Expression Molecular Dynamics Degradation Turnover MOLECULAR DYNAMICS 8/19/2005 Su-Shing Chen, CISE 46 next grouping is for applications significance [human or veterinary health significance, if any known] [bibliography for human or veterinary health significance] [biotech significance, if any known] [bibliography for biotech significance] [agricultural significance, if any known] [bibliography for agricultural significance] APPLICATIONS 8/19/2005 Su-Shing Chen, CISE 47 Health Biotech Agriculture Applications APPLICATIONS 8/19/2005 Su-Shing Chen, CISE 50 next set of entries is for structural information [experimentally determined structures] [bibliography for experimentally determined structures] [model-built structures] [bibliography for model-built structures] [partial structural information -- cd spectra, solution nmr, cysteine scanning, antibody labelling, identification of glycosylation or phosphorylation sites, etc.] [bibliography for partial structural information] STRUCTURAL INFORMATION 8/19/2005 Su-Shing Chen, CISE 51 Structural Information Experimental Structures Bibliography Model Structures Bibliography Bibliography Partial Structure Information STRUCTURAL INFORMATION Entity Primitives Reactors CellComponents 8/19/2005 Su-Shing Chen, CISE 52 8/19/2005 Su-Shing Chen, CISE 55 2002 Fall Home Work 3 Due 11/7  Use NCBI Entrez structure database to get all structure (if available) coordinates data of your data set (2 bacteria and all BLAST annotations)  Create flat files of structure data and visual data using Cn3D. 8/19/2005 Su-Shing Chen, CISE 56 Gene SequenceRefSeq Locus D/E/G Protein Sequence CDS Anno. G. Sequence Anno. P. Sequence BLAST BLAST CDS FunctionsFunctionsFunctions GO Databases CAP 5510 Bacteria & Fungi Functional Database Protein Structure A. P. Structure
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved