Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

NCBI Tools for Gene and Genome Analysis in Molecular Biosciences, Study notes of Molecular biology

Information about a university course titled 'functional genomics with a focus on tools in bioinformatics' offered in the fall of 2008. The course covers various topics related to gene and genome structure analysis using tools from the national center for biotechnology information (ncbi), including genbank, entrez pubmed, and other bioinformatics tools. Various concepts in computational genomics, such as genome mismatch scanning, genome sequence sampling, and the role of bioinformatics and computational biology in functional genomics.

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-u6x
koofers-user-u6x 🇺🇸

5

(2)

10 documents

1 / 9

Toggle sidebar

Related documents


Partial preview of the text

Download NCBI Tools for Gene and Genome Analysis in Molecular Biosciences and more Study notes Molecular biology in PDF only on Docsity! 9/15/2008 1 Gene and Genome Structure: Eukaryotes Functional Genomics With a Focus on Tools in Bioinformatics Molecular Biosciences (MBioS 503) Molecular Biology I Fall 2008 National Center for Biotechnology Information (NCBI) GenBank, Entrez PubMed and other Bioinformatics Tools Computational Genomics http://www.ncbi.nlm.nih.gov http://www.ncbi.nlm.nih.gov/Entrez/ http://www.ncbi.nih.gov/Genbank/ http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide http://www.ncbi.nlm.nih.gov/dbEST/ http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene http://www.ncbi.nlm.nih.gov/LocusLink/ http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=unigene http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=homologene http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed htt // bi l ih /P bM d/ NCBI Web Links p: www.nc .n m.n .gov u e http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=snp http://www.ncbi.nlm.nih.gov/SNP/ http://eutils.ncbi.nlm.nih.gov/entrez/query/static/advancedentrez.html http://www.ncbi.nlm.nih.gov/geo/ http://www.ncbi.nlm.nih.gov/RefSeq/ FTP: ftp://ftp.ncbi.nlm.nih.gov/ ftp://ftp.ncbi.nlm.nih.gov/repository/UniGene ftp://ftp.ncbi.nih.gov/pub/HomoloGene/ NUCLEOTIDE: http://genome.ucsc.edu/ http://www.embl-heidelberg.de/ http://www.ensembl.org/ http://www.ebi.ac.uk/ http://www.gdb.org/ http://bioinfo.weizmann.ac.il/cards/index.html http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/searchgenes.pl PATHWAYS and NETWORKS: http://www.genome.ad.jp/kegg/ PROTEIN: http://us.expasy.org/ ftp://us.expasy.org/ http://www.sanger.ac.uk/Software/Pfam/ http://www.sanger.ac.uk/Software/Pfam/ftp.shtml http://smart.embl-heidelberg.de/ http://www.ebi.ac.uk/interpro/ http://us.expasy.org/prosite/ ftp://us.expasy.org/databases/prosite/ ftp://ftp.genome.ad.jp/pub/kegg/ (http://www.genome.ad.jp/anonftp/) http://dip.doe-mbi.ucla.edu http://dip.doe-mbi.ucla.edu/dip/Download.cgi http://www.blueprint.org/bind/ http://www.blueprint.org/bind/bind_downloads.html http://160.80.34.4/mint/index.php http://160.80.34.4/mint/release/main.php http://www.hprd.org/ http://www.hprd.org/FAQ?selectedtab=DOWNLOAD+REQUESTS http://www.pubgene.org/ (also .com) More Web Links http://www.bioconductor.org/ http://apps1.niaid.nih.gov/david/ http://www.geneontology.org/ http://discover.nci.nih.gov/gominer/index.jsp http://pubmatrix.grc.nia.nih.gov/ http://pevsnerlab.kennedykrieger.org/dragon.htm AFLP amplified fragment length polymorphism APP amyloid precursor protein ARS autonomously replicating sequence BAC bacterial artificial chromosome Bioinfomatics bp base pair CAPS cleaved amplified polymorphic sequences CEPH Centre d'Etude du Polymorphisme Humain Computational Genomics cM centimorgan ct chloroplast Some Abbreviations Used in HGP Literature: DIRVISH direct visual hybridization DMD Duchenne muscular dystrophy EMC enzyme mismatch cleavage ESC embryonic stem cells EST expressed sequence tag FACS fluorescence activated cell sorting FISH fluorescent in situ hybridization Functional Genomics GDRDA genetically directed representational difference analysis 9/15/2008 2 Some Abbreviations Used in HGP Literature, cont.: GMS genome mismatch scanning GSS genome sequence sampling HAEC human artificial episomal chromosome HPR T hypoxanthine phospho ribosyl transferase kb kilo base LINE long interspersed nuclear etement LOD logarithm (likelihood) of odds LTR long terminal repeat MAC mammalian artificial chromosome Mb mega base MHC major histocompatibility complex ; mt mitochondrial ORF open reading frame PAC PI artificial chromosome PCR polymerase chain reaction PFGE pulsed field gel electrophoresis QTL quantitative trait loci RAPD randomly amplified polymorphic DNA RARE RecA-assisted restriction endonuclease RDA representational difference analysis SNP single nucleotide polymorphism Bioinformatics and Computational Biology (sometimes called in silico or systems biology) Involves the use or development of techniques, including applied mathematics, informatics, statistics, computer science, artificial intelligence, chemistry, and biochemistry to solve biological problems, usually on the molecular level. Primary goal of bioinformatics is to increase our understanding of biological processes. What sets it apart from other approaches, however, is its focus on developing and applying computationally intensive techniques (e.g., data mining, and machine learning algorithms) to achieve this goal. Major research efforts in the field include sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression and protein- protein interactions, and the modeling of evolution. Functional vs Computational Genomics Functional Genomics: Results Obtained from Experiments with Organisms, Cells, Genes, Transcripts and Proteins –> Genomic Predictions Computational Genomics: Sequence & Algorithm-Based Knowledge –> Genomic Predictions Goal of Functional Genomics: Experimentally linking the sequence information encoded in the genome to the behavior of cells, organisms, and populations Numerous Problems in Functional Genomics Size and complexity of eukaryotic genomes is a major obstacle What experimentally defines a “functional” sequence element? How to identify “novel” or “unsuspected” functional elements? 9/15/2008 5 How is a gene defined in in “wet” biology and in silico? Seq. from mRNA sampleDNA Array prob design: Alt li i k d tC f i f bSeq. on DNA arraySource – cDNA libraries, oligos, clone collections Content – UniGene, Celera, Incyte Transcript coverage Homology to other transcripts Hybridization dynamics – hyper-multiplex hyb rxn Empirical validation 3’ bias . sp c ng - nown an no Alt. start / stop site in same RNA molecule Less important: RNA editing, SNPs ross-re ere c ng o array pro es: GenBank <> UniGene <> HomoloGene Possible is-referencing: Genomic GenBank Acc.#’s Referenced ID has more NT’s than probe Old DB builds DB or table errors http://www.ncbi.nlm.nih.gov/Entrez/ 1000 10000 100000 B an k (m ill io ns ) 1 10 100 N T' s in G en B 1984 1994 2004 Sequence Quality! Redundancy! Unsurpassed as source of expressed sequence Completeness? Chaos?!? 9/15/2008 SS ——— j peptone snare core uy ormaten abo . 7 eenty wen sequences ae sprog Feresertaves for 2. ete nt ae 223 We es OMAFLIN ne nee ormat we COMT: Catecho-o-methytransfrase 1 O10 Heep ates oneantee F fe (rete record and (CBee Saw ce 2 Gin repre nk eoreemes St sera ose ee » 5 Lauunpowesasrpeamyrmnemanncmmeess | a er ‘Sevens nace 2) on aot eae eters ean se sa ee ee Be cata Hectares, times wnitmen oer cen, or a. oe ‘tac Sone como cos wus ai esr Meenas uneemeaetoreeceme ts fami (Con NA cre Reteaiecaas se tut scorngioferecessigaiesiae See J srlne venti fegmamercanon eevee esl SNES pes re noe vs251 Hemp ape canogomeyeaniase ascent no wie — oe ec Sets San rset atom ese eanece See Uae heehee an a. that imp: vieswtetpatannsete sos, ewe aarananrcainy rater Hale : resources NE Nal thee ve a "Ysa weyers aman smapTO vg3 m6 Siawenciacnsernwsnmatsedced, Laelakrerdswhseqwncedtatrats Surman a em — Entrez Gene Collection ev eevee 9 NOR Seok ‘se nest Vins Citas cmetyann acon py te mje Iydenpenrael ONE) and tempera OBE, 2. Rly ei acct ah cern compen eso peed 2 CONT ay may tee he nd renee ev Paco dere ney 4 orale roca bere rope fined ded ‘mg gen nets cho © menirertrceCOMD OLA, (QUA) 3p eps Cte a Sa (bea) COST ate uce ctu pve AMON) catch ©-atie ‘ertonn eee 24 STEN), dope eceterD2 ORD?) lp bia roach ie ea for FFomoioGene Bula Procedure sehr nox al pace wih * Sa ” nga geht a we 9/15/2008 7 • Our genome encodes an enormous amount of information about our beings – our looks – our size – how our bodies work Computational Genomics – …. – our health – our behaviors – … who we are! gcgtacgtacgtagagtgctagtctagtcgtagcgccgtagtcgatcgtgtgggtagtagctgatatgatgcgaggtaggggataggatagcaaca gatgagcggatgctgagtgcagtggcatgcgatgtcgatgatagcggtaggtagacttcgcgcataaagctgcgcgagatgattgcaaagragtt agatgagctgatgctagaggtcagtgactgatgatcgatgcatgcatggatgatgcagctgatcgatgtagatgcaataagtcgatgatcgatgatg atgctagatgatagctagatgtgatcgatggtaggtaggatggtaggtaaattgatagatgctagatcgtaggtagtagctagatgcagggataaac acacggaggcgagtgatcggtaccgggctgaggtgttagctaatgatgagtacgtatgaggcaggatgagtgacccgatgaggctagatgcgat ggatggatcgatgatcgatgcatggtgatgcgatgctagatgatgtgtgtcagtaagtaagcgatgcggctgctgagagcgtaggcccgagagga gagatgtaggaggaaggtttgatggtagttgtagatgattgtgtagttgtagctgatagtgatgatcgtag ……. Computational Genomics • As technologies improve, we are able to extract more and more information encoded in a genome whole cell community organs biological data proteins genes complexes pathways Computational Genomics • While the ultimate goal of “functional genomics” is to link behavior of cells, organisms, and populations to the information encoded in the genome, “computational genomics” is mainly about identifying and characterizing the parts-lists of complex biological systems genomics proteomicstranscriptomics metabolomics Computational Genomics • Genetic parts-list encoded in a genome – genome sequence and variations – genomic structures – protein-coding genes – RNA-coding genes – pseudo genes – homologs/orthologs/paralogs – promoters/terminators – regulatory elements/binding motifs – transposable elements – …….
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved