Download Genomics, Bioinformatics and Proteomics - Lecture Slides | SOCR 330 and more Study notes Genetics in PDF only on Docsity! 4/20/2018 1 “Genomics, Bioinformatics, and Proteomics” • Genomic Analysis • DNA Analysis and Bioinformatics • Functional Genomics • Omics Revolution • Proteomics 4/20/2018 2 Genomes Genome: the complete set of DNA sequences in a single cell of an organism. It includes the sequence of each chromosome plus any DNA in organelles. Prokaryotes Large chromosome (most are circular) Plasmids (small circular chromosomes) Eukaryotes Linear chromosomes in the nucleus Mitochondrial DNA (circular chromosome) Chloroplast DNA in plants (circular chromosome) BamHI BamHI
Genomic DNA cut into
multiple overlapping
fragments by digestion with
different restriction enzymes to
create a series of contiguous
fragments, or “contigs”
Overlapping sequenced fragments
aligned using computer programs to.
assemble an entire chromosome
Fragments aligned
based on identical
DNA sequences
Contigs
4/20/2018
4/20/2018 6 DNA Sequence Alignment Figure 18.2 Contiguous fragments or Contigs 4/20/2018
Landmarks in Sequencing
Efficiency Year Event
(bp/person/year)
1870 Miescher: Discovers DNA
1940 Avery: Proposes DNA as “Genetic Material”
1953 Watson & Crick: Double Helix Structure of DNA
1 1965 Holley: Sequenced transfer RNA from Yeast
1,500 1977. Maxam & Gilbert: "DNA sequencing by chemical degradation”
Sanger: “DNA sequencing with chain-terminating inhibitors”
1980 Messing: DNA cloning
15,000 1981 Messing: Messing and his colleagues developed “shotgun
sequencing” method
1986
1987 ABI markets the first sequencing platform, ABI 370
Hood et al.: Partial Automation
HUMAN ~ co
GENOME “SF
4/20/2018
LerLLLBR S
Fetal z
: human
genome
. toe :
P Zit _
10
4/20/2018
Next-Gen Sequencing Platforms
454/Roche GS-20/FLX of
(2005)
PacBio RS (2009-2010)
34 generation?
Illumina Hiseq
(2007)
Comparison of NGS Platforms
Technology | Reads perrun | Average Read | bp per run id
Length eres
454 (Roche) 400,000 280-1000bp 70 Million Substitution
SoLID (ABI) 88-132 Million 35bp 1 Billion
Illumina HiSeq 150 Mi
100 ~ 2009
15 Billion Substitution
with
exponential
increase
45 Million Insertions and
deletions
1000-2000bp
11
4/20/2018 12 New Sequencing Technology • Ion Torrent semiconductor sequencing: based on the detection of hydrogen ions that are released during the polymerisation of DNA as opposed to the optical methods used in other sequencing systems. • DNA nanoball sequencing: The method uses rolling circle replication to amplify small fragments of genomic DNA into DNA nanoballs • Nanopore DNA sequencing. The DNA is passed through a nanopore which changes its ion current. This change is dependent on the shape, size and length of the DNA sequence. • Tunnelling currents DNA sequencing; measurements of the electrical tunnelling currents across single-strand DNA as it moves through a channel. 4/20/2018 15 What can we do with sequenced genomes? Catalog sequences • Publish known sequences for a gene • Map sequences to chromosomes • Map and report unknown sequences • Share information about sequenced DNA. 4/20/2018 16 Annotation to Identify Gene Sequences Gene banks acquire, store and share data from researchers (Gen Bank) Gen Bank maintained by National Center for Biotechnology Information (NCBI) NCBI shares with Japan and Europe, stores and shares >100 B base sequences Basic Local alignment Search Tool (BLAST Search) 4/20/2018 17 Blast Results from 280 bp contig from rat with mouse chromosome for an insulin receptor gene 4/20/2018 20 Functional Genomics Functional Genomics attempts to identify potential functions of genes and other elements of the genome. Tools: Homology to other organisms (homologous genes) Orthologs –descended from same ancestor Paralogs – homologs in same species. 4/20/2018 21 Homologs of a Mouse and Human Gene (LEP) Figure 18-5. Sequence similarity between homologs from mouse and human genes. 4/20/2018 22 Project goals were to: identify all genes in human DNA determine the sequences of the 3 billion + chemical base pairs in the human genome store this information in databases, and improve tools for data analysis transfer related technologies to the private sector, and • address the ethical, legal, and social issues (ELSI) that may arise from the project. Human Genome Project: 1990 to 2003 http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml 4/20/2018
opts eb an acini epeat
vm sNeurotrophin 3
pninpl2 Cyn dependent kinase inhibitor
$eoan
vapi2s——_alypotetiel protein 4122028
apt eiSA1Zeeng eta tabu
TUBES) Pasadogene A“
gna eHypothetieal poten LOC7281 66
tans s¢hromoome 12 open reading
‘rome at
sauaay4 Protein Kinase, AMP acted
‘gomme Tonal
ra «Fas popoticinhtory molecule 2
agg #Aetivn A receptor ype
vgn
raga
ran
prt Sia t at shock 7080
rat eTransmembrane protein
rams ety
vans
aw
rage
tae
‘chromosome 21
‘SD mon bases
Coxsackie and adenovins receptor (JB) Mycoproiferative syndrome, transient
‘Amyloidosis cerebroarteral, Dutch type {etkemia transient of Down syndrome
Enterokinase deficiency
Multiple carboxyase deficiency
“el lymphoma invasion and metastasis
Mycobacterial infection, atypical
25
4/20/2018 26 Features High molecular weight DNA (50–250 kb) Isolated from the whole blood of disease-free sources How does it work? High-quality genomic DNA is isolated from the whole blood of disease-free sources that contain no glycogen, to obtain highly pure, intact genomic DNA. Cost: $Free to $3000 Harvard Personal Genome Project will sequence you genome for free. You can have your genome sequenced! 4/20/2018 27 Omics Revolution Proteomics Metabolomics Glycomics Toxigenomics Metagenoics Pharacogenomics Transcriptomics Nutrigenomics Others???? 4/20/2018 30 Microarrays are widely used to measure mRNA expression 4/20/2018 31 Proteomics Proteome: The complete set of proteins encoded by a genome. Proteomics: The study of the complete set of proteins expressed and modified during a cell’s entire lifetime 4/20/2018 32 In a typical proteomic analysis, cells are exposed to two different conditions (such as growth conditions, drugs, or hormones). After treatment, proteins are extracted and separated by 2DGE. The pattern of spots is then compared for evidence of differential gene expression. Spots of interest are cut out from the gel, digested into peptide fragments, and analyzed by mass spectrometry to identify the protein in the spot. 4/20/2018 35 Figure 18-16 The use of Mass Spectrometry for identifying an unknown protein isolated from a 2D gel. 4/20/2018 36 In Humans There are ~25,000 genes that code for ~100,000 to 200,000 proteins. In eukaryotic organisms there are many more proteins than genes. 4/20/2018 37 Conclusions 1. There are many more proteins than genes. 2. Gene expression and posttranslational processing allows for the prodution of complex three dimensional proteins. 3. Only a portion of the genes are expressed under optimum conditions. 4. There a many genes that are only expressed under certain environments or during cell functions. 5. Genes regulate all protein synthesis.