Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Introduction to Bioinformatics, Lecture notes of Bioinformatics

notes of introduction of Bioinformatics lecture 1

Typology: Lecture notes

2016/2017
On special offer
30 Points
Discount

Limited-time offer


Uploaded on 11/21/2017

dr-maqsood-hayat
dr-maqsood-hayat 🇵🇰

5 documents

1 / 35

Toggle sidebar
Discount

On special offer

Related documents


Partial preview of the text

Download Introduction to Bioinformatics and more Lecture notes Bioinformatics in PDF only on Docsity! CENG 465 Introduction to Bioinformatics Spring 2011-2012 Tolga Can (Office: B-109) e-mail: tcan@ceng.metu.edu.tr Course Web Page: h // d / / 465 1 1 ttp: www.ceng.metu.e u.tr ~tcan ceng _s 112/ Goals of the course • Working at the interface of computer science and biology N i i– ew mot vat on – New data and new demands – Real impact • Introduction to main problems in bioinformatics • Opportunity to interact with algorithms, tools, data in current practice 2 Course outline • Protein structures (4 weeks) – Structure prediction (secondary, tertiary) A l t i t t f l di f ti– na yze pro e n s ruc ures or c ues regar ng unc on • Structure alignment • Microarray data analysis (2 weeks) – Correlations, clustering • Gene/Protein networks, pathways (2 weeks) – Protein-protein, protein/DNA interactions – Construction and analysis of large scale networks 5 Grading • Midterm exam - 30% • Final exam - 30% • Assignments (programming) - 40% 6 Miscellaneous • Course webpage – http://www.ceng.metu.edu.tr/~tcan/ceng465_s1112/ – Lecture slides and reading materials – Assignments • Newsgroup – metu ceng course 465. . . – You should follow the newsgroup for course related announcements – Students from other departments should get a CENG account for this semester (Room: A-210) in order to access the newsgroup 7 – Teaching assistant: Itir Onal (itir@ceng, BZ19) Computing versus Biology • what computer science is to molecular biology is like what mathematics has been to physics ...... L H t ISMB’94-- arry un er, • molecular biology is (becoming) an information science ....... -- Leroy Hood, RECOMB’00 • bioinformatics ... is the research domain focused on linking the behavior of biomolecules biological pathways , , cells, organisms, and populations to the information encoded in the genomes --Temple Smith, Current i i C i l l l i l 10 Top cs n omputat ona Mo ecu ar B o ogy Computing versus Biology looking into the future • Like physics, where general rules and laws are taught at the start, biology will surely be presented to future generations of students as a set of basic systems ....... duplicated and adapted to a very wide range of cellular and organismic f i f ll i b i l i i i l i d b E h’unct ons, o ow ng as c evo ut onary pr nc p es constra ne y art s geological history. --Temple Smith, Current Topics in Computational Molecular Biology 11 Introductory Biology = DNA Protein (Genotype) Phenotype 12 UP AAS Ae Nucleus Nucleolus — \fitochondrion . Cytoskeleton Ribosomes apparatus Rough endoplasmic reticulum ©2001 Sinauer Associates, Inc. 15 Two kinds of Cells • Prokaryotes – no nucleus (bacteria) – Their genomes are circular • Eukaryotes – have nucleus (animal,plants) – Linear genomes with multiple chromosomes in pairs. When pairing up, they look like Middle: centromere Top: p-arm Bottom: q-arm 16 Molecular Biology Information - DNA atggcaattaaaattggtatcaatggttttggtcgtatcggccgtatcgtattccgtgca gcacaacaccgtgatgacattgaagttgtaggtattaacgacttaatcgacgttgaatac t tt t t tt t t tt t t ttt t tt t• Raw DNA Sequence Coding or Not? a ggc a a g gaaa a ga caac cacgg cg cgacggcac g gaag g aaagatggtaacttagtggttaatggtaaaactatccgtgtaactgcagaacgtgatcca gcaaacttaaactggggtgcaatcggtgttgatatcgctgttgaagcgactggtttattc ttaactgatgaaactgctcgtaaacatatcactgcaggcgcaaaaaaagttgtattaact ggcccatctaaagatgcaacccctatgttcgttcgtggtgtaaacttcaacgcatacgca ggtcaagatatcgtttctaacgcatcttgtacaacaaactgtttagctcctttagcacgt gttgttcatgaaactttcggtatcaaagatggtttaatgaccactgttcacgcaacgact– – Parse into genes? – 4 bases: AGCT gcaactcaaaaaactgtggatggtccatcagctaaagactggcgcggcggccgcggtgca tcacaaaacatcattccatcttcaacaggtgcagcgaaagcagtaggtaaagtattacct gcattaaacggtaaattaactggtatggctttccgtgttccaacgccaaacgtatctgtt gttgatttaacagttaatcttgaaaaaccagcttcttatgatgcaatcaaacaagcaatc aaagatgcagcggaaggtaaaacgttcaatggcgaattaaaaggcgtattaggttacact gaagatgctgttgtttctactgacttcaacggttgtgctttaacttctgtatttgatgca – ~1 Kb in a gene, ~2 Mb in genome – ~3 Gb Human gacgctggtatcgcattaactgattctttcgttaaattggtatc . . . . . . caaaaatagggttaatatgaatctcgatctccattttgttcatcgtattcaa caacaagccaaaactcgtacaaatatgaccgcacttcgctataaagaacacggcttgtgg cgagatatctcttggaaaaactttcaagagcaactcaatcaactttctcgagcattgctt gctcacaatattgacgtacaagataaaatcgccatttttgcccataatatggaacgttgg gttgttcatgaaactttcggtatcaaagatggtttaatgaccactgttcacgcaacgact acaatcgttgacattgcgaccttacaaattcgagcaatcacagtgcctatttacgcaacc aatacagcccagcaagcagaatttatcctaaatcacgccgatgtaaaaattctcttcgtc ggcgatcaagagcaatacgatcaaacattggaaattgctcatcattgtccaaaattacaa aaaattgtagcaatgaaatccaccattcaattacaacaagatcctctttcttgcacttgg 17 Structure summary • 3-d structure determined by protein sequence • Cooperative and progressive stabilization • Prediction remains a challenge – ab-initio (energy minimization) – knowledge-based • Chou-Fasman and GOR methods for SSE prediction C i d li d i h di f i• omparat ve mo e ng an prote n t rea ng or tert ary structure prediction • Diseases caused by misfolded proteins – Mad cow disease • Classification of protein structures 20 Genes and Proteins • One gene encodes one* protein. • Like a program, it starts with start codon (e.g. ATG), then each three code one amino acid. Then a stop codon (e.g. TGA) signifies end of the gene. • Sometimes, in the middle of a (eukaryotic) gene, there are introns that are spliced out (as junk) during transcription. Good parts are called exons. This is the task of gene finding. 21 A.A. Coding Table Glycine (GLY) GG* Alanine(ALA) GC* Arginine (ARG) CG* Asparagine (ASN) AAT AAC Valine (VAL) GT* Leucine (LEU) CT* , Glutamine (GLN) CAA, CAG Cysteine (CYS) TGT, TGC Isoleucine (ILE) AT(*-G) Serine (SER) AGT, AGC Th i (THR) AC* Methionine (MET) ATG Phenylalanine (PHE) TTT,TTC T i (TYR) TAT TACreon ne Aspartic Acid (ASP) GAT,GAC Glutamic Acid(GLU) yros ne , Tryptophan (TRP) TGG Histidine (HIS) CAT, CAC GAA,GAG Lysine (LYS) AAA, AAG Start: ATG CTG GTG Proline (PRO) CC* Stop TGA, TAA, TAG 22 , , Project 25 Gene Expression Datasets: the Transcriptome Young/Lander, Chips, Abs Exp. . Also: SAGE; S damson an Church, Chips; Aebersold, Protein Brown, µarray, Rel. Exp. over Snyder, Transposons, P i E 26 ExpressionTimecourse rote n xp. Other Whole-Genome Experiments Systematic Knockouts Winzeler, E. A., Shoemaker, D. D., Astromoff, A., Liang, H., Anderson, K., A d B B h R B it R 2 hybrids, linkage maps Hua, S. B., Luo, Y., Qiu, M., Chan, E., Zhou, H. & Zhu L (1998) Construction of a modular yeast twon re, ., ang am, ., en o, ., Boeke, J. D., Bussey, H., Chu, A. M., Connelly, C., Davis, K., Dietrich, F., Dow, S. W., El Bakkoury, M., Foury, F., Friend, S. H., Gentalen, E., Giaever, G., Hegemann J H Jones T Laub M , . . - hybrid cDNA library from human EST clones for the human genome protein linkage map. Gene 215, 143-52 For yeast: 27 , . ., , ., , ., Liao, H., Davis, R. W. & et al. (1999). Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 285, 901-6 6000 x 6000 / 2 ~ 18M interactions Noncoding Pseudogenes Human genome Genes and gene- DNA 810Mb Gene fragments Introns, leaders, trailers related sequences 900Mb Single-copy genes Multi-gene families Tandemly repeated Coding DNA 90Mb Regulatory sequences Non-coding t d Satellite DNA Minisatellites Dispersed Repetitive DNA Extragenic DNA an em repeats DNA transposons Microsatellites 420Mb Genome- wide 2100Mb LTR elements LINEs SINEs interspersed repeats 30 Unique and low-copy number 1680Mb Where to get data? • GenBank – http://www.ncbi.nlm.nih.gov • Protein Databases SWISS PROT h // h/– - : ttp: www.expasy.c sprot – PDB: http://www.pdb.bnl.gov/ • And many others 31 Data • Diversity and size of information S 3 D t t i t i– equences, - s ruc ures, m croarrays, pro e n interaction networks, in silico models, bio-images • Understand the relationship Similar to complex software design 32 –
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved