Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Lecture Notes on Protein Modeling - Databases and Software | MCB 221B, Study notes of Biology

Material Type: Notes; Class: Mechanistic Enzymology; Subject: Molecular and Cellular Biology; University: University of California - Davis; Term: Summer 2007;

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-urv
koofers-user-urv 🇺🇸

3

(1)

10 documents

1 / 38

Toggle sidebar

Related documents


Partial preview of the text

Download Lecture Notes on Protein Modeling - Databases and Software | MCB 221B and more Study notes Biology in PDF only on Docsity! At the beginning, there were thoughts, and observation…. MCB-221b lecture 11 Protein Modeling ‘databases and software’ http://koehllab.genomecenter.ucdavis.edu/ • Data are used to formulate new hypotheses • Data are stored and disseminated via databases of information, allowing open access to the records held within them (bioinformatics) • designing novel algorithms and methods of analyses help solve biological problems (computational biology) From hypothesis-driven to exploratory data analysis: Is there a danger, in molecular biology, that the accumulation of data will get so far ahead of its assimilation into a conceptual framework that the data will eventually prove an encumbrance ? John Maddox, 1988 Genome DBs Protein structure DBs algorithms: classifications (CATH etc) sequencing crystallization, NMR linear aa predictions (CASP) validations • Classical tool biology: It is easier to think about a representative than to embrace the information of all individuals Aristotle: Plants and Animal Linnaeus: binomial system Darwin: systematic classification that reveals phylogeny • Clustering • Domain Definition • 3 Major classifications - SCOP - CATH - DDD • Differences Protein Structure Classification Delineating protein domains: Looking at secondary structure Authors: Sowdhamini and Blundell, Protein Science, 4:506 (1996) Definition of a domain: a cluster of secondary structure. Method: clustering of the secondary structures in a protein. “Distance” between secondary structures: Delineating protein domain: a bottom-up procedure Author: W.R. Taylor, Protein Engineering, 12, 203-216 (1999) Idea: classical methods for defining protein domains starts from an hypothesis / definition of what a domain is, and check how the data verify that hypothesis. Protein Structural Domains Protein Domain: various definitions exist 1) Regions that display significant levels of sequence similarity 2) The minimal part of a gene that is capable of performing a function 3) A region of a protein with an experimentally assigned function 4) Region of a protein structure that recurs in different contexts and proteins 5) A compact, spatially distinct region of a protein Why do proteins fold? Unfolded State Folded State Protein backbone is a linear chain Chain is self-avoiding Protein is closely packed Amino Acid preferences: - inside (hydrophobic) / outside (hydrophilic) - Specific interactions - Interactions with solvent - Interactions with ions - concentration of proteins solvent Backbone + Sidechain Protein tertiary structure: Packing Ω: angle between the 2 axes Helix-helix packing Sheet-sheet packing 20 degree between sheets orthogonal • Parallel sheets tend to be covered by helices on both sides • Anti-parallel sheets tend to have one side covered by a sheet: “sandwich-type” structure. Two types of packing: aligned, or orthogonal • Because the periodicities of helices and strands are different, there is not regular packing patterns. • Helices tend to be on both sides of parallel beta sheets. Helix-sheet packing Glucagon RNA binding protein Protein tertiary structure: Architecture classes lone helix alpha folds helix-turn-helix four helix bundle Myohemerythrin RNA binding protein dimer beta folds beta sandwich (FA binding protein) Greek key topology (5-13 strands) Jellyroll topology (a greek key with extra swirl) beta propellorbeta helix Classification of Protein Structure: CATH C A T Alpha Mixed Alpha Beta Beta Sandwich Tim Barrel Other Barrel Super RollBarrel http://www.cathdb.info/latest/index.html Classification of Protein Structure: SCOP SCOP is organized into 4 hierarchical layers: (1) Classes: similar to CATH alpha, beta, alpha/beta, alpha+beta, multi-domain proteins w/alpha and beta, membrane and cell surface proteins, small proteins, coiled coils, low resolution prot. 3) Superfamily: Probable common evolutionary origin Proteins that have low sequence identities, but whose structural and functional features suggest that a common evolutionary origin is probable are placed together in superfamilies 4) Family: Clear evolutionarily relationship Proteins clustered together into families are clearly evolutionarily related. Generally, this means that pairwise residue identities between the proteins are 30% and greater (2) Folds: Major structural similarity Proteins are defined as having a common fold if they have the same major secondary structures in the same arrangement and with the same topological connections http://scop.mrc-lmb.cam.ac.uk/scop/ http://scop.berkeley.edu/ Classification of Protein Structure: SCOP SCOP: Structural Classification of Proteins. 1.69 release 25973 PDB Entries (1 Oct 2004). 70859 Domains. 1 Literature Reference (excluding nucleic acids and theoretical models) Class ‘Number of folds|Number of superfamilies Number of families [All alpha proteins | 218 [ 376 | 608 [All beta proteins | 144 | 290 | 560 Alpha and beta proteins (a/b) | 136 | 222 | 629 [Alpha and beta proteins (a+b) | 279 | 409 | TIT [Multi-domain proteins | 46 | 46 | 61 [Membrane and cell surface proteins| AT [ 88 | 99 [Small proteins | 15 | 108 | 171 Total | 945 | 1539 | 2845 SCOP: Structural Classification of Proteins. 1.71 release 27599 PDB Entries (18 Jan 2005). 75930 Domains. 1 Literature Reference (excluding nucleic acids and theoretical models) Class ‘Number of folds|Number of superfamilies|Number of families [All alpha proteins 226 392 645 [All beta proteins 149 300 594 (Alpha and beta proteins (a/b) 134 221 661 [Alpha and beta proteins (a+b) 286 424 733 Multi-domain proteins 48 48 64 Membrane and cell surface proteins 49 90 101 ‘Small proteins 19 114 186 Total 971 1589 3004 Protein Structure Comparison The protein structure is a 3D shape: the goal is to find algorithms that find the optimal match between two shapes. • Global versus local alignment • Measuring protein shape similarity • Protein structure superposition • Protein structure alignment Global Alignment 3 oe Global alignment Local Alignment Local alignment motif Protein Structure Prediction • One popular model for protein folding assumes a sequence of events: – Hydrophobic collapse – Local interactions stabilize secondary structures – Secondary structures interact to form motifs – Motifs aggregate to form tertiary structure Protein Structure Prediction A physics-based approach: - find conformation of protein corresponding to a thermodynamics minimum (free energy minimum) - cannot minimize internal energy alone! Needs to include solvent - simulate folding…a very long process! Folding time are in the ms to second time range Folding simulations at best run 1 ns in one day… The CASP experiment • CASP= Critical Assessment of Structure Prediction • Started in 1994, (Moult, Pederson, Judson, Fidelis, Proteins, 23:2-5 (1995)) • First run in 1994; now runs regularly every second year (CASP6 was held last December) 1) Sequences of target proteins are made available to CASP participants in June-July of a CASP year - the structure of the target protein is known, but not yet released in the PDB, or even accessible 2) CASP participants have between 2 weeks and 2 months over the summer of a CASP year to generate up to 5 models for each of the target they are interested in. 3) Model structures are assessed against experimental structure 4) CASP participants meet in December to discuss results Homology Modeling: Practical guide Approach 1: manually. (BLAST, then a range of steps you’d need to learn) Approach 2: Submit target sequence to automatic servers - Fully automatic: - 3D-Jigsaw : http://www.bmm.icnet.uk/servers/3djigsaw/ - EsyPred3D: http://www.fundp.ac.be/urbm/bioinfo/esypred/ - SwissModel: http://swissmodel.expasy.org//SWISS-MODEL.html - Fold recognition: - PHYRE: http://www.sbg.bio.ic.ac.uk/~phyre/ - Useful sites: - Meta server: http://bioinfo.pl/Meta - PredictProtein: http://cubic.bioc.columbia.edu/predictprotein/ Small proteins can be de novo predicted at least, about 50% at < 5Å Small proteins can be de novo predicted Very good Poor – caught in local free energy minimum
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved