Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

bioinformatics lecture note, Lecture notes of Biology

it is lecture note on the introduction of bioinformatics

Typology: Lecture notes

2023/2024

Uploaded on 06/28/2024

bah-14
bah-14 🇪🇹

1 / 208

Toggle sidebar

Related documents


Partial preview of the text

Download bioinformatics lecture note and more Lecture notes Biology in PDF only on Docsity! Introduction to Bioinformatics Learning Objectives • Identity, similarity, homology • Analyze sequence similarity by dotplots • window/stringency • Alignment of text strings by edit distance • Scoring of aligned amino acids • Gap penalties • Global vs. local alignment • Dynamic Programming (Smith Waterman) • FASTA method • The term ‘Bioinformatics’ was first introduced in 1970 by Paulien Hogeweg to refer to the study of information processes in biotic systems. While some people define the term as “An integration of computer, mathematical and statistical methods to manage and analyze biological information”, others view it as “The field of science in which Biology, Computer Science, and Information technology merge into a single discipline”. • In the present-day context, Bioinformatics involve the use of techniques including applied mathematics, informatics, statistics, computer science, chemistry and biochemistry to solve biological problems usually on the molecular level. Major research efforts in the field include sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, and the modeling of evolution. Goals • The ultimate goal of bioinformatics is to better understand a living cell and how it functions at the molecular level. • By analyzing raw molecular sequence and structural data, bioinformatics research can generate new insights and provide a “global” perspective of the cell. • The reason that the functions of a cell can be better understood by analyzing sequence data is ultimately because the flow of genetic information is dictated by the “central dogma” of biology • DNA is transcribed to RNA, which is translated to proteins. Scope of Bioinformatics: 1.Bioinformatics helps to create an electronic database on genomes and protein sequences from single celled organisms to multicellular organisms.  2. It provides techniques by which three-dimensional models of biomolecules could be understood along with their structure and function. 3.  It integrates mathematical, statistical and computational methods to analyse biological, biochemical and biophysical data. 4. Bioinformatics deals with methods for starting, retrieving and analysing biological data such as nucleic acid (DNA/RNA) and protein sequences, structure, functions pathways and genetic interactions.  5. The computational methods in bioinformatics extend information for probing not only at genome level or protein level but up to whole organism level, or ecosystem level of organization. 6. It provides genome level data for understanding normal biological processes and explains the malfunctioning of genes leading to diagnosing of diseases and designing of new drugs. Important sub-disciplines within bioinformatics • Development of new algorithms and statistics with which to assess relationships among members of large data sets • Analysis and interpretation of various types of data including nucleotide and amino acid sequences, protein domains, and protein structures • Development and implementation of tools that enable efficient access and management of different types of information” (NCBI)“ • All biological computing are not bioinformatics, e.g. mathematical modelling is not bioinformatics, even when connected with biology-related problems 11 Bioinformatics Computer Network Computational Theory Database Software Engineering Artificial Intelligence Bio inspired Computing Graphic Computing Image Processing Parallel Computing Optimization InternetBioinformatics Data Structure • Bioinformatics comprises three components. 1. Creation of databases: This involves the organizing, storage and management of the biological data sets. The databases are accessible to researchers to know the existing information and submit new entries, e.g., protein sequence data for molecular structure. 2. Development of algorithms and statistics: This involves the development of tools and resources to determine the relationship among the members of large data sets e.g., comparison of protein sequence data with the already existing protein sequences. 3. Analysis of data and interpretation: The appropriate use of components 1 and 2 to analysis data and interpret the results in a biologically meaningful manner. These includes DNA, RNA and protein sequences, protein structure, gene expression profiles and biochemical pathways. Gene therapy: Gene Therapy is a process through which genetic materials are incorporated into unhealthy cells in order to treat, cure as well as prevent diseases. Gene therapy is a novel form of drug delivery that enlists the synthetic, machinery of the patient’s cell to produce a therapeutic agents. It involves the efficient introduction of functional genes into the appropriates cells of the patients in order to produce sufficient amount of the protein encoded by transferred gene. Strategies are-gene addition, Removal of harmful gene, Control of gene expression. In the not too distant future with the use of bioinformatics tool, the potential for using genes themselves to treat disease may become a reality. Waste Clean up Another important application of bioinformatics is in waste clean up. Here, the primary objective is to identify and assess the DNA sequencing of bacteria and microbes in order to use them for sewage cleaning, removing radioactive waste, clearing oil spills, etc.  Deinococcus radiodurans bacterium has the ability to repair damaged DNA and small fragments from the chromosomes by isolating damaged segments in a concentrated area. Gene from the other bacteria have been inserted to the D. radioduranas for environmental cleanup. Deinococcus radiodurans is known as the world’s toughest bacteria and it is the most radiation resistant organism known. Scientists are interested in this organism because of its potential usefulness in cleaning up waste sites that contain radiation and toxic chemicals. Crop Improvement  Bioinformatics plays a significant role in the development of the agricultural sector, agro-based industries, agricultural by-products utilization and better management of the environment. It makes effective usage of proteomic, metabolomic, genetic, and agricultural crop production to develop strong, more drought- resistant, and insect-resistant crops. Thereby enhancing the quality of livestock and making them disease resistant. Health • Disease prevention: • Detect people at risk • Change of lifestyle, diet… e.g. risk of cardiovascular diseases – exercise… • Study virus evolution e.g. bird flu virus • Treatment: • Quantitative evaluation of disease spread • Rational drug design e.g. first efficient drug against HIV (Norvir 1996) • Gene therapy e.g. “bubble” kids with no immune system • Animal model e.g. zebra fish is the new mouse Aim of bioinformatics “To improve the quality of life” by understanding how it works Forensic (DNA fingerprints) • Criminal suspects (UK: database of 3M people) • Paternity tests Identification of victims (Titanic, earthquakes…) • Prevent illegal trade (drugs, ivory…) Paleoanthropology & archaeology • Human evolution e.g. where is the first American from? Food industry • GMOs (Genetically Modified Organisms) Famine buster or Frankenfood? Other applications Complexity Ecological Processes and Populations Tissue and Organismal Physiology Bioinformatics Problems/Applications = Morphogenesis and Development Simulation of Metabolic and Cellular & Developmental Processes Molecular Dynamics Modeling Catalysis, Ko Signal Transduction Pathways Biochemical Pathways & Processes Complete Genomes Predicting Effects of Modeling Structures of Multi- Variation molecular complexes Predicting Three-Dimensional _n ps Structures of Proteins and RNAs predicting) Guncions Assembled Genomes Sequence Variation of Populations Genes, Proteins, RNAs, and others Predicting Protein Simulating and Understanding Gene Sequence Expression Networks Genes and Gene Structures Reconstructing Phylogeny, Homology, and Comparitive Approaches Bioinformatics Flow Chart (1) 1a. Sequencing 1b. Analysis of nucleic acid seq. -Base calling -Physical mapping -Fragment assembly -gene finding -Multiple seq alignment  evolutionary tree Stretch of DNA coding for protein; Analysis of noncoding region of genome 2. Analysis of protein seq. 3. Molecular structure prediction 3D modeling; DNA, RNA, protein, lipid/carbohydrate Sequence relationship 4. molecular interaction Protein-protein interaction Protein-ligand interaction 5. Metabolic and regulatory networks Bioinformatics Flow Chart (2) 6. Gene & Protein expression data 7. Drug screening -EST -DNA chip/microarray a) Lead compound binds tightly to binding site of target protein b) Lead optimization – lead compound modified to be nontoxic, few side effects, target deliverable Ab initio drug design OR Drug compound screening in database of molecules 8. Genetic variability Drug molecules designed to be complementary to binding Sites with physiochemical and steric restrictions. -Now investigated at the genome scale - SNP, SAGE 27 Why is Bioinformatics Important? • Applications areas include • Medicine • Pharmaceutical drug design • Toxicology • Molecular evolution • Biosensors • Biomaterials • Biological computing models • DNA computing Bioinformatics Topics Informatics Biology Operating Systems Programming Rarely is there a need to become a truly proficient programmer. BUT - Sufficient skill to affect basic management of large datasets is important. AS IS - Sufficient skill to construct simple customised pipelines. Python is currently the most popular Programming Language for Bioinformatics. Minimal programming skill levels would allow: The construction of small programs. The understanding of slightly larger programs. Ability to convey program specifications to a specialist Bioinformatics Topics Informatics Biology Operating Systems Programming A basic understanding of Statistics is just as vital when designing an experiment. https://en.wikipedia.org/wiki/Ronald_Fisher As it is when large datasets need to be interpreted, which sensibly demands a working familiarity with a quality Statistical Package. Bioinformatics software commonly employs statistics to select the most probable answer from a set of many possible answers to a given question. Statistics Bioinformatics Topics Informatics Biology Operating Systems Programming Statistics Data Generation Experimental Data types include : Sequences - Typically Next-Generation DNA Sequencing (NGS). https://www.ebi.ac.uk/training/online/course/ebi-next- generation-sequencing-practical-course/what-you-will-learn/ what-next-generation-dna- Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis The Alignment of Families of Homologous sequences. First, find a family of Homologous sequences. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis The Alignment of Families of Homologous sequences. Then, align by inserting “-”s representing InDels, in each sequence. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis The Alignment of Families of Homologous sequences. Next, identify the columns where Substitutions and/or InDels have been predicted. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Searching for Homologous Sequences in a Sequence Database. Database searching is the most common Bioinformatics process by far. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Searching for Homologous Sequences in a Sequence Database. Database searching is the most common Bioinformatics process by far. Database searching is pairwise comparison repeated many times. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Searching for Homologous Sequences in a Sequence Database. Database searching is the most common Bioinformatics process by far. Database searching is pairwise comparison repeated many times. Non-optimal comparison methods are essential for practical reasons. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Searching for Homologous Sequences in a Sequence Database. Database searching seeks “Similarity”. Users seek “Homology”. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Searching for Homologous Sequences in a Sequence Database. Database searching seeks “Similarity”. Users seek “Homology”. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Searching for simple sequence patterns Sequences in DNA Largely a matter of finding short sequences within longer ones. Computationally trivial. Largely a matter of finding short sequences within longer ones. Restriction Mapping Few Recognition Sites can be simply defined using only the codes A, C, G and T.. Detecting Restriction Enzyme Recognition Sites is complicated by their redundancy. https://en.wikipedia.org/wiki/Restriction_map https://www.neb.com/tools-and-resources/ selection-charts/alphabetized-list-of-recognition- specificities Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Searching for Protein properties with better models. A variety of simple models have been developed (e.g. Position Weight Matrices) for a number of purposes, including:- Gene discovery in bacteria genomes (DNA) - TATA box Detection (DNA) - Early versions of 2D protein Structure Prediction - Helix-Turn-Helix (HTH) Prediction - transmembrane Alpha Helix prediction - Prediction of Coiled Coils https://en.wikipedia.org/wiki/ Transmembrane_domain https://en.wikipedia.org/wiki/TATA_box https://en.wikipedia.org/wiki/Helix-turn-helix http://www.ch.embnet.o g/software/COILS_form.html Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Searching for Protein properties with better models. The most powerful and prolific current profiles are Hidden Markov Models (HMMs) https://en.wikipedia.org/wiki/Hidden_Markov_model Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis http://www.dictionary.com/browse/phylogeny Broadly, the estimation of evolutionary history from available evidence. “Evidence” does not have to be a carefully crafted MSA of Orthologous sequences from a range of organisms. However, in the context of Bioinformatics, it invariably is. Estimating evolution - Phylogeny. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Protein structure prediction. Secondary Structure. General principle being, the more information offered, the more reliable the prediction. Better predictions are obtained from MSA data than from individual protein sequences. Some systems will automatically generate an MSA if offered a solitary protein sequence. Prediction will be based on the MSA, computed by iterative database searching. Bioinformatics Topics Informatics Biology Operating Systems Programmi ng Statisti cs Data Analysis Protein structure prediction. Secondary Structure. Predicting Tertiary Structure directly from Primary Structure is not currently practical. http://www.biology-online.org/dictionary/Primary_structure Homology modelling requires a reliable Tertiary Structure for a homologous protein. https://en.wikipedia.org/wiki/Homology_modeling Tertiary Structure for a protein is predicted by comparison with the homologous structure. Homology modelling is hampered by low volumes and uneven spread of available structures. And now … Once again … Your turn! Some issue for consideration, discussion and reaction The Bioinformatics topics mentioned here do not constitute a comprehensive list. What would suggest is missing … in order of importance? The term algorithm was mentioned once or twice. There are slightly differing definitions. Pick the one you like best and justify your selection. http://www.thefreedictionary.com/algorithm Define the three terms Homologue, Paralogue and Orthologue, being ever assiduous to ignore offensive American misspellings! https://en.wikipedia.org/wiki/Homology_(biology)#Sequence_homology http://homepage.usask.ca/~ctl271/857/def_homolog.shtml http://classroom.synonym.com/difference- between-orthologous-paralogous-genes- 18612.html APPLICATIONS • Medicine genomics and bioinformtics are now poised to revolutionize our healthcare system by developing personalized and customized medicine. The high speed genomic sequencing coupled with sophisticated informatics technology will allow a doctor in a clinic to quickly sequence a patient’s genome and easily detect potential harmful mutations and to engage in early diagnosis and effective treatment of diseases. • Agriculture Plant genome databases and gene expression profile analyses have played an important role in the development of new crop varieties that have higher productivity and more resistance to disease. LIMITATIONS • In many ways, the role of bioinformatics in genomics and molecular biology research can be likened to the role of intelligence gathering in battlefields. • Fighting a battle without intelligence is inefficient and dangerous. • Overreliance on poor-quality intelligence can yield costly mistakes if not complete failures. • The outcome of computation also depends on the computing power available. Many accurate but exhaustive algorithms cannot be used because of the slow rate of computation. Instead, less accurate but faster algorithms have to be used. This is a necessary trade-off between accuracy and computational feasibility. Therefore, it is important to keep in mind the potential for errors produced by bioinformatics programs. Databases • One of the hallmarks of modern genomic research is the generation of enormous amounts of raw sequence data. • the very first challenge in the genomics era is to store and handle the staggering volume of information through the establishment and use of computer databases. • A database is a computerized archive used to store and organize data in such a way that information can be retrieved easily via a variety of search criteria. • The chief objective of the development of a database is to organize data in a set of structured records to enable easy retrieval of information. • Each record, also called an entry, should contain a number of fields that hold the actual data items, for example, fields for names, phone numbers, addresses, dates. • To retrieve a particular record from the database, a user can specify a particular piece of information, called value, to be found in a particular field and expect the computer to retrieve the whole data record. This process is called making a query. TABLE 2.1. Major Biological Databases Available Via the World Wide Web Databases and Retrieval Systems. Brief Summary of Content URL AceDB Genome database for www.acedb.org Caenorhabditis elegans DDBJ Primary nucleotide sequence www.ddbj.nig.ac.jp database in Japan EMBL Primary nucleotide sequence www.ebi.ac.uk/embl/index.html database in Europe Entrez NCBI portal for a variety www.ncbi.nlm.nih.gov/gquery/gquery.fcgi of biological databases ExPASY Proteomics database http://us.expasy.org/ FlyBase A database of the Drosophila http://flybase.bio.indiana.edu/ genome FSSP Protein secondary structures www. bioinfo.biocenter.helsinki.fi:8080/dali/index.html GenBank Primary nucleotide sequence www.ncbi.nlm.nih.gov/Genbank database in NCBI HIVdatabases HIVsequencedataandrelated = wwwhiv.lanl.gov/content/index immunologic information Microarray DNA microarray data and www.ebi.ac.uk/ microarray gene analysis tools expression database OMIM Genetic information ofhuman = www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM diseases PIR Annotated protein sequences http://pir.georgetown.edu/pirwww/pirhome3.shtml PubMed Biomedical literature www.ncbi.nlm.nih.gov/PubMed information Ribosomal Ribosomal RNA sequences and _http://rdp.cme.msu.edu/html database phylogenetic trees derived project from the sequences SRS General sequence retrieval http://srs6.ebi.ac.uk system SWISS-Prot Curated protein sequence www.ebi.ac.uk/swissprot/access.html database TAIR Arabidopsis information wwwaarabidopsis.org database • Primary databases: There are three major public sequence databases that store raw nucleic acid sequence data produced and submitted by researchers worldwide: GenBank, the European Molecular Biology Laboratory (EMBL) database and the DNA Data Bank of Japan (DDBJ), which are all freely available on the Internet. • sequence submission to either GenBank, EMBL, or DDBJ is a precondition for publication in most scientific journals to ensure the fundamental molecular data to be made freely available. • They together constitute the International Nucleotide Sequence Database Collaboration. • Primary databases: for the three-dimensional structures of biological macromolecules, there is only one centralized database, the PDB. • This database archives atomic coordinates of macromolecules (both proteins and nucleic acids) determined by x-ray crystallography and NMR. It uses a flat file format to represent protein name, authors, experimental details, secondary structure, cofactors, and atomic coordinates. • The web interface of PDB also provides viewing tools for simple image manipulation. Secondary Databases • A recent effort to combine SWISS-PROT, TrEMBL, and PIR led to the creation of the UniProt database, which has larger coverage than any one of the three databases while at the same time maintaining the original SWISS-PROT feature of low redundancy, cross-references, and a high quality of annotation. • https://www.uniprot.org/ • There are also secondary databases that relate to protein family classification according to functions or structures. Specialized Databases • Specialized databases normally serve a specific research community or focus on a particular organism. The content of these databases may be sequences or other types of information. • The sequences in these databases may overlap with a primary database, but may also have new data submitted directly by authors. Because they are often curated by experts in the field, they may have unique organizations and additional annotations associated with the sequences. • Many genome databases that are taxonomic specific fall within this category. PITFALLS OF BIOLOGICAL DATABASES • One of the problems associated with biological databases is overreliance on sequence information and related annotations, without understanding the reliability of the information. • Redundancy is another major problem affecting primary databases. • The National Center for Biotechnology Information (NCBI) has now created a non redundant database, called RefSeq, in which identical sequences from the same organism and associated sequence fragments are merged into a single entry. • The other common problem is erroneous annotations. Often, the same gene sequence is found under different names resulting in multiple entries and confusion about the data. Or conversely, unrelated genes bearing the same name are found in the databases. A prominent example of such systems is Gene Ontology. • Some of the inconsistencies in annotation could be caused by genuine disagreement between researchers in the field; others may result from imprudent assignment of protein functions by sequence submitters. GenBank Sequence Format • To search GenBank effectively using the text based method requires an understanding of the GenBank sequence format. • The resulting flat files contain three sections – Header, Features, and Sequence entry. • The Header section describes the origin of the sequence, identification of the organism, and unique identifiers associated with the record. • The top line of the Header section is the Locus, which contains a unique database identifier for a sequence location in the database • The identifier is followed by sequence length and molecule type (e.g., DNA or RNA). This is followed by a three-letter code for GenBank divisions. • PLN for plant, fungal, and algal sequences; PRI for primate sequences; MAM for nonprimate mammalian sequences; BCT for bacterial sequences; and EST for EST sequences. GenBank Sequence Format • Next to the division is the date when the record was made public • “DEFINITION,” provides the summary information for the sequence record • This is followed by an accession number for the sequence • It has two different formats: two letters with five digits or one letter with six digits. • XM_ (mRNA), XR_ (non-coding RNA), and XP_ (protein) [X for model transcript] • NM_ (mRNA), NR_ (non-coding RNA), and NP_ (protein) [N for curated transcript] • version number • The “Features” section includes annotation information about the gene and gene product, as well as regions of biological significance reported in the sequence, with identifiers and qualifiers. The “Source” field provides the length of the sequence, the scientific name of the organism, and the taxonomy identification number. • The third section of the flat file is the sequence itself Category Description NC Complete genomic molecules NG Incomplete genomic region NM mRNA NR ncRNA NP Protein XM predicted mRNA model XR predicted ncRNA model XP predicted Protein model (eukaryotic sequences) WP predicted Protein model (prokaryotic sequences) Conversion of Sequence Formats • In sequence analysis and phylogenetic analysis, there is a frequent need to convert between sequence formats. • One of the most popular computer programs for sequence format conversion is Readseq • It recognizes sequences in almost any format and writes a new file In an alternative format. • http://iubio.bio.indiana.edu/cgi-bin/readseq.cgi/. Sequence Alignment Intro to Bioinformatics – Sequence Alignment 82 Sequence Alignments • Cornerstone of bioinformatics • What is a sequence? • Nucleotide sequence • Amino acid sequence • Pairwise and multiple sequence alignments • We will focus on pairwise alignments • What alignments can help • Determine function of a newly discovered gene sequence • Determine evolutionary relationships among genes, proteins, and species • Predicting structure and function of protein Acknowledgement: This notes is adapted from lecture notes of both Wright State University’s Bioinformatics Program and Professor Laurie Heyer of Davidson College with permission. Basic Steps in Sequence Alignment • Comparison of sequences to find similarity and dissimilarity in compared sequences • Identification of gene-structures, reading frames, distributions of introns and exons and regulatory elements • Finding and comparing point mutations to get the genetic marker • Revealing the evolutionary and genetic diversity • Function annotation of genes. Why Compare Sequences? • Identify sequences found in lab experiments • What is this thing I just found? • Compare new genes to known ones • Compare genes from different species • information about evolution • Guess functions for entire genomes full of new gene sequences • Map sequence reads to a Reference Genome (ChIP-seq, RNA-seq, etc.)  Inference of Homology – Two genes are homologous if they share a common evolutionary history. – Evolutionary history can tell us a lot about properties of a given gene – Homology can be inferred from similarity between the genes  Searching for Proteins with same or similar functions Terms of sequence comparison Sequence identity • Exactly same Nucleotide/AminoAcid in same position Sequence similarity • Substitutions with similar chemical properties Sequence homology • General term that indicates evolutionary relatedness among sequences • Sequences are homologous if they are derived from a common ancestral sequence. Intro to Bioinformatics – Sequence Alignment 90 · Codon deletion: ACG ATA GCG TAT GTA TAG CCG… • Effect depends on the protein, position, etc. • Almost always deleterious • Sometimes lethal · Frame shift mutation: ACG ATA GCG TAT GTA TAG CCG… ACG ATA GCG ATG TAT AGC CG?… • Almost always lethal Deletions Intro to Bioinformatics – Sequence Alignment 91 The Genetic Code Substitutions are mutations accepted by natural selection. Synonymous: CGC  CGA Non-synonymous: GAU  GAA Intro to Bioinformatics – Sequence Alignment 92 Indels • Comparing two genes it is generally impossible to tell if an indel is an insertion in one gene, or a deletion in another, unless ancestry is known: ACGTCTGATACGCCGTATCGTCTATCT ACGTCTGAT---CCGTATCGTCTATCT Sequence Alignment Input: two sequences over the same alphabet Output: an alignment of the two sequences Example:  GCGCATGGATTGAGCGA  TGCGCCATTGATGACCA A possible alignment: -GCGC-ATGGATTGAGCGA TGCGCCATTGAT-GACC-A Alignments -GCGC-ATGGATTGAGCGA TGCGCCATTGAT-GACC-A Three “components”:  Perfect matches  Mismatches  Insertions & deletions (indel) Formal definition of alignment: Things to consider  To find the best alignment one needs to examine all possible alignment  To reflect the quality of the possible alignments one needs to score them  There can be different alignments with the same highest score  Variations in the scoring scheme may change the ranking of alignments
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved