Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Solving Restriction Mapping Problem using Branch and Bounds Algorithm: A Report, Study Guides, Projects, Research of Computer Science

This report explores the use of the branch and bounds algorithm to determine the complete digest set of a dna molecule given a partial digest set. The algorithm involves recursively placing elements from the partial digest set into a solution set and checking if the distances between elements are present in the partial digest set. The java implementation of this algorithm is detailed, and its efficiency is compared to other algorithms such as brute force and another brute force. The report also discusses exceptions and future works.

Typology: Study Guides, Projects, Research

Pre 2010

Uploaded on 08/31/2009

koofers-user-6i9
koofers-user-6i9 🇺🇸

10 documents

1 / 6

Toggle sidebar

Related documents


Partial preview of the text

Download Solving Restriction Mapping Problem using Branch and Bounds Algorithm: A Report and more Study Guides, Projects, Research Computer Science in PDF only on Docsity! Report on “Solving Restriction mapping Problem using Branch and Bounds Algorithm” Abstract Restriction enzymes which are developed by Bacteria are used to cut specific sites of DNA called ‘restriction sites’. Every bacteria has it’s own specific recognition and cut sites. These sites are usually 4-6 base pair long. So A DNA molecule with n number of restriction sites (complete digest set) produce n+1 number of fragments. If the sample of DNA is exposed to the restriction enzyme for only a limited amount of time to prevent it from being cut at all restriction sites. So we generate the set of all possible restriction fragments between every two cuts. This set of fragments are called ‘partial digest set’ is used to determine the positions of the restriction sites in the DNA sequence. Before the DNA sequencing reaction process were invented, calculating the complete digest set from partial digest set was the only process to know the restriction sites in DNA and so as the sequence of base pairs. So an experiment is done based on an algorithm called Branch and bound algorithm. With a given set of partial digest, the program gives the complete digest set. The only exception for this program is for a same partial digest set there are two different complete digest set called Homometric set. This program involves Recursive Selection Sort having the running time O(n2) in average case. Introduction Restriction Mapping is the process of getting structural information on a sequence of DNA by the help of Restriction enzymes. Restriction enzymes are endonuclease that recognize specific base in DNA called ‘Restriction sites’ and make cleavages. As DNA sequencing technology were discovered recently, few years ago restriction maps became powerful research tools in molecular biology by helping the location of genetic markers. The distance between two restriction sites can be calculated experimentally by gel electrophoresis. So a DNA molecule can be either fully digested by restriction enzymes, or partially digested which cut the DNA with a probability less than 1, there by producing all the possible fragments of two consecutive sites. The Objective of this project is to obtain the set of complete digest from a given partial digest sets. Restriction digest problem can be solved by implementing various algorithms 1. Brute Force Algorithm- This is exhaustive search and efficient but very slow and need memory in each step. 2. Another Brute Force Algorithm- This is more efficient, faster than Brute Force but still slow. 3. Branch and Bound Algorithm- Most efficient among three and relatively faster than the two above. Branch and Bound Algorithm PartialDigest( L ): width <- Maximum element in L DELETE(width, L) X <- {0, width} PLACE(L, X) 1. PLACE(L, X) 2. if L is empty 3. output X 4. return 5. y <- maximum element in L 6. Delete(y,L) 7. if D(y, X ) Í L 8. Add y to X and remove lengths D(y, X) from L 9. PLACE(L,X ) 10. Remove y from X and add lengths D(y, X) to L 11. if D(width-y, X ) Í L 12. Add width-y to X and remove lengths D(width-y, X) from L 13. PLACE(L,X ) 14. Remove width-y from X and add lengths D(width-y, X ) to L 15. return Implementation of above algorithm is done by using Java. The following are the methods involved in solving the problems. It has one main method with ten different class methods to get the solution. The data application process flow is detailed in fig 1.  main :This is the main method which first calls assighnData method that validates and accept the user’s input as partial digest set. Then it include 0 and last element in the Complete digest set and remove the same from program steadily reduce the size of the problem by one and calls itself recursively, it is also called RECURSIVE SELECTION SORT. T(n) = T(n-1) + n, T(1) = 1, Therefore, T(n) = n + T(n-1) = n + (n-1) + T(n-2) = n + (n-1) + (n-2) +…….+3+2+T(1) = O(n2) So the upper bound of the algorithm is quadratic time that is : O(n2) , However if there are two alternatives, then T(n) = 2T(n-1) + n 2n-1 So, the Big O for this is an exponential algorithm. Exception One of the exception of this program is with same set of partial digest set, there are two possible complete digest set. These two sets are called Homometric set. There is no solution for this kind of problem. Future Works The partial digest algorithm can be applied in following two problems.  Motif finding problem: Given a set of DNA sequences, find a motif , one from each sequence, that maximizes the consensus score.  Median String problem: Given a set of DNA sequences, find a median string. Both the above problem can be solved by Branch and Bound algorithm. These application shows more efficiency over the exhaustive BruteForce one in speed, but unfortunately it always goes exponential time to find. Conclusion As it is understood that the partial digest problem can be solved by various algorithm, like BruteForce, AnotherBruteForce and Branch and Bound algorithm. The later one is significantly faster than previous two algorithm with very good PDP data set. Although this problem shows an exception when the data set is homometric that is two possible complete digest set from same partial digest set. The running time is usually O(n2) in best or average case but it becomes exponential in worst case. Reference Neil C. Jones and Pavel A. Pevzner – An Introduction to Bioinformatics Algorithm. K.N. King- An introduction to Java Programming.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved