Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Notes for Genetic Markers | Applied Bioinformatics | BIT 150, Study notes of Bioinformatics

Material Type: Notes; Class: Applied Bioinformatics; Subject: Biotechnology; University: University of California - Davis; Term: Summer 2006;

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-7h5
koofers-user-7h5 🇺🇸

4.5

(2)

10 documents

1 / 24

Toggle sidebar

Related documents


Partial preview of the text

Download Notes for Genetic Markers | Applied Bioinformatics | BIT 150 and more Study notes Bioinformatics in PDF only on Docsity! 53 CHAPTER 4 GENETIC MARKERS – MORPHOLOGICAL, BIO- CHEMICAL AND MOLECULAR MARKERS A genetic marker is any visible character or otherwise assayable phenotype, for which alleles at individual loci segregate in a Mendelian manner. Genetic markers can be used to study the genetics of organisms, including trees, at the level of single genes. The develop- ment of the discipline of genetics would not have been possible without genetic markers such as the visible characters in peas and Drosophila. Trees, unfortunately, do not have a large number of visible Mendelian characters (Chapter 3) and for many years, this was a limitation in forest genetics research. It was not until the early 1970s that biochemical ge- netic markers such as terpenes and allozymes were developed for trees. These biochemical markers were applied to an array of research problems, most notably the study of amounts and patterns of genetic variation in natural populations of trees and the characterization of mating systems (Chapters 7-10). A major limitation of biochemical markers, however, is that there are only a small num- ber of different marker loci; therefore, genetic information obtained from such markers may not be very representative of genes throughout the genome. The limitation in the number of markers was overcome beginning in the 1980s with the development of molecular or DNA- based genetic markers. Molecular markers have a wide variety of applications in both basic research and applied tree improvement programs. The goal of this chapter is to briefly de- scribe the basic properties of genetic markers and to introduce some of their uses in forestry. Additional reading on genetic markers can found in Adams et al. (1992), Mandal and Gibson (1998), Glaubitz and Moran (2000), and Jain and Minocha (2000). USES AND CHARACTERISTICS OF GENETIC MARKERS Before describing the many types of genetic markers available for use in forest trees, it is first appropriate to consider the various applications of genetic markers and the desired attributes for such applications. Genetic markers are used to study the genetics of natural and domesti- cated populations of trees and the forces that bring about change in these populations. Some of the more important applications of genetic markers include: (1) Describing mating sys- tems, levels of inbreeding, and temporal and spatial patterns of genetic variation within stands (Chapter 7); (2) Describing geographic patterns of genetic variation (Chapter 8); (3) Inferring taxonomic and phylogenetic relationships among species (Chapter 9); (4) Evaluat- ing the impacts of domestication practices, including forest management and tree improve- ment, on genetic diversity (Chapter 10); (5) Fingerprinting and germplasm identification in breeding and propagation populations (Chapters 16, 17, and 19); (6) Constructing genetic linkage maps (Chapter 18); and (7) Marker assisted breeding (Chapter 19). We describe several different types of genetic markers in this chapter, each of which has different attributes that make it more or less desirable to use in certain applications. Some of the desirable attributes of a given type of genetic marker are that it be: (1) Inex- 54 Genetic Markers pensive to develop and apply; (2) Unaffected by environmental and developmental varia- tion; (3) Highly robust and repeatable across different tissue types and different laborato- ries; (4) Polymorphic, i.e. reveal high levels of allelic variability; and (5) Codominant in its expression. MORPHOLOGICAL MARKERS As discussed in Chapter 3, very few simple Mendelian morphological characters have been discovered in forest trees that could be used as genetic markers. Many of the identi- fied morphological markers are mutations observed in seedlings such as albino needles, dwarfing and other aberrations (Franklin, 1970; Sorensen, 1973) (Fig. 5.2). Such mutants have been used to estimate self-pollination rates (Chapter 7) in conifers. These markers, however, have limited application because morphological mutants occur rarely and often are highly detrimental or even lethal to the tree. BIOCHEMICAL MARKERS Monoterpenes Monoterpenes are a subgroup of the terpenoid substances found in resins and essential oils of plants (Kozlowski and Pallardy, 1979). Although the metabolic functions of monoterpenes are not fully understood, they probably play an important role in resistance to attack by diseases and insects (Hanover, 1992). The concentrations of different monoterpenes, such as alpha-pinene, beta-pinene, myrcene, 3-carene and limonene are determined by gas chromotography and are useful as genetic markers (Hanover, 1966a, b, 1992; Squillace, 1971; Strauss and Critchfield, 1982). Monoterpene genetic markers have been applied primarily to taxonomic and evolu- tionary studies (Chapter 9). However, they have also been used to a limited extent to esti- mate genetic patterns of geographic variation within species (Chapter 8). Although monoterpenes were the best available genetic markers for forest trees in the 1960s and early 1970s, they require specialized and expensive equipment for assay. In addition, there are relatively few monoterpene marker loci available and most express some form of dominance in their phenotypes. Dominant genetic markers have the disadvantage that dominant homozygous genotypes cannot be distinguished from heterozygotes carrying the dominant allele. Monoterpenes were gradually replaced by allozyme genetic markers be- cause allozymes are less expensive to apply, are codominant in expression, and many more marker loci can be assayed. Allozymes Allozymes have been the most important type of genetic marker in forestry and are used in many species for many different applications (Conkle, 1981; Adams et al., 1992). Allozymes are allelic forms of enzymes that can be distinguished by a procedure called electrophoresis. The more general term for allozymes is isozymes, and refers to any variant form of an en- zyme, whereas allozyme implies a genetic basis for the variant form. Most allozyme genetic markers have been derived from enzymes of intermediary metabolism, such as enzymes in the glycolytic pathway; however, conceivably an allozyme genetic marker could be devel- Genetic Markers 57 Fig. 4.2. The conifer megagametophye and embryo genetic system: (a) In conifer seeds, the embryo is diploid (2N), while the megagametophyte is haploid (1N) and genetically identical to the egg gam- ete; and (b) Allozymes from an individual megagametophyte show the product of just one allele. Megagametophyes from a single seed tree, heterozygous for an allozyme locus, are expected to seg- regate in a 1:1 ratio of fast and slow allozyme bands (alleles), as shown here for six megagameto- phytes from seed of a heterozygous mother tree. (Photo courtesy of G. Dupper, Institute of Forest Genetics, Placerville, California). (1) Those based on DNA-DNA hybridization; and (2) Those based on amplification of DNA sequences using the polymerase chain reaction (PCR). Important technical aspects of these two approaches are discussed in detail in the following sections. More comprehen- sive reviews of molecular markers in forestry are available (Neale and Williams, 1991; Neale and Harry, 1994; Echt, 1997), so only the marker types most often used in forest trees are discussed here. DNA-DNA Hybridization: Restriction Fragment Length Polymorphism Genetic marker systems based on DNA-DNA hybridization were developed in the 1970s. Eukaryotic genomes are very large and there was no simple way to observe genetic poly- morphisms of individual genes or sequences. The property of complementary base pairing allowed for methods to be developed whereby small pieces of DNA could be used as probes to reveal polymorphisms in just the sequences homologous to the probe. The ge- netic system derived using this approach is called restriction fragment length polymor- phism (RFLP). Restriction fragment length polymorphism (RFLP) markers were the first DNA- 58 Genetic Markers based genetic markers developed (Botstein et al., 1980). A brief description of the RFLP procedure is shown in Fig. 4.3. To begin, total cellular DNA is digested with a restriction endonuclease (Box 4.1), which reduces the genome to a large pool of restriction fragments of different sizes. Hundreds of restriction endonucleases have been discovered that cleave DNA at specific recognition sites of varying length and sequence. However, just a few of these enzymes (e.g. HindIII, EcoRI, BamHI) are routinely used because they generally pro- vide the best size distribution of DNA fragments and are inexpensive. Restriction endonucle- ase recognition sites are found throughout the genome, both in coding and non-coding re- gions, and are a powerful way to sample DNA sequence variation in the genome. The restriction fragments are then separated by their size on an agarose gel by electrophoresis. It is possible to visualize DNA within such a gel by staining it with ethidium bromide; however, because there are typically so many restriction frag- ments of all possible sizes, discrete fragments cannot be seen. To overcome this problem, the fractionated DNA is transferred and chemically bound to a nylon mem- brane by a process called Southern blotting, named after its inventor E.M. Southern (1975). Specific DNA fragments are visualized by hybridizing the DNA fragments bound to the nylon membrane with a radioactively- or fluorescently-labeled DNA probe. A DNA probe is just a small piece of DNA used to reveal its complementary sequence in the DNA bound to the membrane. The DNA probing relies on comple- mentary base pairing; both the DNA fragments on the nylon membrane and the probe are first denatured so that they are single stranded and available to pair with their complementary DNA sequence. DNA probes have been developed for genetic marker analyses of all three plant ge- nomes: nDNA, cpDNA and mtDNA. The chloroplast and mitochondrial genomes are relatively small, so it has been possible to digest these genomes with a restriction endo- nuclease and clone individual sections of these genomes using standard plasmid cloning techniques (Box 4.1). The cpDNA and mtDNA probes have been developed for a num- ber of forest tree species by cloning DNA fragments from those genomes (Strauss et al., 1988; Lidholm and Gustafsson, 1991; Wakasugi et al., 1994a,b). Each of these cloned fragments contains several different genes; therefore, when one clone is used for a probe in RFLP analysis, it is possible to reveal genetic variation for a number of organelle- encoded genes at once. Developing probes for RFLP analysis of nDNA is more problematic because of the large amount of repetitive DNA in the nuclear genome. Two types of probes are com- monly used: genomic DNA (gDNA) probes and complementary DNA (cDNA) probes. Probes are isolated from DNA libraries, which are a large collection of cloned fragments resulting from a single cloning experiment. Both cDNA and gDNA probes are equally easy to use and reveal abundant genetic variation in trees (Devey et al., 1991; Liu and Furnier, 1993; Bradshaw et al., 1994; Byrne et al., 1994; Jermstad et al., 1994). The gDNA probe libraries, however, are much easier to construct than cDNA probe libraries because the difficult task of mRNA isolation is not required. The cDNA probes are de- rived from expressed genes because cDNA is derived from mRNA (Box 4.2), whereas gDNA probes generally are not; therefore, cDNA probes are often preferred for many applications of RFLP analysis in trees. The genetic interpretation of RFLP banding patterns can be difficult especially in conifers whose large genomes often lead to large numbers of fragments revealed by a sin- gle probe. Examples illustrating the molecular basis of several RFLP patterns are shown in Fig. 4.4, as well as the Mendelian interpretations of these band patterns. Genetic Markers 59 Fig. 4.3. Restriction fragment length polymorphism (RFLP) analysis. Step 1, isolation of DNA from tree tissues; Step 2, restriction enzyme digestion of DNA to cleave DNA into small fragments; Step 3, electro- phoresis of DNA samples on agarose gels to separate DNA fragments by size; Step 4, transfer of DNA fragments from gel to nylon membrane by Southern blotting technique; Step 5, hybridization of nylon membrane with specific radioactively labeled DNA probe; Step 6, exposure of nylon membrane to x-ray film (autoradiography); and Step 7, autoradiogram showing RFLP bands. 62 Genetic Markers Restriction fragment length polymorphism analysis has been applied to both chloro- plast and mitochondrial genomes to study: (1) Phylogenetic relationships (Wagner et al., 1991, 1992; Tsumura et al., 1995)(Chapter 9); (2) Genetic variation within species (Ali et al., 1991; Strauss et al., 1993; Ponoy et al., 1994); and (3) Modes of organelle DNA in- heritance (Chapter 3). Restriction fragment length polymorphism analysis of nuclear ge- nomes of forest trees has not been as widely applied, due to the technical challenges of performing Southern blot analysis, especially with the large genomes of conifers. Nuclear DNA RFLP probes are available for a few conifer, Populus, and Eucalyptus species. These probes have been used to construct genetic linkage maps for a number of species as dis- cussed in Chapter 18. Box 4.1. Restriction enzymes and DNA cloning. (Continued from previous page.) At this point (step 5), it is now possible to join the foreign DNA fragment with the linearized plasmid. The cleaving of both DNAs with EcoRI leaves a four-base (AATT) single-stranded end. These are called “sticky ends.” Through complemen- tary base pairing, the two DNA fragments join together perfectly. All that remains to reconstitute a circular, double-stranded DNA molecule is to form a bond between the nucleotide bases that were separated by cleaving with the restriction enzyme. This is accomplished by the enzyme DNA ligase and the process is called a ligation reaction. Finally, the recombinant plasmid is re-introduced into the bacterial host (step 6) by transformation. Now the transformed E. coli can be grown in culture to produce large quantities of plasmid DNA. Cloning of DNA fragments makes it possible to study aspects of the DNA, such as determining its nucleotide sequence, which would not be possible without cloning because large quantities of DNA are generally needed for such assays. Box 4.2. Complementary DNA (cDNA) cloning. The ability to obtain millions of copies of individual genes is fundamentally important for molecular genetics research. Molecular biologists developed a very clever method, called complementary DNA (cDNA) cloning, to obtain cloned copies of expressed genes. Complementary DNA cloning usually involves cloning many genes at once, which together form cDNA libraries (Fig. 1). In step 1, mRNA is isolated from one or more tissues. In trees for example, mRNA has been isolated from xylem to obtain a cDNA library of genes expressed in xylem. The mRNA has a tail consisting of many A’s (called a polyadenylated tail) which provides a convenient location to attach a primer. In step 2, a poly-T primer is annealed to the poly-A tail. In step 3, the enzyme reverse transcriptase is added, along with free A, T, C, and G nucleotides, to synthesize a DNA strand complementary to the mRNA strand. In step 4, an enzyme called RNAseH is added which digests away the original mRNA template, leaving just the newly synthesized single-stranded DNA copy. In step 5, a DNA strand complementary to the first DNA strand is synthesized by the enzyme DNA polymerase I. Finally in step 6, the double-stranded DNA molecule is inserted into a plasmid or viral vector using the enzyme DNA ligase (Box 4.1). (Box 4.2 continued on next page) Genetic Markers 63 64 Genetic Markers Box 4.2. Complementary DNA (cDNA) cloning. (Continued from previous page) Once the cDNA is put into the vector, the vector can be transformed into a bac- terial host that can be cultured to produce large quantities of the cDNA. Fig. 1. Steps in cDNA cloning. Genetic Markers 67 had to be added each cycle because it was destroyed by high temperature during the dena- turing step. The polymerase chain reaction is used not only for DNA marker technology but also for a variety of recombinant DNA assays. Many procedures in molecular biology that previously required cloning of DNA can now be performed by PCR. Random Amplified Polymorphic DNA Random amplified polymorphic DNA (RAPD) markers have been the most widely used molecular marker type in forest trees to date. They were the first of the PCR-based markers and were developed independently by Welsh and McClelland (1990) and Williams et al. (1990). The RAPD marker system is easy to apply as no prior DNA sequence information is needed for designing PCR primers as is required for other PCR-based genetic marker systems. In the RAPD marker system (Fig. 4.6), a PCR reaction is conducted using a very small amount of template DNA (usually less than 10 nanograms) and a single RAPD primer. Primers are usually just 10 base pairs long (10-mers) and are of random sequence. There are several thousand primers commercially available, all with a different 10-base sequence, which in theory will all amplify different regions of the target genome. Therefore, the RAPD marker system has the potential to randomly survey a large portion of the genome for the presence of polymorphisms. The small amount of DNA needed is a big advantage of the RAPD technique versus RFLPs, because marker analysis can be applied to haploid conifer megagametophytes as was discussed for allozyme markers (Fig. 4.2). A specific segment of the genomic DNA is amplified when the RAPD primer finds its complementary sequence at a location in the genome and then again at a second nearby location, but in the opposite orientation from the first priming site. If both chromosomes of a homologous pair each have the forward and reverse priming sites (homozygous +/+), PCR amplification products of identical length are synthesized from both homologues and a RAPD band appears on a gel when the amplification product is electrophoresed (Fig. 4.6). Likewise, if both homologues are missing one or both of the priming sites (homozy- gous −/−), no amplification products are synthesized and no bands are seen on gels. If one homologue has both priming sites, but the other homologue is missing at least one (het- erozygous +/−), then an amplification product results from the first homologue. The het- erozygous (+/−) band pattern phenotype cannot be distinguished from the +/+ homozygote; therefore, RAPD markers are dominant, di-allelic (i.e. only two alleles (+ and -) are ex- pressed at each locus), genetic markers. The + (band) and – (no band) phenotypes are distinguishable in haploid megagameto- phytes. Therefore, conifer trees heterozygous for a RAPD marker will segregate for + and – phenotypes in megagametophytes of their seeds, while only + phenotypes will be ob- served in megagametophytes of +/+ homozygotes. In this manner, +/− heterozygotes can be distinguished from +/+ homozygotes in conifer mother trees. Since a single RAPD primer can anneal to many locations in the genome, multiple loci are revealed by a single primer. Therefore, it is possible to obtain a large number of RAPD genetic markers in a short amount of time and at relatively low cost. Carlson et al. (1991) first demonstrated the use of RAPD markers in trees by showing the inheritance of RAPD markers in F1 families of Pseudotsuga menziesii and Picea glauca. In a subsequent paper, Tulsieram et al. (1992) used RAPD markers and mega- gametophyte segregation analysis to construct a partial genetic linkage map for Picea glauca. Random amplified polymorphic DNA markers have since been used for linkage mapping and marker analyses in dozens of tree species (Cervera et al., 2000). However, as the popularity of RAPD markers increased, difficulty in establishing marker repeatability 68 Genetic Markers across laboratories slowly manifested itself. Therefore, although RAPD markers are easy and quick to use, they have less overall value than the earlier allozyme and RFLP markers because of the problems with repeatability. Fig. 4.6. The random amplified polymorphic DNA (RAPD) marker system involves a small number of steps and all are generally easy to apply in forest trees: Step 1, DNA is isolated; Step 2, DNA is amplified by PCR (Fig. 4.5) using single 10-mer primers; and Step 3, RAPD products are electrophoresed and bands are visualized by staining gels with ethidium bromide. Amplified Fragment Length Polymorphism Genetic Markers 69 Amplified fragment length polymorphism (AFLP) markers are a recent development (Vos et al., 1995). They are like RAPDs in that many markers can be assayed quickly using PCR and they are generally dominant; but, AFLPs appear to be more repeatable than RAPDs. AFLP markers also are similar to RFLPs because they survey the genome for the presence of restriction fragment polymorphisms. The first report of the use of AFLPs in trees was by Cervera et al. (1996) who used this marker system to genetically map a disease resistance gene in Populus. Genetic link- age maps based on AFLPs have also been constructed in Eucalyptus globulus and E. tereticornis (Marques et al., 1998) and in Pinus taeda (Fig. 4.7; Remington et al., 1998). Simple Sequence Repeat Simple sequence repeat (SSR) markers were first developed for use in genetic mapping in humans (Litt and Luty, 1989; Weber and May, 1989), and are another name for micro- satellite DNA (Chapter 2). Short, tandemly-repeated sequences of two, three or four nu- cleotides are found throughout the genome. For example, the dinucleotide repeat AC is commonly found in Pinus genomes. Since the number of tandem repeats at a locus can vary greatly, SSR markers tend to be amongst the most polymorphic genetic marker types. For example, one allele might have 10 copies of the AC tandem repeat (AC)10 , whereas another allele would have 11 copies (AC)11, another 12 copies (AC)12, and so forth. Simple sequence repeat genetic markers require a considerable investment to develop. Genomic DNA libraries rich in microsatellite sequences must be created and screened for clones containing SSR sequences (Ostrander et al., 1992). The DNA sequence of these clones must be determined (Box 4.3), because the unique sequence regions flanking the SSR are needed to design PCR primers to amplify SSR sequences from individual sam- ples. Once a pair of primers is developed to amplify the SSR region, it must be determined whether there is polymorphism for the SSR and whether band patterns on gels have simple genetic interpretations (Fig. 4.8). Some of the first SSR markers developed in trees were from the chloroplast genome (Powell et al., 1995; Cato and Richardson, 1996; Vendramin et al., 1996). Development of these markers was made easier because the complete DNA sequence of the entire chloro- plast genome of Pinus thunbergii was known (Wakasugi et al., 1994a,b). The SSR se- quences were found by a computer search of the entire cpDNA sequence database (see Chapter 18 for a discussion of database searching and comparison of DNA sequences). Furthermore, since cpDNA sequences are highly conserved among related plant taxa, PCR primers designed from sequences flanking SSR sequences in P. thunbergii should easily amplify homologous sequences in other Pinus species. The cpDNA SSRs are highly poly- morphic relative to other types of cpDNA markers and are useful for many types of stud- ies. For example, because cpDNA is paternally inherited in conifers (Chapter 3), cpDNA markers are useful for determining male parentage of offspring (paternity analysis) (Chap- ter 7) and for following the dispersal of pollen in populations where the SSR genotypes of all male trees are known (Stoehr et al., 1998). Nuclear DNA SSRs have been developed for several forest trees, including species of Pinus (Smith and Devey, 1994; Kostia et al., 1995; Echt et al., 1996; Echt and May- Marquardt, 1997; Pfeiffer et al., 1997; Fisher et al., 1998), Picea (van de Ven and McNi- col, 1996), Quercus (Dow et al., 1995) and Populus (Dayanandan et al., 1998). Each of these studies describes the isolation and cloning of a small number of SSRs, their inheri- tance patterns, and their utility for related species. 72 Genetic Markers Genetic Markers 73 Box 4.3. Methods to determine the nucleotide sequence of a DNA fragment. (Continued from previous page) Fig. 1. Illustration of the steps required in determining the nucleotide sequence in a fragment of DNA. 74 Genetic Markers Fig. 4.8. The simple sequence repeat (SSR) or microsatellite marker system requires signifi- cant time and cost to develop; however, it is fairly easy to apply once markers are developed: (a) Primers complementary to unique sequence regions flanking the SSRs are used to amplify the SSR sequence by PCR; allele 1 has seven copies of the ATC repeat and allele 2 has six copies of the ATC repeat; and (b) A simplified gel pattern of the two homozygous (11 and 22) and the heterozygous (12) SSR genotypes. Since all three genotypes are distinguishable, this is a codominant marker. All approaches to ESTP detection require a pair of PCR primers to be designed from the EST sequence. The primers are then used to amplify genomic DNA fragments by PCR. The next step is to reveal polymorphism among amplification products and for this a vari- ety of methods have been used. In Picea mariana, Perry and Bousquet (1998) were able to detect length variation among ESTP alleles when amplification products were analyzed on standard agarose gels. This is the fastest and simplest technique for revealing ESTP varia- tion, although it seems unlikely that such length variation can be found for the majority of ESTs. Polymorphisms surely exist as nucleotide substitutions among alleles; however, these polymorphisms are more problematic to detect.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved