Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

IM Model - Biology of Snakes - Cheatsheet | ZOO 6927, Study notes of Zoology

Material Type: Notes; Class: BIOLOGY OF SNAKES; Subject: Zoology; University: University of Florida; Term: Fall 2007;

Typology: Study notes

Pre 2010

Uploaded on 03/10/2009

koofers-user-jpv
koofers-user-jpv 🇺🇸

5

(1)

10 documents

Partial preview of the text

Download IM Model - Biology of Snakes - Cheatsheet | ZOO 6927 and more Study notes Zoology in PDF only on Docsity! IM Cheat Sheet November 26, 2007 IM model: IM estimates six parameters: ancestral effective population size (NA), time of divergence into two populations (t), the splitting parameter (s), present day effective population sizes of populations 1 (N1) and 2 (N2), and migration rates between the populations (m1 and m2). Assumptions of the Model: Loci are selectively neutral No recombination within loci Free recombination between loci Data fit the mutation model Mutation Models: Infinite Sites (IS) model Hasegawa-Kishino-Yano (HKY) model Stepwise Mutation Model (SMM) When multiple loci are used, each locus can have it’s own mutation model Input file: IM requires a very specific input file format. Here is a brief bit of info for what each line means: Line 1 – arbitrary text – your title for what you’re doing Line with # - comments line (always starts with #) Line 2 – population 1 name <space> population 2 name Line 3 - # loci (entered as an integer) Line 4 – locus name <space> sample size for n1 <space> sample size for n2 <space> inheritance scalar (1 for diploid loci, 0.25 for haploid) <space> mutation model (I for IS, H for HKY, S for SSM) <space> mutation rate <space> mutation rate ranges, as in (low end of mutation rate, high end of mutation rate) Line 5 – data for gene copy 1 from n1 - this must start in the 11th column Etc… The data from n2 immediately follows n1. For each new locus, one must include a “line 4” for that locus, which is specific to that locus. Finally, the input file must have a blank line at the end. For more info, refer to IMa Documentation - July 13, 2007, page 5. Running IM: For IM to run, the input file (in our cast, beast_CI.txt) and the IMrun file must be in the same file as the executable IM file. After you have downloaded beast_CI.txt, most this to the IM folder. Open the command prompt, and change to the IM folder. Enter this command line: im -ibeast_CI.txt -oIM.bci.0.6.1.out -m18 -m28 -q15 -q25 -qa2 -t84 -u0.055 -b100000 -l0.5 -s30 this means… -i input file name (no spaces) -o output file name (no spaces) -q1 scalar for 4N1u maximum -q2 scalar for 4N2u maximum -qA scalar for 4NAu maximum -m1 maximum migration rate from population 1 to population 2 -m2 maximum migration rate from population 2 to population 1 -t maximum time of population splitting -u generation time in years -l duration of run (if floating point, the time in hours between outputs - run continues until file IMrun is no longer present) -s random number seed Full command line explanations can be found in the pages 13-24 of to IMa Documentation - July 13, 2007. Parameter Conversions: The IM program can be used to generate estimates of model parameters (θ1, θ2, θA, m1, m2 and t). To convert to demographic parameters, we need the mutation rates in 2 different units. Let V be an estimate of the mutation rate per generation and U be an estimate of mutation rate per year. To get N1, θ1=4N1*V, N1 = θ1/(4V) To get time of divergence in years, t = tU, t=t/U m - migration events per mutation, so m1 = m1/U to get Nm1 (effective migrants/generation), since 4N1u × m1/u / 2 = 2N1 m1, Nm = θ1×m1/2. For more detail, refer to Introduction to the IM and IMa computer programs - March 5, 2007, 12-14. Interpreting the output: Use excel to open the output (IM_pres.out). This will come in handy later. There are 3 sections in the output. Input and Starting Information – This section lists the starting information from the input file and the command line settings. Run Information - This section summarizes information on the run, beginning with the length and time of the run and the number of measurements made. There are a few other things listed here, but the most important is the output file has a table of parameter autocorrelations and ESS (effective sample size) estimates. These are the same as appear on the computer screen during the course of the run. ESS values are estimates of the number of independent points that have been sampled for each parameter. They can be a valuable guide to how well the Markov Chain is mixing and how long the chain should be run for. Following the autocorrelations and correlations there are listed results on differences between parameters. Over the course of the run all relevant pairwise comparisons between parameters of similar type are made, and a record is made of the proportion of the time that one is larger than the other. The result is a posterior probability estimate that can be used directly as a statistical assessment of whether or not one parameter is larger than another. Marginal Histograms- The primary results of the analysis are the records of the residence time for each parameter in each of 1000 evenly sized bins that span the prior distribution for each parameter. A table is first provided that summarizes features of the histogram for each parameter. • Minbin - the midpoint value of the lowest bin. • Maxbin - the midpoint value of the highest bin. • HiPt - the value of the bin with the highest count (i.e. highest residence time). • HiSmth - the value of the bin with the highest count, after the counts have been smoothed by taking a running average of 9 points centered on each bin. This is what you would use to get a point estimate for any parameter. • Mean - the mean value of the parameter. • 95Lo - the estimated point to which 2.5% of the total area lies to the left. • 95Hi - the estimated point to which 2.5% of the total areas lies to the right • HPD90Lo - the lower bound of the estimated 90% highest posterior density (HPD) interval. The 90% HPD interval is the shortest span (on the X axis) that contains 90% of the posterior probability. A question mark, ‘?’, is added if the HPD interval did not appear to be
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved