Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Comparing Sunscreen Lotions with Histograms & Hypothesis Testing, Assignments of Statistics

University of Washington (UW) - Seattle Statistics

Solutions to problem 1 of stat 421 homework 6, where the goal is to compare two sunscreen lotions, a and b, by examining their histograms and performing a hypothesis test using the average difference in scores. The document also includes the r code to generate the histograms, estimate the reference distribution, and calculate the p-values based on both the reference distribution and the normal approximation.

Typology: Assignments

Pre 2010

Uploaded on 03/18/2009

koofers-user-dqn 🇺🇸

10 documents

1 / 10

Partial preview of the text

Download Comparing Sunscreen Lotions with Histograms & Hypothesis Testing and more Assignments Statistics in PDF only on Docsity! Stat 421, Fall 2008 Fritz Scholz Homework 6 Solutions Problem 1: It is desired to examine the difference between two sun screen lotions, called A and B. We have n=100 volunteers, willing to try out the lotions. In order to eliminate the natural variation effects from subject to subject, it is decided to apply both lotions to each subject, one lotion on one arm, the other lotion on the other arm. Which arm gets which lotion is decided by a fair coin flip and the subjects do not know the lotion identities for their respective arms. This avoids possible bias coming from subjects, favoring one lotion over the other, by exposing their arms differently. It also avoids biases arising from left and right arm receiving different sun exposure. (For example, basal cell carcinomas occur more frequently on the left side, which is exposed more for a driver of a car). After sufficient exposure each arm is assigned a burn score. The scores associated for arms treated with lotion A are given below in the X.A vector of length 100, those associated with lotion B are given below in the X.B vector. The scores in position i of each vector belong to the same person i. On one page (using par(mfrow=c(3,1) prior to plotting) give the histograms of X.A, X.B and X.A-X.B (using nclass=20) and indicate by a vertical line the position of the mean for each histogram. Based on these histograms what would you say about A and B having different effects? Now test the hypothesis H0: no difference between A and B against the alternative hypothesis that there is a difference, using as test statistic the average Dbar of the D = X.A-X.B scores (which is the same as mean(X.A)-mean(X.B)). As reference distribution for the test statistic we take all 2n possible assignments of signs to the n=100 absolute differences |D| and find the average D value (Dbar) for each such assignment. If this looks familiar to you from a previous homework problem, it is meant to be, and you can reuse any code developed there. Of course, the complete reference distribution is out of our reach and you will need to use a simulated reference distribution to get an estimate for the p-value of the observed Dbar. Before getting the reference distribution invoke set.seed(27) and use Nsim=10000 simulations. The rationale for taking this reference distribution is as follows: Under H0 it does not matter whether A or B is applied to a specific arm. Whether we see for the first person (based on the data below) the difference X.A-X.B = 80.9 – 77.2 = 3.7 or whether we see X.A-X.B = -3.7 (because the assignment of lotions had gone the other way) would have equal chances ½ and ½. The score for the left arm and the score for the right arm would have remained the same (whether A or B was applied). The results for each person are due to all other factors that might have impacted the final scores for that person. Show the histogram (using nclass=100, xlim=c(-2.5,2.5)) with superimposed normal distribution for this estimated reference distribution and give the estimated p- values based on this estimated reference distribution and based on the normal approximation, either in your text or as part of your histogram plot. Write a function that does all this and provide the code. X.A = c(80.9, 73.1, 39.1, 29.6, 37.6, 55.8, 51.3, 69.1, 84.6, 53.7, 31.3, 55.5, 74, 51.2, 45.4, 38.8, 28.5, 37.6, 49.7, 54.5, 42.8, 29.3, 48.9, 51.4, 8.3, 31.8, 36.1, 57.2, 72.8, 57.2, 51.5, 40.1, 78.6, 26.4, 50.4, 66.2, 58.5, 60.3, 49.7, 63.4, 19.5, 40.5, 65.6, 85.9, 26.2, 30.1, 23, 55.5, 55.2, 50.6, 39.2, 50.9, 63.9, 29.1, 78.5, 63.7, 58.9, 63.3, 56.5, 70.2, 36.5, 58.1, 51.1, 92.4, 37.3, 34, 55.5, 53.9, 38.1, 61.5, 47.4, 60.2, 37.6, 63.4, 58.9, 43.8, 37.9, 48.4, 61.9, 80.2, 86.4, 58.5, 41.5, 55.2, 41.8, 64.1, 51.9, 68.7, 23.2, 27.4, 49.3, 45.6, 64.5, 49.2, 43.1, 62.8, 48.8, 69.6, 70.7, 48.2) and X.B = c(77.2, 65.9, 39.1, 25.4, 37.3, 53.4, 47.3, 64.5, 79, 52.5, 27.3, 45, 68.6, 50.2, 34.9, 32.7, 22.3, 38.8, 52.2, 51.9, 46.6, 32.2, 55.2, 58, 7.1, 31.5, 35.7, 52, 70.6, 54.8, 44.6, 38.9, 79.4, 29.1, 44.8, 67, 61.7, 59.6, 50.4, 66.1, 17.4, 45.6, 74.2, 77.7, 32.9, 33.9, 25.6, 56.9, 59.2, 42.3, 44.7, 50.5, 65.2, 31, 83.6, 62.9, 52.5, 67.3, 57.5, 62.2, 29.3, 55.6, 41.3, 89.7, 38.8, 27, 55, 46.2, 34.9, 56.7, 41.8, 55.7, 32.6, 63.9, 55.3, 44.9, 42.4, 44.6, 52.4, 78.5, 81, 58.5, 42.5, 58.2, 40, 60.8, 52.2, 71.7, 22.7, 23.8, 39.5, 45, 74.8, 45.3, 44.2, 59.7, 44.2, 72.5, 63.9, 43.5) Solution: The histograms for X.A, X.B and X.A-X.B are shown below. if(Dbar>=0){ text(1.02*Dbar,.97*high,"two-sided p-value",cex=1.3,adj=0) text(1.02*Dbar,.9*high,paste("p = ",format(signif(pval,6))),cex=1.3,adj=0) text(1.02*Dbar,.80*high,"normal approximation",cex=1.3,adj=0) text(1.02*Dbar,.73*high,"two-sided p-value",cex=1.3,adj=0) text(1.02*Dbar,.66*high,paste("p = ",format(signif(pval.normal,6))),cex=1.3,adj=0) }else{ text(1.02*Dbar,.97*high,"two-sided p-value",cex=1.3,adj=1) text(1.02*Dbar,.9*high,paste("p = ",format(signif(pval,6))),cex=1.3,adj=1) text(1.02*Dbar,.80*high,"normal approximation",cex=1.3,adj=1) text(1.02*Dbar,.73*high,"two-sided p-value",cex=1.3,adj=1) text(1.02*Dbar,.66*high,paste("p = ",format(signif(pval.normal,6))),cex=1.3,adj=1) } } Problem 2: As indicated on slide 98 of Stat421NormalPopulation.pdf modify the function sample.size2 (slide 68, same source) so that it can be used to determine the smallest combined sample size N=m+n (with m=n) necessary to get the desired power beta at an alternative delta0=|muY-muX|/sigmau. The hypothesis to be tested is H0: muY=muX against H1: muY  muX. Here sigmau is a known upper bound to the actual common sigma. It is assumed that you deal with random samples from normal distributions with same variance. Provide the code for your modified function. Apply this to the concrete case when you want power =.9 at delta0=.5. Take alpha = .05. Show the plots that helped you make your determination of that smallest N. (Note: N should be even!). Solution: The appropriately modified function is sample.size2samp2 = function (delta0 = 1, nrange = 10:100, alpha = 0.05) { power = NULL for (N in nrange) { tcrit = qt(1 - alpha/2, N - 2) power = c(power, 1 - pt(tcrit, N - 2, sqrt(N/4)* delta0)+ pt(-tcrit, N - 2, sqrt(N/4) * delta0)) } plot(nrange, power, type = "l", xlab = "total sample size N = N/2+N/2") abline(h = seq(0.01, 0.99, 0.01), col = "grey") abline(v = nrange, col = "grey") title(substitute(abs(mu[Y] - mu[X])/sigma[u] == delta0 ~ ", " ~ alpha == alpha0, list(delta0 = delta0, alpha0 = alpha))) lines(nrange, power, col = "red") } sample.size2samp2(delta0=.5,nrange=100:200) produced the following plot 100 120 140 160 180 200 0. 70 0. 75 0. 80 0. 85 0. 90 0. 95 total sample size N = N/2+N/2 p ow er Y  X u 0.5,  0.05 with magnification given by sample.size2samp2(delta0=.5,nrange=160:180) 160 165 170 175 180 0. 88 5 0. 89 0 0. 89 5 0. 90 0 0. 90 5 0. 91 0 0. 91 5 total sample size N = N/2+N/2 p ow er Y  X u 0.5,  0.05 We seem to be slightly below .9 in power at N=170. Thus we should take N=172, or m=n=86 to guarantee power at least .9. Problem 3: Assuming the data situation as in Problem 1, but assume that these scores were obtained as though lotion A had been assigned randomly to 100 of the 200 available arms, while the other arms got lotion B. Thus it would have been possible to have many persons with both arms treated with lotion A. It also is possible (but extremely unlikely) that each person got one of each lotion applied to his/her arms. How unlikely is it? Count the number of ways of splitting 200 into two groups of 100 each (use choose(…? …) ) and count the number of ways of giving each person one of each lotion. Again we use as test statistic Dbar = mean(X.A)-mean(X.B). The full reference distribution is unattainable. Write a function that (after using set.seed(27) ) estimates this reference distribution by generating Nsim=10000 splits of the 200 available numbers Z=c(X.A,X.B), draws the histogram with nclass=100, and gives the estimated two-sided p- value for the observed Dbar using the reference distribution. Show the location of the observed Dbar as a vertical line in the histogram. Also show the approximating normal distribution for this histogram (use probability=T in hist) and get the p-value from it. To find the appropriate normal distribution revisit slide 44 in Stat421DoeFlux.pdf.

Documents

questions

Comparing Sunscreen Lotions with Histograms & Hypothesis Testing, Assignments of Statistics

Related documents

Partial preview of the text