Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Testing the Two Sample Means with t-Test - Business Statistics - Handout, Exercises of Business Statistics

Saylor.org - [Category] Business Administration - [Course] Business Statistics - [Unit 5] Estimation and Hypothesis Testing - [Unit 5.3] Hypothesis Testing: t-Tests - [Reading] College of Micronesia-FSM: Dana Lee Ling's Introduction to Statistics Using OpenOffice.org, LibreOffice.org Calc, 4th edition: “11 Testing the Two Sample Means with the t-Test”

Typology: Exercises

2013/2014

Uploaded on 05/18/2014

docsity.en
docsity.en 🇺🇸

4.6

(945)

35 documents

1 / 11

Toggle sidebar

Related documents


Partial preview of the text

Download Testing the Two Sample Means with t-Test - Business Statistics - Handout and more Exercises Business Statistics in PDF only on Docsity! Source URL: http://www.comfsm.fm/~dleeling/statistics/text.html#page-111 Saylor URL: http://saylor.org/courses/bus204 Attributed to: [Dana Lee Ling] Saylor.org Page 1 of 11 11  Testing  the  two  sample  means  with  the   t-­‐test   11.1  Paired  differences:  Dependent  samples     Many  studies  investigate  systems  where  there  are  measurements  taken   before  and  after.  Usually  there  is  an  experimental  treatment  or  process   between  the  two  measurements.  A  typical  such  system  would  be  a  pre-­‐test   and  a  post-­‐test.  Inbetween  the  pre-­‐test  and  the  post-­‐test  would  typically  be   an  educational  or  training  event.  One  could  examine  each  student's  score   on  the  pre-­‐test  and  the  post-­‐test.  Even  if  everyone  did  better  on  the  post-­‐ test,  one  would  have  to  prove  that  the  difference  was  statistically   significant  and  not  just  a  random  event.     These  studies  are  called  "paired  t-­‐tests"  or  "inferences  from  matched   pairs".  Each  element  in  the  sample  is  considered  as  a  pair  of  scores.  The  null   hypothesis  would  be  that  the  average  difference  for  all  the  pairs  is  zero:   there  is  no  difference.  For  a  confidence  interval  test,  the  confidence  interval   for  the  mean  differences  would  include  zero  if  there  is  no  statistically   significant  difference.   Source URL: http://www.comfsm.fm/~dleeling/statistics/text.html#page-111 Saylor URL: http://saylor.org/courses/bus204 Attributed to: [Dana Lee Ling] Saylor.org Page 2 of 11     If  the  difference  for  each  data  pair  is  referred  to  as  d,  then  the  mean   difference  could  be  written  d.  The  hypothesis  test  is  whether  this  mean   difference  d  could  come  from  a  population  with  a  mean  difference  μd  equal   to  zero  (the  null  hypothesis).  If  the  mean  difference  d  could  not  come  from   a  population  with  a  mean  difference  μd  equal  to  zero,  then  the  change  is   statistically  significant.  In  the  diagram  above  the  mean  difference  μd  is  equal   to  μbefore  −  μ  after.     Confidence  interval  test     Consider  the  paired  data  below.  The  first  column  are  female  body  fat   measurements  from  the  beginning  of  a  term.  The  second  column  are  the   body  fat  measurements  sixteen  weeks  later.  The  third  column  is  the   difference  d  for  each  pair.   Source URL: http://www.comfsm.fm/~dleeling/statistics/text.html#page-111 Saylor URL: http://saylor.org/courses/bus204 Attributed to: [Dana Lee Ling] Saylor.org Page 5 of 11 p-­‐value   0.14   Maximum  confidence  level  c  0.86     The  p-­‐value  confirms  the  confidence  interval  analysis,  we  fail  to  reject  the   null  hypothesis.  At  a  5%  risk  of  a  type  I  error  we  would  fail  to  reject  the  null   hypothesis.  We  can  have  a  maximum  confidence  of  only  86%,  not  the  95%   standard  typically  employed.  Some  would  argue  that  our  concern  for   limited  the  risk  of  rejecting  a  true  null  hypothesis  (a  type  I  error)  has  led  to  a   higher  risk  of  failing  to  reject  a  false  null  hypothesis  (a  type  II  error).  Some   would  argue  that  because  of  other  known  factors  -­‐  the  high  rates  of   diabetes,  high  blood  pressure,  heart  disease,  and  other  non-­‐communicable   diseases  -­‐  one  should  accept  a  higher  risk  of  a  type  I  error.  The  average   shows  an  increase  in  body  fat.  Given  the  short  time  frame  (a  single  term),   some  might  argue  for  reacting  to  this  number  and  intervening  to  reduce   body  fat.  They  would  argue  that  given  other  information  about  this   population's  propensity  towards  obesity,  86%  is  "good  enough"  to  show  a   developing  problem.  Ultimately  these  debates  cannot  be  resolved  by   statisticians.     11.2  T-­‐test  for  means  for  independent  samples     One  of  the  more  common  situations  is  when  one  is  seeking  to  compare  two   independent  samples  to  determine  if  the  means  for  each  sample  are   statistically  significantly  different.  In  this  case  the  samples  may  differ  in   sample  size  n,  sample  mean,  and  sample  standard  deviation.   In  this  text  the  two  samples  are  refered  to  as  the  x  data  and  the  y  data.  The   sample  size  for  the  x  data  is  nx.  The  sample  mean  for  the  x  data  is  x.  The   sample  standard  deviation  for  the  x  data  is  sx.  For  the  y  data,  the  sample   size  is  ny,  the  sample  mean  is  y,  and  the  sample  standard  deviation  is  sy.   Source URL: http://www.comfsm.fm/~dleeling/statistics/text.html#page-111 Saylor URL: http://saylor.org/courses/bus204 Attributed to: [Dana Lee Ling] Saylor.org Page 6 of 11     Two  possibilities  exist.  Either  the  two  samples  come  from  the  same   population  and  the  population  mean  difference  is  statistically  zero.  Or  the   two  samples  come  from  different  populations  where  the  population  mean   difference  is  statistically  not  zero.     Confidence  Interval  test   Each  sample  has  a  range  of  probable  values  for  their  population  mean  μ.  If   the  confidence  interval  for  the  sample  mean  differences  includes  zero,  then   there  is  no  statistically  significant  difference  in  the  means  between  the  two   samples.  If  the  confidence  interval  does  not  include  zero,  then  the   difference  in  the  means  is  statistically  significant.     Note  that  the  margin  of  error  E  for  the  mean  difference  is  still  tc  multiplied   by  the  standard  error.  The  standard  error  formula  changes  to  account  for   the  differences  in  sample  size  and  standard  deviation.   Source URL: http://www.comfsm.fm/~dleeling/statistics/text.html#page-111 Saylor URL: http://saylor.org/courses/bus204 Attributed to: [Dana Lee Ling] Saylor.org Page 7 of 11     Thus  the  margin  of  error  E  can  be  calculated  using:         For  the  degrees  of  freedom  in  the  t-­‐critical  tc  calculation  use  n  −  1  for  the   sample  with  the  smaller  size.  This  produces  a  conservative  estimate  of  the   degrees  of  freedom.  Advanced  statistical  software  uses  another  more   complex  formula  to  determine  the  degrees  of  freedom.     For  the  degrees  of  freedom  in  the  t-­‐critical  tc  calculation  use  n  −  1  for  the   sample  with  the  smaller  size.  This  produces  a  conservative  estimate  of  the   degrees  of  freedom.  Advanced  statistical  software  uses  another  more   complex  formula  to  determine  the  degrees  of  freedom.   The  confidence  interval  is  calculated  from:     (x  −  y)  −  E  <  (μx  −  μy)  <  (x  −  y)  +  E     Where  x  is  the  sample  mean  of  one  data  set  and  y  is  the  sample  mean  of  the   other  data.  Some  texts  use  the  symbol  xd  for  this  difference  and  μd  for  the   hypothesized  difference  in  the  population  means.  This  leads  to  the  more   familiar  looking  formulation:     xd  −  E  <  μd  <  xd  +  E     Where:   μd  =  μx  −  μy  and       xd  =  x  −  y     Source URL: http://www.comfsm.fm/~dleeling/statistics/text.html#page-111 Saylor URL: http://saylor.org/courses/bus204 Attributed to: [Dana Lee Ling] Saylor.org Page 10 of 11 As  noted  above,  spreadsheets  provide  a  function  to  calculate  p-­‐values.  If   the  the  p-­‐value  is  less  than  your  chosen  risk  of  a  type  I  error  α  then  the   difference  is  significant.     The  function  takes  as  inputs  one  the  data  for  one  if  the  two  samples   (data_range_x),  the  data  for  the  other  sample  (data_range_y),  the  number   of  tails,  and  a  final  variable  that  specifies  the  type  of  test.  A  t-­‐test  for  means   from  independent  samples  is  test  type  number  three.     =TTEST(data_range_1,data_range_2,number  of  tails,3)     For  the  above  data,  the  p-­‐value  is  given  in  the  following  table:     p-­‐value   0.02   Maximum  confidence  level  c  0.98     The  TTEST  function  does  not  use  the  smaller  sample  size  to  determine  the   degrees  of  freedom.  The  TTEST  function  uses  a  different  formula  that   calculates  a  larger  number  of  degrees  of  freedom,  which  has  the  effect  of   reducing  the  p-­‐value.  Thus  the  confidence  interval  result  could  produce  a   failure  to  reject  the  null  hypothesis  while  the  TTEST  could  produce  a   rejection  of  the  null  hypothesis.  This  only  occurs  when  the  p-­‐value  is  close   to  your  chosen  α.     [Optional  material!]  If  you  have  doubts  and  want  to  explore  further,  take   the  difference  of  the  means  and  divide  by  the  standard  error  to  obtain  the  t-­‐ statistic  t.  Then  use  the  TDIST  function  to  determine  the  p-­‐value,  using  the   smaller  sample  size  −  1  to  calculate  the  degrees  of  freedom.       Source URL: http://www.comfsm.fm/~dleeling/statistics/text.html#page-111 Saylor URL: http://saylor.org/courses/bus204 Attributed to: [Dana Lee Ling] Saylor.org Page 11 of 11   Note  that  (μx  −  μy)  is  presumed  to  be  equal  to  zero.  Thus  the  formula  is  the   difference  of  the  means  divided  by  the  standard  error  (given  further   above).     t  =  xd  ÷  (standard  error)     Once  t  is  calculated,  use  the  TDIST  function  to  determine  the  p-­‐value.   =TDIST(ABS(t),n−1,2)     Technical  side-­‐note:  TTEST  type  three  does  not  presume  that  the  population   standard  deviations  σx  and  σy  are  equal.  This  is  in  keeping  with  modern   practice  and  reality.  TTEST  type  two  presumes  σx  =  σy.  One  rarely  knows   either  value,  and  if  one  did  know  those  values,  why  would  not  they  also   know  the  actual  population  means?  With  the  true  population  means  in   hand,  then  any  difference  would  be  significant.  
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved