Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

MATH 399 Week_8_Correlation_and_Regression, Exams of Mathematics

The concepts of correlation and regression, which are used to compare two variables and determine the type of relationship between them. It discusses the limitations of these methods and how they can be used to make predictions. The document also covers the properties of the correlation coefficient and the regression equation, as well as the significance of these measures. Additionally, it explains the concept of outliers and influential points and how they can affect the regression line.

Typology: Exams

2022/2023

Available from 03/17/2023

Examiner651
Examiner651 🇺🇸

4.2

(10)

630 documents

1 / 18

Toggle sidebar

Partial preview of the text

Download MATH 399 Week_8_Correlation_and_Regression and more Exams Mathematics in PDF only on Docsity! CORRELATION AND REGRESSION  Correlation and Regression is used to compare two variables. Correlation asks if there is a relationship between the variables. Regression determines which type of relationship.  Regression can be linear, quadratic, exponential, or even sinsusoidal. This class will focus on linear.  Given a population and two variables in that population, is there a relationship between the two variables? Look at the following table of values: X: 78 85 92 100 85 Y: 89 93 99 100 84 THIS IS NOT CAUSE AND EFFECT  What if the data are test grades of subjects before and after formal instruction? Can we say that if you take the formal instruction, you will do better?  The answer is no! There are too many factors that can influence the outcome of the test scores. What if a person was sick that day? How about if they were running late all day?  Correlation and regression can predict values, but it cannot prove anything. What type of relationship exists?  A positive relationship has a positive slope. Therefore, both variables either increase or decrease at the same time. We CANNOT say that an increase in one causes an increase in the other.  A negative relationship has a negative slope. As one variable decreases, the other variable increases or visa versa. What kind of predictions can be made from this relationship?  Examples include weather forecasting, crop predictions, success in college, and sports predictions.  NOTICE THE WORD PREDICTION. We cannot say that one caused the other; we see a pattern or relationship. CORRELATION  A correlation exists between two variables when one of them is related to the other in some way.  One variable is the independent variable and the other is the dependent variable. CORRELATION COEFFICIENT  This coefficient measures the strength of the relationship.  Although there are many types of correlation coefficients, we will use the Pearson product moment correlation coefficient, named after Karl Pearson, who pioneered most of the research in this area.  Sample correlation coefficient is r. The population correlation coefficient is the Greek letter rho, . PROPERTIES OF THE CORRELATION COEFFICIENT  The range of the coefficient is from –1 to +1.  The closer to +1, the stronger the relationship. It is called a positive linear relationship.  If the coefficient is closer to –1, it is called a negative linear relationship.  When there is no relationship between the variables, or a very weak one, r will be close to 0. SIGNIFICANCE OF THE CORRELATION COEFFICIENT  We can use hypothesis testing to see if there is a significant linear relationship or if r is due to chance.  We use , our population parameter. The sample correlation coefficient can be used if the following assumptions are met about :  (x, y) are linearly related and are random.  They have a bivariate normal distribution. y is normally distributed. REGRESSION  This section will describe the relationship between the variables through a regression line, which is the data’s line of best fit. The equation of the regression line is the regression equation. y = a + bx This equation is defined by the y-intercept of the regression line, a, and the slope, b.  We want to find the equation of the line that best describes the distance between the observed values and the predicted values. OUTLIERS/INFLUENTIAL POINTS  A point is influential if it affects the look of the line substantially when the point is excluded. Influential points come from outlier points. PREDICTIONS  We cannot always use the regression line to predict the value of a variable.  In predicting a value of y based on some given value of x…  If there is not a significant linear correlation, the best predicted y value is the mean.  If there is a significant linear correlation, the best predicted y value is found by substituting x into the regression equation.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved