Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistics and Probability: Understanding Distributions, Hypothesis Testing, and Sampling, Exams of Statistics

An in-depth exploration of various statistical concepts, including variables, scores, frequency distributions, normal distributions, probability, and hypothesis testing. Learn about different types of distributions, such as symmetrical and skewed distributions, and the importance of z scores in understanding the location of raw scores within a distribution. Discover the role of probability in understanding distributions and hypothesis testing, and the concept of a null hypothesis and comparison distribution in hypothesis testing. Gain insights into the differences between population parameters and sample statistics, and the importance of random sampling and power in hypothesis testing.

Typology: Exams

2023/2024

Available from 04/01/2024

DrShirley
DrShirley 🇺🇸

3

(2)

1.1K documents

1 / 21

Toggle sidebar

Related documents


Partial preview of the text

Download Statistics and Probability: Understanding Distributions, Hypothesis Testing, and Sampling and more Exams Statistics in PDF only on Docsity! Psych Stats Chapters 1-5 Descriptive - psychologist use this to summarize and describe a group of numbers from a research study Inferential - psychologist use inferential statistics to draw conclusions and to make inferences that are based on the numbers from a research study but that go beyond the numbers... INFERENCES THAT GO BEYOND THE NUMBERS!! Able to generalize about a larger population based on a research study in which a smaller number of individuals took part variable - characteristic that can have different values values - possible number or category that a score can have score - particular persons value on a variable Numeric variable - the scores are numbers that tell you how much there is of what i seeing measured. Also known as quantitative variables: Equal interval & Rank Orer Equal Interval Variable - A variable in which the numbers stand for approximately equal amounts of what is being measured. Grade Point Average is an equal variable because a 2.0 to 2.3 is the same as a 3.0 to 3.3... the difference is the same....stress scales are also equal interval variables. they are on a scale from 1-10. 4 to 6 is the same as 7 to 9. Rank Order - is a variable in which the numbers stand only for relative ranking. A students standing in his/her graduating class. The amount of difference in underlying gpa between being second and third in class could be very unlike the amount of difference between 8th and 9th. Rank Order does not tell you the exact difference (like equal interval variable) of what is being measured..... Ordinal Ratio Variable - has order, equal intervals, and a meaningful zero point. A 0 means there is a complete absence of the variable.... HAS A MEANINGFUL 0...IF IT HAS 0 THEN IT IS A RATIO Nominal Variable - Variable with values that are categories (they are names rather than numbers). Also called categorical variable. For example, it could include a psychiatric disorder such as: major depression, post traumatic stress disorder, schizophrenia and obsessive compulsive disorder. Gender OR sexuality are also examples of nominal variables. Religion Discrete Variable - one that has specific values and cannot have values between the specific values. For example: Number of times you went to the dentist. 1,2,3 cannot be: 1.2 1.4 1.5. Nominal variables such as gender, religious affiliation, and college major are also discrete variables. Continuous variable - variable for which, in theory, there are an infinite number of variables. How old are you? could be 19.24. Height, weight and time. weight: 206.45lbs etc...infinite number of variables!! frequency table - ordered listing of number of individuals having each of the different values for a particular variable interval - range of values in a grouped frequency table that are grouped together Grouped Frequency Table - measure of how spread out a set of scores are; average of the sward deviations from the mean...it is always positive because it is the square root of the sum of the squares!!! deviation score - score minus the mean squared deviation score - score of the difference between a score and the mean sum of squared deviations - total of each scores squared difference from the mean ...bigger =bigger variance Standard Deviation - square root of the average of the squared deviations from the mean:the most common descriptive statistic for variation:approximately the average amount that scores in a distribution vary from the mean. average amount in which the value is spread out from the mean!!*** Interquartile Range - Q3-Q1 The interquartile range or IQR is the range of the middle half of a set of data. It is the difference between the upper quartile and the lower quartile.... It is the range of the middle 50 percent of the data. Semi Interquartile Range - Q3-Q1/2 t is half the distance needed to cover half the scores. The semi-interquartile range is affected very little by extreme scores. This makes it a good measure of spread for skewed distributions. Measures the disbursement as the distance from the middle of the distribution to the boundaries that define the middle percent of observations Median - The observed value that has 50 percent of the scores above it and 50 percent of the scores below it. mode - not useful for bimodal or multimodal distributions Limitations with the range - influenced by extreme scores and provides no information about the scores between the lowest and highest values. Zscores - zscores are approx standard deviations. (if conditions were normal) z score - number of standard deviations that a score is above (or below, if it is negative) the mean of its distribution; it is thus an ordinary score transformed so that it better describes the scores location in a distribution raw score - ordinary score (or any number in a distribution before it has been made into a Z score or otherwise transformed) normal distribution - frequency distribution that follows a normal curve normal curve - specific, mathematically defined, bell shaped frequency distribution that is symmetrical and unimodal; distributions observed in nature and in research commonly approximate it. it can be shown mathematically that in the long run, if the influences are truly random, and the number of different influences being combined is large, a precise normal curve will result....central limit theorem Why is the normal curve so important? - It is common because any particular score is the result of the random combination of many effects, some of which make the score larger and some of which make the score smaller. Thus, on average these effects balance out near the middle, with relatively few at each extreme because it is unlikely for most of the increasing and decreasing effects to come out in the same direction. population - entire group of people to which a researcher intends the results of a study to apply; larger group to which inferences are made on the basis of the particular set of people studied. Sample - scores of the particular group of people studied; usually considered to be representative of the scores in some larger population. Can generalize the same to the larger population. spoon of beans, generalizes to pot of beans Random Selection - Method for selecting a sample that uses truly random procedures (usually meaning that each person in the population has an equal chance of being selected); One procedure is for the researcher to begin with a complete list of all the people in the population and select a group of them to study using a table of random numbers. Population parameter - actual value of the mean, standard deviation and so on, for the population; usually population parameters are not known, though often the are estimated based on information in samples: population in mean, population in variance, population standard deviation. sample statistics - descriptive statistics, such as the mean or standard deviation, figured from the scores in a group of people studied population parameter vs. sample statistic - information about the entire population while sample statistics is information about the specific sample Sampling without replacement - a method of sampling in which members of the sample are not returned to the population prior to selecting subsequent members. Probability Sampling - selected in such a way as to be representative of the population RANDOM SAMPLING - EACH member of the population is equally likely to be selected for membership in the sample. This usually results in a representative sample. Thus, findings from the sample generalize to the population. There is no bias involved in the selection of the sample. Any variation between the sample characteristics and the population characteristics is only a matter of chance. Stratified Random Sampling - The population is divided into characteristics of importance for the research. For example: by gender. The population is sampled within each category or stratum. If 60 percent of the population is female, then 60 percent of the sample is randomly selected from the female population. Requires advanced knowledge of the population, therefore can be difficult to construct!!! Non probability Sampling - not truly representative. A researcher may use this method of sampling because: it may not be feasible or possible to obtain a random or stratified sample. It may be too expensive to obtain this sample. Also, they may not care about generalizing to a larger population The validity of non probability sampling can be increased!!! BY: - By: trying to approximate random selection, and by eliminating as many sources of bias as possible... Three types include: Quota, Purposive and convenience. Quota Sampling - The researcher deliberately sets the proportions of categories within the sample. This is generally done to ensure the inclusion of a particular segment of the population. The proportions may or may not differ dramatically from the actual population. The researcher sets a quota, independent of population characteristics. Example of Quota - A researcher is interested in the attitudes of members of different religions towards the death penalty: In Iowa buddhist are less than 1 percent of population research might set a quota at 10 percent of them in sample to make sure their views are represented-- but now the sample is not random and you can't generalize to population Purposive Sample - Is a non representative subset of some larger population and is constructed to serve a very specific need or purpose. For example: a researcher may have a specific group in mind such as runaways who are heroine addicts--->it may not be possible to specify the population--they will not all be known and access will be difficult. The researcher will attempt to capture the target group including whomever is available (Getting the sample w/ a purpose in mind). ----->snowball sample-one picks up the sample along the way, analogous to a snowball accumulating snow.... It occurs as in a participant is asked to suggest someone else who might be willing or appropriate for the study. Snowball samples are particularly useful in hard to track populations. Convenience Sampling - When you select a naturally occurring group of people within the population you want to study. It is an accidental sample. Although selection is unguided it may not be random(using the correct definition of everyone in the population having an equal chance of being selected.) Volunteers would constitute a convenience sample!!!!!!!!!!! Keep in mind that not all convenience samples are created equal. ability to generalize will depend on how convenience sample was constructed and how representative it is of the population. to be random: - 1. everyone has an equal chance of being selected 2. can generalize to a larger population Probability - Frequency of times an outcome occurs divided by the total number of outcomes Formula for probability - probability=relative frequency=observed/possible Characteristics of Probability - 1) Probability is a proportion 2)Range from 0 to 1 in percents (0% to 100%) Mutually Exclusive - Two events are mutually exclusive when the occurrence of one precludes the occurrence of the other. Independent Events - The occurrence of one has no effect on the occurrence of the other. The result of 1 coin flip has no impact on the outcome of the second flip. Complementary - Two outcomes are complimentary when the sum of their probabilities is equal to one. The outcome constitutes 100 percent of the possible outcomes. For example if we flip a coin, the probability of getting heads or tails is 1. Conditional Outcome (AKA DEPENDENT) - Two outcomes are conditional when the occurrence of one outcome changes the probability that the occur outcome will occur. Probability Distribution - A probability distribution is the distribution of probabilities for each outcome of a random variable. -we decide on whether there is such an effect by seeing if it is unlikelyy that there is not such an effect. research hypothesis - statement in hypothesis testing about the predicted relation between populations (often a prediction of a difference between population means) null hypothesis - statement about a relation between populations that is the opposite of the research hypothesis: statement that in the population there is no difference (or a difference opposite to that predicted) between populations; contrived statement set up to examine whether it can be rejected as part of hypothesis testing Comparison Distribution - Distribution used in hypothesis testing. It represents the population situation if the null hypothesis is true. (mean and sd for babies who do not take pill bc. the null states=there is not difference between pill takers and non pill takers) in hypothesis testing, the comparison distribution is the distribution for the situation when the null hypoth is true. to decide whether to reject the null hypothesis, check how extreme the score of your sample is on this comparision distribution. Cutoff sample/critical value - point in hypothesis testing, on the comparison distribution at which, if reached or exceeded by the sample score, you reject the null. Also called critical value. conventional levels of significance - p<.05, p<.01...levels of significant widely use in psychology Statistically significant - conclusion that the results of a study would be unlikely if in fact the sample studied represents a population that is no different from the population in general; an outcome of hypothesis testing in which the null is rejected When you reject the null hypothesis: - All you are saying is that your results support the research hypothesis or are statistically significant. YOU DO NOT SAY results prove the research hypothesis or that the results show that the research hypothesis is true. Terms such as prove and true are too strong because the results of research studies are based on PROBABILITIES. Specifically they are based on the probability being low of getting your result if the null hypothesis were true. Proven and true are acceptable terms in logic and mathematics but to use these words in conclusions from scientific research is unprofessional Directional Hypothesis - research hypothesis predicating a particular direction of difference between populations--for example a prediction that the population like the sample studied has a higher mean than the population in general One tailed test - hypothesis testing procedure for a directional hypothesis; situation in which the region of the comparison distribution in which the null hypothesis would be rejected is all on one side (tail) of the distribution. Non directional Hypothesis - Research hypothesis that does not predict a particular direction of difference between the population like the sample studied and the population in general two-tailed test - hypothesis-testing procedure for a nondirectional hypothesis; the situation in which the region of the comparison distribution in which the region of the comparison distribution in which the null hypothesis would be rejected is divided between the two tails of the distribution. hypothesis testing - the logical, statistical procedure for determining the likelihood of your study having gotten a particular pattern of results if the null hypothesis is true When a result is not extreme enough to reject the null hypothesis, explain why it is wrong to conclude that your result supports the null. - It is still possible that the research hypothesis is correct but the result in the particular sample was not extreme enough to be able to reject the null hypothesis. Statistically Significant - When the null is rejected, the effect is said to be statistically significant. It is very important to keep in mind that statical significance means only that the null hypothesis is rejected, it does not mean that the effect is important. Why fail to reject rather than accept null - the likelihood of getting an outcome if the null is true is tested..not the likelihood that the null is true Strict interpretation of a significant finding: - The probability of observing the difference we have observed given the null hypothesis is less than 5 percent. OR THE likelihood that the difference we have obesrved is due to chance is less than 5 percent Cant say prove - because hypothesis testing is based on probabilities Should hypothesis testing be rejected? - With enough power, one can always find significance The assumption of a random sample from the population is never met in practice We cannot draw conclusions about the research (alternative) hypothesis Probabilities are conditional in ways that cannot be documented Distribution of means - The range of scores (that is scores between an upper and lower value) that is likely to include the true population mean; more precisely, the range of possible population means from which it is not highly unlikely that you could have obtained your sample mean. Confidence Limit - upper or lower value of a confidence interval Greater the confidence - The broader the confidence interval 95 percent confidence interval - confidence interval in which, roughly speaking, there is a 95 percent chance that the population mean falls within this interval 99 percent confidence interval - confidence interval in which, roughly speaking, there is a 99 percent chance that the population mean falls within this interval What is the best estimate of a population mean? - sample mean- it is more likely to have come from a population with the same mean than from any other population 95% interval - 95% confident that the true population mean falls within a particular range What number is used to indicate the accuracy of an estimate of the population mean - standard error- the standard error is roughly the average amount that means vary from the mean of the distribution of means. Why is it wrong to say that the 95 percent confidence interval is the region in which there is a 95 percent probability of finding the true population mean? - it is wrong to say that the 95 percent confidence interval is the reign in which there is a 95 percent probability of finding the true population mean because u do not know the true population mean....so you have no way of knowing for sure what to start with when figuring 95 percent probability what is the basis for our 95% confidence? - The lower confidence limit is the point at which a true population any lower would not have a 95 percent probability of including a sample with our mean. The upper confidence limit is the point at which a true population any higher would not have a 95 percent probability of including a sample with our mean.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved