If you’re looking for variability, standardization and sampling distributions chapter notes, then you’re definitely in the right place. These notes are not only for my fellow Bertelsmann Udacity Data Science Scholarship students that maybe are just beginning or simply they want to go through the material once more but for any of you that expressed an interested in this subject.
Variability measures how much your scores differ from each other. In other words refers to how spread out a group of data is.
Measures of Variability: Range
Range is the simplest measure of variability. You take the smallest number and subtract it from the largest number to calculate the range. This shows the spread of our data. The range is sensitive to outliers, or values that are significantly higher or lower than the rest of the data set, and should not be used when outliers are present.
Measures of Variability: IQR
When working with sets of data that contain outliers we can use IQR (interquartile range). The IQR, or the middle fifty, is the range for the middle fifty percent of the data.
Measures of Variability: Variance
Variance is the average of the squared deviations from the mean
Measure of Variability: Standard Deviation
It represents the square root of the variance. Like the variance, the standard deviation measures how close the scores in the data set are to the mean.
Standardization coverts individual scores to standard scores and allows us to determine where the score falls in relation to other scores.
A sampling distribution is the frequency distribution of a statistic over many random samples from a single population.
Central Limit Theorem
The Central Limit Theorem states that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger.