Demonstrations#
Below you will find some Python scripts that demonstrate various statistical facts and theorems. We will go over them in class when the time comes.
Conditional Distributions#
This script will generate a random bivariate sample of categorical data and then construct a conditional distribution for one variable. A stacked bar chart is then created to visualize the association, or lack thereof, between the two conditional distributions.
Estimators#
This script contains many useful functions for young statisticians seeking to tame the wild beast of uncertainy.
Measuring Variation#
This script displays a dot plot of a sample of data and illustrates how the different measures of variation are affected by the slight alterations in the sample of data.
The Effect of Outliers#
This script generates a distribution of grades and visualizes the distribution with a dot plot. It will then calculate the sample mean and sample median, and plot as vertical lines, red and green respectively.
We will alter the distribution in class to see how it affects the sample mean and median.
Scatter Plot of Twitter Data#
This script shows how to parse a CSV file and then create a scatter plot with it. To execute this script, you will need to download the Twitter dataset from Datasets section and place it in the same folder where you download this script.
This dataset is an example of negative, non-linear correlation. In other words, even though there is clearly a correlation in this dataset, we cannot use linear regression to fit a model.
Die Roll Simulation#
This script will simulate rolling m
die n
times. The outcome of the m
die rolls is then summed and a frequency distribution is created for the n
experiments. The frequency distribution is visualized with a histogram.
The intent is show how the random variation of independent, identically distributed random variables leads naturally to the normal distribution. This result is known as Central Limit Theorem
Normal Distribution#
This script shows how to work with the normal distribution in Python. It demonstrates how to calculate percentiles and probabilities. It also demonstrates how the symmetry of the Normal Distribution manifests numerically via the Law of Complements.
QQ Plot#
This script shows how to construct a QQ plot to assess the normality of a sample of data.
Least Squares Regression#
This script illustrates how the regression parameters for the slope and intercept of the line of best fit are estimated used least squares.
(Source code
, png
, hires.png
, pdf
)