Demonstrations#

Below you will find some Python scripts that demonstrate various statistical facts and theorems. We will go over them in class when the time comes.

Conditional Distributions#

This script will generate a random bivariate sample of categorical data and then construct a conditional distribution for one variable. A stacked bar chart is then created to visualize the association, or lack thereof, between the two conditional distributions.

Stacked Bar Chart (Conditional Distribution)

Estimators#

This script contains many useful functions for young statisticians seeking to tame the wild beast of uncertainy.

Point Estimators

Measuring Variation#

This script displays a dot plot of a sample of data and illustrates how the different measures of variation are affected by the slight alterations in the sample of data.

Measuring Variation

The Effect of Outliers#

This script generates a distribution of grades and visualizes the distribution with a dot plot. It will then calculate the sample mean and sample median, and plot as vertical lines, red and green respectively.

We will alter the distribution in class to see how it affects the sample mean and median.

The Effects of Outliers

Scatter Plot of Twitter Data#

This script shows how to parse a CSV file and then create a scatter plot with it. To execute this script, you will need to download the Twitter dataset from Datasets section and place it in the same folder where you download this script.

This dataset is an example of negative, non-linear correlation. In other words, even though there is clearly a correlation in this dataset, we cannot use linear regression to fit a model.

Twitter Data Scatter Plot

Die Roll Simulation#

This script will simulate rolling m die n times. The outcome of the m die rolls is then summed and a frequency distribution is created for the n experiments. The frequency distribution is visualized with a histogram.

The intent is show how the random variation of independent, identically distributed random variables leads naturally to the normal distribution. This result is known as Central Limit Theorem

Die Roll Simulations

Normal Distribution#

This script shows how to work with the normal distribution in Python. It demonstrates how to calculate percentiles and probabilities. It also demonstrates how the symmetry of the Normal Distribution manifests numerically via the Law of Complements.

Normal Distribution

QQ Plot#

This script shows how to construct a QQ plot to assess the normality of a sample of data.

QQ Plot

Least Squares Regression#

This script illustrates how the regression parameters for the slope and intercept of the line of best fit are estimated used least squares.

Least Squares

(Source code, png, hires.png, pdf)

../../_images/least_squares.png