Graphical Representations of Data#

Definitions#

Frequency

f(x)

The number of times an observation x occurs in a sample S.

Frequency Distributions#

Ungrouped Distributions#

Suppose you ask 10 people their favorite color and the following data set represents their answers,

S = \{ r, b, g, g, r, r, y, o, r, b \}

Where

b = response of “blue” g = response of “green” o = response of “orange” r = response of “red” y = response of “yellow “

An ungrouped frequency distribtion is simply a table where each entry represents the frequency of every possible observation,

x

f(x)

b

2

g

2

o

1

r

4

y

1

Grouped Distributions#

The steps for constructing a grouped frequency distribution are given below.

Steps
  1. Find the range of the data sets.

R = max(x_i) - min(x_i)

  1. Choose a number of classes. Typically between 5 and 20, depending on the size and type of data.

  2. Find the class width. Round up, if necessary.

w = \frac{R}{n}

  1. Find the lower and upper class limits LLi and ULi for each i up to n, i.e. for each class.

LL_i = min(x_i) + (i-1) \cdot w

UL_i = min(x_i) + i \cdot w

i = 1, 2, ... , n

  1. Find the lower and upper class boundaries LBi and UBi for each i up to n, i.e. for each class,

LB_i = LL_i - 0.5

UB_i = UL_i + 0.5

i = 1, 2, ... , n

  1. Sort the data set into classes and tally up the frequency of each class.

Example

Suppose you measure the height of everyone in your class and get the following sample of data, where each observation in the data set is measured in feet,

S = \{ 5.7, 5.8, 5.5, 5.7, 5.9, 6.3, 5.3, 5.5, 5.4, 5.3, 5.7, 5.9 \}

Find the grouped frequency distribution for this sample of data using n = 5 classes.

TODO

Histograms#

A histogram is a graphical representation of a frequency distribution. The classes or bins are plotted on the x-axis against the frequency of each class on the y-axis.

(Source code, png, hires.png, pdf)

../../_images/histogram_random.png

Variations#

A basic histogram can be modified to accomodate a variety of scenarios, depending on the specifics of the problem. In each case below, the sample’s frequency distribution is used as the basis for constructing the graph.

Bar Charts#

Sometimes the frequency distribution has already been calculated for us. In cases like this, a simple bar chart is all that is required.

(Source code, png, hires.png, pdf)

../../_images/bar_chart.png

Stem-Leaf Plots#

TODO

Relative Frequency Plots#

Relative frequency histograms express the frequency of each class as a percentage of the total observations in the sample,

f(x_i) = \frac{x_i}{n}

(Source code, png, hires.png, pdf)

../../_images/histogram_relative.png

Distribution Shapes#

TODO

Uniform#

(Source code, png, hires.png, pdf)

../../_images/histogram_uniform.png

Normal#

(Source code, png, hires.png, pdf)

../../_images/histogram_normal.png

Bimodal#

(Source code, png, hires.png, pdf)

../../_images/histogram_bimodal.png

Skewed#

Skewed Right

(Source code, png, hires.png, pdf)

../../_images/histogram_skewed_right.png

Skewed Left

(Source code, png, hires.png, pdf)

../../_images/histogram_skewed_left.png

Ogives#

TODO

(Source code, png, hires.png, pdf)

../../_images/histogram_and_ogive.png

Note

Your book’s authors call these types of graphs ogives. Be aware, you will almost never see these graphs referred to by that term. In practice, they are almost always called cumulative frequency distributions.

Construction#

  1. Find the relative frequency distribution

Distribution Shapes#

TODO

Uniform#

(Source code, png, hires.png, pdf)

../../_images/ogive_uniform.png

Normal#

(Source code, png, hires.png, pdf)

../../_images/ogive_normal.png

Bimodal#

(Source code, png, hires.png, pdf)

../../_images/ogive_bimodal.png

Skewed#

Skewed Right

(Source code, png, hires.png, pdf)

../../_images/ogive_skewed_right.png
Skewed Left

(Source code, png, hires.png, pdf)

../../_images/ogive_skewed_left.png

Boxplots#

While Histograms and Ogives provide a wealth of information about the sample distribution, they do not give us the whole picture.

Construction#

  1. Find the maximum observation.

  2. Find the 75 th percentile (third quartile)

  3. Find the 50 th percentile (median)

  4. Find the 25 th percentile (first quartile)

  5. Find the minimum observation.

Distribution Shapes#

TODO

Uniform#

(Source code, png, hires.png, pdf)

../../_images/boxplot_uniform.png

Normal#

(Source code, png, hires.png, pdf)

../../_images/boxplot_normal.png

Bimodal#

(Source code, png, hires.png, pdf)

../../_images/boxplot_bimodal.png

Skewed#

Skewed Right

(Source code, png, hires.png, pdf)

../../_images/boxplot_skewed_right.png

Skewed Left

(Source code, png, hires.png, pdf)

../../_images/boxplot_skewed_left.png

Scatter Plots#

No Correlation

(Source code, png, hires.png, pdf)

../../_images/scatterplot_no_correlation.png

Positive Correlation

(Source code, png, hires.png, pdf)

../../_images/scatterplot_positive_correlation.png

Negative Correlation

(Source code, png, hires.png, pdf)

../../_images/scatterplot_negative_correlation.png

Other Types of Graphs#

TODO

Pie Chart#

TODO

Time Series#

Positive Trend

(Source code, png, hires.png, pdf)

../../_images/timeseries_positive_trend.png

Negative Trend

(Source code, png, hires.png, pdf)

../../_images/timeseries_negative_trend.png

No Trend

(Source code, png, hires.png, pdf)

../../_images/timeseries_no_trend.png