Describing Data

Return to Statistics

Data in its original form is called raw data. However, in this form, data is difficult to analyze and explain. So, statisticians organize data. The most fundamental form of data organization is by frequency.

Frequency Distribution
The frequency is the number of times a value appears.


 * For example, if the grades in a class were 86, 86, 89, 97, and 100 then the frequency of 86 is 2, 89 is 1, 97 is 1, and 100 is 1.

It can also be the number of times a value within a range appears.


 * For example, if you organize those sames grades using a range, then the frequency of B's was 3 and the frequency of A's was 2.

A frequency distribution is an organization of data in a table form using classes and frequencies. There are 3 basic types of frequency distributions: Categorical, ungrouped, and grouped.

Categorical

 * Example: Final grades of a calculus class:

Ungrouped

 * Example: Dexterity test on 25 different 3rd graders. The time it took the students to complete the task (in minutes) was collected:

Grouped

 * Example: President ages:

Graphing a Frequency Distribution
There are many ways to graph a frequency distribution.

Providing a visual of the data by using a graph, like a historgram, allows the statistician to describe the data. It can be described in terms of skewness (shape), centrality (measures of central tendency), and spread.

Histogram
The most common method of graphing a frequency distribution is by using a histogram. A histogram is similar to a bar graph except that the bars are touching.

Each bar in the histogram represents the frequency of a value or class. Histograms most commonly have between 6 and 20 bars. It is up to the statistician to determine the number of bars that will be most representative of the data.

Stem and Leaf Plot
Less common is the stem and leaf plot. In order to make this chart, the data must be in ascending order.


 * Example: Let the data set contain 20, 25, 25, 27, 28, 31, 33, 34, 36, 37, 44, 50, 59, 85, and 86. Then the stem and leaf plot would look like this:

Stem and leaf plots can also be used to compare two frequencies.


 * Example: Let data set A contain 22, 25, 34, 35, 41, 41, 46, 46, 46, 47, 49, 54, 54, 59, and 60. Let data set B contain 9, 9, 22, 32, 33, 39, 39, 42, 49, 52, 58, and 70. Then the stem and leaf plot would look like this: