The Language of Statistics

Return to Behavioral Research Methods

As if learning statistics isn't difficult enough, there is also a language barrier between statisticians and everyone else. This section will cover some of the basics in understanding the language of statistics.

Population
Formal Definition: The complete set of observations about which an investigator wishes to draw conclusions.

Think about the population as all of the people you are interested in studying.


 * For example, pretend you want to know if playing sports affects the GPA of college students. The population is all college students.

Sample
Formal Definition: A subset of a population.

The sample is all of the people you actually study. In most cases it is impossible to study everyone in a population (imagine trying to ask everyone in the world even a single question). Instead of trying to do the impossible, use a sample from the population.


 * For example, pretend you want to know if playing sports affects the GPA of college students. A sample could be 50 college students: 25 who play sports and 25 who don't play sports.

The S'ample Size is the number of people (or scores) in your sample. For information on how a sample is obtained, refer to Sampling Methods.

Parameter
Formal Definition: A descriptive index of a population.

Parameters measure populations.

In the behavioral sciences, there aren't very many known parameters. A common parameter example is IQ. Enough research has been done with IQ testing that we know the average IQ is 100. That average is a parameter because it is a measure of the entire population.

Statistic
Formal Definition: A descriptive index of a sample.

There are two different types of statistics: descriptive and inferrential. Descriptive statistics organize and summarize data to make them easier to understand.


 * For example, if you want to know how your test grade compares to your classmates, then you may arrange the scores from highest to lowest. Or if you could take the average of the class test grades and compare it to your own grade.

The second type of statistics, inferrential, is used to draw conclusions about a population based on information drawn from a sample. This type of statistics is what allows us to use a sample instead of a population.


 * For example, to find out if a new cold medicine works, doctors will test it out on a population. The effects the drug has on the sample are used to make inferrences to how the drug would affect the population. So instead of testing the drug on the entire population, the drug is tested on a sample and the results are generalized to the population.

Constants
Formal Definition: A characteristic that does not change or only has one value.

A constant is something that is the same for everyone.
 * For example, if you are interested in only studying females, then gender is a constant because you are only using one option: female.

Variables
Formal Definition: A characteristic that can change or take on more than one value.

A variable is something that is not the same for everyone.
 * For example, everyone is not the same height so height is a variable.

Discrete
Discrete variables can only take on a set number of values like whole numbers.


 * For example, the number of students in a classroom is a discrete variable.

Continuous
Continuous variables can take on any value within a given range. These can be technically be infinite but since that is not very useful, we like to round. The important thing to remember is that these have decimal places and the decimal places can go on forever!


 * For example, the time it takes for Michael Vick to throw the ball. A possible answer is 3.5555559470932 seconds. Or never.

Independent and Dependent
The independent variable is the variable that you either (1) know, (2) manipulate, or (3) believe affects another variable (or other variables).

The dependent variable is the variable that you either (1) don't know or (2) believe is affected by another variable (or other variables). The dependent variable depends on the independent variable.


 * For example, pretend only half of a class receives a study guide for a test. Then the test scores are compared between the half that received the study guide and the half that didn't to see which students did better. The independent variable is whether or not the student received the study guide (because this is what you know). The dependent variable is the test scores (because this is what you don't know).

Types of Data
Data is the qualitative or quantitative attributes of a variable or set of variables.

Qualitative Data
Qualitative data provides description without numbers. Think quality.


 * For example, color, friendliness, smell, etc.

Quantitative Data
Quantitative data uses numbers to provide a measurement. Think quantity.


 * For example, age, height, temperature, etc.

Scales of Measurement
Measurement is the process of assigning labels to observations. There are four scales of measurement.

Nominal
Formal Definition: Mutually exclusive and exhaustive categories differing in some qualitative aspect.

This scale of measurement does not have any numerical value. You can use numbers but you can't compare them. Think about a phone number, is 867-5309 better than 111-1000? No, they're just identifiers. The measure doesn't have to be a number.


 * For example, if your data are colors then your options could be blue, green, red, and other. The 'other' option is necessary in order for the data to be exhaustive.

Ordinal
Also known as 'rank-ordering'.

This scale has the properties of a nominal scale with the ability to rank order by magnitude (no accounting for differences between each step on the scale).


 * For example, if you are rating someone's cooking, the choices could be terrible, bad, ok, good, or excellent. Even though you know 'excellent' is better than 'good', the difference between 'excellent' and 'good' may not be the same as the difference between 'ok' and 'good'.

Interval
This scale has all of the properties of an ordinal scale with the addition of equal differences between measures. This scale does not have a true zero, meaning a value of zero is just another point on the scale.


 * For example, the difference between 1 ºF and 2 ºF is the same as the difference between 50 ºF and 51 ºF. Also, a temperature of 0 ºF does not mean there is no temperature, it just means it's really cold.

Ratio
This scale has all of the properties of the interval scale with the addition of an absolute zero. A value of zero on this scale means an absence of the quantity being measured.


 * For example, when you have $0 then you have no money :(