A frequency distribution is a table that shows how frequently each value of a variable occurs in a set of scores. It is an ordered array of data points (also called observations, measures, measurements, scores) from highest to lowest or vice versa. Before looking at frequency distributions, two important concepts in research methods need to be discussed: variables and cases.
- Variable: An attribute or property of a person, group, or object that can be measured; the resulting measures are capable of assuming different values.
- A case is an individual person, book, database, library, time period, and so on that can observe. Cases are sometimes called units of analysis.
An attribute or characteristic of a case can be measured, that is, it can be observed systematically, not necessarily quantified, e.g., eye colour and gender. It can also vary. The difference between these two types of variables results from how variables are measured and not because of some supposed intrinsic characteristic any variable. Remember that measurement means observation in some systematic way. For example, height can be measured by inches or by classifying people immediately into small, medium, or large. Thus, height is defined by the researcher as, respectively, a quantitative or qualitative variable.
Example: Start with some raw data, i.e., direct observations, of values that a variable assumed in a research study. A variable is often referred to as “x,” a printed, lower case, Roman letter, not as “χ,” which is a lower case, Greek, script letter. This example will look at a population, as defined by the researcher consisting of 17 values of the variable. ‘x’ is defined as the number of monographs on Reserves for the faculty members in a department of classics in one semester. The values the variable assumes, in the order in which they were determined are below. Remember that all of these values are measured in units of monographs:
10 5 5 0
9 0 10 5
7 0 9
23 7 5
7 0 1
Five columns in the frequency distribution will be used, where
- x= the number of monographs
- freq x= the frequency of x, how often this particular value appears in the data set
- cum freq x= the cumulative frequency of x, the number of observations in the data set at this particular value or lower
- rel freq x= the relative frequency of x, the proportion of observations that assume this value; this proportion is determined algebraically, by dividing the frequency of x (freq x) for each value by the number of observations (N). N is the number of observations in a population, while n is the number of observations in a sample.
- cum rel freq= the cumulative relative frequency of x, the proportion of scores at this value or lower; this value is only approximately equal to the cumulative frequency of x (cum freq x) divided b y N.