Data Types

At the highest level, two kinds of data exist: quantitative and qualitative.

  • Quantitative data deals with numbers and things you can measure objectively: dimensions such as height, width, and length. Temperature and humidity. Prices. Area and volume.
  • Qualitative data deals with characteristics and descriptors that can’t be easily measured, but can be observed subjectively—such as smells, tastes, textures, attractiveness, and color.

Broadly speaking, when you measure something and give it a number value, you create quantitative data. When you classify or judge something, you create qualitative data. So far, so good. But this is just the highest level of data: there are also different types of quantitative and qualitative data.

Continuous Data and Discrete Data

There are two types of quantitative data, which is also referred to as numeric data: continuous and discrete. As a general rule, counts are discrete and measurements are continuous.

  • Discrete data is a count that can’t be made more precise. Typically it involves integers. For instance, the number of children (or adults, or pets) in your family is discrete data, because you are counting whole, indivisible entities: you can’t have 2.5 kids, or 1.3 pets.
  • Continuous data, on the other hand, could be divided and reduced to finer and finer levels. For example, you can measure the height of your kids at progressively more precise scales—meters, centimeters, millimeters, and beyond—so height is continuous data.

If I tally the number of individual Jujubes in a box, that number is a piece of discrete data. If I use a scale to measure the weight of each Jujube, or the weight of the entire box, that’s continuous data.

Continuous data can be used in many different kinds of hypothesis tests. For example, to assess the accuracy of the weight printed on the Jujubes box, we could measure 30 boxes and perform a 1-sample t-test.

Some analyses use continuous and discrete quantitative data at the same time. For instance, we could perform a regression analysis to see if the weight of Jujube boxes (continuous data) is correlated with the number of Jujubes inside (discrete data).

Binomial Data, Nominal Data, and Ordinal Data

When you classify or categorize something, you create Qualitative or attribute data. There are three main kinds of qualitative data.

Binary data place things in one of two mutually exclusive categories: right/wrong, true/false, or accept/reject.

Occasionally, I’ll get a box of Jujubes that contains a couple of individual pieces that are either too hard or too dry. If I went through the box and classified each piece as “Good” or “Bad,” that would be binary data. I could use this kind of data to develop a statistical model to predict how frequently I can expect to get a bad Jujube.

When collecting unordered or nominal data, we assign individual items to named categories that do not have an implicit or natural value or rank. If I went through a box of Jujubes and recorded the color of each in my worksheet, that would be nominal data.

This kind of data can be used in many different ways—for instance, I could use chi-square analysis to see if there are statistically significant differences in the amounts of each color in a box.

We also can have ordered or ordinal data, in which items are assigned to categories that do have some kind of implicit or natural order, such as “Short, Medium, or Tall.”  Another example is a survey question that asks us to rate an item on a 1 to 10 scale, with 10 being the best. This implies that 10 is better than 9, which is better than 8, and so on.

The uses for ordered data is a matter of some debate among statisticians. Everyone agrees its appropriate for creating bar charts, but beyond that the answer to the question “What should I do with my ordinal data?” is “It depends.”

ScaleDefinitionExampleStatistics
NominalOnly the presence/absence of an attribute. It can only count items. Data consists of names or categories only. No ordering scheme is possible. It has central location at mode and only information for dispersion.go/no-go, success/fail, accept/rejectpercent, proportion, chi-square tests
OrdinalData is arranged in some order but differences between values cannot be determined or are meaningless. It can say that one item has more or less of an attribute than another item. It can order a set of items. It has central location at median and percentages for dispersion.taste, attractivenessrank-order correlation, sign or run test
IntervalData is arranged in order and differences can be found. However, there is no inherent starting point and ratios are meaningless. The difference between any two successive points is equal; often treated as a ratio scale even if assumption of equal intervals is incorrect. It can add, subtract and order objects. It has central location at arithmetic mean and standard deviation for dispersion.calendar time, temperaturecorrelations, t-tests, F-tests, multiple regression
RatioAn extension of the interval level that includes an inherent zero starting point. Both differences and ratios are meaningful. True zero point indicates absence of an attribute. It can add, subtract, multiply and divide. It has central location at geometric mean and percent variation for dispersion.elapsed time, distance, weightt-test, F-test, correlations, multiple regression

Converting Data Types – Continuous data, tend to be more precise due to decimal places but, need to be converted into discrete data. As continuous data contains more information than discrete data hence, during conversion to discrete data there is loss of information.

Discrete data cannot be converted to continuous data as instead of measuring how much deviation from a standard exists, the user may choose to retain the discrete data as it is easier to use. Converting variable data to attribute data may assist in a quicker assessment, but the risk is that information will be lost when the conversion is made.

Cost-Benefit Analysis
Data Collection Techniques

Get industry recognized certification – Contact us

keyboard_arrow_up