# Design, Conduct and Analysis

## Design, Conduct and Analysis: Division B (Factorial designs)

Factorial designs should be used to assess the joint effects of variables

### Introduction

i.      There is an old scientific “rule of thumb” recommending investigation of one factor at a time. Consider the following quote from Byte magazine (Halfhill, 1995) which summarises that view as well as any:

…. In any scientific investigation, the goal is to control all variables except the one you are testing. If you compare two microprocessors that implement the same general architecture on the same process technology and both chips are running the same benchmark program at the same clock speed, any difference in performance must be due to the relative efficiencies of their microarchitectures.

ii.      The problem with this prescription is that is most studies there are many factors that are of interest, not just one factor. The challenge is then to develop a design that allows assessment of all of the factors with minimal additional cost.

### Rule of Thumb

i.      The effects of two or more factors can be assessed simultaneously by means of factorial designs in which treatment combinations are applied to the observational units. A benefit of such a design is that there may be decreased cost.

Illustration

i.      Suppose that a comparative study has been designed to study the response of asthmatic subjects to two factors: pollutant exposures (ozone and air) and two exercise levels (active and rest). Assume that it has been decided to test n subjects at each level of the two factors. Following the advice given in the introduction, the design involves two separate studies: one study to look at the effect of pollutant and the other study to look at the effect of exercise. This would take 4n subjects.an alternative approach is the factorial design which assigns n/2 subjects to each of the four treatment combinations. The design requires only 2n subjects as compared with the 4n subjects in the two, separate, single factor studies. Comparison of the means in the bottom row of the table reflects the effect of pollutant; a similar comparison of the column margin reflects the effect of exercise. Hence, this experiment contains the same information as the two independent experiments. In addition a comparison of the means within the table, the cell means, allows examinations of the question whether the effect of pollutant is the same during exercise and during test.

### Basis of the Rule

i.      The variance of the differences (comparing active and rest, or comparing ozone and air) is based on the same number of subjects as the single factor study. The precision is virtually the same as that for the two independent studies; but with half the number of subjects.

### Discussions and Extensions

i.      It is appropriate to contrast the above quote with another one by Fisher (1935):

No aphorism is more frequently repeated in connection with field trials, then that we must ask Nature few questions or, ideally, one question at a time. The writer is convinced that this view is wholly mistaken. Nature, he suggests, will best respond to a logical and carefully thought out questionnaire; indeed if we ask her a single question, she will often refuse to answer until some other topic has been discussed.

ii.      What did Fisher mean? Essentially that researchers are often interested in relationships between factors, or interactions. For example, it could be that the effect of ozone exposure can only be seen when exercising—a two-factor interaction. Comparison of the means in the margins, as indicated, reflect the effects of the factors separately. These are called the main effects. Thus, the factorial design provides not only the information of the two separate studies but also the potentially more important information about the relationship between the two factors.

iii.      It is useful to consider the error terms for the single factor study with n subjects per treatment, and the factorial design. In the single factor study the error term has 2n-2 degrees of freedom since each treatment has n subjects with n-1 degrees of freedom per treatment level. In the factorial experiment each cell contributes n/2-1 degrees of freedom to the error term for a total of 2n-4 degrees of freedom. So two degrees have been lost. Where did they go? One of them went to the comparison of the other factor, and the other is used in estimating the interaction.

iv.      Given this rule, why are not all studies factorial? If there are many factors at many levels, it will not be possible to carry out the full factorial analysis. For example, if there were four factors, each of three levels, then 34=81 observation units are needed for only one run of study. (But see the following rule of thumb for a possible solution). Note also that the precision of the estimate of interaction effect must be large in order to be able to detect it in a factorial design.

v.      The three strategies of randomisation, blocking and factorial design are the basic elements of experimental designs. Their creative and judicious use will lead to efficient, robust and elegant investigations.