Demonstration Testing

Product testing, also called consumer testing or comparative testing, is a process of measuring the properties or performance of products. The theory is that since the advent of mass production manufacturers produce branded products which they assert and advertise to be identical within some technical standard.

Product testing seeks to ensure that consumers can understand what products will do for them and which products are the best value. Product testing is a strategy to increase consumer protection by checking the claims made during marketing strategies such as advertising, which by their nature are in the interest of the entity distributing the service and not necessarily in the interest of the consumer. The advent of product testing was the beginning of the modern consumer movement.

Product testing might be accomplished by a manufacturer, an independent laboratory, a government agency, etc. Often an existing formal test method is used as a basis for testing. Other times engineers develop methods of test which are suited to the specific purpose. Comparative testing subjects several replicate samples of similar products to identical test conditions.

Prior to acceptance into service, a system should be evaluated to check whether the Reliability exhibited under service conditions exceeds the minimum acceptable requirement. The system may be a new design, a modification of an existing system or an Off-The-Shelf product entering service. Similar evaluation may be required if an existing system is required to operate in a new way or in a new environment. In some cases previous field history will provide sufficient confidence for acceptance. In others, one or more Reliability Demonstrations will have to be conducted.

RDT, is mainly concerned with measuring whether or not a specified requirement has been met, in terms which can be contractually binding. It may reveal new failure modes which require corrective action, especially if there have been inadequacies in earlier development tests. There is a relationship between development testing, growth and demonstration and these activities must be planned together.

The primary purpose of a Reliability Demonstration Test (RDT) is to decide whether or not the Reliability of a system is good enough. The result of such a demonstration is therefore a decision to accept the item or reject it. The RDT can also provide a quantitative estimate of the true Reliability but this is not its primary purpose.

An RDT shows whether the achievement of given Reliability parameter values can be claimed with a given level of confidence. This definition is deliberately couched in terms of ‘claimed’ and ‘confidence’ since statistical parameters are being addressed and the parameters cannot be measured exactly and repeatedly in tests based on small samples. For example if a system is tested for 500 hours and exhibits 5 failures it can be claimed (assuming a constant failure rate) that:

  • the MTBF is at least 50 hours with a confidence of 0.93;
  • the MTBF is at least 100 hours with a confidence of 0.38; and
  • the MTBF is at least 150 hours with a confidence of 0.04.

All of these statements are correct from the results quoted. Note that the higher the claim, the lower the level of confidence.

Reliability Demonstrations are generally not conducted unless required by contract. Here they form a contractual requirement for design acceptance. This is useful is emphasising the importance of the Reliability requirements. However their effectiveness is often reduced by a lack of clarity in the specification and an unwillingness to incur the timescale and cost penalties of rework following a reject result.

RDT is commonly based on half the Reliability requirement. This situation has arisen out of the way in which Reliability requirements and RDT requirements have been specified on major projects. The producer’s risk is based on the specified Reliability and the consumer’s risk on a fraction, normally half, of that value. This can be regarded is heavily biased in favour of the producer. To address this and remove any bias, the customer should ensure that the specification and contract is clear, simple and harmonised.

RDT is an expensive activity and is not warranted unless it is clear what course of action will be taken in the event of a reject result. RDT may be associated with contract incentives and/or penalties, but great care is needed at the specification stage and in the contract to ensure that any penalties or re-work requirements can be enforced.

A demonstration test can be any form of test which is agreed between a Contracting Authority and a Contractor to give the necessary assurance that a particular requirement has been met. It is essential, however, that the demonstration requirements are unambiguous and agreed before the demonstration is started.


The basic principle of demonstration is that a ‘sample’ of items is tested under conditions which are considered to be representative of their operational use. Based on the results of such a test, a decision is taken on the acceptability of the ‘population’ of items which the sample represents, that is, future production items.

In any sampling test, there are risks to both the producer and the consumer that a wrong decision can be reached. The degree of risk will vary according to such factors as the sample size and test duration and must therefore be agreed and specified when planning demonstration tests.

Demonstration testing may be carried out under laboratory conditions or as field tests but, to be effective, it must

  • use items which are declared and accepted to be representative of production items (and so be of fixed build standard through out the test);
  • represent typical operational use of the test item as closely as practicable; and
  • provide sufficient test observations to produce results which are statistically significant or can be assessed in some other way.

If any of these conditions cannot be fulfilled, then demonstration is unlikely to warrant the costs involved.

Compliance test plans are illustrated by an operating characteristic (OC) curve, which a graph showing the probability of demonstrating compliance is given the true value of the product parameter. For a test with time as the continuous variable, the OC curve will have the true value of the required parameter like failure rate () or MTBF (m) on the x axis and the probability of demonstrating compliance P(A) on the y axis. The unit reliability R, or the probability of success for each trial, is on the x axis, and the probability of demonstrating compliance is on the y axis for fixed-trial test plans.

MTBF/MTTF could be used as the compliance value. The accept point for this OC curve is defined by m0 and 1 – a , and the reject point is defined by m1 and b. The expected (acceptable) MTBF is m0 and m1, is the minimum acceptable (rejectable) MTBF. The ratio m0/m1 is the discrimination ratio.

Statistical Test Plans

Demonstrations of statistical reliability characteristics are based on statistical considerations. By assuming a particular distribution for the Reliability characteristic of interest, a statistical test plan can be formulated. This enables the accept/reject criteria for agreed values of decision risks to be pre-determined, and stated precisely before testing starts. The standard test plans are based on a negative exponential distribution of times to failure (constant failure rate).

There are two main types of statistical test plan which may be used for demonstration purposes:

Fixed time/failure terminated test plans – In it, testing is continued until a pre-determined test time has been exceeded (accept) or a pre-determined number of failures has occurred (reject). It has the advantages which includes

  • Either maximum test time or maximum number of failures are fixed prior to testing. Therefore, in the former case, the maximum requirements for test equipment and manpower are fixed before testing begins; in the latter case, the maximum number of test items can be determined.
  • The maximum accumulated test time is shorter than for a truncated sequential test based on the same parameters.
  • A better absolute measure of the Reliability characteristic is obtained in most cases.

It has disadvantages, which are

  • On average the number of failures and the accumulated test time will exceed those of a similar truncated sequential test.
  • Very good equipment or very bad equipment still has to undergo the maximum accumulated test time or number of failures to reach the agreed point for making a decision

Both the required amount of test time and the allowable number of failures are determined by the accept and reject points on the OC curve. Once the two points are defined, the amount of cumulative test time T required and the allowable number of failures c is set. The cumulative test time T is the total of the operating time of all the units during the test, including the units that fail as well as the units that do not fail.


Ten units are on test. The units are not replaced when they fail (non-replacement). One unit fails at t1 = 685 hours, and a second unit fails at t2 = 1690 hours. The test is ended at t = 2500 hours with no additional failures. What is the total accumulated test time?

T = 685 + 1690 + (8)(2500) = 22,375 hours

Sequential test plans – They are often truncated sequential test plans, in which both test time and failures are compared with established criteria to decide whether to accept, reject or continue testing the item. It has the advantage that the average number of failures to reach a decision is a minimum.

It has disadvantages, which are

  • The test has no maximum accumulated test time or number of failures and could, in theory, continue forever.
  • The number of failures – and therefore the test item costs – will vary in a broader range than for a similar time / failure terminated test. This causes administrative problems in scheduling test items, test equipment and manpower.

To decide whether to continue testing, it is necessary to know, as you go along, the actual number of valid failures which have occurred. This may not be easy if the failure definition is hard to interpret or there is a lack of evidence on why a failure happened.

Get industry recognized certification – Contact us