# Solution 2.2 - Chemometrics: Data Analysis for the Laboratory and Chemical Plant

## Education Article

• Published: Jan 1, 2000
• Channels: Chemometrics & Informatics

1. These values are given by

 x0 x1 x2 x3 x4 x5 x1x2 x1x3 x1x4 x1x5 x2x3 x2x4 x2x5 x3x4 x3x5 x4x5 1 -1 -1 -1 -1 1 1 1 1 -1 1 1 -1 1 -1 -1 1 1 -1 -1 1 -1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 1 1 -1 -1 1 1 -1 -1 1 -1 -1 1 1 -1 -1 1 -1 -1 1 1 1 1 -1 -1 -1 -1 -1 -1 1 1 1 1 1 -1 1 -1 -1 -1 1 -1 -1 -1 1 1 -1 -1 1 1 -1 1 1 -1 -1 -1 -1 1 1 1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

2. The eight factors

x0, x1, x2, x3, x4, x5, x1x3 and x1x4 are all different.

Confounding is as follows

1 = 25   2 = 15   3=45   4=35   5=1   2=34   13 = 24  and   14 = 23

using the notation of Table 2.24 in the printed text. This can easily be seen by checking the columns in the answer to question 1, for example, the column for x1 is identical to that for x2x5 hence the relationship 1 = 25 and so on.

3. The design matrix, D, is given by

 x0 x1 x2 x3 x4 x5 1 -1 -1 -1 -1 1 1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 1 1 -1 -1 1 1 -1 -1 1 1 1 1 1 -1 1 -1 -1 1 -1 1 1 -1 -1 1 1 1 1 1 1

4. Hence

b = (D'.D)-1 .D' .y

giving an equation of

y = 90.5 + 18.75x1 + 27.75x2 + 5.0x3 – 26.0 x4 + 31.0 x5

The significance can be assessed simply by the size of the coefficients, since they vary over identical ranges in the coded data. Note that the overall values of NO vary between 26 and 176 mg MJ-1 or 150 mg MJ-1. Hence, for example, the first factor (the load) on average accounts for 2´ 18.75/150 (=2´ coefficient/range) of the variability between the highest and lowest levels or 25% of the variability (the factor of 2 is because difference between the coded levels is 2 and not 1).

It would appear the NH3 has little practical effect, the air / fuel ratio some influence, the other three factors being all of approximately similar and fairly high significance.

The t-test could also be used, but because errors are unlikely to be normal, the main aim is to give the experimenter guidance as to which factors are important to control, and whether the influence is positive or negative.

5. The calculation is presented below

 True response Predicted response Residual 109 96 13 26 19.5 6.5 31 37.5 -6.5 176 189 -13 41 54 -13 75 81.5 -6.5 106 99.5 6.5 160 147 13 Sum of squares of residuals 845 Root mean sum of square 20.56 Average of raw data 90.5 Percentage root mean square error 22.71

Note that there are only 2 degrees of freedom to determine the root mean square residual error, hence the value of 20.56 not 10.28 which would be obtained if dividing by 8 rather than 2. The percentage error is 22.71%. Although it is possible to interpret this in a more detailed statistical manner, the nature of the data probably precludes this. It is likely to be sufficient to inform the experimenter that the predictions are accurate to within 20%. A more detailed model would probably require a different experimental strategy.

## Microsites

Suppliers Selection
Societies Selection