Solution 4.13 - Chemometrics: Data Analysis for the Laboratory and Chemical Plant
Education Article
- Published: Jan 1, 2000
- Channels: Chemometrics & Informatics
1. The eigenvalues are as follows.
Eigenvalues |
363256.56 |
1642.37 |
735.51 |
152.78 |
42.05 |
21.86 |
The sum of 365851.1 comes to the sum of squares of the original data.
2. The graph of eigenvalue against component number is as follows.
This is not very valuable as the data is dominated by the first eigenvalue which primarily represents size. The logarithmic plot shows the trend much more clearly.
3. This task is performed in several steps.
The loadings when sample 1 is removed are as follows.
The predicted scores for sample 1, using the model of samples 2 to 6 are as follows, for 5 PCs.
The results of predicting the measurements on sample 1 with increasing number of PCs are as follows.
This should be compared to the original data.
4. The sum of square errors for each object and varying PCs are summarised below, together with the overall sum.
5. The autopredictive errors are simply the difference between the overall sum of squares and the sum of the eigenvalues and are given as follows.
The RSS and PRESS values are given below.
There are certainly no more than 4 PCs as the PRESS value for 4 PCs comfortably exceeds the RSS for 3 PCs. However, the ratio comfortably exceeds 1 when using 4 PCs, so this suggests that 3 PCs are adequate for the model. Note that there are a variety of different approaches for cross-validation and various methods may make slightly different conclusions.