Multivariate numberical analysis of Raoulia subgenus raoulia
Degree GrantorUniversity of Canterbury
Degree NameDoctor of Philosophy
Through the sequential analysis of a data set, comprising 86 OTUs of Raoulia subg. raoulia measured on 98 characters, different multivariate numerical manipulations are performed, compared and assessed. The creation of additional characters derived from sampled characters (ratios) is investigated. The univariate distributional properties of these derived characters, are seen to be mainly non-normal, although strategies for minimising this in terms of the choice of numerator and denominator are advocated. To include a ratio and its two constituent characters in a data set will invariably lead to the multiple inclusion of the same information beyond a level that might occur with normal inter-character correlations in a biological data set. Through the exploration of the multiple and bivariate correlation coefficients it is shown that simple calculations will reveal an optimum strategy in terms of removing one of the three characters. Using three shape ratios independently as examples and developing a discriminant function for the 13 a priori defined species in the data set, based on the three characters numerator, denominator, and the ratio, it is shown that the ratio per se has the superior correlation with the grouping variable in each instance. Using the multivariate moments of skewness and kurtosis, the multivariate normality of the taxon based on the 63 continuous characters was assessed. This normality of the taxon was seen to be dependent primarily on the character number and the status of the OTUs most distant from the centroid. The repercussions of the removal of the outlying OTUs and the status of these six outliers was further explored by cluster analysis. Using a reduced 39-character data set the individual group normality of both portions of three successive dichotomies was assessed. It was seen from this that a reasonable range of Mahalanobis D2s about a centroid was essential for multivariate normality. It was further shown, via a pair-wise discriminant analysis on each of the three partitions, that OTUs disturbing the multivariate normality of a single group are not necessarily those that are misclassified within the prescribed groups. The inter-group associations revealed by the canonical variate plots and the jackknife iYlahalanobis D2s indicated the possibility of amalgamating a number of the species groups. The detection of outlying OTUs on the basis of large relative minimum jackknife Mahalanobis D2s was compared with the detection via the earlier single-group analyses this showed that even apparently extreme outliers were providing some not inconsiderable stability to their hypothesised groups, and that their removal could be an extreme course of action. In order to reassess the above changes using the entire data set and to reach a conclusive grouping strategy a new method was proposed as being appropriate to such circumstances . This method allows for the independent summation of the univariate X2 and F ratios approximated by X2 values, of each character based on a given groupiug strategy. These values were recomputed for an alternative strategy and the difference between the sums compared to the X2 distribution, with the change in the degrees of freedom as the degrees of freedom. Given the 'final' groups dermed by the previous analysis, characters were extracted to form a diagnostic hierarchy of dichotomous subdivisions.