A reference test approach for moderating internal assessment at the upper secondary school level.
Degree GrantorUniversity of Canterbury
Degree NameMaster of Arts
The main purpose of this study was to investigate the suitability of reference tests for moderating internally assessed national qualifications at the upper secondary school level. In a secondary analysis, the relative merits of two alternative item formats, the open-ended and cloze, were compared with the multiple-choice item, which has been traditionally utilized in reference tests of this nature. A series of short reference tests, based on the underlying construct of developed abilities, were constructed in four core subject areas (i.e. English, mathematics, science and social studies), along with an additional test of scholastic aptitude. The English, science and social studies tests consisted of a vocabulary and reading comprehension component; while the mathematics test had a more traditional content, relating to the measure of general concepts. An essay test was added to the English test analyses. Multiple forms of the developed abilities tests were developed as separate multiple-choice and open-ended/ cloze formats, and to enable a multiple matrix sampling technique to be employed. The validity of the reference tests was evaluated by using the performance of Christchurch fifth formers on the tests to predict their corresponding School Certificate Examination class parameters (i.e. mean and standard deviation). These analyses were based on a sample of 18 classes, across four state, co-educational high schools; covering a wide range of ability levels. A series of mUltiple regression analyses were conducted to provide optimal predictions of the respective class parameters. It was found that each of the subject-based reference tests predicted class ability levels (i.e. means) on the corresponding School Certificate Examinations with a very high degree of sensitivity. The multiple-R's generated were 0.97 for mathematics, 0.90 for English, 0.89 for science and 0.80 for social studies. The predictions of the spread of ability for each class (i.e. standard deviation) were found to be more difficult, although the results were still sensitive enough for moderation purposes. The addition of the mathematics or scholastic aptitude test to the subject-based reference tests improved the multiple-R's on both parameters. The comparison of item types revealed no significant difference in the prediction of class means. However, the open-ended/ cloze format failed to predict class standard deviations at a statistically significant level. The findings were discussed with reference to those from earlier studies, to policy implications, to application at a practical level and to the urgent need for further research.