HomeForum PhotosBeta siteAd Starcraft II

STATISTICAL METHODS FOR ANALYSIS THE QUALITY OF

TEST QUESTIONS AND TASKS.

Farzon Nosiri

e-mail: farzon@kth.se

 

During my diploma project at university I was involved to establish the set of the programs to automate the testing system at Khujand Branch of Technological University of Tajikistan. My task was to assess the quality of test questions and tasks. Using mathematical model of the classical theory of the tests there was analysis conducted on the results of the exams of the spring term 2009. Different models and methods of the analysis of test questions and tasks were studied. Appropriate statistical and analytical characteristics were applied to the current system.

Analysis of the tasks on the coefficient of difficulty and easiness. These are the first characteristics which included to the module of analysis. It can be called Index of Easiness of the task (IE) and Index of Difficulty (ID):

 

                         (1.1)

 



where,        xavgj – average of grades, received by all tested students for the fulfillment of j-task,

xmaxj maximum possible number of marks can be taken for fulfillment of j-task,

N -  the number of tested students.

The importance of quantitative characteristic of difficulty of the tests is that making tasks to be able to differentiate tested students by the level of their preparedness, the difficulty of the tasks  should appropriate to the level of preparedness of the tested students. Generally test should include the set of the different tasks – from the easiest to the difficult ones. However very easy tasks, when everybody gives right answer and very difficult tasks when nobody gives right answers do not have ability to differentiate tested students by their level of preparedness and that is why these type of tasks can’t be considered as testing tasks.

The other characteristic is Dispersion (variation) of the results of the testing tasks, which calculates as:

 

                                                  (1.2)

 

This characteristic shows spread of grades, received by N being tested during the answer to the exact (j) task of the test. If all tested students answer to the question similarly, then the spread of grades will be 0. The tasks with zero or very low value of dispersion have very low ability to divide tested students by their preparedness and therefore need to be excluded from the test. If dispersion is high, then the quality of the test is higher.

One more very important statistic characteristic with ability to differentiate the test tasks is Coefficient of Differentiation(CD). Coefficient of Differentiation is calculated as:

                                                

                (1.3)

where

where,

 – dispersion of summary results of tested students for fulfillment of all test tasks;

Savg − average of marks received by all N tested students for the whole test;

si − the sum of marks of i-tested student for fulfillment of all test tasks.

This characteristic is a coefficient of correlation of the range of answers, received from tested students on the exact task with the result of the whole test received from the same students. This characteristic can be between -1 and +1 and also is a measure of ability of exact task define strong and  weak students. Positive values of this characteristic respond to the tasks which define “strong” and “weak” students indeed. Negative values of this characteristic show that not prepared students answer to this task in average better then well prepared students. Obviously, these kind of the tasks are not well stated and can’t be as test tasks and should be sort out from the test.

The next characteristic of the test analysis is graphical view, i.e. diagrams. This characteristic divides the students into 3 or 4 groups according their answers in the test.  The diagram shows general average percentage of each group of correctly answered students by each task separately.  The good test task will have diagram as rising diagonal. If diagram has a form of zigzag or divergence, the test can be considered as not qualitative.

 Spread of students will be done according 2 criteria: the first criteria divides the students based on previous grades. The second criteria divides the students based on the grades received in the current test.

          According to these methods there was a program written in PHP language by me. This program was tested, debugged and currently it is on use at Khujand Branch of Technological University of Tajikistan. Although automated testing process is popular in other world universities, it is new for Tajik universities and other schools can adopt this system too.

References:

1. Rasch G. Probabilistic Models for Some Intelligence and Attainment Tests. Copenhagen, Denmark: Danish Institute for Educational Research, 1960.

2. Item Analysis in “Moodle”. http://opp.psy.msu.ru/help.php?module=quiz&file=itemanalysis.html

3. Minin M.G., Stas N.F., Zhidkova E.V., Rodkevich O. B. Statistical analysis of test quality, applied to control knowledge in chemistry. Russia: News of polytechnic university of Tomks. 2007. C. 310. No. 1

Fikrona and Co. If you want to contact about advertisement, you can go to forum and post a thread.
All rights reserved®.