Professional Testing, Inc.
Providing High Quality Examination Programs

From the Item Bank

The Professional Testing Blog

 

What is Measurement Error and What is its Relationship to Reliability?

October 13, 2016  | By  | 

When discussing the statistical properties of an exam, one may hear the term “error” or measurement error used by psychometricians. Error can be considered information contributing to a person’s exam score beyond the person’s true or actual ability. So from computational and classical test theory perspective, error is

E = O – T               or            Error = Observed – True

The Observed score is the actual score on the exam and True score is the person’s actual ability. Error is the difference between observed and true scores.

Error can be random or systematic. According to Crocker and Algina (1986), Introduction to Classical and Modern Test Theory, random error typically occurs during one administration. This may include guessing, misreading a question or a candidate not feeling well. In this instance of random error, the error would not likely occur during a subsequent administration. Systematic errors are typical attributes of the person or the exam that would occur across administrations. These errors typically do not have much to do with the content being measured. An example could be an exhaustive item with excess verbiage that asks a simple math problem and the simple math problem is what is intended to be measured, not the candidate’s ability to sort through the verbiage.

Why is measuring error important?

Reliability, theoretically speaking, is the relationship (correlation) between a person’s score on parallel (equivalent) forms. As more error is introduced into the observed score, the lower the reliability will be. As measurement error is decreased, reliability is increased. With that said, administering two forms of an exam to one candidate to calculate reliability is not practical.

Because creating perfectly parallel exam forms and administering two forms for a given candidate is not practical, we estimate reliability using a single form methodology. Coefficient Alpha, and its variations, has been a popular reliability estimate used for single exam form administrations.

Cureton (1958), in Crocker and Algina (1986), Introduction to Classical and Modern Test Theory, discusses how a single form reliability estimate is appropriate. A reliability estimate should be the relationship between True score variance to Observed score variance.

Why is all of this important? Validity.

If systematic errors occur, there is a threat to the validity of the exam program. Having an exam that measures something other than what it is intended to will result in inaccurate exam results and inappropriate score interpretations. This would be considered a major flaw in the program.

Developing certification test requires many steps. It is these steps that reduces the probability of systematic error. This includes defining the content of the exam (i.e., job analysis), writing and reviewing items with multiple and separate SMEs, conducting standard setting and equating studies, reviewing item and exam form statistics,  and conducting proper scoring procedures. It is very important that organizations develop certification exams using best practices to avoid threats from error.

 

Image Attribution: Onderwijsgek at nl.wikipedia

Tags: , , ,

Categorized in:

Comments are closed here.