Professional Testing, Inc.
Providing High Quality Examination Programs

From the Item Bank

The Professional Testing Blog


Tools of the Trade

February 15, 2017  | By  | Leave a comment

What is the role of a psychometrician and software in the development and maintenance of a certification examination program?

As another Psychometric trade conference approaches, we receive many advertisements for revolutionary software that will assist certification bodies, reduce their costs and run the necessary reports to assist with making exam program validation claims. These ads are intriguing and some of the software is very good.

With that said, and given the high-stakes nature of our business, psychometric results should be reviewed by someone with experience. Such interpretations are critical to the constant improvement of a testing program. Interpretations also help identify threats to a testing program.  A psychometrician will help a certification body understand the data, interpret results, and apply findings in a way that benefits the examination within the context of the overall program.

A good example of this is the interpretation of item and test statistics. Many of us are familiar with basic item statistics such as p-values (% correctly answering an item correctly) and item discrimination indices (a statistic indicating an item’s ability to differentiate low ability candidates from higher ability candidates). There are many factors that affect these statistics and these statistics affect other facets of an exam’s performance. Below is just a sampling of factors to consider when reviewing item and exam statistics:

  1. What is the sample size and how does that affect statistical interpretations?
  2. What type of item discrimination statistic is used and how does that affect interpretation?
  3. How does item option performance affect item statistics?
  4. How do easy and hard items affect the variability of scores?
  5. How does the variability of scores affect reliability estimates or standard error estimates?
  6. How does the placement of a passing standard within a score continuum affect reliability of decisions?
  7. How can one best optimize the performance of an assessment based on item and test statistics while considering a budget (e.g., item review meetings, new form development, item security, item exposure)?
  8. How does the review of problematic items affect the item bank, the current exam form and scoring decisions?
  9. How do the statistics, and their calculations, affect the reporting of sub-scores and their interpretations.


Much of the value in data and statistics is how it is used within the parameters of a program—the pool of candidates, scoring decisions, volume of test-takers, programmatic goals etc.  The value of a psychometrician is understanding the characteristics of a certification body and its program elements, and will guide their clients in the right direction.  Using data and statistical analyses in a meaningful way is a useful tool in mitigating threats to validity and strengthening exam programs.

The Standards for educational and psychological testing (2014) developed by the American Educational Research Association, American Psychological Association and The National Council on Measurement in Education provide this general definition: “Validity refers to the degree to which evidence and theory support the interpretation of test scores for proposed uses of test.” The standards have over a dozen chapters providing guidance on process and empirical evidence that should be collected to help make validity claims. Empirical evidence must be interpreted correctly. Empirical evidence is only part of the validation process. Correctly interpreting the data in the validation process is what makes a validation strategy meaningful.



Tags: , , , ,

Categorized in:

Leave a Reply

Your email address will not be published. Required fields are marked *