Why Report Scaled Scores? - Professional Testing Blog

Our Blog

From the Item Bank
The Professional Testing Blog

Why Report Scaled Scores? November 30, 2017 \| By Dustin Shullick \| Raw Scores vs. Scaled Scores When reporting examination scores, one of the big decisions that must be made is on how to report them – with raw scores or scaled scores. Most examinations are initially scored with raw scores. Raw scores are scores without any sort of adjustment or transformation, which is simply the number of questions the candidate answered correctly. Raw scores do not always present the full picture to candidates because they do not take into account qualitative factors such as difficulty of the questions or performance relative to other candidates. Raw Scores and Passing Standards Input from subject matter experts who review the difficulty of the questions in the item pool relative to the skills and abilities of the target audience and provide guidance on where the passing score should be set is essential to the standard setting process. As a result, the actual number of questions that a candidate has to answer correctly to pass may vary from one form to another if the difficulty of the question set changes. In other words, if a candidate sees a more difficult set of questions, it is not fair to expect the candidate to be able to answer the same percentage correct as someone who sees an easier set of questions. If only percentages are reported, candidates are not able to compare their scores across time, because a higher percentage on an easier set of items does not mean that they are performing better on the exam than a lower percentage on a more difficult set of items. Without knowing difficulty of each question, raw scores are impossible to decipher in terms of their actual meaning across different examination forms. The Benefits of Scaled Scores To deal with this issue, a candidate’s raw score is often transformed into and reported as a scaled score for comparative and interpretive reasons. Candidates are held to the same passing standard regardless of which examination form they take, so scaled scores are reported instead of raw scores to provide a direct comparison of performance across examination forms and administrations. This process ensures that the passing standard communicated to candidates remains the same. Suppose that an examination has two forms, and one is more difficult than the other. It has been determined by equating that a score of 66% on examination form 1 is equivalent to a score of 71% on examination form 2. Scores on both forms can be converted to a scale so that these two equivalent scores have the same reported scores. For example, they could both be assigned a score of 350 on a scale of 100 to 500 (scales in this region are most common, but theoretically any scale can be used). How points are distributed across a scale range depends on where the passing score is set. In the above example of 350 as a common passing score, the number of raw points below the passing score are equally distributed between 100 and 350, while the number of points above that score are equally distributed between 350 and 500. It is important to remember passing scores are not arbitrarily set (i.e., 350 does NOT mean 35 percent!). Another purpose of scaling scores is that a candidate can actually determine how their performance has changed between attempts. Because the difficulty of the question set that a candidate sees in one attempt may vary from the next, reporting raw scores or percentages is meaningless in terms of a candidate’s ability to see whether their performance has improved. A lower raw score or percentage on a more difficult set of questions might actually mean improved performance if the first set of questions was a lot easier. Scaling allows a candidate to see improvements (or not) by putting all attempts on the same scale or metric. The biggest drawback of reporting scaled scores is candidate confusion in the interpretation of their scaled score. It is important that the candidate not confuse the scaling process with weighting. Each point earned on an exam is worth 1 point regardless of whether it is earned through a dichotomously scored item (correct or incorrect) or polytomously scored item (multiple points possible), and those points are scaled through a mathematical conversion that allows for comparisons of a candidate’s testing attempts across time. Even though it looks as though points are given a weight in this process, they are not weighted which is one of the few downsides of using scaled scores. A good way to avoid this confusion is to provide an explanation or rationale of the scaled scoring process to candidates with their score report. For high stakes testing, using scaled scores in reporting is an industry standard and best practice. It is a vital component in providing candidates with meaningful information that can assist them in interpreting their results and possibly improving their performance on subsequent attempts. For more information on score reporting, please see Eight Tips for Reporting Failing Test Scores on Licensing and Certification Tests and Revision of the Standards: An Advisory Note on Sub-Score Reporting Categorized in: Industry News, Test Development Comments are closed here.		Recent Posts Virtual Meetings—Good for the Interim But Less Than Ideal? – Part 2 Virtual Meetings—Good for the Interim But Less Than Ideal? – Part 1 Nudges and the work-from-home world – How to inspire your team from a distance Expectations When Developing a New Certification Examination—Be Aware So You Can Manage Your Expectations and Those of Subject-Matter-Experts The Revision of ISO/IEC 17000 Archives April 2021 February 2021 May 2020 September 2019 July 2019 January 2019 December 2018 November 2018 October 2018 September 2018 March 2018 January 2018 December 2017 November 2017 October 2017 September 2017 July 2017 June 2017 May 2017 April 2017 March 2017 February 2017 January 2017 December 2016 November 2016 October 2016 September 2016 August 2016 July 2016 June 2016 May 2016 April 2016 March 2016 February 2016 January 2016 December 2015 November 2015 October 2015 September 2015 August 2015 July 2015 June 2015 May 2015 April 2015 March 2015 February 2015 January 2015 Categories Industry News Item Type Item Writing Licensing Boards Marketing PTI Updates Regulatory Power Test Development White Paper