Professional Testing, Inc.
Providing High Quality Examination Programs

From the Item Bank

The Professional Testing Blog

 

Validity, Test Security, and the Standards for Educational and Psychological Testing (AERA/APA/NCME)

March 10, 2016  | By  | 

Simply put, validity refers to the degree to which evidence supports the inferences we make from test scores. Such inferences may be extrapolated to mean mastery over a defined content domain, acquisition of targeted skills or abilities, or work readiness. Evidence needed to establish this sometimes nebulous concept may include scores from written or performance-based tests, results from statistical analyses, such as correlation studies or factor analyses, the degree of alignment between the exam’s blueprint and the domain or constructs that the test was intended to target, a clear understanding of the population of examinees, how the test form was assembled and scored, and how the resulting scores were reported.

Validity is not a product or state of being: it is a process that results in a compelling body of evidence that the test did what it was designed to do, that the design was good, and that the resulting scores have meaning. Anything that corrupts the testing process or undermines the resulting scores also compromises that test’s validity—all the more reason to make sure your test is secure. End-users assume that the person eligible and registered to take the test was indeed the person who took it. They assume that the rules for that test’s administration were obeyed, that no unauthorized materials were used or accessed during the test session, and that there was an even playing field among examinees. They further assume that test takers made a good faith attempt to pass the test and that the examinee put the proverbial “best foot forward” with no mal-intent, no ulterior motive to gain exposure to test content and no hidden agendas. Unfortunately, when stakes are high, such assumptions are naïve, if not dangerous, to the integrity, veracity, and ultimately, interpretability of resulting scores.

So what are the “rules” of test taking? Where do test developers look for guidance when designing tests and the rules that define them? An important place to start is with the Standards for Educational and Psychological Testing, which is jointly produced by the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education (AERA/APA/NCME, 2015).

Updated in 2015, the Standards “provide criteria for the development and evaluation of tests and testing practices and provide guidelines for assessing the validity of interpretations of test scores for the intended purposes” (AERA/APA/NCME, 2015). This collection of criteria and guidelines address the rights and responsibilities of test takers and test developers/users, and call into consideration the issue of cheating and the role of test security in formulating a validity argument. For example, Standard 6.6 specifically states that “active efforts to prevent, detect, and correct scores obtained by fraudulent or deceptive means” should be taken.

Standard 6.6: Reasonable efforts should be made to ensure the integrity of test scores by eliminating opportunities for test takers to attain scores by fraudulent or deceptive means.

It is this standard that launches the conversation about the various policies and procedures that might be considered for a given test program. From establishing eligibility requirements to candidate identification requirements, from design issues such as item types, test administration windows and other operational decisions to exam-day procedures and post-exam data forensics, the Standards provide thoughtful guidance and direction.

How should secure materials be handled? See Standard 7.9. What are reasonable expectations regarding test takers’ responsibilities? See Standard 8.9. How might scores resulting from irregularities or candidate misconduct be handled? See Standard 8.11. Protection of copyrights? See Standards 9.21 and 9.22. Messaging to examinees about disclosing proprietary content? See Standard 9.23.

Regardless of whether testing is conducted within the genre of credentialing (licensure or certification), educational purposes, or psychological testing, the issue of test security quickly becomes an issue of fairness, and as such, has direct implications on validity (see the background section of AERA/APA/NCME Chapter 11). Clearly, if one examinee has an unfair advantage over another examinee and a selection (or admission) decision is made using the first examinee’s ill-gotten score, then not only is the playing field uneven, but a lesser qualified candidate may be perceived as being more qualified than they are. As a result, a sub-optimal decision (promotion, acceptance into a program, graduation, awarding of a credential) may be made and pubic/personal health, welfare or safety may be compromised.

The AERA/APA/NCME Standards are a useful resource that provide respected guidance to test developers and test/score users. Using the Standards as a point of reference when developing an exam program and associated test security plan aligns policies and procedures to recognized and respected external criteria. This alignment enhances confidence in the inferences made from test scores and the decisions made from those inferences. Test security, validity, and the Standards are (or should be) intrinsically linked.

Tags:

Categorized in:

Comments are closed here.