Not just psychometric babble
This article is Part 2 of a two-part series related to the development and legal defensibility of competence examinations used by government as one criterion of licensure eligibility. Read Part 1.
Your organization, the Association of Social Work Boards, recently published a significant amount of data related to the entry-level competence examinations that governmental social work boards rely on as one criterion for licensure eligibility. These examinations allow applicants for licensure an opportunity to demonstrate entry-level competence through a validated instrument administered on a uniform basis. Of importance to note: Exam candidates voluntarily disclose demographic data, and these data are not subject to verification under law. There has been much reaction to the data, which includes information about the disparate performance of some self-identified ethnic and racial groups. The purpose of this article is to explain the examination development and validation process. Be assured that your organization adheres to and maintains a valid examination program for use by member boards. The process is highly technical, so bear with me as we navigate the details.
High-stakes licensure examinations
High-stakes licensure examinations have been used in the professions for decades to assess an applicant’s entry-level competence as part of licensure eligibility determinations. Their longevity is attributable to both their accurate assessment of competence and their validation based on industry standards. Virtually all regulated professions use these examinations and follow a process that ensures their defensibility.
As readers are aware, government licensure is intended primarily to protect the public by applying the statutes to the process of licensure. Further, government regulation of the professions creates equity by setting criteria in statute that must be uniformly applied to the process. Licensure examinations adhere to the principles of uniform application and allow all test-takers the opportunity to demonstrate entry-level competence. Uniformity based on industry standards eliminates subjectivity in the decision-making process. Government licensure is bound by the constitutional principles of due process and equal protection. Conversely, the private sector is not bound by such principles and, as a result, introduces subjectivity in its processes.
High-stakes licensure examinations are developed, administered, scored, and maintained following rigorous industry standards (referred to here as the Standards) that validate their ability to measure competence. The National Council on Measurement in Education, the American Educational Research Association, and the American Psychological Association develop and continually update the Standards. Adherence to the Standards provides the legal defensibility necessary to protect the government agencies delegated with the authority to issue and renew licensure.
Licensure examinations are criterion referenced, meaning that candidate results are based on the test-taker’s demonstration of competence, not on performance by peers (e.g., on a curve). In a criterion-referenced exam, all test-takers pass (or fail) the examination as a demonstration of knowledge, skills, and abilities. Testing facilities standardize testing to ensure that all test-takers experience the same test center environment, check-in process, test delivery process, on-site scoring, and test result reporting. In fact, the facilities are standardized right down to the floor plan and the placement of plants. This standardization reinforces exam defensibility and fairness to candidates.
Turning our attention to the examination development process, it all begins with a practice analysis—a survey of the profession identifying what new practitioners do, how often they do it, and how important each task is when first entering the profession. The survey is made available to social workers in the United States and Canada. Based on the results of the survey, the content areas of the exam are identified and used to construct the exam blueprint that forms the basis for item development. (Items are the questions that are asked of the test-takers.)
Item development is a process wherein social workers are chosen to create a pool of item writers that reflects diversity of practice area, geographic location, gender, race, ethnicity, age, licensure category, and other criteria. They write examination questions after receiving training in this specialized discipline. Based on their qualifications, trained item writers are assigned certain content areas and contracted to produce an identified number of items. Writers submit their items for review and editing to consultants experienced in the field and with item development. Consultants take into consideration all factors related to the test-takers and review for language and words that may be misleading or driven by cultural conceptions. If needed, the consultant returns items to the item writer for further development.
After items have been drafted and revised, they are presented to one of the ASWB examination committees. The committees represent discrete licensure categories and examinations and are composed of content experts with diverse expertise in areas of social work such as academia or other practice environments; practice category; geographic location; gender, ethnicity, race, and age; and other factors. The committee role is to collectively review the items and engage in a discussion that results in the item moving on to pretesting, being returned to the writer for further work, or being rejected and removed from consideration.
Items that graduate from committee review move to the pretesting phase. Pretesting allows the statistical performance of every single item to be analyzed before the item is included on an exam as a scored item. To review, the ASWB examinations contain 170 items, of which 150 are scored for purposes of determining the test-taker’s pass-fail status. The remaining 20 are not scored; instead, they are statistically analyzed for performance. Pretest items that perform well according to psychometric standards eventually become scored items that count toward pass–fail determinations on future exams. Poorly performing pretest items are either deleted or returned to the committee review process.
Test-takers do not know which items are scored and which items are in pretest. In addition to the statistical analyses, exam items undergo continual performance analysis based on demographics, referred to as differential item functioning or DIF. Although the test-taker demographics are voluntarily disclosed and therefore unverified, ASWB uses these data in the DIF process as an added measure of validation. Items that show DIF are deleted from the item banks. Of interest, in 2021, ASWB administered more than 60,000 examinations; fewer than 5 percent of items showed DIF and were deleted before becoming scored items. This result is a testament to the rigor of ASWB’s exam development and validation process.
The described development and item performance process, which adheres to item response theory, ensures that the performance of every single item is analyzed and statistically validated. The final step is populating the examination form with items specific to each content area. ASWB has multiple forms, or versions, of the exams live at any given moment. Using multiple forms protects exam security and prevents multiple test-takers from receiving the same form in a particular test center. As a further exam security measure, the order of items on each examination form are randomized so that test-takers cannot compare items either during or after exam administrations.
A word on exam scoring
Consistent with the Standards, ASWB engages in a standard-setting exercise to establish the passing score for the anchor examination that is created after the blueprints are developed. A passing standard on a high-stakes licensure examination is not based on a percentage of correct responses. Standard-setting is a complex process not easily described in a newsletter article. Suffice it to say that a passing score study panel engages in an iterative process to determine the likelihood that a minimally competent social worker would answer each item correctly. After many rounds of analysis, the panel recommends the number of items that must be correctly answered to ensure that the test-taker possesses entry-level competence. Once the anchor examination passing standard is set, it is used to calibrate the passing standard of every other form based on the difficulty of items on that form as established through the statistical analyses of item response theory. Thus, each form may have a slightly different raw passing score.
Education and exam: Equally important yet different licensure eligibility criteria
Readers are encouraged to consider the difference between educational and examination components in a licensure process. Remember that the statutes call for the successful completion of an education and examination as prerequisites to government-issued licensure. Licensure examinations are not capstones to an educational program. Rather, they assess entry-level competence based on a survey instrument and validated through the process described above. They assess competence at that point in time. An educational program uses academic freedom and student–professor relationships over a multiyear period to prepare students for a lifelong career in the social work profession. Each component is vital to licensure eligibility decisions and, ultimately, public protection.
ASWB is an organization of government social work boards that are delegated with the authority to regulate the profession in the interest of public protection. Regulators are reminded of the need to understand the roles of education and the exam in fulfilling the intent of the legislation.
The ASWB data publication has caused many social workers to jump to unjustified conclusions related to the examination program. Emotionally reactive responses are causing some individuals to point to the examinations as the basis for societal issues that have long existed. Government-mandated licensure is enacted for a reason. Diluting the fundamental criteria necessary for licensure and consumer protection will have consequences. The entire social work community is encouraged to self-reflect on the various roles played and assess how each organization can use its own data to address the societal issues in need of attention. ASWB is continuing its evolution and is excited to take additional action steps, including exploring professional competency, revisiting examination structure, and engaging additional voices in exam development. Please visit the ASWB website for more details.