Organization of Test Reviews for the Mental Measurements Yearbook Series

When preparing your review, it would be most helpful if you would organize your review using the following categories. By using this set of categories for organizing your review, we hope that reviews will be easier for our readers to follow and understand. Also, by using the same categories, it is more likely that the reviews will cover similar information about the reviewed tests, aiding our users in making comparisons among different tests they may be considering. We recognize that there may be situations where a category may not be applicable for a particular test review, but we expect that these categories will be appropriate for the majority of tests reviewed in the Mental Measurements Yearbook.

There are 5 general categories that we have identified for test reviews:

  1. Description
  2. Development
  3. Technical
  4. Commentary
  5. Summary

Review Category Descriptions

Reviewers are asked to organize their test review using 5 general categories. The purpose of this document is to identify the kinds of information that reviewers usually include when addressing each of these general categories. Each category should be concise. The total length of a typical review should be 1000-1500 words.

1. Description

In this section, a general description is given for the test. The purposes of the assessment, identification of the target population, the intended uses of the test could be presented. In addition, information about administration of the test should be summarized along with information on the scores and scoring procedures.

2. Development

Information in this section reviews how the test was developed, what underlying assumptions or theory guided the decisions about how to define the construct, and details on item development. Results of pilot testing would be discussed in this section. In addition the reviewer might comment on any steps that were undertaken in the selection of the final set of items for the test and any evaluations of the appropriateness of these items for measuring the construct(s) of interest.

3. Technical

This section can be divided into three categories -- standardization, reliability, and validity.

In the standardization section, information about the norm sample is presented, including how well this sample matches the intended population. Appropriateness of the norms for different gender or ethnic/culture groups could also be discussed.

In the reliability section, evidence for score consistency should be presented. The types of reliability estimates and their magnitudes should be presented in a summary fashion. Comments about the acceptability of the levels of reliability, the sample used for determining these estimates, and related issues are pertinent to this category.

In the validity section, the interpretations and potential uses of test results should be addressed. Studies designed to gather evidence of valid uses should be summarized. Information about test content and the adequacy of testing measures of the intended construct also should be presented. If the test is intended to be used to make classifications or predictions, evidence should be described in this section. In addition, reviews should examine the differential validity of the test across gender, racial, ethnic, and culture groups (including differential item functioning if not addressed in the Development section). Comments about the acceptability of the evidence presented to support test score interpretation and use belong in this category. Consistent with current measurement standards, a test is not deemed "valid" in and of itself; it is instead the uses of the test results that can be substantiated and how well test results meet the intended purposes of the test.

4. Commentary

This section provides an opportunity for the reviewer to address the overall strengths and weaknesses of the test. The adequacy of the theoretical model supporting test use should be summarized, as should the impact of current research on the test's assumptions.

5. Summary

In no more than six or seven concise sentences, the reviewer is to provide conclusions and recommendations about the quality of the test. The summary should be as explicit as possible. If another test should be considered for use, that test should be listed, cited, and referenced.

Read a sample review.