Combining constructed response items and multiple choice items using a hierarchical rater model
by Kim, YoungKoung, Ph.D., COLUMBIA UNIVERSITY, 2009, 162 pages; 3373771

Abstract:

Many large-scale assessments have implemented tests that include both constructed response (CR) items and multiple choice (MC) items. Compared to scoring MC items, scoring CR items requires raters and thus introduces an additional, subjective layer to the scoring process. The use of raters in scoring CR items raises issues with respect to how to use scores from CR items along with scores from MC items.

The present study explores an approach to combining scores from CR and MC items via an extension of a hierarchical rater model (HRM). The extended HRM incorporates a latent class signal detection theory (SDT) model, which provides a useful model of rater behavior, in the first level of the model, whereas the second level relates the latent classes of the SDT model to examinee ability using a item response theory (IRT) model. In addition, scores from MC items can be used as direct indicators of ability in the second level of the HRM

SDT

model.

Simulations and analysis of real world data were conducted to examine the performance of the HRM

SDT

. The simulations showed that the rater parameters were accurately recovered for versions of the HRM

SDT

with or without MC items. The results also showed that adding MC items improved estimation of the rater parameters, and greatly improved estimation of the CR item parameters. In addition, increasing the number of CR items considerably improved estimation of the CR item parameters, but only for the HRM

SDT

without MC items. Thus, one can accurately evaluate CR item characteristics by either including MC items in the model or by adding more CR items.

The study also found that ability estimation using both CR and MC items was noticeably better than when only CR items were used. Compared to other approaches to combining CR and MC items, the approach via HRM

SDT

yielded the best estimation of ability. For example, it was found that the HRM

SDT

model provided the best weighted composite for MC and CR items, as compared to commonly used weighting schemes. Thus, the HRM

SDT

model appears to offer advantages over simply using arbitrary composite weights.

 
AdviserLawrence T. DeCarlo
SchoolCOLUMBIA UNIVERSITY
SourceDAI/A 70-08, p. , Nov 2009
Source TypeDissertation
SubjectsEducational tests & measurements; Statistics; Quantitative psychology and psychometrics
Publication Number3373771
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3373771
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.