摘要
在认知诊断评估实践中,属性层级合理性的验证非常重要,而现有指标仅停留在0-1计分测验,无法适应考试形式和评分方式多样化的实践需求。研究将0-1计分层级一致性指标(MHCI)拓展至多级计分的层级一致性指标(GHCI),模拟和实证研究结果表明:(1)GHCI具有和MHCI相同的本质含义,考虑了父项目和子项目得分的多种可能性,从而将MHCI纳入GHCI体系;(2)在多级或混合计分情境,MHCI会有信息损失,容易发生低估,且易受转换比例的影响;(3)GHCI在模拟和实践情境均具较好的适宜性,拟合截断值的设置可依属性层级而定。
In educational practice of Cognitive Diagnostic Assessment(CDA), it is crucial to validate the reasonability of the hierarchical structure of attributes, because it can affect the quality of CDA and the accuracy of classification of examinees directly. Several indices that validate the reasonability of the hierarchical structure of attributes have been developed by researchers. Based on Attribute Hierarchy Method(AHM), Cui, Leighton, Gierl and Hunka(2006) developed Hierarchy Consistency Index(HCI) to detect the degree to which an observed examinee response pattern is consistent with the attribute hierarchy; also Ding, Mao, Wang and Luo(2011) modified the HCI, then developed a new index-Modified HCI(MHCI). In addition, Guo(2012) proposed the hierarchy misfit index(HMI) to detect misfitting item response vectors. Although these indices can be used to detect the misfits, all of these indices are suitable for dichotomous items.For polytomous items, in order to validate the reasonability of Attribute Hierarchy(AH), researchers, in general, transform these polytomous items into dichotomous items according to some prespecified rules, then calculate the HCI(Kang, Wu, Chen, & Zeng, 2015; Kang, Xin, & Tian, 2013) or MHCI(Ding et al., 2012). However, it will lose some details when transform polytomous items into dichotomous, therefore produce larger errors(Ding, Wang, & Luo, 2014) and underestimate the consistence of AH. The purpose of this study is to extend the Modified HCI(MHCI) to a new HCI that is suitable for both dichotomous and polytomous items, and we name the new HCI as Generalized Hierarchy Consistency Index(GHCI). To evaluate the suitability of GHCI, a simulation study and an empirical study are employed.For the simulation study, to compare the GHCI and MHCI under different conversion ratio, two independent variables are manipulated: type of AH and proportion of transformation. The AH has 4 levels(linear, convergent, divergent, and unstructured) and the proportion of transformation has 5 levels(GHCI, 60%MHCI, 2/3 MHCI, 75%MHCI, 100%MHCI). The control variables are the number of attributes, K = 5, and the number of examinees, N = 2000. Matlab R2013 a software was applied to generate examinees' response matrix and compute the GHCI and MHCI. Results showed that:(1) Both GHCI and MHCI were affected by the AH type, and the linear AH had the best GHCI and MHCI, then it was convergent AH, after that was divergent AH and the unstructured AH was the worst.(2) GHCI had larger means than MHCI regardless of AH and proportion of transformation. The empirical study shows the same consistent pattern with the simulation study.
引文
丁树良,毛萌萌,汪文义,罗芬,Cui.(2012).教育认知诊断测验与认知模型一致性的评估.心理学报,44(11),1535-1546.
丁树良,汪文义,罗芬.(2014).多级评分认知诊断测验蓝图的设计--根树型结构.江西师范大学学报(自然科学版),38(2),111-118.
康春花,任平,曾平飞.(2015).非参数认知诊断方法:多级评分的聚类分析.心理学报,47(8),1077-1088.
康春花,吴会云,陈婧,曾平飞.(2015).小学数学“图形与几何”认知诊断测验的编制.教育测量与评价,10,4-8.
康春花,辛涛,田伟.(2013).小学数学应用题认知诊断测验编制及效度验证.考试研究,6,24-43.
康春花,杨亚坤,钟晓玲,曾平飞.(2016).四年级数学应用题Q矩阵的适宜性.江西师范大学学报(自然科学版),40(4),369-376.
李娟,丁树良,罗芬.(2013).基于等级反应模型的广义距离判别法.江西师范大学学报(自然科学版),36(6),636-639.
罗欢,丁树良,汪文义,喻晓锋,曹慧媛.(2010).属性不等权重的多级评分属性层级方法.心理学报,4,528-538.
马珂.(2014).分数概念的认识及其教学研究.首都师范大学硕士学位论文.
毛萌萌.(2011).引进粒计算与形式概念分析技术的认知诊断研究.江西师范大学博士学位论文.
田伟,辛涛.(2012).基于等级反应模型的规则空间方法.心理学报,44(1),249-262.
涂冬波,蔡艳,戴海琦,丁树良.(2010).一种多级评分的认知诊断模型:P-DINA模型的开发.心理学报,10,1011-1020.
张淑梅,包钰,郭文海.(2013).一种多级评分的广义认知诊断模型.心理学探新,33(5),444-450.
祝玉芳,丁树良.(2009).基于等级反应模型的属性层级方法.心理学报,41(03),267-275.
祝玉芳,王黎华,丁树良,汪文义.(2015).多策略的多级评分认知诊断方法的开发.江西师范大学学报(自然科学版),39(4),371-376.
Alves,C.(2012).Making diagnostic inferences about student performance on the alberta education diagnostic mathematics project:An application of the attribute hierarchy method.Unpublished Ph D dissertation,University of Alberta.
Bolt,D.,&Fu,J.B.(2004).A polytomous extension of the fusion model and its Bayesian parameter estimation.Paper presented at the annual meeting of the National Council on Measurement in Education,San Diego,CA.
Cui,Y.,&Leighton,J.P.(2009).The hierarchy consistency index:Evaluating person fit for cognitive diagnostic assessment.Journal of Educational Measurement,46(4),429-449.
Cui,Y.,Leighton,J.P.,Gierl,M.,&Hunka,S.(2006).A person-fit statistic for the attribute hierarchy method:The hierarchy consistency index.Paper presented at the annual meeting of the National Council on Measurement in Education,San Francisco,CA.
Gierl,M.J.,Wang,C.,&Zhou,J.(2008).Using the attribute hierarchy method to make diagnostic inferences about examinees'cognitive skills in algebra on the SAT.Journal of Technology,Learning,and Assessment,6(6),1-53.
Guo,Q.(2012).The hierarchy misfit index:Evaluating person fit for cognitive diagnostic assessment.Unpublished Master dissertation,University of Alberta.
Sun,J.,Xin,T.,Zhang,S.M.,&de la Torre,J.(2013).A polytomous extension of the generalized distance discriminating method.Applied Psychological Measurement,37(7),503-521.
Wang,C.,&Gierl,M.J.(2011).Using the attribute hierarchy method to make diagnostic inferences about examinees’cognitive skills in critical reading.Journal of Educational Measurement,48(2),165-187.