Publication details

Test Versions´ Comparability in the Upper-Secondary School Leaving Examination of English



Year of publication 2016
Type Appeared in Conference without Proceedings
Description The presentation outlines a dissertation project that focuses on the issue of assurance of the test versions comparability in the context of high-stakes exams – upper-secondary school leaving examinations on English as a foreign language, using real data provided by Slovak institution NÚCEM. The findings, conclusions and suggestions might be useful for similar centralised exams, that aroused in the Central, East and East-South Europe after the 1989. The aims of the research are to investigate what comparability means in high- and low-stakes contexts and how it has been dealt, established and proved by different bodies in the field of language testing (Cambridge ESOL, Goethe Institut, European universities, national testing bodies, etc.); to provide theoretical rationale for the development of comparable test versions; to investigate what methods and approaches would be suitable for the context of Slovak national high-stakes exams given the existing constrains (e.g. legislation, accountability to the stakeholders and to the public in general) and how to implement them in the processes of test development. The investigation of test versions comparability comprises analyses of different aspects of the exam: structural and content equivalence of the construct, psychometric equivalence of the test versions, and structural equivalence of the test-takers´ population. Several methods will be used and qualitative and quantitative analyses will be combined: the degree of construct equivalence will be judged using confirmatory factor analysis and/or structural equation modelling; content analysis will be conducted by a panel of expert judges using descriptive framework based on the CEFR (Council of Europe, 2001) and on the models by Khalifa and Weir (2009) for reading, Buck´s model (2001) for listening and Purpura´s model (2004) for grammatical ability. For the content analysis, the amount of agreement will be calculated using Krippendorff´s Alpha. Psychometric analyses will be conducted, both under the assumptions of Classical Test Theory, and Item Response Theory. Descriptive and inferential statistics will be calculated and comparative analyses will be conducted, in order to evaluate the statistical significance of the results (e.g. score distributions, descriptive test and item statistic, reliability and generalizability will be compared and their differences will be evaluated). The structure of the populations form years 2011 – 2015 will be compared and frequencies of different groups and subgroups will be compared and evaluated. In case of any significant difference in population, the research design would have to be adjusted, using sampling of the original populations, in order to reach or randomised samples, or equivalent samples in terms of the population characteristics, such as age, geographical characteristic, type of school etc. The results and findings should provide an insight into the variety of ways how comparability can be reached in different testing contexts. Investigation of problems with test versions comparability might be a useful platform for discussing its potential sources and solutions under specific constraints of a particular context. The importance of well-defined construct, represented by relevant and representative items and tasks, will be emphasized as a crucial step for subsequent valid interpretation of results and for the validity of decisions and their social consequences. The results of the research might have beneficial implications for the fairness in high-stakes language testing in Slovakia and potentially also for the field of language testing in general.

You are running an old browser version. We recommend updating your browser to its latest version.

More info