فهرست مطالب

International Journal of Language Testing
Volume:11 Issue: 2, Oct 2021

  • تاریخ انتشار: 1400/07/25
  • تعداد عناوین: 10
|
  • Hossein Khodabakhshzadeh, Roya Shoahosseini * Pages 1-12
    The present study aimed to investigate the relationship between attitudes towards cheating, academic self-confidence, and general language ability among Iranian EFL learners. One hundred and thirty-nine university EFL students participated in this study. Findings showed that attitudes towards cheating negatively correlate with academic self-confidence and ability as measured by students’ GPA. The relationship between age, gender, level of education, and attitudes towards cheating was examined too. Analyses showed that there is a negative correlation between age and attitudes towards cheating. However, no relationship was found between gender and level of education, and attitudes towards cheating. Furthermore, psychometric qualities of the Attitudes towards Cheating Questionnaire were examined. Analyses revealed that some items had low and negative item discrimination indices. When malfunctioning items were deleted the reliability of the scale improved. Implications of the study on correlates of cheating among Iranian EFL learners and the validity of the Attitudes towards Cheating Questionnaire are discussed.
    Keywords: Academic Self-confidence, Attitudes towards Cheating, General Language Ability
  • Abbas Zarei *, Hanieh Rahmaty Pages 13-33
    Given the importance of assessment in language education, this study was aimed at investigating the effects of interventionist and interactionist Dynamic Assessment (DA) on Iranian EFL learners’ perfectionism, willingness to communicate (WTC) and foreign language anxiety (FLA). The participants of the study were 166 pre-intermediate female and male learners at two public schools and one private language institute in Karaj. The participants were divided into three experimental groups and were given three separate questionnaires to measure their perfectionism, WTC, and FLA before the treatment sessions. During 14 sessions, groups A and B received instruction using interactionist DA and interventionist DA, respectively. Group C was instructed conventionally as the control group. At the end of the treatment sessions, they were given the same three questionnaires as the post test. The collected data were analysed using one-way ANCOVA. The results showed no significant differences among the two approaches of dynamic assessment and the control group in terms of their effect on learners’ perfectionism and willingness to communicate. Although there was no significant difference between the interventionist and interactionist models of DA, they were both significantly more effective than the control group on learners’ FLA. The findings suggested that dynamic assessment could reduce learners’ anxiety in language learning and lead to an effective teaching and learning process. These findings can have implications for teachers, students, and material developers.
    Keywords: Foreign language anxiety, Dynamic assessment, perfectionism, willingness to communicate
  • Hamed Ghaemi * Pages 34-50
    The Online Interaction Learning Model was founded on constructivist learning theory. It is an input-process-output model based on moderating variables, the inputs, which includes all of the characteristics of the courses, the instructors, EFL learners, and the technology. Due to the fact that the studies considering the pivotal role of the Online Interaction Learning Model are scarce in number, this study was conducted to validate a newly-designed questionnaire via exploratory and confirmatory factor analyses. Two hundred fifty-nine Iranian university and higher education institutes EFL learners were asked to participate. The newly-developed questionnaire consisted of 35 items measuring the five constructs of the Online Interaction Learning Model (Course materials, Instructor performance, Learning practices, Student-to-student interaction, and Access to technology). The results from EFA, CFA, and reliability analyses revealed that the new questionnaire is a valid and reliable instrument measuring Online Interaction Learning Model. Moreover, there was a significant positive correlation between each component of the Online Interaction Learning Model and EFL learners’ GPA as well as between the total Online Interaction Learning Model and student academic achievement. Male and female EFL learners’ scores on the online interaction learning model were quite different from each other.
    Keywords: Confirmatory Factor Analysis, constructivist learning theory, exploratory factor analysis, online Interaction Learning Model, Reliability
  • Mohammad Hasan Razmi *, Masoud Khabir, Shouket Ahmad Tilwani Pages 51-63

    Since its inception in 1949, over 1,500 studies have investigated the validity of the GRE General Test to predict its performance criteria in higher education (Klieger, Bridgeman, Tannenbaum, & Cline, 2016). The present review paper sought to examine the predictive validity of the GRE General Test. Factors affecting the predictive validity (e.g., range restriction, compensatory selection, criterion unreliability, substantive and artifactual moderators, bias in testing, coaching effects, socioeconomic status (SES), gender, and a host of other intervening factors such as motivation, communication skills, etc.) have been discussed. A brief overview of GRE revised General Test format is also presented. After an account of the related review of the literature, a critical commentary on the predictive validity of the GRE General Test has been discussed with an emphasis on the role of criterion unreliability and SES factor effects.

    Keywords: Predictive validity, Graduate Record Examination (GRE), criterion unreliability, range restriction, compensatory selection
  • Mehdi Doosti *, Mohammad Ahmadi Safa Pages 64-90
    This study examined the effect of rater training on promoting inter-rater reliability in oral language assessment. It also investigated whether rater training and the consideration of the examinees’ expectations by the examiners have any effect on test-takers’ perceptions of being fairly evaluated. To this end, four raters scored 31 Iranian intermediate EFL learners’ oral performance on the speaking module of the IELTS in two stages (i.e. pre- and post-training stage). Furthermore, following Kunnan’s (2004) Test Fairness Framework, a questionnaire on fairness in oral language assessment was developed, and after pilot testing and validating, it was administered to the examinees at both stages. The examinees’ expectations were taken into account in the second round of the speaking test. The results indicated that rater training is likely to promote inter-rater reliability and, in turn, enhances the fairness of the decisions made based on the test scores. It was also concluded that considering students’ expectations of a fair test would improve their overall perceptions of being fairly evaluated. The results of this study sought to provide second language teachers, oral test developers, and oral examiners and raters with useful insights into addressing fairness-related issues in oral assessment.
    Keywords: Inter-rater reliability, oral language assessment, Rater training, Test fairness
  • Fatemeh Firoozi * Pages 91-108
    Large-scale standardized ESL tests such as the International English Language Testing System (IELTS) are widely used around the world to measure language proficiency of test takers and make different decisions based on their scores. Reading comprehension is an integral part of such tests which requires test takers to read passages and answer a set of questions. Although IELTS is a popular standardized test and is used for making critical decisions about test takers, very few attempts have been made to explore the validity of the exam and especially the reading part of the General Training Module. With this in mind, the purpose of the present study was to use a non-parametric item response theory model, called Mokken Scale Analysis (MSA), to examine the validity of the reading part of the General Training module of IELTS. To this end, item responses of 352 test takers to the reading comprehension test were analyzed. The results of item scalability, total scalability, and item-pair scalability showed that the reading part is a weak unidimensional scale. Using Monotone Homogeneity Model (MHM), monotonicity results also indicated that there are some items which violate the monotonicity assumption, although their values are insignificant. The analysis of unidimensionality using the AISP revealed that there are two scales and four unscalable items in the reading part. Therefore, Mokken scale analysis did not support the unidimensional structure of the reading part of the General Training module of IELTS.
    Keywords: IELTS, General Training Module, Monotone Homogeneity Model, Reading Comprehension Section, Validity
  • Mohammad Kabir Rasoli * Pages 109-121

    AbstractAddressing the limitations and criticisms of cloze tests, Klein-Braley and Raatz suggested a modified version of the cloze test called the C-test where the ‘C’ stands for cloze. They introduced the C-test as a better representation of the reduced-redundancy principle. In C-tests, several shorter texts involving a larger number of items completed within a shorter amount of time while cloze tests usually consist of one or two longer texts with less items. This study reports on the results of a research program carried out to validate the C-test amongst Afghan EFL learners. One hundred advanced English majors were administered two different language tests namely a language proficiency test (composed of listening, reading, and writing) and a C-test composed of four different texts to measure the overall language ability of the participants. Various analyses were to measure the validity and reliability of the C-test. The results of the study confirmed that the C-test is a reliable and valid test which can be used as a general English language proficiency test among Afghan students of English as a foreign language.

    Keywords: C-test, Reduced Redundancy Principle, Validation, Reliability, Afghan Learners
  • Hamid Khosravany Fard, Mohammad Davoudi * Pages 122-141
    The present study aims at scrutinizing the inference complexity level (henceforth ICL) of test items of the Iranian State University TEFL Ph.D. entrance exam (ISUTPEE) from 2010 to 2017 through the lens of Kintsch's construction-integration theory (C-I) (Kintsch, 1988, 1998). Though there is ample research on inferencing in the field of reading comprehension, the existing literature reveals a serious gap in relation to inferencing complexity of test items in high-stakes exams that exert profound effects on the academic achievements of the individuals. Inferencing is examined in this study to explore the ICL of the test items of the Special Knowledge Test (SKT) according to three levels of memory representations of Kintsch's model: the surface model, the textbase, and the situation model. To this end, the test items for eight consecutive years of the ISUTPEE were examined in relation to the three distinct kinds of mental representations. To ensure the reliability of coding by the researchers, two other specialist coders assessed the ICL of 33% of the items. The intraclass correlation among the three sets of codes was 0.91. The results of the study showed that a large number of questions, accounting for more than 80% of the items, merely activate the surface and the textbase model of information representations in memory. Furthermore, the ICL for each of the four parts of SKT was examined. This analytical study carries a stark warning regarding a deficiency of systematic attention to ICL in the development of test items.
    Keywords: Construction-integration model, Inferencing, TEFL Ph.D. Entrance Exam, Mental representations, Situation model
  • Xiaoli Yu * Pages 142-167
    This study examined the development of text complexity for the past 25 years of reading comprehension passages in the National Matriculation English Test (NMET) in China. Text complexity of 206 reading passages at lexical, syntactic, and discourse levels has been measured longitudinally and compared across the years. The natural language processing tools used in the study included TAALES, TAALED, TAASSC, and TAACO. To compare the differences across the years at various levels of text complexity, ANOVA and MANOVA tests were conducted. The results suggested that lexical level text complexity revealed the most evident changes throughout the years, lexical sophistication, density, and diversity levels of the most recent years of reading passages have increased remarkably compared to the early years. The syntactic level text complexity indicated a moderate elevation toward the recent years of reading passages. For the discourse level text complexity, regarding cohesion, insignificant fluctuation occurred throughout the years and the general trend was not necessarily increasing. Combined, the results indicated that text complexity of the reading comprehension passages in the NMET over the past 25 years had been steadily increasing by including more low frequency and academic vocabulary, diversifying vocabulary in the passages, and complicating sentence and grammatical structures. The results were further examined against the general curriculum standards and guidelines to analyze whether the changes were reflected in the policies. It showed that the exams required a much larger vocabulary size than the number indicated in the guidelines, suggestions for test designers and pedagogical practices were provided accordingly.
    Keywords: corpus linguistics, High Stakes Exam, Natural Language Processing, reading comprehension, Text Complexity
  • Seyyed Ali Ostovar-Namaghi *, Abutaleb Iranmehr, Mostafa Morady Moghaddam Pages 168-179
    Some universities in Iran have recently witnessed a shift in admission criteria from university admission test performance towards high school records. This sudden change seems to be unwarranted since the predictive power of high school records has not been explored. To fill in this gap, this study aims at showcasing the predictive validity of high school records for undergraduate students of English language and literature. To this end, a random sample of undergraduate students studying at Shahrood University of Technology were selected as the participants, the predictor variables were operationally defined as the participants’ grade point average (GPA) in three school subjects including English, Persian and Arabic languages along with their overall high school GPA, and the explained variable was operationalized as the participants’ overall GPA for the first academic year. The results of Pearson correlation revealed a significant but very low correlation between the variables of interest. Moreover, the results of multiple regression analysis revealed that none of the predictor variables well predicts academic success in English language and literature. Although the results of this study are case-specific, they have clear implications for policy makers and interested researchers nationwide.
    Keywords: high school GPA, university admission, Criteria, predictive power, Academic success