Reliability, practicality, and washback of choice format in language testing

    Xuemei+Wu

    【Abstract】 This paper aims to discuss reliability, practicality, and washback of MC format and show how effective and applicable it is in language testing. It mainly includes an evaluation of MC format reliability, and practicality, and a discussion about the washback MC format has on the teaching and learning process. This paper concludes by suggesting that although MC formats reliability and validity are still questionable, it is still quite applicable in certain circumstances.

    【Key Words】 MC format; language testing; reliability; practicality ; washback

    【中圖分类号】 G64.33 【文献标识码】 A 【文章编号】 2095-3089(2017)16-00-01

    1.Introduction

    Multiple-choice (MC) has been used widely as an instrument for language testing. In spite of the controversies on it in recent years, it still plays an very important role for many language tests, either regionally or globally. Therefore, it is crucial to study the effectiveness of MC format in language testing. To evaluate whether a language test instrument is effective or not, we have to consider many factors. These mainly include reliability, practicality and washback.

    2. Evaluation

    2.1 Reliability

    "The reliability of a test concerns with its precision as a measuring instrument. Reliability asks whether an assessment instrument administered to the same respondents a second time would yield the same results" (Cohen, 1994:36)

    MC format is said to have high rater-reliability, one of the important components of test reliability. Given that no rating carelessness occurs, the score of MC choice will always to be same. That is to say, a scorer will give the same score on different occasions; a different scorer will give the same score to a certain candidate. Objective scoring makes the MC format reliable. As Cohen(1994) pointed out, the more objective the scoring is, the higher the rater reliability a test has. In addition, MC format is likely to include many different items. The test takers will only have to make a mark on their answer sheet, so unlike essay writing, which can only test one specific topic, MC format can cover a wide variety of language applications. Hughes (2003) concluded that the more items a test includes, the more reliable the test will be.

    However, guessing is a crucial factor reducing the reliability of MC format. It may have an unknown effect on test scores. A student knowing nothing about the test content may get as high as 30 to 40 out of 100 in a test, just because of luck. Zhong, a teacher of mine once had a small experience on her three-year old daughter, who was asked to circle a multiple-choices answer sheet randomly. The teacher then compared the score of her daughter with that of the students and found that her daughter could get even higher scores than some of her students. This example might be too extreme, but it really makes me doubt the reliability of MC format while it is related to guessing. What is more, the responses of MC format are very simple, the candidate will only need to write down or circle a,b,c,d. This will facilitate the communication among test takers. Cheating is very likely to occur in this case, which will certainly affect the reliability of the test.

    2.2 Practicality

    According to Bachman and Palmer (1996:36), practicality is "the relationship between resources that will be required in the design development and use of the test and the resources that will be available for these activities."

    MC format for language testing is often said to be practical due to the following factors. Firstly, MC format can be applied to large groups of people compared to those tests that can only be used with individuals, such as face-to-face interviews. It is much less time-consuming to administers (Genesee & Upshur, 1996). Secondly, we do not need extra procedures or facilities to implement MC format testing. For some of the tests, special training has to be provided to examiners, which is the case with the IELTS speaking interview. Special facilities and equipment such as recorders are also needed for this testing procedure. Inevitably, cost of the test will increase. With MC tests, however, such problems do not exist. Lastly, it is also indisputable that scoring of MC tests is quick and economical. With the help of scoring machine, the answer sheets of hundreds of test takers can be scored in a few minutes. Further analysis of test taker's responses is also available with computer software ( Oosterholf, 1996).

    On the other hand, the effort and time involved in constructing MC items pose problems with its practicality. Successful MC items are very difficult to write. Considerable time is needed to write a good item with plausible and effective alternatives. Professional test writers usually write more MC items than actually needed, and apply them in a real test only after trialing and statistical analysis of performance on the items (Hughes, 2003). This procedure will inevitably increase the time and effort for the test preparation.

    2.3 Washback

    According to Brown and Hudson (1998, p667), washback is "the effect of testing and assessment on the language teaching curriculum that is related to it."

    The washback for MC format can be harmful due to the fact that it might be the easier instrument to find corresponding decoding test-taking strategies. Instead of studying hard, students are likely to focus on the test -taking strategies, considering them as a short cut. There is a danger that some teachers may also change their teaching goal to cater for the need of the students. They pay their attention to the studying and teaching strategies solely for the sake of coping with various tests. This is quite common for teachers in private language school in China. They know that students have strong desire to pass exams such as TOEFL and IELTS. As thus, they have worked out a set of strategies to "decode" MC format. Special effort has been made to improve the educated guessing. Some of the teachers could even make 30 correct choices without listening to the materials in a TOEFL listening test. I doubt students will try hard to improve their English when they know they can easily get high mark in this way.

    3. Conclusion

    MC format may still have important status in the language testing due to its versatility, practicality and objectivity. However, it is not problematic. As suggested in the above discussions, some aspects of its reliability and practicality are still questionable. It is further degraded by the fact that it has negative washback for our learning and teaching. To fully evaluate the candidate's language ability, we need to use a combination of various test instruments. The test instrument itself is of value, if employed appropriately; every single instrument can play a positive role in testing candidate's real language ability.

    References:

    Bachman, L.& Palmer, A., 1996, Language Testing in Practice. Oxford University Press. Oxford.

    Brown, J & Hudson, T., 1998, The Alternatives in Language Assessment, TESOL quarterly, Vol.32, No.4, P653-675

    Cohen, A.D., 1994, Assessing language ability in the classroom (2nd edition), Heinle & Heinle Publishers.

    Genesee, F & Upshur, J,A., 1996, Classroom-based Evaluation in Second Language Education, Cambridge University Press.

    Hughes, A., 2003, Testing for Language Teachers ( second edition), Cambridge University Press.

    Oosterhof, A., 1996, Developing and Using Classroom Assessments, Prentice-Hall, Inc.

    Weir, c.j., 1990, Communicative Language Testing. Prentice Hall. New York.

相关文章!
  • 高等教育人工智能应用研究综述

    奥拉夫·扎瓦克奇-里克特 维多利亚·艾琳·马林【摘要】多种国际报告显示教育人工智能是当前教育技术新兴领域之一。虽然教育人工智能已有约

  • 如何做好教学课堂评价

    任芬新课标中提出:“评价的主要目的是全面了解学生的学习历程,激励学生的学习和改进教师的教学,应建立评价目标多元、评价方法多样的评

  • 融入互联网技术,丰富初中语文

    崔莉【摘? 要】作为素质教育背景下的新时代教师应不断提高互联网应用于课堂的意识,努力提高自身专业技能,进而不断提高教学质量,丰富教