Self-Assessment in Language Testing: Reliability and Validity Issues
is not an alien concept to human behavior.
All human beings are involved, either consciously or subconsciously, in
an on-going process of self-evaluation. Until
recently, however, the value of this human process was largely ignored in
pedagogy. Learners were rarely
asked to assess their performance, much less have a say in the construction of
evaluation instruments. Pedagogically,
the term self-assessment was rendered all but oxymoronic.
the last decade, with the increased attention to learner-centered curricula,
needs analysis, and learner autonomy, the topic of self-assessment has become of
particular interest in testing and evaluation (Blanche 1988; Oscarson, 1998).
It is now being recognized that learners do have the ability to provide
meaningful input into the assessment of their performance, and that this
assessment can be valid. In fact,
with regard to second and foreign language, research reveals an emerging pattern
of consistent, overall high correlations between self-assessment results and
ratings based on a variety of external criteria (Blanch 1988; Oscarson 1984,
1997, 1998; Coombe 1992). In spite
of these results, however, issues concerning the validity and reliability of
language self-assessment need to be addressed.
formal or standardized tests have already established construct, predictive, and
concurrent validity and reliability indices, the question of the validity and
reliability of learners' estimates still remains moot.
Because of the complex process nature of the language learning process,
constructs of what is being measured need to be clarified.
To be able to validly assess their behavior, learners need to know, in
non-linguistic, simplified and practical terms, exactly what it is that they are
trying to assess. Many language
constructs, such as proficiency and communicative competence, are elusive and
must be clearly and concisely operationalized and communicated to ensure the
validation of assessment among learners. The
criterion by which learners are to assess themselves may be opaque and thus add
an additional threat to validity. Language
learners in EFL contexts may find self-assessment particularly difficult if no
comparisons to a native speaker are available to them.
They may be able to judge their own fluency and understanding fairly
accurately, but may find it more difficult to assess their accuracy of speech
additional consideration of validity is whether different language skills are
comparable for assessment. They
probably are not, and learners must be made aware of this.
Linguistic analyses may require a different focus than communication
does. Receptive skills may demand
different attention than productive skills.
The degree to which language learners are able to carry out valid
self-assessments will depend on the nature of the skills being assessed and the
relative accuracy with which learners can define and use, in concrete,
behavioral terms, the skills they are to assess.
reliability of learners’ judgement is subject to variables whose influence on
the learner is difficult to establish. Extraneous
factors, such as parental expectations, career aspirations, amount of exposure
to foreign languages, age, past academic record and lack of training in
self-assessment, affect the accuracy of self-estimates and must, in some way, be
accounted for. Furthermore, because
reliability, like validity, depends on systematic analysis, the questions is
raised as to whether short term self-assessments lend themselves to consistency.
They most likely do not. Learners
need to be asked to assess their performance on a regular basis. Their performance must be carefully and closely linked with
the particular skills that they are working on.
Learner ability to accurately self-assess language performance is not
automatic. Therefore, constant
feedback within a formative, as well as summative framework is a crucial factor
for obtaining reliable self-assessment results.
previously stated, there is strong evidence that self-assessments yield
consistent and homogeneous results; indeed, research indicates that learner
self-assessment is working in situations that were traditionally reserved for
standardized tests (i.e. placement) (LeBlanc & Painchaud 1985).
Nevertheless, self-assessment is not a panacea for all testing problems,
and the field is fraught with problematic issues, a few of which have been
addressed in this article. Further
research is needed, not only to investigate the many validity and reliability
issues involved, but also to help establish the place of self-assessment in the
complete measurement and evaluation process.
Blanche, P. (1988). The FLIFLC Study. Monterey: The US Department of Defense Language Institute.
Coombe, C. (1992) The Relationship Between Self-assessment Estimates of Functional Literacy Skills and Basic English Skills Test Results in Adult Refugee ESL Learners. Ph.D. Diss. The Ohio State University.
LeBlanc, R. & Painchaud, O. (1985). Self-Assessment as a Second Language Instrument. TESOL Quarterly. 19, 4, 673-687.
Oscarsson, M. (1984). Self-Assessment of Foreign Language Skills: A Survey of Research and Development Work. Strasbourg: Council of Europe.
Oscarsson, M. (1997) “Self-Assessment of Foreign and Second Language Proficiency”. In The Encyclopedia of Language and Education, Vol. 7. Kluwer Academic Publishers, pp 175-187.
Oscarsson, M. (1998). “Learner Self-Assessment of Language Skills”. IATEFL TEA SIG Newsletter, Nov. 1998.
©Christine Coombe 2002. All rights reserved.