32
Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Embed Size (px)

Citation preview

Page 1: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Quality Control in Evaluation and Assessment

J Charles Alderson,

Department of Linguistics and Modern English Language,

Lancaster University

Page 2: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

“Assessment is central to language learning, in order

• to establish – where learners are at present, – what level they have achieved,

• to give learners feedback on their learning,

• to diagnose their needs for further development, and

• to enable the planning of curricula, materials and activities.”

Page 3: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Outline Current practice Assessment for certification

Tradition one: teacher-centred, school-based

Tradition two: central, quality controlled Basic parameters What is needed to ensure parameters are

met

Page 4: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Current practice • Quality of important examinations not monitored• No obligation to show that exams are relevant, fair,

unbiased, reliable, and measure relevant skills• University degree in a foreign language qualifies one to

examine language competence, despite lack of training in language testing

• In many circumstances merely being a native speaker qualifies one to assess language competence.

• Teachers assess students’ ability without having been trained.

Page 5: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

First tradition    Teacher-centred    School/university-based assessment    Teacher develops the questions     Teacher's opinion the only one that counts    Teacher-examiners have no explicit marking criteria    Assumption that by virtue of being a teacher, and

having taught the student being examined, teacher-examiner makes reliable and valid judgements

    Authority, professionalism, reliability and validity of teacher rarely questioned

    Rare for students to fail

Page 6: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Second tradition

    Tests externally developed and administered    National or regional agencies responsible for

development, following accepted standards     Tests centrally constructed, piloted and revised    Difficulty levels empirically determined     Externally trained assessors    Empirical equating to known standards or levels of

proficiency

Page 7: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Basic parameters

• Validity

• Reliability

• Practicality

• Authenticity

• Washback

• Impact

• Currency

Page 8: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

“Validity in general refers to the appropriateness of a given test or any of its component parts as a measure of what it is purported to measure. A test is said to be valid to the extent that it measures what it is supposed to measure. It follows that the term valid when used to describe a test should usually be accompanied by the preposition for. Any test may then be valid for some purposes, but not for others.”(Henning, 1987)

Page 9: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Validity

• Rational, empirical, construct

• Internal and external validity

• Face, content, construct

• Concurrent, predictive

• Construct

Page 10: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

How can validity be established?

• My parents think the test looks good.• The test measures what I have been taught.• My teachers tell me that the test is

communicative and authentic.• If I take the Rigo utca test instead of the

FCE, I will get the same result.• I got a good English test result, and I had no

difficulty studying in English at university.

Page 11: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

How can validity be established?

• Does the test look valid to the general public?

• Does the test match the curriculum, or its specifications?

• Is the test based adequately on a relevant and acceptable theory?

Page 12: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

How can validity be established?

• Does the test yield results similar to those from a test known to be valid for the same audience and purpose?

• Does the test predict a learner’s future achievements?

Note: a test that is not reliable cannot, by definition, be valid

Page 13: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

How can validity be established?

• A test’s items should work well: they should be of suitable difficulty, and good students should get them right, whilst weak students are expected to get them wrong.

• All tests should be piloted, and the results analysed to see if the test performed as predicted

Page 14: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Factors affecting validity

• Unclear or non-existent theory• Lack of specifications• Lack of training of item/ test writers• Lack of / unclear criteria for marking• Lack of piloting/ pre-testing• Lack of detailed analysis of items/ tasks• Lack of standard setting to CEF• Lack of feedback to candidates and teachers

Page 15: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Reliability

• If I take the test again tomorrow, will I get the same result?

• If I take a different version of the test, will I get the same result?

• If the test had had different items, would I have got the same result?

• Do all markers agree on the mark I got?• If a marker marks my test again tomorrow, will I

get the same result?

Page 16: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Reliability

• Over time: test – re-test

• Over different forms: parallel

• Over different samples: homogeneity

• Over different markers: inter-rater

• Within one rater over time: intra-rater

Page 17: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Factors affecting reliability

• Poor administration conditions – noise, lighting, cheating

• Lack of information beforehand

• Lack of specifications

• Lack of marker training

• Lack of standardisation

• Lack of monitoring

Page 18: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Practicality

• Number of tests to be produced

• Length of test in time

• Cost of test

• Cost of training

• Cost of monitoring

• Difficulty in piloting/ pre-testing

• Time to report results

Page 19: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Factors affecting practicality

• Awareness of complexity and cost

• Time to do the job: ‘quick and dirty’ remains dirty

• Funding to support development, monitoring and further development

• Recognition of need for training – of testers and of teachers

Page 20: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Authenticity

• Genuineness of text

• Naturalness of task

• Naturalness of learners’ response

• Suitability of test for purpose

• Match of test to learners’ needs (if known)

• Face validity

• Expectations of stakeholders and culture

Page 21: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Factors affecting ‘authenticity’

• A test is a test is a test

• Availability of resources

• Training of test developers/ item writers

• Relative importance of reliability over validity

• Purpose of test: proficiency versus progress or diagnosis

Page 22: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Washback

• Test can have positive or negative effects

• Test can affect content of teaching

• Test can affect method of teaching

• Test can affect attitudes and motivation

• Test can affect all teachers and students in same way, or individuals differently

• Importance of test will affect washback

Page 23: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Factors affecting washback

• Extent to which teachers know nature of test• Extent to which teachers understand

rationale of test• Extent to which teachers consider how best

to prepare learners for test• Nature of teachers’ beliefs about teaching• Effort teachers are willing to make• Difficulty of test

Page 24: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Impact

• Effect of test on society

• Effect of test on stakeholders: employers, higher education, parents, politicians

• Intended and unintended

• Beneficial or detrimental

Page 25: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Factors affecting impact

• Extent to which purpose of test is understood and accepted

• Currency of test• Face validity of test• Stakes of test• Availability of information• Education of stakeholders re complexity of

testing

Page 26: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Currency of test

• Extent to which test is valued by stakeholders

• Different stakeholders may have different perspectives: university vs employer; parents vs teachers; teachers vs principals? politicians vs professionals?

Page 27: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Factors affecting currency

• Consequences of passing or failing – stakes• Extent to which stakeholders take results seriously

into consideration• Beliefs about value of tests in general• Extent to which test matches expectations about

tests in general or language tests in particular• Difficulty of test• Institution offering the test

Page 28: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

General Issues

    Teacher-based assessment vs central quality control     Internal vs external assessment     Quality control of exams (and the associated cost)     Piloting and pre-testing    Test analysis and the role of the expert     The existence of test specifications    Guidance and training for test developers and

markers   

Page 29: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

General Issues (continued)

• Feedback to candidates• Pass / fail rates • The currency of the old and the new

traditions • The relationship with other languages and

countries • The standards of the local exams in terms of

"Europe"

Page 30: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

Constraints on testing

   Time – much less than for teaching 

    Sample – inevitably limited

    Resources always limited – money, infrastructure, trained personnel

    Assessment culture / tradition

    Lack of awareness of problems and solutions

Page 31: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

BUT WASHBACK

    Testing is too important to be left to the teacher

    Testing is too important to be left to the tester

    Both are needed, to reflect and influence teaching, validly and reliably.

Page 32: Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

“Assessment is central to language learning, in order

• to establish – where learners are at present, – what level they have achieved,

• to give learners feedback on their learning,

• to diagnose their needs for further development, and

• to enable the planning of curricula, materials and activities.”