View
221
Download
0
Category
Tags:
Preview:
Citation preview
How could training of language How could training of language examiners be related to examiners be related to
the Cthe Common ommon EEuropean uropean FFrameworkramework??
A case study based on the experience of the Hungarian Examinations Reform Teacher Support
Project of the British Council
Ildikó Csépes University of Debrecen,
Hungary
Inaugural Conference of EALTAKranjska Gora, Slovenia
May 14th-16th 2004
Assessing speaking skills Assessing speaking skills subjective assessmentsubjective assessment
adopting standard proceduresstandard procedures governing how the assessments should be carried out (Guideline 1)basing judgements in direct tests on specific defined criteria (Guideline 2)using pooled judgements to rate performances (Guideline 3)undertaking appropriate training in relation to assessment guidelines (Guideline 4)
The Common European Framework of Reference for Languages (2001, p.188):Subjectivity in assessment can be reduced, and validity and reliability thus increased by taking steps such as
The training of language examiners The training of language examiners
has become an important issuehas become an important issue
in English language education in in English language education in
Europe.Europe.
There is an increased interest There is an increased interest
in in
QUALITY CONTROLQUALITY CONTROL
the use of Interlocutor Frame to conduct the speaking exam
the role of benchmarking in assessor training
In this presentation, some In this presentation, some important important
aspects of quality controlaspects of quality control will be will be
highlighted in relation to training oral highlighted in relation to training oral
examiners:examiners:
A set of A set of suggested training proceduressuggested training procedures
for oral examiner training will also be for oral examiner training will also be
presented.presented.
The The model speaking examinationmodel speaking examination and the and the
interlocutor/assessor training modelinterlocutor/assessor training model have been have been
developed and piloted by the Hungarian developed and piloted by the Hungarian
Examinations Reform Teacher Support Project of Examinations Reform Teacher Support Project of
the British Council. the British Council.
The original aim of the Project was to develop The original aim of the Project was to develop a a
new English school-leaving examination. Now there new English school-leaving examination. Now there
is only a model exam and related training courses is only a model exam and related training courses
available.available.
The training modelThe training model cancan bebe easilyeasily
adapted to other contextsadapted to other contexts..
According to the CEF According to the CEF ((Guideline 1Guideline 1), ),
standard procedures standard procedures should be should be
adoptedadopted
to carry out the assessments.to carry out the assessments.
aa way of way of standardising the elicitationstandardising the elicitation of of
oral performances oral performances
by using an by using an Interlocutor FrameInterlocutor Frame
((it helps to conduct the exam in a standard it helps to conduct the exam in a standard
manner, following a standard proceduremanner, following a standard procedure))
describes in detail how the exam should be conducted
gives standardised wording for •beginning the examination •giving instructions•providing transition from one part of
the examination to the other•intervening •rounding off the examination
The The Interlocutor FrameInterlocutor Frame developed by the Project developed by the Project
Overview of the Model Speaking Overview of the Model Speaking
ExaminationExamination
Part Task Timing A2/B1 level B2 level 10-15 min. 20 min.
Part 1 Interview
2-3 minutes 4-6 minutes
Part 2 Individual long turn
4-6 minutes 7-8 minutes
Part 3 Simulated discussion task
(role-play)
4-6 minutes 7-8 minutes
Part 1: focuses on candidates’ general interactional skills and ability to use English for social purposes
Part 2: candidates demonstrate their ability to produce transactional long turns by comparing and contrasting visual prompts and to answer scripted supplementary questions asked by the interlocutor
In Part 1 and 2, the interlocutor’s contributions (questions and instructions) are carefully guided and described in as much detail as possible in the Interlocutor Frame.
Part 3: candidates produce both transactional and interactional short turns
• The interlocutor and the candidate interact with each other in order to reach a decision about a problem that is posed by the interlocutor.
• The candidate has a small number of prompts to work with while the interlocutor has specific guidelines for contributing to the exchange.
In Part 3, the interlocutor’s contributions are also carefully guided but the interlocutor has more freedom to express him or herself when participating in the simulated discussion task.
Performances are rated by the assessor according
to set criteria, which consist of • communicative impact communicative impact
• grammar and coherencegrammar and coherence
• vocabularyvocabulary
• sound, stress and intonationsound, stress and intonation
According to the CEF According to the CEF ((Guideline 2 Guideline 2 ),),
assessors’ assessors’ judgements should be based on judgements should be based on
specific defined criteriaspecific defined criteria..
ItIt consists of consists of 8 bands8 bands:: •55 of these bands (0, 1, 3, 5, 7) are of these bands (0, 1, 3, 5, 7) are defineddefined by band descriptors by band descriptors
•33 of them (2, 4, 6) are of them (2, 4, 6) are empty bandsempty bands, , which are provided for evaluating which are provided for evaluating performances which are better than the performances which are better than the level below, but worse than the level level below, but worse than the level above above
The The AAnalytic nalytic RRating ating
SScalecale
According to the CEFAccording to the CEF ( (Guideline 3Guideline 3), ), pooled judgementspooled judgements should be used should be used
to rate performancesto rate performances..
Pooled judgementsPooled judgements are represented as
benchmarksbenchmarks for sample performances.
Benchmarked performances can enhance and ensure the reliability of subjective marking.
Benchmarked performances can illustrate band illustrate band
descriptorsdescriptors (different levels of achievement).
Without benchmarked performances, assessors may interpret candidates’ performances in their own terms.
In the Hungarian context they consisted of four main phases:
1.1. selecting sample performances and selecting sample performances and judgesjudges
2.2. home marking by judgeshome marking by judges
3.3. live benchmarkinglive benchmarking
4.4. editing and standardising editing and standardising justificationsjustifications
The benchmarking procedures were designed by Charles Alderson (the advisor of the Project).
The Assessor-Interlocutor Training Team members selected a wide range of oral performance a wide range of oral performance samplessamples (12) that had been videoed during pilot examinations. These were subsequently used for the benchmarking exercise.
15 experts15 experts were invited, who were thought to have particular expertise in and experience of the assessment of oral performances in English, both at secondary and at tertiary level, and who were expected to have some familiarity with the CEF.
Phase 1: Selecting Sample Phase 1: Selecting Sample Performances and JudgesPerformances and Judges
Judges were asked to
study the documentsstudy the documents of the Benchmarking Pack carefully.
view the videoed performancesview the videoed performances on tape and mark themmark them according to the appropriate rating scale and use the mark sheets provided.
view the videos againview the videos again once all performances have been marked, and make any necessary make any necessary
adjustments to the marksadjustments to the marks.
Phase Phase 22: Home Marking: Home Marking
Judges were asked to
note downnote down any features of each performancefeatures of each performance that justified the mark for each criterion, always referring to the band descriptors in the scale.
make a list of examples of candidate make a list of examples of candidate languagelanguage, which would contribute to the final list to be compiled after the benchmarking exercise and to
be used for training assessors in the future.
Phase Phase 22: Home Marking: Home Marking
Mark sheets and notes were sent to Mark sheets and notes were sent to Györgyi Györgyi
EgyüdEgyüd, the coordinator of the benchmarking , the coordinator of the benchmarking
exercise in an electronic format exercise in an electronic format
she collated all the marks and notes for she collated all the marks and notes for
each performance sample and assigned an each performance sample and assigned an
ID number to each judge. ID number to each judge.
For each candidate For each candidate a table of results a table of results
by criterion and judge and a table of by criterion and judge and a table of
justificationsjustifications
STEP 1:STEP 1:
Judges viewed and Judges viewed and marked each video again marked each video again
without the noteswithout the notes they had made previously. they had made previously.
However, they were encouraged to take notes, However, they were encouraged to take notes,
underline relevant aspects of the scales that led underline relevant aspects of the scales that led
them to their decisions. them to their decisions.
STEP 2:STEP 2:
Judges were asked to Judges were asked to reveal their marksreveal their marks after after
each video sample.each video sample.
Phase Phase 33: The : The LiLive ve BBenchmarkingenchmarking
STEP 3:STEP 3:
Judges looked at the table of marks given in Judges looked at the table of marks given in
the preparation phasethe preparation phase together with the together with the collated collated
justificationsjustifications – in the meantime the marks were – in the meantime the marks were
being recorded for purposes of calculating first and being recorded for purposes of calculating first and
second marks (intra-rater reliability).second marks (intra-rater reliability).
STEP 4:STEP 4:
The candidate’s The candidate’s performance was then discussed performance was then discussed
with reference to the justificationswith reference to the justifications and the and the
current rating session.current rating session.
Phase Phase 33: The live benchmarking: The live benchmarking
STEP 5:STEP 5:
Judges Judges voted for the final benchmarksvoted for the final benchmarks..
STEP 6:STEP 6:
The The individual mark sheets were handed in for individual mark sheets were handed in for
central recordingcentral recording after the performance sample after the performance sample
had been benchmarked.had been benchmarked.
STEP 7:STEP 7:
Judges discussed Judges discussed major and minor errorsmajor and minor errors in in
relation to the benchmarked performance.relation to the benchmarked performance.
Phase Phase 33: The live benchmarking: The live benchmarking
The The main purposemain purpose of the benchmarking of the benchmarking
workshop: to workshop: to reach agreement on grades reach agreement on grades
using the Project’s scalesusing the Project’s scales..
Relating the performances to the Common Relating the performances to the Common
European Framework could only be a European Framework could only be a
supplementary exercise. supplementary exercise.
For this purpose the 9-point scale (For this purpose the 9-point scale (Overall Spoken Overall Spoken
InteractionInteraction) on page 74 of the Framework was ) on page 74 of the Framework was
used. After each video sample, judges had to used. After each video sample, judges had to
indicate which of the 9 levels best described the indicate which of the 9 levels best described the
candidate. candidate.
Reasons for editing and standardising the Reasons for editing and standardising the
justifications:justifications:
• thethe justifications had to be worded in justifications had to be worded in
harmony with the wording of the Speaking harmony with the wording of the Speaking
Assessment ScalesAssessment Scales as much as possible in order as much as possible in order
to make the assessor training more effectiveto make the assessor training more effective
participants seemed to be more ready to accept participants seemed to be more ready to accept
the benchmarks when they saw that the the benchmarks when they saw that the
justifications used the same terms (printed in justifications used the same terms (printed in
bold) as the band descriptors in the scales.bold) as the band descriptors in the scales.
Phase Phase 44: Editing and Standardising : Editing and Standardising JustificationsJustifications
• thethe examples for minor and major mistakesexamples for minor and major mistakes, ,
included in the justifications for support and included in the justifications for support and
illustration, had to be selected from the list of illustration, had to be selected from the list of
examples for candidate language that had been examples for candidate language that had been
agreed on by all the judges. agreed on by all the judges.
• thethe justificationsjustifications or notes produced by the or notes produced by the
individual expert judges in the home marking phase individual expert judges in the home marking phase
were rather varied with respect to both content and were rather varied with respect to both content and
format and so they had to be collated and format and so they had to be collated and
standardised in terms of layoutstandardised in terms of layout in order to in order to
produce the final justifications for each candidate. produce the final justifications for each candidate.
Phase Phase 44: Editing and Standardising : Editing and Standardising JustificationsJustifications
The Use of Benchmarked The Use of Benchmarked
Performances in the Training of Performances in the Training of
AssessorsAssessorsThe benchmarks and justifications produced The benchmarks and justifications produced
by the judges in the benchmarking sessions by the judges in the benchmarking sessions
are used for supporting the pre-course tasks are used for supporting the pre-course tasks
and the face-to-face assessor training course. and the face-to-face assessor training course.
Benchmarked performance samples Benchmarked performance samples
illustrate candidate performance illustrate candidate performance
at different levels of the scalesat different levels of the scales..
The Use of Benchmarked The Use of Benchmarked
Performances in the Training of Performances in the Training of
AssessorsAssessorsWhen the wording of the assessment scales When the wording of the assessment scales
contains expressions such as ‘contains expressions such as ‘major and major and
minor mistakesminor mistakes’, or ‘’, or ‘widewide and limited range of and limited range of
vocabularyvocabulary’, only benchmarked performance ’, only benchmarked performance
samples on video together with standardised, samples on video together with standardised,
written justifications can help future assessors written justifications can help future assessors
to come to an agreement about what level of to come to an agreement about what level of
performance the band descriptors actually performance the band descriptors actually
refer to.refer to.
In the face-to-face training phase,In the face-to-face training phase,
thethe benchmarks and benchmarks and
justifications are revealed to justifications are revealed to
course participantscourse participants
in different waysin different ways
at different stages of the training. at different stages of the training.
StageStage 1 1
Individual assessor’s decisionIndividual assessor’s decision
Justifications
Benchmarks
Step 1Step 1
Step 2Step 2
Step 3Step 3
StageStage 2 2
Individual assessor’s decisionIndividual assessor’s decisionStep 1Step 1
Step 2Step 2
Step 3Step 3
Step 4Step 4
Group Group decision
Justifications
Benchmarks
StageStage 3 3
Individual assessor’s decisionIndividual assessor’s decisionStep 1Step 1
Step 2Step 2
Step 3Step 3
Step 4Step 4
Group Group decision (revealed)
Justifications
Benchmarks
Stage 4Stage 4
Individual assessor’s decision(revealed) + taking notes
Individual assessor’s decision(revealed) + taking notes
Step Step
11
Step Step
22 Step Step
33 Step Step
44
Justifications
Benchmarks
Groups write Groups write
justificationsjustifications
Stage 1 Stage 2 Stage 3 Stage 4
Step 1
Step 2
Step 3
Step 4
A Colour-coded Overview of the Techniques
The training procedures developed by the Project
have the following aims:
to provide participants with sufficient information about the model speaking examination they are going to be trained for (outline, task types, mode)
to familiarise participants with standard interlocutor behaviour
to familiarise participants with the main principles and procedures of assessing speaking performances
According to the CEFAccording to the CEF ( (Guideline 4Guideline 4), ), future oral examiners should future oral examiners should undertake undertake
appropriate appropriate trainingtraining..
Further aims:to introduce the idea and practice of using analytic rating scales for assessing oral performances
to enable participants to develop the necessary interlocuting and assessing skills
to ensure valid and reliable assessment of live performances through standardisation
to equip trainees with transferable skills (there is a special need for this in Hungary)
The Outline of the Training The Outline of the Training ModelModel
Stage 1:Stage 1: pre-course distance learning
• self-study of an Introductory Training Pack with a pre-course video
accomplishing the pre-course tasks (analysing and marking sample video performances)
The Introductory Training Pack
contains•An overview of the speaking examination
•Guidelines for interlocutor behaviour
•Guidelines for assessor behaviour•Pre-course tasks•Self-assessment questions•Appendices (e.g. Benchmarks &
Justifications for the Sample Speaking Tests, Examples of Candidate Language, CEF Scales, Glossary)
The Outline of the Training The Outline of the Training ModelModel
Stage 2A: Stage 2A: live live interlocutor traininginterlocutor training course (a series of workshop sessions – course (a series of workshop sessions – Day 1) Day 1)
• discussing the experiencesdiscussing the experiences of the distance of the distance phasephase
• analysing video samples of both analysing video samples of both standard standard and non-standard interlocutor behaviourand non-standard interlocutor behaviour
• standardisation of the administration standardisation of the administration procedureprocedure through simulated examination through simulated examination situations (role plays)situations (role plays)
Stage 2B: Stage 2B: live live assessor trainingassessor training course course (a series of workshop sessions – Day 2)(a series of workshop sessions – Day 2)
• discussing the experiencesdiscussing the experiences of the distance phase of the distance phase • introduction to assessing oral performances: introduction to assessing oral performances:
modes and techniques of assessmentmodes and techniques of assessment• familiarisationfamiliarisation with the analytic rating scale with the analytic rating scale• standardisationstandardisation of the assessment procedure of the assessment procedure• comparing performances at different levelscomparing performances at different levels
The Outline of the Training The Outline of the Training ModelModel
Stage 3:Stage 3: a distance phasea distance phase
Practical application of the acquired skillsapplication of the acquired skills in mock
speaking tests •Participants do the mock exams
in co-operationin co-operation with another course participant, thus they take the role of both the interlocutor and the assessor.
•They can observeobserve each other and share their experiences.
•They have to reportreport on their experience in detail.
The Outline of the Training The Outline of the Training ModelModel
Sample Materials from the Sample Materials from the Interlocutor Training ModelInterlocutor Training Model
Sample 1:Sample 1: Analysing non-standard interlocutornon-standard interlocutorbehaviour
After seeing and discussing standard interlocutordiscussing standard interlocutor
behaviourbehaviour, participants are asked to compare it compare it withwith
non-standard performancesnon-standard performances..
• They have to identify instances where the interlocutor’s behaviour deviates from the deviates from the Interlocutor FrameInterlocutor Frame and the suggested suggested guidelinesguidelines.
Sample 2:Sample 2: Simulating ating difficult examinationsituationsParticipants role playrole play difficult examinationsituations in groups of threein groups of three:
• an observer• the candidate• the interlocutor
For each part of the model speaking exam, thereare three role-play tasks → all participants willexperience all the three roles by the end of thetraining.
Sample Materials from the Sample Materials from the Interlocutor Training ModelInterlocutor Training Model
Role-play Cards for Part 1 (The Role-play Cards for Part 1 (The
Interview)Interview)
Candidate Candidate You are a shy, not very talkative candidate who tends are a shy, not very talkative candidate who tends
to wait for guiding questions. You often reply with one to wait for guiding questions. You often reply with one
or two short sentences only.or two short sentences only.
Interlocutor Interlocutor
You are the interlocutor who asks the questions of the You are the interlocutor who asks the questions of the
first part of the speaking test. You have to elicit as first part of the speaking test. You have to elicit as
much speech from the candidate as possible. Please much speech from the candidate as possible. Please
remember to ask the questions listed in the remember to ask the questions listed in the
Interlocutor Frame.Interlocutor Frame.
ConclusionsConclusions
It is impossible to become a trained interlocutor and assessor without formal trainingformal training.
Training should involve distance and face-to-face distance and face-to-face elementselements as well to ensure that future interlocutors and assessors go through each and every phase of the difficult and complex standardisation process.
One training course is not enoughOne training course is not enough.
Only further practiceOnly further practice and monitoring monitoring interlocutor and assessor behaviourinterlocutor and assessor behaviour can ensure that candidates’ speaking ability is assessed in a standard manner and the assessments are valid and reliable.
INTO EUROPEINTO EUROPESeries Editor: J. Charles AldersonSeries Editor: J. Charles Alderson
The Speaking HandbookThe Speaking HandbookIldikó Csépes & Györgyi EgyüdIldikó Csépes & Györgyi Együd
The The HandbookHandbook is accompanied by a 5-hour is accompanied by a 5-hour DVDDVD
Published by Teleki László Foundation & The British Published by Teleki László Foundation & The British CouncilCouncil
Distributor: Libro TradeDistributor: Libro TradeInfo:Info: books@librotrade.hubooks@librotrade.hu
email:email: icsepes@delfin.unideb.hu icsepes@delfin.unideb.hu
Recommended