Upload
ralph
View
214
Download
1
Embed Size (px)
Citation preview
This article was downloaded by: [University of Hong Kong Libraries]On: 12 November 2014, At: 07:19Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK
International Journal for Academic DevelopmentPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/rija20
Evaluating teaching effectiveness and teachingimprovement: A language for institutional policiesand academic development practicesLynn McAlpine a & Ralph Harrisa Center for University Teaching and Learning , McGill University , Canada.Published online: 10 Dec 2010.
To cite this article: Lynn McAlpine & Ralph Harris (2002) Evaluating teaching effectiveness and teaching improvement: Alanguage for institutional policies and academic development practices, International Journal for Academic Development,7:1, 7-17, DOI: 10.1080/13601440210156439
To link to this article: http://dx.doi.org/10.1080/13601440210156439
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose ofthe Content. Any opinions and views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be reliedupon and should be independently verified with primary sources of information. Taylor and Francis shallnot be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and otherliabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
Introduction – contextOver the past decade in North America, Europeand Australia, there has been increasing publicconcern with the value or impact of money beingspent in post-secondary education, e.g., the Smith(1991) report in Canada called for action at thepost-secondary level; the Staff and EducationalDevelopment Association in the United Kingdomestablished standards for practice at the nationallevel; external calls for accountability and constantimprovement are ongoing in Australia (Robertson,1998).
This emphasis on accountability is centrallydriven in some countries. Dow (2001) notes thecurrent regulatory protocols for quality assurancecommon to all levels of government in Australia.Also, Randall (2001) reports on the impact of theQuality Assurance Agency for Higher Educationacross the UK. Yet, during the same period of time
in the US and Canada, the call for accountabilityhas not been experienced in a standards-basednational model, but in a more decentralizedmanner. Evaluation of the quality of the practicesof post-secondary institutions occurs throughprofessional associations accrediting particularprogrammes (e.g., counselling psychology,engineering), and the charters of institutions beinggranted by individual states and provinces. Calls foraction are more evident in North America throughthe media, for example, an annual survey ofuniversities by a Canadian national news magazine,as well as budget cuts that force universities toconsider both the nature of what they do and howoutsiders perceive it. Concurrently, there are callsfrom academics in North America to promote therecognition and evaluation of teaching: forexample in the US, Boyer’s (1990) proposal for thescholarship of teaching and Edgerton, Hutchingsand Quinlan’s (1991) for teaching portfolios; in
Evaluating teaching effectiveness
and teaching improvement:
A language for institutional
policies and academic
development practices
Lynn McAlpine and Ralph Harris, Center forUniversity Teaching and Learning,McGill University, Canada.
ABSTRACTDemands for institutional accountability in higher education have been increasing and have led to greater attentionto the evaluation of teaching, the assumption being that improved teaching will result in enhanced learning.
In our work as academic developers, we are increasingly helping academic managers make explicit teachingpolicies and practices that seem fair and equitable. To help us in this work, we have developed a framework forevaluating the practice of teaching. What is unique about this framework is the language it provides to differentiateaspects of teaching. For instance, it provides a basis for differentiating and linking criteria to standards, i.e. the levelof achievement desired or expected. Standards are critical if the evaluation of teaching is to be seen as fair andequitable, yet they are often unexamined in other representations of the evaluation of teaching. Although theoriginal intent of our efforts was to provide a framework for academic managers, we have come to find it useful inour own work as university teachers and as academic developers. Examples of all three uses are provided in thepaper.
The International Journal for Academic DevelopmentISSN 1360-144X print/ISSN 1470-1324 online © 2002 Taylor & Francis Ltd
http://www.tandf.co.uk/journalsDOI: 10.1080/13601440210156439
IJAbpost 20/11/02 2:33 pm Page 7D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
Canada, academic developers promoting a nationalagenda to make teaching count (Smith, 1997).
These social and political demands for actionhave put pressure on universities to emphasize theimprovement of teaching, the assumption beingthat improved teaching will result in enhancedlearning. Concurrently, we know that teachers andacademic managers consistently report they wantteaching to be recognized, valued and rewardedinstitutionally. Eight years ago in the US, 35,000academics in 2- and 4-year colleges were canvassedregarding their view of teaching in the university;98% reported that being a good teacher was one of their personal goals (Centra, 1993). However,only 10% believed that their universitiesrecognized and rewarded teaching. Centra alsoreports another US study of 47 research anddoctorate granting institutions in which teachersand academic managers were surveyed. All groupspolled supported a balance between research andundergraduate teaching. When teachers andacademic managers were grouped separately, it wasfound that faculty members believed academicmanagers favoured research, although academicmanagers claimed they favoured teaching (clearevidence of a lack of discourse on teaching!). Morerecently, an international study of the perceptionsof university teachers, academic managers andacademic developers confirmed this finding(Wright, 1998). It revealed a strong belief amongall three groups (regardless of country of origin – Canada, US, UK, Australia) that the greatestpotential to improve the quality of teaching inone’s university was through teaching beingevaluated in personnel decisions.
If teaching is to be fairly and accuratelyevaluated in such decisions, then efforts arerequired in two areas to provide the impetus forchange. One is academic development – providingresources and the opportunity for teachers toimprove in order to be able to meet theexpectations. The other area is development ofpolicies and institutional structures that defineexplicit ways of documenting and assessing whatwill be evaluated.
Nevertheless, undertaking institutional changeof this kind is extremely challenging, requiringinstitutional will as well as human and financialresources – the commitment of all in theinstitution: academic managers, teachers and staff(Biggs, 2001). We believe the most importantingredient in change of this kind is a clear andshared conceptualization of what is meant byeffective teaching. Centra (1993, p. 42) says
‘effective teaching produces beneficial andpurposeful student learning’. Cashin (1989, p. 4)describes effective teaching as ‘all of those teacherbehaviours which help students learn’. Bothdefinitions share with Ramsden (1992) a focus onlearning as the critical purpose in teaching. Yet,these definitions do not make clear the nature ofteaching itself.
Robertson (1998) noted that difficulties indefining the term ‘teaching’ may be due to the lackof valid and reliable measures of teachingeffectiveness. The focus of our paper is a steptowards this. It provides a language and aframework for communicating the practice ofteaching and the criteria and standards that can beused to evaluate the full range of this practice. Aswe present the framework, we provide instances of how it can be used for administrative purposes,for personal efforts at teaching improvement, andas a means to evaluate our own efforts at academicdevelopment.
A framework and a language forevaluating teachingIn our academic development work both in ourhome university and internationally (e.g., Chile,Indonesia), we find academic managers who havetaken up the challenge to create policies that valueteaching and will enable a fair evaluation ofteaching. For instance, as regards tenure andpromotion in North America, academicadministrators are concerned with how to usestudent course evaluations effectively, or whatshould constitute an appropriate discipline-specificteaching portfolio. Six years ago we were asked aquestion by a group of administrators that led to alot of reflection: ‘On what basis should they makedistinctions about teaching expectations in termsof academic rank?’ We realized that we did notourselves have a comprehensive coherent way ofanswering that question. This led us to analyse ourpersonal accumulated knowledge as well as toreview some of the literature on evaluation ofteaching for administrative purposes, in order todevelop a language and a framework to structureour work. The ERIC search at that time, 1996,produced a limited number of documents. (Seeasterisks in reference list.) Through reflection,discussion and reading, we elaborated a languagefor those wanting to evaluate teaching in acomprehensive fashion.
8 IJAD 7:1
IJAbpost 20/11/02 2:33 pm Page 8D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
Overview
We articulated five aspects to the framework:teaching categories, criteria for evaluation,artefacts that provide evidence, sources ofevidence, and standards. (Table 1 summarizesthese.) In this paper, we define the categoriesrepresenting the practice of teaching. Then, withineach of these categories, we provide examples ofthe criteria for evaluation that we found. Next, weenumerate examples of the artefacts; the types ofobjects and documents that could provide evidencefor evaluating the criteria for the differentcategories. The fourth concept that emerged wassources, the individuals who are likely to be able toprovide or evaluate evidence; the full range ofthese is described. The fifth concept, standards,represents statements about expectations.Standards are critical if the evaluation of teachingis to be seen as fair and equitable. In this paper, wepropose a way to specify standards in relation tocriteria. (Table 2 provides examples of this.)
What resulted from the review of the literatureand our reflection on our experience is, we believe,useful for a number of purposes since it enabled usto make explicit the many aspects of what we havelong termed the invisible, hidden aspects of theteaching ‘iceberg’. What most students experience(and others see) of our practice of teaching is ‘onlythe tip of the iceberg’. There are many aspects ofteaching hidden under the waterline that areessential for the visible part to be experienced asmeaningful by students. Thus, the framework isvaluable for personal reflection by individuals(ourselves or other teachers) interested inimproving their own teaching. Second, as academicdevelopers, we can use the framework in ouractivities to ensure we include support for the fullrange of ways in which teaching is lived out in theuniversity. Also, as academic developers we can usethe framework to evaluate the extent to which ourwork has an impact beyond individual teachers andleads to benefits for the institutional climate thatstudents experience. Lastly, the frameworkprovides a relatively explicit mechanism foracademic managers and leaders to evaluateteaching in situations such as tenure andpromotion.
Categories of teaching tasks: teachingeffectiveness and improvement
In reflecting on our experience and reviewing thedocuments we examined, we looked for categories
descriptive of the major tasks that were used todefine teaching. We began with four described byCashin (1989) and expanded these to seven by thetime we had finished our review of the literature.
The first category, subject matter expertise, refers tothe individual’s grasp of the field, sub-disciplineand discipline that he/she has studied.
The second category, design skills, refers to theconceptualization, planning and organization ofinstruction at the course, programme orcurriculum levels. We included all three levels ofinstructional development here since the skillsdrawn on are the same.
Delivery skills, the third component, refers to theimplementation of instructional plans includinginstructional strategies, evaluation techniques,availability to students in courses as well as thoseone is advising. Included are both in and out ofclass activities, tasks students are expected toengage in regardless of when they occur.
The fourth category, management skills, refers tothe organizational abilities necessary forinstruction to move smoothly. These skills caninclude arranging for media, negotiating with thelibrary, meeting deadlines for grading.
The next category, mentoring/supervision, refersdirectly to the one-on-one relationship betweenacademics and the students they are supervising inundergraduate honours papers, graduatemonographs or theses, as well as credited practica.This is distinct from academic advising, providingadvice about programmes or courses of study.
So far, the categories are best conceptualized asrelating directly to the outcome of teaching, i.e.,student learning. The sixth and seventh categoriesthat emerged broaden the concept of the practiceof teaching by incorporating responsibility forteaching improvement. The inclusion ofimprovement activities in the practice of teachingaffirms that ‘good teaching’ is not just the result ofone’s efforts with students, but also includes one’sefforts in learning how to teach.
The sixth category, personal and professionaldevelopment, refers to the individual’s ability toconceptualize and carry out activities which furtherpersonal growth in teaching. It acknowledges thatthe practice of teaching entails the intention toexperiment, practice, get feedback, reflect overtime (e.g., Ho, 1998; McAlpine & Weston, 2002).
Departmental development refers to plannedactivities and policies that members of adepartment might implement to further the qualityof teaching in their unit, for instance, by defining‘acceptable’ teaching expectations and/or
EVALUATING TEACHING EFFECTIVENESS AND IMPROVEMENT 9
IJAbpost 20/11/02 2:33 pm Page 9D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
Tab
le 1
Fram
ewor
k fo
r ev
alua
tin
g te
ach
ing
(wit
h s
ampl
e cr
iter
ia, a
rtef
acts
an
d so
urce
s)
Cat
ego
ryS
ubje
ct m
atte
rD
esig
n sk
ills
Del
iver
y sk
ills
Man
agem
ent
Men
tori
ng
Per
sona
l/D
epar
tmen
tal
expe
rtis
esk
ills
stud
ents
pro
fess
iona
lde
velo
pmen
tde
velo
pmen
t
Defi
niti
on
of
gras
p of
the
fiel
d,
conc
eptu
aliz
atio
n,
impl
emen
tatio
n or
gani
zatio
nal
one
on o
ne
abili
ty t
o pa
rtic
ipat
ion
in
cate
gory
sub-
disc
iplin
e an
d pl
anni
ng a
nd
of in
stru
ctio
nal
abili
ties
rela
tions
hip
conc
eptu
aliz
e an
d ac
tiviti
es a
nd
disc
iplin
eor
gani
zatio
n of
pl
ans
incl
udin
g ne
cess
ary
for
betw
een
carr
y ou
t ac
tiviti
es
crea
tion
of
inst
ruct
ion
at t
he
stra
tegi
es,
inst
ruct
ion
to
prof
esso
rs a
nd
whi
ch fu
rthe
r po
licie
s th
at
cour
se,
eval
uatio
n m
ove
smoo
thly
the
stud
ents
the
y pe
rson
al g
row
th
furt
her
the
prog
ram
me
or
tech
niqu
es,
are
supe
rvis
ing
in t
each
ing
qual
ity o
f cu
rric
ulum
leve
lsav
aila
bilit
y to
in
und
ergr
adua
te
teac
hing
in t
he
stud
ents
– b
oth
hono
urs
pape
rs,
unit
in a
nd o
ut o
f cla
ss
grad
uate
the
ses
or m
onog
raph
s,
also
cre
dite
d pr
actic
a
Sam
ple
curr
ency
, co
urse
in
stru
ctio
nal a
ids,
co
mes
to
clas
s,
qual
ity o
f co
nduc
ting
supp
ort
of
crit
eria
:co
mpr
ehen
sive
ness
,or
gani
zatio
n,
mar
king
and
pr
epar
atio
n fo
r di
sser
tatio
n,
clas
sroo
m
depa
rtm
enta
l as
pect
s of
mas
tery
appr
opri
ate
grad
ing
prac
tices
, cl
ass,
gra
de
one
on o
ne
rese
arch
, in
stru
ctio
nal
spec
ific
obje
ctiv
es &
st
uden
t re
port
s in
stru
ctio
nal
part
icip
atin
g in
ef
fort
sca
tego
ries
ev
alua
tion
achi
evem
ent,
com
plet
ed o
n an
d ev
alua
tion
impr
ovem
ent
whi
ch c
ould
met
hods
, rel
evan
t ac
cess
by
time
met
hods
activ
ities
, be
eva
luat
edco
nten
tst
uden
tsre
flect
ing
on
prac
tice
Sam
ple
cour
se m
ater
ials
, co
urse
pla
ns,
clas
s ob
serv
atio
n st
uden
t le
tter
s,
stud
ent
awar
ds,
publ
icat
ions
on
repo
rt o
n ar
tefa
cts:
a re
view
of
stud
ent
repo
rts
repo
rt, e
mpl
oyer
de
part
men
tal
stud
ent
teac
hing
, pro
posa
ls
part
icip
atio
n in
sp
ecifi
c
cour
se m
ater
ials
on c
ours
e ou
tline
, re
port
s, s
tude
nt
cour
se fi
les,
cla
ss
com
plet
ion
rate
, to
obt
ain
fund
s de
part
men
tal
docu
men
ts o
r co
urse
mat
eria
lspe
rfor
man
ce o
n ob
serv
atio
n re
port
inte
rpre
tatio
n fo
r te
achi
ng
teac
hing
ob
ject
s st
anda
rd e
xam
s,
of s
tude
nt r
atin
gsin
nova
tion
com
mitt
ees
inte
rpre
tatio
n of
st
uden
t co
urse
ra
tings
Sam
ple
self,
pee
rs, o
ther
s se
lf, s
tude
nts,
pe
er, s
tude
nts,
st
uden
ts,
stud
ents
, pee
rs,
self,
pee
rs,
self,
pee
rs,
sour
ces:
su
ch a
s em
ploy
ers
acad
emic
ac
adem
ic
inst
ruct
iona
l ac
adem
ic
acad
emic
ac
adem
ic
indi
vidu
als
that
m
anag
ers
man
ager
s,
acad
emic
m
anag
ers,
m
anag
ers
man
ager
spr
ovid
e co
lleag
ues,
oth
ers
man
ager
sco
lleag
ues
info
rmat
ion
such
as
empl
oyer
s
IJAbpost 20/11/02 2:33 pm Page 10D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
supporting professional development activities.The emergence of this category is consistent webelieve with Scriven’s (1988) concern regardingthe teacher as an ethical professional involved in acommunity of equals, as well as with Boyer’s (1990)conception of teaching as scholarship andShulman’s (1993) notion of teaching ascommunity property. Since faculty members sit ondepartmental committees that set departmentalpolicies and practices, it is they who through theirconversations can initiate departmentaldevelopment efforts, and help to constitute aculture or climate that enhances student learning(Laurillard, 2002).
Some of the categories may seem self-evident.Reference to the small sample of literature showedthat generally design and delivery skills were seen asessential aspects of teaching, and then personal andprofessional development and subject matter expertise.Categories referred to less frequently in this reviewwere: management skills, mentoring/supervision, anddepartmental development. Management skills arelargely invisible, yet are critical to seamlessinstruction.
As for mentoring, it may be given less attentiondue to the fact that is it largely an undertakingassociated with graduate work; yet in some cases itmakes up the bulk of an academic’s teachingpractice. By treating it as a separate category, thedistinct nature of this one-on-one relationship ishighlighted, and can potentially be evaluated in anappropriate fashion. For instance, after wedeveloped this framework, we realized we had notaddressed graduate supervision as an academicdevelopment activity, and responded by offeringworkshops. Concurrently, graduate studentsinitiated the formation of a university committee toexplore the evaluation of graduate supervision. Wesat with them on the committee to provide expertadvice and after broad consultation a policy is nowin place in the university. Thus, teachers now havea better understanding of supervision and itsrelation to their teaching and there is a universitypolicy that will ultimately enhance graduatestudent learning.
Similarly, by making explicit the category ofdepartmental development, there is formalrecognition that collegial will and commitment arecrucial. Now, we are more attentive to this aspect ofteaching in our academic development work andseek to help individual academics to workcollectively to influence departmental practices,and to document and report their efforts forpersonnel decisions.
Overall, the review of the literature highlightedthe importance of a broad view of the practice ofteaching, one that goes beyond effectiveness withstudents to include efforts to improve our personaland collegial practices of teaching. Our belief isthat one can only conduct a comprehensiveevaluation of teaching if all seven categories areconsidered in the evaluation process.
Criteria for evaluation of teachingeffectiveness and improvementFor each of the seven categories of teaching, weused the sample of literature to generate criteriafor evaluation, and aspects of the specific teachingcategory which could be evaluated. For instance,for the category of subject matter expertise, ‘currency’and ‘comprehensiveness’ were cited, and forpersonal and professional development, ‘conductingclassroom research’ and ‘workshop attendance’were noted. Criteria for categories other thandesign skills, delivery skills and personal and professionaldevelopment were not addressed consistently acrossthe documents. However, it was possible to acquireexamples of criteria for all categories. We are surethough that the lists could be expanded to makethem more comprehensive. (See Table 1 forexamples.)
Artefacts to evaluate teaching effectiveness and improvementArtefacts refer to specific documents or objects,either primary sources, such as course outlines andvideotapes of classes, or secondary sources,summarized or critiqued analyses of primarysources, e.g., a colleague’s written review of acourse outline.
Artefacts can be reviewed to evaluate teachingeffectiveness and efforts at teaching improvementwithin each of the categories and in relation to theevaluation criteria. In other words, individualartefacts may cut across categories. For instance, ateaching portfolio could provide evidence for thefollowing categories: subject matter expertise, designskills, personal and professional development. Inaddition, an artefact can provide evidence ofseveral different criteria. For example, the teachingportfolio could be used to evaluate the criteria of‘comprehensiveness’ and ‘currency’ as regardssubject matter expertise. It could also be used toevaluate ‘conducting classroom research’ and‘workshop attendance’ in the category, personal andprofessional development (see Table 1).
EVALUATING TEACHING EFFECTIVENESS AND IMPROVEMENT 11
IJAbpost 20/11/02 2:33 pm Page 11D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
Many universities use a particular artefact,student evaluation questionnaires or courseratings, to evaluate teaching (Robertson, 1998).Our framework enables a critique of this practice.Although there is plentiful research evidence tosupport the thoughtful use of these questionnaires(d’Appollonia & Abrami,1997; Marsh & Roche,1997), an important consideration should be theextent to which any single artefact iscomprehensive in its evaluation of teaching. Sincequestionnaires can only provide information abouta limited range of teaching categories, it isinappropriate to use them as the sole or evenprincipal artefact in personnel decisions since thisreduces the range of criteria that are beingevaluated. Thus, an important consideration inmaking personnel decisions is that other artefactsrepresenting the other categories of teaching beincluded. An additional incentive for using avariety of artefacts in evaluating teaching is thatone can triangulate information across artefactsabout specific criteria within each category ofteaching. Our analysis provided a reminder of therange of artefacts on which one can draw.Nevertheless, we realise that decisions maderegarding what to collect or require will beinfluenced by institutional culture, by feasibilityand cost, etc. We return to this point later.
Sources of information about teaching improvement and effectivenessSources refer to the individuals that provideinformation to be used in evaluating instruction(for example, Weston, McAlpine & Bordonaro,1995). The analysis we did was useful in expandingour conception of the range of sources since oftenstudents appear to be the primary source used. In total, seven were named. ‘Self’ refers to theuniversity teacher. ‘Students’ refers to those theyteach, advise or supervise. ‘Peers’ indicates thosewith the same subject matter expertise as theuniversity teacher. ‘Colleagues’ are academics inother fields. ‘Administrators’ were defined as theindividuals most directly responsible for theevaluation of teaching, in the case of NorthAmerica, chairs and deans.
In the literature, ‘instructional administrators’referred to individuals such as librarians orprogramme directors who are responsible foraspects of instruction. We noted that support orallied staff were not mentioned in any of thedocuments although these individuals often have a
very clear perspective on the teaching abilities offaculty academics. So we expanded the definitionof ‘instructional administrators’ to include supportstaff. A last source we named ‘other’. This refers toindividuals who are not a part of the university butwho may be able to provide useful information,e.g., alumni, community members sitting oncommittees.
Overall, we found a strong reliance on studentsas a source of evaluation data. This can be useful inan environment seeking to foster student-centredlearning. However, it does overlook the fact that,for instance, alumni may have a very different viewof their learning than they did when they werestudents. In addition, students may not have thekind of expertise that is required to evaluate, forinstance, subject matter or personal/professionaldevelopment. The framework focuses attentionbeyond students as sources and providers ofartefacts and recognizes academic responsibility tomake teaching a more collegial and publicenterprise in which a range of sources can provideinsight into one’s practice of teaching. (See Table 1 for examples.)
And what of standards?The above framework provides a beginninglanguage with which to discuss and analyseteaching effectiveness and teaching improvement.However, when concerned with evaluating theteaching of a range of individuals for personneldecisions, or trying to evaluate our own efforts atteaching or academic development, we are oftenfaced with the following types of questions:
• What are the possible or desired levels ofachievement?
• How do we distinguish acceptable fromunacceptable teaching?
• What would be exceptional?
These questions relate to standards – expectations,which can be used as goals and also to evaluateprogress in either a criterion or norm-referencedmanner.
In terms of academic development, Weimer andFiring Lenze (1994) proposed five levels of impactone might consider in setting standards: teacherattitude, teacher knowledge, teacher skill, studentattitude and student learning. This scheme hasvalue in representing the impact of academicdevelopment in terms of both instructor andstudent learning. Instructor learning can beassessed in terms of changes in affect, cognition
12 IJAD 7:1
IJAbpost 20/11/02 2:33 pm Page 12D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
and action, and student learning can be assessed interms of changes in affect and cognition. However,this scheme is not designed to track the impact ofthe practice of teaching as represented incommitment to departmental development.
An alternate scheme of four levels, developedfor non-academic settings, i.e. business andindustry, was proposed by Kirkpatrick (1982, 1998).The progression is similar to that of Weimer &Firing Lenze (1994), with the added specificationthat the learning can be used in a range of settings.These levels can be used to assess either student orteacher learning. In addition, the analysis movesbeyond Weimer and Firing Lenze and includes, asa last level, the extent to which there is institutionalbenefit or change as a result of individual efforts atchange. We believe this last level is important sincewe all collectively share responsibility fordepartmental development, creating anenvironment in which students can learn effectively(Biggs, 2001). Thus, the Kirkpatrick schema isuseful in evaluating the practice of teaching interms of the extent to which:
• learners felt engaged or enjoyed the learningactivity, whether formal or informal, e.g., forstudents or for teachers this could be throughself-report [level 1];
• actual learning occurred as measured byappropriate means, e.g., for students in a coursethis might be exams or projects; for teachers,this might be self-report [level 2];
• there is ability to use the learning outside of theactivity in which the learning occurred, e.g., forstudents, the ability to use the learning in thenext course as observed in class by the newteacher; for teachers the ability to effectively usethe learning in a range of different courses[level 3];
• there is institutional benefit that accrues as aresult of the activity, e.g., enhanced climate forstudent learning as represented in studentreports, increase in registrations for aprogramme/course, higher completion rates,etc. [level 4].
It struck us that the levels of impact might be usedas a structure to begin to differentiate ‘acceptable’
from ‘very good’ from ‘exemplary’ teaching (seeFigure 1). For instance, one might describe‘acceptable’ teaching as including impact at levels1 and 2 across the seven categories; that is, there isevidence of positive affective outcomes and actuallearning. ‘Very good’ teaching would incorporatean emphasis on levels 2 (actual learning) and 3(transfer of learning to other settings). ‘Exemplary’teaching would focus on levels 3 and 4 (transferand institutional benefit). The nature of thecategory would make it clear whether the focus ofthe learning was students or teachers, e.g.,evidence of student learning would be necessaryfor delivery skills and evidence of teacher learningfor personal/professional development.
Possible standards for evaluation
In Table 2, for each teaching category, we havechosen one criterion and provided examples of theartefacts that could provide evidence of‘acceptable’, ‘very good’ and ‘exemplary’ teachingin order to show the potential of this approach. Soin considering the category of delivery skills, and thecriterion of ‘fairness of grading practices’, forexample, ‘acceptable’ expectations might be a self-report analysing principles and decisions.Characterization as ‘very good’ could berepresented in a formative evaluation to getstudent responses to grading practices. ‘Exemplary’could be seen in a written summary of theseresponses which was then critiqued by a colleaguein order to find ways to improve the gradingpractices.
In considering the category ofpersonal/professional development and the criterion of‘workshop attendance’, ‘acceptable’ expectationsmight be represented in a report on a workshop.Characterization as ‘very good’ for the samecriterion, ‘workshop attendance’, might be seen ina written plan for professional development thatelaborated how questions or concerns aboutteaching that emerged in the workshop were beingaddressed through other academic developmentactivities. ‘Exemplary’ teaching might berecognized through a report about discussions withcolleagues or the use of the learning to shiftdepartmental teaching practices.
EVALUATING TEACHING EFFECTIVENESS AND IMPROVEMENT 13
Figure 1 Teaching standards and equivalent ‘levels of impact’
acceptable
mostly levels 1 and 2
very good
emphasis on levels 2 and 3
exemplary
emphasis on levels 3 and 4
IJAbpost 20/11/02 2:33 pm Page 13D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
Tab
le 2
App
licat
ion
of l
evel
s of
impa
ct to
fram
ewor
k
Cat
ego
ryS
ampl
e cr
iter
ion
Acc
epta
ble
Go
od/
exce
llent
Exe
mpl
ary
Subj
ect
mat
ter
com
preh
ensi
vene
ssse
t of
lect
ure
note
s th
at
self-
repo
rt o
n ef
fort
s to
ens
ure
peer
rev
iew
of l
ectu
re n
otes
or
dem
onst
rate
s at
tent
ion
to
com
preh
ensi
ve o
f sub
ject
mat
ter
conc
ept
map
of c
ours
e –
2, 3
com
preh
ensi
vene
ss –
1*,
2–
1, 2
Des
ign
skill
sco
urse
org
aniz
atio
n an
d co
urse
out
line
with
ele
men
ts
elab
orat
ed c
ours
e ou
tline
– 1
elab
orat
ed c
ours
e ou
tline
with
pl
anni
ng, a
lignm
ent
of
requ
ired
by
polic
y –
1cl
ear
eval
uatio
n cr
iteri
a cr
itiqu
ed
obje
ctiv
es a
nd e
valu
atio
nby
pee
r or
col
leag
ue –
2, 3
Del
iver
y sk
ills
fair
ness
of t
ests
/gra
ding
se
lf-re
port
ana
lysi
ng p
rinc
iple
s co
nduc
ting
form
ativ
e ev
alua
tion
wri
tten
sum
mar
y of
stu
dent
pr
actic
esan
d de
cisi
ons
– 1
to g
et s
tude
nt r
espo
nse
to
resp
onse
s re
the
ir p
erce
ptio
n of
gr
adin
g pr
actic
es –
1, 2
fair
ness
and
cri
tique
by
peer
or
colle
ague
of s
elf-r
epor
t an
d ot
her
rele
vant
doc
umen
ts –
2, 3
Man
agem
ent
skill
spr
epar
atio
n fo
r ea
ch c
lass
repo
rts
from
stu
dent
s an
d pe
ers
repo
rts
from
stu
dent
s an
d pe
ers
repo
rts
from
pee
rs a
nd o
ther
s th
at n
o di
srup
tions
of c
lass
th
at u
ses
aids
bey
ond
the
that
diff
eren
t st
rate
gies
use
d in
an
time
– 1
blac
kboa
rd w
hich
enh
ance
or
gani
zed
inte
grat
ed m
anne
r, e
.g.,
lear
ning
– 1
, 2fie
ld t
rips
, mul
tidis
cipl
inar
y pr
ojec
ts, t
eam
tea
chin
g –
2, 3
,4
Men
tori
ng
eval
uatio
n of
qua
lity
of
good
ext
erna
l rev
iew
s –
2de
part
men
tal/u
nive
rsity
aw
ards
na
tiona
l ass
ocia
tion
awar
ds –
4st
uden
ts s
kills
diss
erta
tion
– 3
Pers
onal
/ pa
rtic
ipat
ion
in im
prov
emen
t ev
iden
ce o
f att
endi
ng w
orks
hops
w
ritt
en p
lan
for
prof
essi
onal
pr
esen
tatio
n of
res
ults
of
prof
essi
onal
ac
tiviti
eson
tea
chin
g –
1, m
aybe
2de
velo
pmen
t –
3in
nova
tion
at lo
cal/n
atio
nal
deve
lopm
ent
conf
eren
ce –
4
Dep
artm
enta
l su
ppor
ts d
epar
tmen
tal e
ffort
ssi
ts o
n cu
rric
ulum
com
mitt
ees
chai
rs c
omm
ittee
s or
coo
rdin
ates
in
itiat
es n
ew p
rogr
amm
es o
r de
velo
pmen
t–
1, 2
team
tea
chin
g –
2, 3
teac
hing
pro
ject
s –
3, p
erha
ps 4
* N
umbe
rs r
efer
to le
vel o
f im
pact
IJAbpost 20/11/02 2:33 pm Page 14D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
You will note that in providing examples ofstandards, we have not attended to the weightingthat might be important in order to differentiate a)a range of personal factors, e.g., teachingresponsibilities, prior experience, or b) a change inprofessional expectations, e.g., responsibilities ofassistant, associate or full professors in NorthAmerica. In terms of personal factors, years ofteaching experience could, for instance, beincorporated such that those new to teachingmight be expected to focus on subject matter,design, delivery and management skills and moreexperienced teachers on an increasingly broadrange of categories.
As regards professional expectations, in NorthAmerican terms, one might presume ‘acceptable’teaching as the level of expertise for someoneseeking promotion to the associate level when onemoves from tenure track to permanency. ‘Verygood’ would be the standard expected of anindividual seeking promotion to full professorwhen an associate professor submits a dossiershowing sustained excellence over an extendedperiod. In fact, this framework has been used inevaluating submissions for university teachingprizes in which awards were given for each rank. It appeared to make the comparison ofsubmissions, which is, in our experience, a verydifficult task, somewhat simpler.
The same notion of standards can be used to setacademic development goals and to evaluate theimpact of our work. For instance, for the categoryof departmental development and the criterion of‘supports departmental development’, ‘acceptable’expectations might be that as an academicdeveloper one actively seeks to learn the teachingdevelopment needs of the unit. The artefact mightbe a report or proposal to a teaching committee orthe chair or head of the department. Academicdevelopment characterized as ‘very good’ might befacilitating the implementation of an appropriateresponse to these needs, including an evaluation ofthe impact. This could be represented in a follow-up report to the relevant committee, one thatpresents the results of the evaluation. ‘Exemplary’academic development might be recognized indisseminating the results to senior administrationand other units through presentations or reports asexamples of ‘best practice’.
Implementing policies that recognizeand reward teachingWe began this analysis to understand how we mightconceptualize a system for evaluating the practiceof teaching. What has emerged is an explicitframework that is comprehensive andcomprehensible. Does the framework suggest areaswhere there is room for change in yourinstitution’s practices? Could any part of theframework make it easier for academic managersto recognize and reward teaching? If so, then werepeat our earlier caveat: implementing auniversity policy that supports the recognition andreward of teaching is a political activity that can befraught with resistance and difficulties. Suchpolicies involve human and financial resourcecommitment to a) academic development thatenables university teachers to meet and movebeyond acceptable expectations, and b)administrative structures that can implement theaccountability policies in ways that are perceived tobe reliable and credible. For instance, control ofinformation – who has access to it – is a crucialconcern.
This balance between support andaccountability is a critical one. Based on publicschool experience (Dube, 1995), an over-emphasison accountability will lower morale and lead to afocus on meeting the minimum standards, whereasan over-emphasis on development will lead to asituation where only those already interested inteaching take advantage of the opportunities.Further, workload can increase. McInnis (1996)compared workload patterns for academics inAustralia from 1977 to 1993 and found littlechange for teaching and research. However, therewas an additional workload – the result ofinstitutional competition and requirements foraccountability and quality assurance. In NorthAmerica, this may not be as true given the differentapproach being used for accountability.Nevertheless, it is evident that any scheme requirestime and other resources to implement effectively,and such implementation can be not only difficultbut very much influenced by institutionaldifferences (Vidovich & Porter, 1999).
EVALUATING TEACHING EFFECTIVENESS AND IMPROVEMENT 15
IJAbpost 20/11/02 2:33 pm Page 15D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
In closing: Improving one’s own teaching or academic developmentactivities
We believe this framework is unique in providing alanguage for differentiating aspects of the practiceof teaching as well as criteria, artefacts, sources andstandards for evaluating teaching.
Although the original intent of the analysis wasto provide a framework for academic managers, wefind it useful for other purposes since it makesexplicit the hidden aspects of the teaching‘iceberg’. What students and others see andassociate with teaching is similar to the small partof the iceberg that is visible above the water line.What students experience would be impossiblewithout the 80% to 90% of the part that is hidden,underwater so to speak. Until now, the‘underwater’ part has often been described only tothe extent of saying that it is the planning andevaluation of teaching that occur here. As a resultof this analysis, we can now name ‘delivery skills’ asthe visible part of teaching and the other sixcategories as making up the invisible part ofteaching. Thus, as academic developers in ourwork with university teachers, we can use theframework to provide them, particularly those whoare new, with a concrete and explicit description ofthe full range of ways in which teaching is lived outin the university. As well, we can use it as a basis forconsidering our own work – the extent to which weprovide support for all aspects of the practice ofteaching, the extent to which we have an impactbeyond individuals and are able to facilitateinstitutional change. This framework can also serveas a reflective tool for us (and for others) in ourown teaching: to set personal goals and evaluatepersonal learning and accomplishments as weengage in undergraduate and graduate teaching.
It was serendipitous that what emerged for usfrom this analysis was a clearer understanding ofour own work, both in undergraduate andgraduate teaching and in academic development.It made us both realize what we had beenattending to and what we had been overlooking orignoring in setting personal teaching (andacademic development) goals and evaluating ouraccomplishments. It has led us to ask questions ofeach other that others might find equally fruitful torespond to. For instance:
• Does the framework of categories, criteria,artefacts and sources broaden your perspective
of what effective teaching (and academicdevelopment) is?
• Does it help you explain your teaching (oracademic development) activities to yourself?
• Does the blending of the levels of impact withthe teaching categories provide you withconcrete ways of evaluating your teaching (oracademic development) efforts and setting newgoals?
• If you pause and think about your own teaching(or academic development activities), have youany artefacts showing the standard you haveachieved in each of the categories?
• If ‘yes’, would these convince others?• If ‘no’, what kinds of evidence can you see
yourself collecting?• Have you tried to collect evidence from a range
of sources, i.e., not just students (or, in the case of academic development, universityteachers)?
• Would the combination of levels and categoriesof teaching help you explain your teaching (oryour academic development activities) to yourinstitution?
References(those included in the review are asterisked)
Association of University Teachers. (1996). Professional accreditation of university teaching.Supplement to AUT Bulletin, London: Associationof University Teachers, January.*
Biggs, J. (2001). The reflective institution: Assuringand enhancing the quality of teaching andlearning. Higher Education, 42, 221–238.
Blizzard, A., & Lockhart, P. (1994). A crude model forassessing teaching quality. Waterloo, Canada:Instructional Development Center, McMasterUniversity.*
Boyer, E. (1990). Scholarship reconsidered: Priorities of theprofessoriate. Princeton, NJ: Princeton UniversityPress.
Braskamp, L., & Ory, J. (1994). Assessing faculty work:Enhancing individual and institutional performance.San Francisco: Jossey Bass.*
Cashin, W. (1989). Defining and evaluating collegeteaching. IDEA Paper #21. Kansas, KS: Kansas StateUniversity, Center for Faculty Evaluation andDevelopment, Sept. ED339791.*
Centra, J. (1993). Reflective faculty evaluation. SanFrancisco: Jossey Bass.
D’Apollonia, S., & Abrami, P. (1997). Navigatingstudent ratings of instruction, American Psychologist(Special issue on student ratings of professors), 52 (11), 1198–1208.*
16 IJAD 7:1
IJAbpost 20/11/02 2:33 pm Page 16D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14
Dow, K. (2001). Strengthening quality assurance inAustralian Higher education. In D. Dunkerley &W. Wong (Eds.), Global perspectives on quality inhigher education (pp. 123–142). Aldershot, England:Ashgate Publishing.
Dube, D. (1995). Teacher evaluation policy. Albany, NY:State University of New York Press.
Edgerton, R., Hutchings, P., & Quinlan, K. (1991).The teaching portfolio: Capturing the scholarship inteaching. Washington, DC: American Associationfor Higher Education.
Ho, A. (1998). An example of a conceptual changeapproach to staff development. Paper presented atthe International Consortium for EducationalDevelopment Conference, Austin TX. April.
Kirkpatrick, D. (1982). How to improve performancethrough appraisal and coaching. New York: Amacom.
Kirkpatrick, D. (1998). Evaluating training programs:The four levels (2nd Edn.). San Francisco: Berrett-Koehler Publishers.
Laurillard, D. (2002). Rethinking university teaching,(2nd Edn.). London: Routledge/Falmer.
Marsh, H., & Roche, L. (1997). Making students’evaluations of teaching effectiveness effective: Thecritical issues of validity, bias and utility, AmericanPsychologist (Special issue on student ratings ofprofessors), 52 (11), 1187–1198.
McAlpine, L., & Weston, C. (2002). Reflection: Issuesrelated to improving professors’ teaching andstudents’ learning, In N. Hativa & P. Goodyear(Eds.) Teachers’ thinking, beliefs and practices (pp.59–78). Dordrecht, The Netherlands: Kluwer.
McInnis, C. (1996). Change and diversity in the workpatterns of Australian academics. Higher EducationManagement, 8 (2), 105–117.
Narang , H (1992) Evaluating Faculty Teaching: A Proposal. Unpublished manuscript.Saskatchewan, Canada. Available: ED349906.*
Ramsden, P. (1992). Learning to teach in highereducation. London: Routledge.
Randall, J. (2001). Academic review in the UnitedKingdom. In D. Dunkerley & W. Wong (Eds.),Global perspectives on quality in higher education (pp. 57–69). Aldershot, UK: Ashgate Publishing.
Robertson, M. (1998). Benchmarking teachingperformance in universities: Issues of control,policy, theory and ‘best practice’. In J. Forest (Ed.),University teaching: International perspectives (pp. 275–303). New York: Garland Publishing.
Scriven, M. (1988). Duty-based teacher evaluation.Journal of Personnel Evaluation in Education, 1,319–334.
Shulman, L. (1993). Teaching as communityproperty. Change, Nov/Dec, 6–7.
Smith, R. A. (1997). Making teaching count inCanadian higher education: Developing a nationalagenda. Newsletter of the Society for Teaching andLearning in Higher Education (STLHE), 21, 1–9.
Smith, S. (1991). Report of the Commission of Inquiry on
Canadian University Education. Ottawa, Canada:Association of Universities and Colleges of Canada.
Weston, C., McAlpine, L., & Bordonaro, T. (1995). A model for understanding formative evaluation ininstructional design. Educational Technology,Research and Development, 43 (3), 29–46.
Weimer, M., & Firing Lenze, L. (1994). Instructionalinterventions: A review of the literature on effortsto improve instruction. In K. Feldman & M.Paulsen (Eds.), Teaching and learning in the collegeclassroom (pp. 653–682). Needham Heights, MA:Simon & Schuster Custom.
Wright, A. (1998). Improving teaching by design:Preferred policies, programs and practices. In J.Forest (Ed.), University teaching: Internationalperspectives (pp. 3–17). New York: Garland.
Vidovich, L., & Porter, P. (1999). Quality policy inAustralian higher education of the 90s: Universityperspectives. Journal of Educational Policy, 14 (6),567–586.
The authorsLynn McAlpine is an Associate Professor in theDepartment of Educational and CounsellingPsychology and the Director of the Center forUniversity Teaching and Learning at McGillUniversity. Ralph Harris, also at McGill, is anAssociate Professor in the Department of Miningand Metallurgy and an Affiliate Member of theCenter for University Teaching and Learning. Thetwo have worked jointly on a number of academicdevelopment activities.
Address: Center for University Teaching andLearning, McGill University, 3700 McTavish,Montreal, Quebec, Canada H3A 1Y2. Phone: 514-398-6648.Email: [email protected]
EVALUATING TEACHING EFFECTIVENESS AND IMPROVEMENT 17
IJAbpost 20/11/02 2:33 pm Page 17D
ownl
oade
d by
[U
nive
rsity
of
Hon
g K
ong
Lib
rari
es]
at 0
7:19
12
Nov
embe
r 20
14