Evaluating teaching effectiveness and teaching improvement: A language for institutional policies and academic development practices

This article was downloaded by: [University of Hong Kong Libraries]On: 12 November 2014, At: 07:19Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal for Academic DevelopmentPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/rija20

Evaluating teaching effectiveness and teachingimprovement: A language for institutional policiesand academic development practicesLynn McAlpine a & Ralph Harrisa Center for University Teaching and Learning , McGill University , Canada.Published online: 10 Dec 2010.

To cite this article: Lynn McAlpine & Ralph Harris (2002) Evaluating teaching effectiveness and teaching improvement: Alanguage for institutional policies and academic development practices, International Journal for Academic Development,7:1, 7-17, DOI: 10.1080/13601440210156439

To link to this article: http://dx.doi.org/10.1080/13601440210156439

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose ofthe Content. Any opinions and views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be reliedupon and should be independently verified with primary sources of information. Taylor and Francis shallnot be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and otherliabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/rija20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/13601440210156439

http://dx.doi.org/10.1080/13601440210156439

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Introduction – contextOver the past decade in North America, Europeand Australia, there has been increasing publicconcern with the value or impact of money beingspent in post-secondary education, e.g., the Smith(1991) report in Canada called for action at thepost-secondary level; the Staff and EducationalDevelopment Association in the United Kingdomestablished standards for practice at the nationallevel; external calls for accountability and constantimprovement are ongoing in Australia (Robertson,1998).

This emphasis on accountability is centrallydriven in some countries. Dow (2001) notes thecurrent regulatory protocols for quality assurancecommon to all levels of government in Australia.Also, Randall (2001) reports on the impact of theQuality Assurance Agency for Higher Educationacross the UK. Yet, during the same period of time

in the US and Canada, the call for accountabilityhas not been experienced in a standards-basednational model, but in a more decentralizedmanner. Evaluation of the quality of the practicesof post-secondary institutions occurs throughprofessional associations accrediting particularprogrammes (e.g., counselling psychology,engineering), and the charters of institutions beinggranted by individual states and provinces. Calls foraction are more evident in North America throughthe media, for example, an annual survey ofuniversities by a Canadian national news magazine,as well as budget cuts that force universities toconsider both the nature of what they do and howoutsiders perceive it. Concurrently, there are callsfrom academics in North America to promote therecognition and evaluation of teaching: forexample in the US, Boyer’s (1990) proposal for thescholarship of teaching and Edgerton, Hutchingsand Quinlan’s (1991) for teaching portfolios; in

Evaluating teaching effectiveness

and teaching improvement:

A language for institutional

policies and academic

development practices

Lynn McAlpine and Ralph Harris, Center forUniversity Teaching and Learning,McGill University, Canada.

ABSTRACTDemands for institutional accountability in higher education have been increasing and have led to greater attentionto the evaluation of teaching, the assumption being that improved teaching will result in enhanced learning.

In our work as academic developers, we are increasingly helping academic managers make explicit teachingpolicies and practices that seem fair and equitable. To help us in this work, we have developed a framework forevaluating the practice of teaching. What is unique about this framework is the language it provides to differentiateaspects of teaching. For instance, it provides a basis for differentiating and linking criteria to standards, i.e. the levelof achievement desired or expected. Standards are critical if the evaluation of teaching is to be seen as fair andequitable, yet they are often unexamined in other representations of the evaluation of teaching. Although theoriginal intent of our efforts was to provide a framework for academic managers, we have come to find it useful inour own work as university teachers and as academic developers. Examples of all three uses are provided in thepaper.

The International Journal for Academic DevelopmentISSN 1360-144X print/ISSN 1470-1324 online © 2002 Taylor & Francis Ltd

http://www.tandf.co.uk/journalsDOI: 10.1080/13601440210156439

IJAbpost 20/11/02 2:33 pm Page 7D

ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

Canada, academic developers promoting a nationalagenda to make teaching count (Smith, 1997).

These social and political demands for actionhave put pressure on universities to emphasize theimprovement of teaching, the assumption beingthat improved teaching will result in enhancedlearning. Concurrently, we know that teachers andacademic managers consistently report they wantteaching to be recognized, valued and rewardedinstitutionally. Eight years ago in the US, 35,000academics in 2- and 4-year colleges were canvassedregarding their view of teaching in the university;98% reported that being a good teacher was one of their personal goals (Centra, 1993). However,only 10% believed that their universitiesrecognized and rewarded teaching. Centra alsoreports another US study of 47 research anddoctorate granting institutions in which teachersand academic managers were surveyed. All groupspolled supported a balance between research andundergraduate teaching. When teachers andacademic managers were grouped separately, it wasfound that faculty members believed academicmanagers favoured research, although academicmanagers claimed they favoured teaching (clearevidence of a lack of discourse on teaching!). Morerecently, an international study of the perceptionsof university teachers, academic managers andacademic developers confirmed this finding(Wright, 1998). It revealed a strong belief amongall three groups (regardless of country of origin – Canada, US, UK, Australia) that the greatestpotential to improve the quality of teaching inone’s university was through teaching beingevaluated in personnel decisions.

If teaching is to be fairly and accuratelyevaluated in such decisions, then efforts arerequired in two areas to provide the impetus forchange. One is academic development – providingresources and the opportunity for teachers toimprove in order to be able to meet theexpectations. The other area is development ofpolicies and institutional structures that defineexplicit ways of documenting and assessing whatwill be evaluated.

Nevertheless, undertaking institutional changeof this kind is extremely challenging, requiringinstitutional will as well as human and financialresources – the commitment of all in theinstitution: academic managers, teachers and staff(Biggs, 2001). We believe the most importantingredient in change of this kind is a clear andshared conceptualization of what is meant byeffective teaching. Centra (1993, p. 42) says

‘effective teaching produces beneficial andpurposeful student learning’. Cashin (1989, p. 4)describes effective teaching as ‘all of those teacherbehaviours which help students learn’. Bothdefinitions share with Ramsden (1992) a focus onlearning as the critical purpose in teaching. Yet,these definitions do not make clear the nature ofteaching itself.

Robertson (1998) noted that difficulties indefining the term ‘teaching’ may be due to the lackof valid and reliable measures of teachingeffectiveness. The focus of our paper is a steptowards this. It provides a language and aframework for communicating the practice ofteaching and the criteria and standards that can beused to evaluate the full range of this practice. Aswe present the framework, we provide instances of how it can be used for administrative purposes,for personal efforts at teaching improvement, andas a means to evaluate our own efforts at academicdevelopment.

A framework and a language forevaluating teachingIn our academic development work both in ourhome university and internationally (e.g., Chile,Indonesia), we find academic managers who havetaken up the challenge to create policies that valueteaching and will enable a fair evaluation ofteaching. For instance, as regards tenure andpromotion in North America, academicadministrators are concerned with how to usestudent course evaluations effectively, or whatshould constitute an appropriate discipline-specificteaching portfolio. Six years ago we were asked aquestion by a group of administrators that led to alot of reflection: ‘On what basis should they makedistinctions about teaching expectations in termsof academic rank?’ We realized that we did notourselves have a comprehensive coherent way ofanswering that question. This led us to analyse ourpersonal accumulated knowledge as well as toreview some of the literature on evaluation ofteaching for administrative purposes, in order todevelop a language and a framework to structureour work. The ERIC search at that time, 1996,produced a limited number of documents. (Seeasterisks in reference list.) Through reflection,discussion and reading, we elaborated a languagefor those wanting to evaluate teaching in acomprehensive fashion.

8 IJAD 7:1


ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

Overview

We articulated five aspects to the framework:teaching categories, criteria for evaluation,artefacts that provide evidence, sources ofevidence, and standards. (Table 1 summarizesthese.) In this paper, we define the categoriesrepresenting the practice of teaching. Then, withineach of these categories, we provide examples ofthe criteria for evaluation that we found. Next, weenumerate examples of the artefacts; the types ofobjects and documents that could provide evidencefor evaluating the criteria for the differentcategories. The fourth concept that emerged wassources, the individuals who are likely to be able toprovide or evaluate evidence; the full range ofthese is described. The fifth concept, standards,represents statements about expectations.Standards are critical if the evaluation of teachingis to be seen as fair and equitable. In this paper, wepropose a way to specify standards in relation tocriteria. (Table 2 provides examples of this.)

What resulted from the review of the literatureand our reflection on our experience is, we believe,useful for a number of purposes since it enabled usto make explicit the many aspects of what we havelong termed the invisible, hidden aspects of theteaching ‘iceberg’. What most students experience(and others see) of our practice of teaching is ‘onlythe tip of the iceberg’. There are many aspects ofteaching hidden under the waterline that areessential for the visible part to be experienced asmeaningful by students. Thus, the framework isvaluable for personal reflection by individuals(ourselves or other teachers) interested inimproving their own teaching. Second, as academicdevelopers, we can use the framework in ouractivities to ensure we include support for the fullrange of ways in which teaching is lived out in theuniversity. Also, as academic developers we can usethe framework to evaluate the extent to which ourwork has an impact beyond individual teachers andleads to benefits for the institutional climate thatstudents experience. Lastly, the frameworkprovides a relatively explicit mechanism foracademic managers and leaders to evaluateteaching in situations such as tenure andpromotion.

Categories of teaching tasks: teachingeffectiveness and improvement

In reflecting on our experience and reviewing thedocuments we examined, we looked for categories

descriptive of the major tasks that were used todefine teaching. We began with four described byCashin (1989) and expanded these to seven by thetime we had finished our review of the literature.

The first category, subject matter expertise, refers tothe individual’s grasp of the field, sub-disciplineand discipline that he/she has studied.

The second category, design skills, refers to theconceptualization, planning and organization ofinstruction at the course, programme orcurriculum levels. We included all three levels ofinstructional development here since the skillsdrawn on are the same.

Delivery skills, the third component, refers to theimplementation of instructional plans includinginstructional strategies, evaluation techniques,availability to students in courses as well as thoseone is advising. Included are both in and out ofclass activities, tasks students are expected toengage in regardless of when they occur.

The fourth category, management skills, refers tothe organizational abilities necessary forinstruction to move smoothly. These skills caninclude arranging for media, negotiating with thelibrary, meeting deadlines for grading.

The next category, mentoring/supervision, refersdirectly to the one-on-one relationship betweenacademics and the students they are supervising inundergraduate honours papers, graduatemonographs or theses, as well as credited practica.This is distinct from academic advising, providingadvice about programmes or courses of study.

So far, the categories are best conceptualized asrelating directly to the outcome of teaching, i.e.,student learning. The sixth and seventh categoriesthat emerged broaden the concept of the practiceof teaching by incorporating responsibility forteaching improvement. The inclusion ofimprovement activities in the practice of teachingaffirms that ‘good teaching’ is not just the result ofone’s efforts with students, but also includes one’sefforts in learning how to teach.

The sixth category, personal and professionaldevelopment, refers to the individual’s ability toconceptualize and carry out activities which furtherpersonal growth in teaching. It acknowledges thatthe practice of teaching entails the intention toexperiment, practice, get feedback, reflect overtime (e.g., Ho, 1998; McAlpine & Weston, 2002).

Departmental development refers to plannedactivities and policies that members of adepartment might implement to further the qualityof teaching in their unit, for instance, by defining‘acceptable’ teaching expectations and/or

EVALUATING TEACHING EFFECTIVENESS AND IMPROVEMENT 9


ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

Tab

le 1

Fram

ewor

k fo

r ev

alua

tin

g te

ach

ing

(wit

h s

ampl

e cr

iter

ia, a

rtef

acts

an

d so

urce

s)

Cat

ego

ryS

ubje

ct m

atte

rD

esig

n sk

ills

Del

iver

y sk

ills

Man

agem

ent

Men

tori

ng

Per

sona

l/D

epar

tmen

tal

expe

rtis

esk

ills

stud

ents

pro

fess

iona

lde

velo

pmen

tde

velo

pmen

t

Defi

niti

on

of

gras

p of

the

fiel

d,

conc

eptu

aliz

atio

n,

impl

emen

tatio

n or

gani

zatio

nal

one

on o

ne

abili

ty t

o pa

rtic

ipat

ion

in

cate

gory

sub-

disc

iplin

e an

d pl

anni

ng a

nd

of in

stru

ctio

nal

abili

ties

rela

tions

hip

conc

eptu

aliz

e an

d ac

tiviti

es a

nd

disc

iplin

eor

gani

zatio

n of

pl

ans

incl

udin

g ne

cess

ary

for

betw

een

carr

y ou

t ac

tiviti

es

crea

tion

of

inst

ruct

ion

at t

he

stra

tegi

es,

inst

ruct

ion

to

prof

esso

rs a

nd

whi

ch fu

rthe

r po

licie

s th

at

cour

se,

eval

uatio

n m

ove

smoo

thly

the

stud

ents

the

y pe

rson

al g

row

th

furt

her

the

prog

ram

me

or

tech

niqu

es,

are

supe

rvis

ing

in t

each

ing

qual

ity o

f cu

rric

ulum

leve

lsav

aila

bilit

y to

in

und

ergr

adua

te

teac

hing

in t

he

stud

ents

– b

oth

hono

urs

pape

rs,

unit

in a

nd o

ut o

f cla

ss

grad

uate

the

ses

or m

onog

raph

s,

also

cre

dite

d pr

actic

a

Sam

ple

curr

ency

, co

urse

in

stru

ctio

nal a

ids,

co

mes

to

clas

s,

qual

ity o

f co

nduc

ting

supp

ort

of

crit

eria

:co

mpr

ehen

sive

ness

,or

gani

zatio

n,

mar

king

and

pr

epar

atio

n fo

r di

sser

tatio

n,

clas

sroo

m

depa

rtm

enta

l as

pect

s of

mas

tery

appr

opri

ate

grad

ing

prac

tices

, cl

ass,

gra

de

one

on o

ne

rese

arch

, in

stru

ctio

nal

spec

ific

obje

ctiv

es &

st

uden

t re

port

s in

stru

ctio

nal

part

icip

atin

g in

ef

fort

sca

tego

ries

ev

alua

tion

achi

evem

ent,

com

plet

ed o

n an

d ev

alua

tion

impr

ovem

ent

whi

ch c

ould

met

hods

, rel

evan

t ac

cess

by

time

met

hods

activ

ities

, be

eva

luat

edco

nten

tst

uden

tsre

flect

ing

on

prac

tice

Sam

ple

cour

se m

ater

ials

, co

urse

pla

ns,

clas

s ob

serv

atio

n st

uden

t le

tter

s,

stud

ent

awar

ds,

publ

icat

ions

on

repo

rt o

n ar

tefa

cts:

a re

view

of

stud

ent

repo

rts

repo

rt, e

mpl

oyer

de

part

men

tal

stud

ent

teac

hing

, pro

posa

ls

part

icip

atio

n in

sp

ecifi

c

cour

se m

ater

ials

on c

ours

e ou

tline

, re

port

s, s

tude

nt

cour

se fi

les,

cla

ss

com

plet

ion

rate

, to

obt

ain

fund

s de

part

men

tal

docu

men

ts o

r co

urse

mat

eria

lspe

rfor

man

ce o

n ob

serv

atio

n re

port

inte

rpre

tatio

n fo

r te

achi

ng

teac

hing

ob

ject

s st

anda

rd e

xam

s,

of s

tude

nt r

atin

gsin

nova

tion

com

mitt

ees

inte

rpre

tatio

n of

st

uden

t co

urse

ra

tings

Sam

ple

self,

pee

rs, o

ther

s se

lf, s

tude

nts,

pe

er, s

tude

nts,

st

uden

ts,

stud

ents

, pee

rs,

self,

pee

rs,

self,

pee

rs,

sour

ces:

su

ch a

s em

ploy

ers

acad

emic

ac

adem

ic

inst

ruct

iona

l ac

adem

ic

acad

emic

ac

adem

ic

indi

vidu

als

that

m

anag

ers

man

ager

s,

acad

emic

m

anag

ers,

m

anag

ers

man

ager

spr

ovid

e co

lleag

ues,

oth

ers

man

ager

sco

lleag

ues

info

rmat

ion

such

as

empl

oyer

s


ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

supporting professional development activities.The emergence of this category is consistent webelieve with Scriven’s (1988) concern regardingthe teacher as an ethical professional involved in acommunity of equals, as well as with Boyer’s (1990)conception of teaching as scholarship andShulman’s (1993) notion of teaching ascommunity property. Since faculty members sit ondepartmental committees that set departmentalpolicies and practices, it is they who through theirconversations can initiate departmentaldevelopment efforts, and help to constitute aculture or climate that enhances student learning(Laurillard, 2002).

Some of the categories may seem self-evident.Reference to the small sample of literature showedthat generally design and delivery skills were seen asessential aspects of teaching, and then personal andprofessional development and subject matter expertise.Categories referred to less frequently in this reviewwere: management skills, mentoring/supervision, anddepartmental development. Management skills arelargely invisible, yet are critical to seamlessinstruction.

As for mentoring, it may be given less attentiondue to the fact that is it largely an undertakingassociated with graduate work; yet in some cases itmakes up the bulk of an academic’s teachingpractice. By treating it as a separate category, thedistinct nature of this one-on-one relationship ishighlighted, and can potentially be evaluated in anappropriate fashion. For instance, after wedeveloped this framework, we realized we had notaddressed graduate supervision as an academicdevelopment activity, and responded by offeringworkshops. Concurrently, graduate studentsinitiated the formation of a university committee toexplore the evaluation of graduate supervision. Wesat with them on the committee to provide expertadvice and after broad consultation a policy is nowin place in the university. Thus, teachers now havea better understanding of supervision and itsrelation to their teaching and there is a universitypolicy that will ultimately enhance graduatestudent learning.

Similarly, by making explicit the category ofdepartmental development, there is formalrecognition that collegial will and commitment arecrucial. Now, we are more attentive to this aspect ofteaching in our academic development work andseek to help individual academics to workcollectively to influence departmental practices,and to document and report their efforts forpersonnel decisions.

Overall, the review of the literature highlightedthe importance of a broad view of the practice ofteaching, one that goes beyond effectiveness withstudents to include efforts to improve our personaland collegial practices of teaching. Our belief isthat one can only conduct a comprehensiveevaluation of teaching if all seven categories areconsidered in the evaluation process.

Criteria for evaluation of teachingeffectiveness and improvementFor each of the seven categories of teaching, weused the sample of literature to generate criteriafor evaluation, and aspects of the specific teachingcategory which could be evaluated. For instance,for the category of subject matter expertise, ‘currency’and ‘comprehensiveness’ were cited, and forpersonal and professional development, ‘conductingclassroom research’ and ‘workshop attendance’were noted. Criteria for categories other thandesign skills, delivery skills and personal and professionaldevelopment were not addressed consistently acrossthe documents. However, it was possible to acquireexamples of criteria for all categories. We are surethough that the lists could be expanded to makethem more comprehensive. (See Table 1 forexamples.)

Artefacts to evaluate teaching effectiveness and improvementArtefacts refer to specific documents or objects,either primary sources, such as course outlines andvideotapes of classes, or secondary sources,summarized or critiqued analyses of primarysources, e.g., a colleague’s written review of acourse outline.

Artefacts can be reviewed to evaluate teachingeffectiveness and efforts at teaching improvementwithin each of the categories and in relation to theevaluation criteria. In other words, individualartefacts may cut across categories. For instance, ateaching portfolio could provide evidence for thefollowing categories: subject matter expertise, designskills, personal and professional development. Inaddition, an artefact can provide evidence ofseveral different criteria. For example, the teachingportfolio could be used to evaluate the criteria of‘comprehensiveness’ and ‘currency’ as regardssubject matter expertise. It could also be used toevaluate ‘conducting classroom research’ and‘workshop attendance’ in the category, personal andprofessional development (see Table 1).



ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

Many universities use a particular artefact,student evaluation questionnaires or courseratings, to evaluate teaching (Robertson, 1998).Our framework enables a critique of this practice.Although there is plentiful research evidence tosupport the thoughtful use of these questionnaires(d’Appollonia & Abrami,1997; Marsh & Roche,1997), an important consideration should be theextent to which any single artefact iscomprehensive in its evaluation of teaching. Sincequestionnaires can only provide information abouta limited range of teaching categories, it isinappropriate to use them as the sole or evenprincipal artefact in personnel decisions since thisreduces the range of criteria that are beingevaluated. Thus, an important consideration inmaking personnel decisions is that other artefactsrepresenting the other categories of teaching beincluded. An additional incentive for using avariety of artefacts in evaluating teaching is thatone can triangulate information across artefactsabout specific criteria within each category ofteaching. Our analysis provided a reminder of therange of artefacts on which one can draw.Nevertheless, we realise that decisions maderegarding what to collect or require will beinfluenced by institutional culture, by feasibilityand cost, etc. We return to this point later.

Sources of information about teaching improvement and effectivenessSources refer to the individuals that provideinformation to be used in evaluating instruction(for example, Weston, McAlpine & Bordonaro,1995). The analysis we did was useful in expandingour conception of the range of sources since oftenstudents appear to be the primary source used. In total, seven were named. ‘Self’ refers to theuniversity teacher. ‘Students’ refers to those theyteach, advise or supervise. ‘Peers’ indicates thosewith the same subject matter expertise as theuniversity teacher. ‘Colleagues’ are academics inother fields. ‘Administrators’ were defined as theindividuals most directly responsible for theevaluation of teaching, in the case of NorthAmerica, chairs and deans.

In the literature, ‘instructional administrators’referred to individuals such as librarians orprogramme directors who are responsible foraspects of instruction. We noted that support orallied staff were not mentioned in any of thedocuments although these individuals often have a

very clear perspective on the teaching abilities offaculty academics. So we expanded the definitionof ‘instructional administrators’ to include supportstaff. A last source we named ‘other’. This refers toindividuals who are not a part of the university butwho may be able to provide useful information,e.g., alumni, community members sitting oncommittees.

Overall, we found a strong reliance on studentsas a source of evaluation data. This can be useful inan environment seeking to foster student-centredlearning. However, it does overlook the fact that,for instance, alumni may have a very different viewof their learning than they did when they werestudents. In addition, students may not have thekind of expertise that is required to evaluate, forinstance, subject matter or personal/professionaldevelopment. The framework focuses attentionbeyond students as sources and providers ofartefacts and recognizes academic responsibility tomake teaching a more collegial and publicenterprise in which a range of sources can provideinsight into one’s practice of teaching. (See Table 1 for examples.)

And what of standards?The above framework provides a beginninglanguage with which to discuss and analyseteaching effectiveness and teaching improvement.However, when concerned with evaluating theteaching of a range of individuals for personneldecisions, or trying to evaluate our own efforts atteaching or academic development, we are oftenfaced with the following types of questions:

• What are the possible or desired levels ofachievement?

• How do we distinguish acceptable fromunacceptable teaching?

• What would be exceptional?

These questions relate to standards – expectations,which can be used as goals and also to evaluateprogress in either a criterion or norm-referencedmanner.

In terms of academic development, Weimer andFiring Lenze (1994) proposed five levels of impactone might consider in setting standards: teacherattitude, teacher knowledge, teacher skill, studentattitude and student learning. This scheme hasvalue in representing the impact of academicdevelopment in terms of both instructor andstudent learning. Instructor learning can beassessed in terms of changes in affect, cognition

12 IJAD 7:1


ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

and action, and student learning can be assessed interms of changes in affect and cognition. However,this scheme is not designed to track the impact ofthe practice of teaching as represented incommitment to departmental development.

An alternate scheme of four levels, developedfor non-academic settings, i.e. business andindustry, was proposed by Kirkpatrick (1982, 1998).The progression is similar to that of Weimer &Firing Lenze (1994), with the added specificationthat the learning can be used in a range of settings.These levels can be used to assess either student orteacher learning. In addition, the analysis movesbeyond Weimer and Firing Lenze and includes, asa last level, the extent to which there is institutionalbenefit or change as a result of individual efforts atchange. We believe this last level is important sincewe all collectively share responsibility fordepartmental development, creating anenvironment in which students can learn effectively(Biggs, 2001). Thus, the Kirkpatrick schema isuseful in evaluating the practice of teaching interms of the extent to which:

• learners felt engaged or enjoyed the learningactivity, whether formal or informal, e.g., forstudents or for teachers this could be throughself-report [level 1];

• actual learning occurred as measured byappropriate means, e.g., for students in a coursethis might be exams or projects; for teachers,this might be self-report [level 2];

• there is ability to use the learning outside of theactivity in which the learning occurred, e.g., forstudents, the ability to use the learning in thenext course as observed in class by the newteacher; for teachers the ability to effectively usethe learning in a range of different courses[level 3];

• there is institutional benefit that accrues as aresult of the activity, e.g., enhanced climate forstudent learning as represented in studentreports, increase in registrations for aprogramme/course, higher completion rates,etc. [level 4].

It struck us that the levels of impact might be usedas a structure to begin to differentiate ‘acceptable’

from ‘very good’ from ‘exemplary’ teaching (seeFigure 1). For instance, one might describe‘acceptable’ teaching as including impact at levels1 and 2 across the seven categories; that is, there isevidence of positive affective outcomes and actuallearning. ‘Very good’ teaching would incorporatean emphasis on levels 2 (actual learning) and 3(transfer of learning to other settings). ‘Exemplary’teaching would focus on levels 3 and 4 (transferand institutional benefit). The nature of thecategory would make it clear whether the focus ofthe learning was students or teachers, e.g.,evidence of student learning would be necessaryfor delivery skills and evidence of teacher learningfor personal/professional development.

Possible standards for evaluation

In Table 2, for each teaching category, we havechosen one criterion and provided examples of theartefacts that could provide evidence of‘acceptable’, ‘very good’ and ‘exemplary’ teachingin order to show the potential of this approach. Soin considering the category of delivery skills, and thecriterion of ‘fairness of grading practices’, forexample, ‘acceptable’ expectations might be a self-report analysing principles and decisions.Characterization as ‘very good’ could berepresented in a formative evaluation to getstudent responses to grading practices. ‘Exemplary’could be seen in a written summary of theseresponses which was then critiqued by a colleaguein order to find ways to improve the gradingpractices.

In considering the category ofpersonal/professional development and the criterion of‘workshop attendance’, ‘acceptable’ expectationsmight be represented in a report on a workshop.Characterization as ‘very good’ for the samecriterion, ‘workshop attendance’, might be seen ina written plan for professional development thatelaborated how questions or concerns aboutteaching that emerged in the workshop were beingaddressed through other academic developmentactivities. ‘Exemplary’ teaching might berecognized through a report about discussions withcolleagues or the use of the learning to shiftdepartmental teaching practices.


Figure 1 Teaching standards and equivalent ‘levels of impact’

acceptable

mostly levels 1 and 2

very good

emphasis on levels 2 and 3

exemplary

emphasis on levels 3 and 4


ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

Tab

le 2

App

licat

ion

of l

evel

s of

impa

ct to

fram

ewor

k

Cat

ego

ryS

ampl

e cr

iter

ion

Acc

epta

ble

Go

od/

exce

llent

Exe

mpl

ary

Subj

ect

mat

ter

com

preh

ensi

vene

ssse

t of

lect

ure

note

s th

at

self-

repo

rt o

n ef

fort

s to

ens

ure

peer

rev

iew

of l

ectu

re n

otes

or

dem

onst

rate

s at

tent

ion

to

com

preh

ensi

ve o

f sub

ject

mat

ter

conc

ept

map

of c

ours

e –

2, 3

com

preh

ensi

vene

ss –

1*,

2–

1, 2

Des

ign

skill

sco

urse

org

aniz

atio

n an

d co

urse

out

line

with

ele

men

ts

elab

orat

ed c

ours

e ou

tline

– 1

elab

orat

ed c

ours

e ou

tline

with

pl

anni

ng, a

lignm

ent

of

requ

ired

by

polic

y –

1cl

ear

eval

uatio

n cr

iteri

a cr

itiqu

ed

obje

ctiv

es a

nd e

valu

atio

nby

pee

r or

col

leag

ue –

2, 3

Del

iver

y sk

ills

fair

ness

of t

ests

/gra

ding

se

lf-re

port

ana

lysi

ng p

rinc

iple

s co

nduc

ting

form

ativ

e ev

alua

tion

wri

tten

sum

mar

y of

stu

dent

pr

actic

esan

d de

cisi

ons

– 1

to g

et s

tude

nt r

espo

nse

to

resp

onse

s re

the

ir p

erce

ptio

n of

gr

adin

g pr

actic

es –

1, 2

fair

ness

and

cri

tique

by

peer

or

colle

ague

of s

elf-r

epor

t an

d ot

her

rele

vant

doc

umen

ts –

2, 3

Man

agem

ent

skill

spr

epar

atio

n fo

r ea

ch c

lass

repo

rts

from

stu

dent

s an

d pe

ers

repo

rts

from

stu

dent

s an

d pe

ers

repo

rts

from

pee

rs a

nd o

ther

s th

at n

o di

srup

tions

of c

lass

th

at u

ses

aids

bey

ond

the

that

diff

eren

t st

rate

gies

use

d in

an

time

– 1

blac

kboa

rd w

hich

enh

ance

or

gani

zed

inte

grat

ed m

anne

r, e

.g.,

lear

ning

– 1

, 2fie

ld t

rips

, mul

tidis

cipl

inar

y pr

ojec

ts, t

eam

tea

chin

g –

2, 3

,4

Men

tori

ng

eval

uatio

n of

qua

lity

of

good

ext

erna

l rev

iew

s –

2de

part

men

tal/u

nive

rsity

aw

ards

na

tiona

l ass

ocia

tion

awar

ds –

4st

uden

ts s

kills

diss

erta

tion

– 3

Pers

onal

/ pa

rtic

ipat

ion

in im

prov

emen

t ev

iden

ce o

f att

endi

ng w

orks

hops

w

ritt

en p

lan

for

prof

essi

onal

pr

esen

tatio

n of

res

ults

of

prof

essi

onal

ac

tiviti

eson

tea

chin

g –

1, m

aybe

2de

velo

pmen

t –

3in

nova

tion

at lo

cal/n

atio

nal

deve

lopm

ent

conf

eren

ce –

4

Dep

artm

enta

l su

ppor

ts d

epar

tmen

tal e

ffort

ssi

ts o

n cu

rric

ulum

com

mitt

ees

chai

rs c

omm

ittee

s or

coo

rdin

ates

in

itiat

es n

ew p

rogr

amm

es o

r de

velo

pmen

t–

1, 2

team

tea

chin

g –

2, 3

teac

hing

pro

ject

s –

3, p

erha

ps 4

* N

umbe

rs r

efer

to le

vel o

f im

pact


ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

You will note that in providing examples ofstandards, we have not attended to the weightingthat might be important in order to differentiate a)a range of personal factors, e.g., teachingresponsibilities, prior experience, or b) a change inprofessional expectations, e.g., responsibilities ofassistant, associate or full professors in NorthAmerica. In terms of personal factors, years ofteaching experience could, for instance, beincorporated such that those new to teachingmight be expected to focus on subject matter,design, delivery and management skills and moreexperienced teachers on an increasingly broadrange of categories.

As regards professional expectations, in NorthAmerican terms, one might presume ‘acceptable’teaching as the level of expertise for someoneseeking promotion to the associate level when onemoves from tenure track to permanency. ‘Verygood’ would be the standard expected of anindividual seeking promotion to full professorwhen an associate professor submits a dossiershowing sustained excellence over an extendedperiod. In fact, this framework has been used inevaluating submissions for university teachingprizes in which awards were given for each rank. It appeared to make the comparison ofsubmissions, which is, in our experience, a verydifficult task, somewhat simpler.

The same notion of standards can be used to setacademic development goals and to evaluate theimpact of our work. For instance, for the categoryof departmental development and the criterion of‘supports departmental development’, ‘acceptable’expectations might be that as an academicdeveloper one actively seeks to learn the teachingdevelopment needs of the unit. The artefact mightbe a report or proposal to a teaching committee orthe chair or head of the department. Academicdevelopment characterized as ‘very good’ might befacilitating the implementation of an appropriateresponse to these needs, including an evaluation ofthe impact. This could be represented in a follow-up report to the relevant committee, one thatpresents the results of the evaluation. ‘Exemplary’academic development might be recognized indisseminating the results to senior administrationand other units through presentations or reports asexamples of ‘best practice’.

Implementing policies that recognizeand reward teachingWe began this analysis to understand how we mightconceptualize a system for evaluating the practiceof teaching. What has emerged is an explicitframework that is comprehensive andcomprehensible. Does the framework suggest areaswhere there is room for change in yourinstitution’s practices? Could any part of theframework make it easier for academic managersto recognize and reward teaching? If so, then werepeat our earlier caveat: implementing auniversity policy that supports the recognition andreward of teaching is a political activity that can befraught with resistance and difficulties. Suchpolicies involve human and financial resourcecommitment to a) academic development thatenables university teachers to meet and movebeyond acceptable expectations, and b)administrative structures that can implement theaccountability policies in ways that are perceived tobe reliable and credible. For instance, control ofinformation – who has access to it – is a crucialconcern.

This balance between support andaccountability is a critical one. Based on publicschool experience (Dube, 1995), an over-emphasison accountability will lower morale and lead to afocus on meeting the minimum standards, whereasan over-emphasis on development will lead to asituation where only those already interested inteaching take advantage of the opportunities.Further, workload can increase. McInnis (1996)compared workload patterns for academics inAustralia from 1977 to 1993 and found littlechange for teaching and research. However, therewas an additional workload – the result ofinstitutional competition and requirements foraccountability and quality assurance. In NorthAmerica, this may not be as true given the differentapproach being used for accountability.Nevertheless, it is evident that any scheme requirestime and other resources to implement effectively,and such implementation can be not only difficultbut very much influenced by institutionaldifferences (Vidovich & Porter, 1999).



ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

In closing: Improving one’s own teaching or academic developmentactivities

We believe this framework is unique in providing alanguage for differentiating aspects of the practiceof teaching as well as criteria, artefacts, sources andstandards for evaluating teaching.

Although the original intent of the analysis wasto provide a framework for academic managers, wefind it useful for other purposes since it makesexplicit the hidden aspects of the teaching‘iceberg’. What students and others see andassociate with teaching is similar to the small partof the iceberg that is visible above the water line.What students experience would be impossiblewithout the 80% to 90% of the part that is hidden,underwater so to speak. Until now, the‘underwater’ part has often been described only tothe extent of saying that it is the planning andevaluation of teaching that occur here. As a resultof this analysis, we can now name ‘delivery skills’ asthe visible part of teaching and the other sixcategories as making up the invisible part ofteaching. Thus, as academic developers in ourwork with university teachers, we can use theframework to provide them, particularly those whoare new, with a concrete and explicit description ofthe full range of ways in which teaching is lived outin the university. As well, we can use it as a basis forconsidering our own work – the extent to which weprovide support for all aspects of the practice ofteaching, the extent to which we have an impactbeyond individuals and are able to facilitateinstitutional change. This framework can also serveas a reflective tool for us (and for others) in ourown teaching: to set personal goals and evaluatepersonal learning and accomplishments as weengage in undergraduate and graduate teaching.

It was serendipitous that what emerged for usfrom this analysis was a clearer understanding ofour own work, both in undergraduate andgraduate teaching and in academic development.It made us both realize what we had beenattending to and what we had been overlooking orignoring in setting personal teaching (andacademic development) goals and evaluating ouraccomplishments. It has led us to ask questions ofeach other that others might find equally fruitful torespond to. For instance:

• Does the framework of categories, criteria,artefacts and sources broaden your perspective

of what effective teaching (and academicdevelopment) is?

• Does it help you explain your teaching (oracademic development) activities to yourself?

• Does the blending of the levels of impact withthe teaching categories provide you withconcrete ways of evaluating your teaching (oracademic development) efforts and setting newgoals?

• If you pause and think about your own teaching(or academic development activities), have youany artefacts showing the standard you haveachieved in each of the categories?

• If ‘yes’, would these convince others?• If ‘no’, what kinds of evidence can you see

yourself collecting?• Have you tried to collect evidence from a range

of sources, i.e., not just students (or, in the case of academic development, universityteachers)?

• Would the combination of levels and categoriesof teaching help you explain your teaching (oryour academic development activities) to yourinstitution?

References(those included in the review are asterisked)

Association of University Teachers. (1996). Professional accreditation of university teaching.Supplement to AUT Bulletin, London: Associationof University Teachers, January.*

Biggs, J. (2001). The reflective institution: Assuringand enhancing the quality of teaching andlearning. Higher Education, 42, 221–238.

Blizzard, A., & Lockhart, P. (1994). A crude model forassessing teaching quality. Waterloo, Canada:Instructional Development Center, McMasterUniversity.*

Boyer, E. (1990). Scholarship reconsidered: Priorities of theprofessoriate. Princeton, NJ: Princeton UniversityPress.

Braskamp, L., & Ory, J. (1994). Assessing faculty work:Enhancing individual and institutional performance.San Francisco: Jossey Bass.*

Cashin, W. (1989). Defining and evaluating collegeteaching. IDEA Paper #21. Kansas, KS: Kansas StateUniversity, Center for Faculty Evaluation andDevelopment, Sept. ED339791.*

Centra, J. (1993). Reflective faculty evaluation. SanFrancisco: Jossey Bass.

D’Apollonia, S., & Abrami, P. (1997). Navigatingstudent ratings of instruction, American Psychologist(Special issue on student ratings of professors), 52 (11), 1198–1208.*

16 IJAD 7:1


ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

Dow, K. (2001). Strengthening quality assurance inAustralian Higher education. In D. Dunkerley &W. Wong (Eds.), Global perspectives on quality inhigher education (pp. 123–142). Aldershot, England:Ashgate Publishing.

Dube, D. (1995). Teacher evaluation policy. Albany, NY:State University of New York Press.

Edgerton, R., Hutchings, P., & Quinlan, K. (1991).The teaching portfolio: Capturing the scholarship inteaching. Washington, DC: American Associationfor Higher Education.

Ho, A. (1998). An example of a conceptual changeapproach to staff development. Paper presented atthe International Consortium for EducationalDevelopment Conference, Austin TX. April.

Kirkpatrick, D. (1982). How to improve performancethrough appraisal and coaching. New York: Amacom.

Kirkpatrick, D. (1998). Evaluating training programs:The four levels (2nd Edn.). San Francisco: Berrett-Koehler Publishers.

Laurillard, D. (2002). Rethinking university teaching,(2nd Edn.). London: Routledge/Falmer.

Marsh, H., & Roche, L. (1997). Making students’evaluations of teaching effectiveness effective: Thecritical issues of validity, bias and utility, AmericanPsychologist (Special issue on student ratings ofprofessors), 52 (11), 1187–1198.

McAlpine, L., & Weston, C. (2002). Reflection: Issuesrelated to improving professors’ teaching andstudents’ learning, In N. Hativa & P. Goodyear(Eds.) Teachers’ thinking, beliefs and practices (pp.59–78). Dordrecht, The Netherlands: Kluwer.

McInnis, C. (1996). Change and diversity in the workpatterns of Australian academics. Higher EducationManagement, 8 (2), 105–117.

Narang , H (1992) Evaluating Faculty Teaching: A Proposal. Unpublished manuscript.Saskatchewan, Canada. Available: ED349906.*

Ramsden, P. (1992). Learning to teach in highereducation. London: Routledge.

Randall, J. (2001). Academic review in the UnitedKingdom. In D. Dunkerley & W. Wong (Eds.),Global perspectives on quality in higher education (pp. 57–69). Aldershot, UK: Ashgate Publishing.

Robertson, M. (1998). Benchmarking teachingperformance in universities: Issues of control,policy, theory and ‘best practice’. In J. Forest (Ed.),University teaching: International perspectives (pp. 275–303). New York: Garland Publishing.

Scriven, M. (1988). Duty-based teacher evaluation.Journal of Personnel Evaluation in Education, 1,319–334.

Shulman, L. (1993). Teaching as communityproperty. Change, Nov/Dec, 6–7.

Smith, R. A. (1997). Making teaching count inCanadian higher education: Developing a nationalagenda. Newsletter of the Society for Teaching andLearning in Higher Education (STLHE), 21, 1–9.

Smith, S. (1991). Report of the Commission of Inquiry on

Canadian University Education. Ottawa, Canada:Association of Universities and Colleges of Canada.

Weston, C., McAlpine, L., & Bordonaro, T. (1995). A model for understanding formative evaluation ininstructional design. Educational Technology,Research and Development, 43 (3), 29–46.

Weimer, M., & Firing Lenze, L. (1994). Instructionalinterventions: A review of the literature on effortsto improve instruction. In K. Feldman & M.Paulsen (Eds.), Teaching and learning in the collegeclassroom (pp. 653–682). Needham Heights, MA:Simon & Schuster Custom.

Wright, A. (1998). Improving teaching by design:Preferred policies, programs and practices. In J.Forest (Ed.), University teaching: Internationalperspectives (pp. 3–17). New York: Garland.

Vidovich, L., & Porter, P. (1999). Quality policy inAustralian higher education of the 90s: Universityperspectives. Journal of Educational Policy, 14 (6),567–586.

The authorsLynn McAlpine is an Associate Professor in theDepartment of Educational and CounsellingPsychology and the Director of the Center forUniversity Teaching and Learning at McGillUniversity. Ralph Harris, also at McGill, is anAssociate Professor in the Department of Miningand Metallurgy and an Affiliate Member of theCenter for University Teaching and Learning. Thetwo have worked jointly on a number of academicdevelopment activities.

Address: Center for University Teaching andLearning, McGill University, 3700 McTavish,Montreal, Quebec, Canada H3A 1Y2. Phone: 514-398-6648.Email: [email protected]



ownl

oade

d by

[U

nive

rsity

of

Hon

g K

ong

Lib

rari

es]

at 0

7:19

12

Nov

embe

r 20

14

Documents

Evaluating teaching effectiveness and teaching improvement: A language for institutional policies and academic development practices