Evaluating training effectiveness

EVALUATING TRAINING EFFECTIVENESS:AN INTEGRATED PERSPECTIVE IN MALAYSIA

UNIVERSITY OFSOUTH AUSTRALIA

12 JUL 2006

LIBRARY

Lim Guan ChongMaster of Business Administration (Finance)

International Graduate School of ManagementDivision of Business and Enterprise

University of South Australia

Submitted on this 5th of August in the year 2005 for the partial requirements of the degree ofDoctor of Business Administration


DOCTOR OF BUSINESS ADMINISTRATION

PORTFOLIO SUBMISSION FORM

Name: Lim Guan Chong Student Id No: 0111487H

Dear Sir/Madam

To the best of my knowledge, the portfolio contains all of the candidate's own workcompleted under my supervision, and is worthy of examination.

I have approved for submission the portfolio that is being submitted forexamination.

Signed:

Dr Travis Kemp/Professor Dr Leo Ann Mean Date

Supported by:

Dr Ian WhyteChair: Doctoral Academic Review CommitteeInternational Graduate School of Business

ge)(2,/2$

Date

14

DBA Portfolio Declaration

I hereby declare that this paper submitted in partial fulfillment ofthe DBA degree is my own work and that all contributions from anyother persons or sources are properly and duly cited. I further declarethat it does not constitute any previous work whether published orotherwise. In making this declaration I understand and acknowledgeany breaches of the declaration constitute academic misconduct whichmay result in my expulsion from the program and/or exclusion from theaward of the degree.

Signature of candidate:

11

Lim Guan Chong Date:5th August 2005

TABLE OF CONTENTS

Portfolio Submission FormPortfolio DeclarationAcknowledgements

Overview 1

1 Research Paper 1 Methodological Issues 3

In Measuring Training Effectiveness1.1 Abstract 41.2 Introduction 4

1.3 Approaches to Training Evaluation 6

1.3.1 Discrepancy Evaluation Model 7

1.3.2 Transaction Model 10

1.3.3 Goal-Free Model 101.3.4 Systemic Evaluation 12

1.3.5 Quasi-Legal Approach 13

1.3.6 Art Criticism Model 13

1.3.7 Adversary Model 141.3.8 Contemporary Approaches Stufflebeam's 14

Improvement-Oriented Evaluation (CIPP)Model, 1971

1.3.9 Cervero's Continuing Education Evaluation, 1984 151.3.10 Kirkpatrick Model, 1959a, 1959b, 1960a, 1960b,

1976, 1979, 1994, 1996a, 1996b, 1998 16

1.4 Critical Review 221.5 Future Research 27

1.5.1 The Transfer Component 271.5.2 Evaluating Beyond the 4 Levels 281.5.3 Incorporating Competency-based Approach 29

into Training Evaluation1.5.4 Multi-Rater System in Training Evaluation 31

1.6 Conclusion 331.7 References for Paper One 34

2 Research Paper 2 Evaluating Training Effectiveness: 43An Empirical Study of Kirkpatrick Model ofEvaluation in the Malaysian Training Environment ForThe Manufacturing Sector

2.1 Abstract 442.2 Introduction 442.3 Training Practices in Malaysia 452.4 Practice of Evaluation in Training 462.5 Training Evaluation Practices In Malaysia 492.6 Methodology of Study 53

2.6.1 Questionnaire Construction 542.6.2 The Sample and Sampling 552.6.3 Questionnaire Responses 56

2.7 Findings and Discussion 562.8 Limitations of Study 632.9 Conclusion 64

2.10 References for Paper Two 652.11 Appendix A The Questionnaire for Research Paper Two 69

3 Research Paper 3 Multi-rater Feedback For Training 74And Development: An Integrated Perspective

3.1 Abstract 753.2 Introduction 753.3 The Use of Multi-rater Feedback 763.4 The Effectiveness Of Multi-rater Feedback For 79

Development3.5 The Effectiveness Of Multi-rater Feedback For 81

Appraisal3.6 The Variation Of Multi-rater Feedback Information 81

3.7 Multi-rater Feedback Practices In Malaysia 83

3.8 Integrating Multi-rater Feedback With Developmental 86Tool

3.9 Multi-rater Feedback: Process Consultation As A 87Developmental Tool

3.10 Micro Perspective Of Conversation Theory In Process 89Consultation

3.11 An Integrated Approach for Post Multi-rater Feedback 91Development

3.12 Conclusion 963.13 References for Paper Three 97

iv

Acknowledgements

I am sincerely grateful to my supervisors, Dr Travis Kemp and Professor Leo AnnMean, who have been so supportive, by taking their time to look through my papersand gave me tremendously useful feedback and suggestions.

First and foremost, thanks to my spouse, Linda Liew Mei Ling who acted as myresearch assistant and has put in her late nights and thoughtful moral supportthroughout this endeavor.

Special thanks are reserved for my friends who acted as my proof readers whonever let me produce less than the best I had to offer.

In particular, my sincerest thanks to my respondents, relatives, families and otherparties who have supported me along the way and helped me find the time tocomplete my thesis.

Finally, my utmost appreciation to University of South Australia, InternationalGraduate School of Management for their support and enthusiasm to achievingexcellence in education.

Lim Guan Chong

Overview

The majority of organizations realize that training must be a worthwhile effort; there must be

returns towards labour productivity after training. Evaluation is possibly the least developed

aspect of the training cycle. This research portfolio looks at the effectiveness of Kirkpatrick

Four-Levels of Evaluation with emphasis on the assessment of the methodology within the

training perspective.

Evaluating training is typically linked with measuring change and quantifying the degree of

change which leads to performance. Measuring gains in organization effectiveness that

resulted from training interventions is probably the most difficult task in training evaluation.

This research portfolio, as a partial fulfillment of the requirement of the degree of Doctor of

Business Administration, develops a series of ideas that expand on traditional approaches to

training evaluation. The research portfolio is divided into three papers.

Paper 1 critically reviews the methodological problems faced when adopting the evaluation

model developed by Donald L Kirkpatrick in 1959. A series of industrial research conducted

shows little application of this definite approach. The literature provides little understanding

about the transfer of the learning component when using Kirkpatrick model to determine

training effectiveness. Most current researchers find that future research on training

evaluation lies in the effectiveness of transfer of the skills learned. The objective of this

research portfolio through the anatomy of this classical theory is to effectively address the

weaknesses by re-focusing the issue of transfer of learning as a major key to unlock the

model's practicality and validity.

Paper 2 adopts a survey method to track the history, rationale, objectives, implementation and

evaluation of training initiatives in the Malaysian manufacturing sector. It utilizes the survey

research to triangulate reliable and convincing findings.

The research looks at the extensiveness of Kirkpatrick model as practised in the Malaysian

manufacturing sector. This paper reports the practice of Kirkpatrick's 4 levels of evaluation

and the effectiveness of this evaluation model within the Malaysian manufacturing sector.

1

Paper 3 is on the effective use of the multi-rater feedback system in providing multi-source

information and creating self awareness based on individual strength and weaknesses. One

underlying rationale to such system is their potential impact on the individual's self

awareness which is thought to enhance performance at the development stage.

This paper serves as a conceptual paper, which studies how multi-rater feedback could

effectively lead to a successful developmental process through process consultation in the

context of Malaysia training environment. Through the years, training evaluation culture in

Malaysia has not been properly developed. A comprehensive approach is necessary for

organizations to see the benefits of conducting pre training analysis. This should be followed

by an effective development plan so that a comprehensive training approach could be

instilled in the Malaysian environment.

The process consultant holds the key to effective development process by using a multi-rater

assessment as a pre-training gap analysis. Process consultation provides the opportunity to

'check and balance' the degree of learning and development activities through reflections,

problem solving capabilities and application of theory throughout the developmental process.

Good conversation was introduced as an intervention tool to complement double loop

learning during process consultation.

This portfolio systematically discusses the issue of training evaluation faced by the Malaysian

manufacturing sector. It is recommended that an integrated model approach comprising

preliminary and post assessment using multi-rater feedback, followed by a developmental

process using process consultation, complemented by good conversation as an intervention

tool, may serve as a rational balance between training financial outlays and development

outcome.

2

Research Paper I

METHODOLOGICAL ISSUES IN MEASURINGTHE TRAINING EFFECTIVENESS


University of Hull

International Graduate School of ManagementUniversity of South Australia

3

Methodological Issues in MeasuringTraining Effectiveness

Lim Guan ChongInternational Graduate School of Management


1.1 Abstract

This literature review examines the effectiveness and the methodological issues related to

Kirkpatrick's four-level model of evaluation and its application to training. The paper first

measures the extent that the Kirkpatrick's evaluation model has been used by organizations to

measure learning outcomes, reactions towards development, transfer learning, change of

behavior and return of investment after training. Research was conducted to determine the

weaknesses of this model faced by most practitioners. An examination of this classical

theory was carried out to address the weaknesses of this model by re-focusing the issue of

transfer learning as a key to unlock the model's practicality and validity.

1.2 Introduction

Training evaluation is regarded as an important human resource development strategy.

However, there seems to be widespread agreement that systematic evaluation is the least well

carried out training activity. Chen and Rossi (1992) commented that evaluation knowledge

found in the literature has not been fully utilized in program evaluation. This reveals that

training evaluation has not been culturally embedded in most organizations. The first reason

could be that companies have no knowledge in conducting training evaluation. Secondly, the

available training evaluation models are not sufficient in providing a total approach for

effective training evaluation. This is further evidenced by a study on the benefits of training

in Britain, which revealed that 85 percent of British companies make no attempt to assess the

benefits gained from undertaking training (HMSO, 1989).

4

Since evaluation started in the area of education, most of the early definitions were in that

area. Tyler (1949) was the first researcher to define evaluation as a process of determining to

what extent the educational objectives are actually being realized by the curriculum and

instruction. The early researchers emphasized the need to look at attaining objectives as an

important process in determining the effectiveness of any programs. This was found in the

study by Steel (1970), who compared effectiveness of the program with its cost. Boyle and

Jahns (1970) defined evaluation as the determination of the extent to which the desired

objectives have been attained or the amount of movement that has been made in the desired

direction. Further study by Provus (1971) conceptualized the need to have a certain standard

of performance as an objective-based criterion to judge the success of the program. His

model made comparisons between this preset standard and what actually exists. Noe (2000)

defined evaluation by referring to training evaluation as the process of analyzing the

outcomes needed to determine if training was effective. However, Goldstein and Ford (2002),

were of the opinion that evaluation is a systematic collection of descriptive and judgemental

information necessary to make effective training decisions which are related to the selection,

adoption, value, and modification of various activities.

After many in-depth studies were conducted on training evaluation and the high cost-

effective expectation from training, the term evaluation has been given a broader perspective

in which it no longer focuses on achieving program objectives but mainly covers the

methodology element of evaluation (Brinkerhoff, 1988; Goldstein, 1986; Junaidah, 2001;

Shadish & Reichardt, 1987; Stufflebeam & Shinkfield, 1985). The basis of goal-based

process formed only part of the overall evaluation process, unlike in the past when

researchers used one preferred methodological principle to assess the degree to which

training had attained their goal. With the availability of a wider range of philosophical

principles and scientific methodologies, many social scientists emphasized scientific rigor in

their evaluation models, and this is reflected in their definition of the field (Junaidah, 2001).

The evaluation model of these social scientists involves primarily the application of scientific

methodologies to study the effectiveness of the programs. These evaluators emphasized the

importance of experimental designs (Goldstein & Ford, 2002), quantitative measures (Rossi

& Freeman 1993) and qualitative assessment (Wholey, Hatry & Newcomer, 1994).

Contemporary social scientists, Cascio (1989), Mathieu and Leonard (1987), Morrow, Jarrett

5

and Rupinski (1997), Tesoro (1998) even adopted utility analysis in evaluating the worthiness

and effectiveness of the programs.

In brief, the concept of evaluation consists of two distinct definitions; congruent and

contemporary definitions (Junaidah, 2001). The congruent definition is more concerned with

meeting the desired objectives. It is a process of collecting information, judging the worth or

value of the program and ensuring training objectives are met. The contemporary definition

of evaluation places emphasis on scientific investigation to facilitate decision-making.

Stufflebeam (1971) mentioned that evaluation is the process of delineating, obtaining and

providing useful information for judging decision alternatives. This can be seen from the

evolution of the early 70s models to the current contemporary evaluation models.

1.3 Approaches to Training Evaluation

Evaluation in its modern form has developed from attempts to improve the educational

process (Bramley, 1996). Evaluating the effectiveness of people became popular at about the

same time as scientific management, and school officials began to see the possibility of

applying these concepts to school improvement (Bramley, 1996). Tyler (1949) model is

generally considered an early prominent evaluation model which was planned to evaluate the

value of progressive high-school curricula with more conventional ones (Stufflebeam &

Shinkfield, 1985).

Tyler (1949) introduced the Basic Principles of Curriculum and Instruction, which is

organized around four main concerns:

What educational purposes should the organization seek to attain?

How to select learning experiences that are likely to be useful in achieving these

purposes?

How can the selected learning experiences be organized for effective instruction?

How can the effectiveness of these learning experiences be evaluated?

Tyler laid the foundation for an objective-based style of evaluation. Objectives were seen as

being critical because they were the source for planning, guiding the instruction and

6

preparing the test and measurement procedures. Tyler's objective-based evaluation model

concentrates on clearly stated objectives by changing the evaluation from appraisal of

students to appraisal of programs. He defined evaluation as assessing the degree of

attainment of the program objectives. Decisions made on any program had to be based on the

goal congruence between the objectives and the actual outcomes of the program (Stufflebeam

& Shinkfield, 1985).

1.3.1 Discrepancy Evaluation Model

The Discrepancy Evaluation Model, developed by Provus (1971) is used in situations where a

program is examined through its development stages with the understanding that each stage

(which Provus defines as design, installation, process, product and cost-benefit analysis) is

measured against a set of performance standards (objectives). The cost-benefit analysis

identifies the potential benefits of the training before it is carried out. The expected

behaviours which result from the training are agreed upon between the trainer and the

trainees. The analysis also establishes training objectives, which are defined as changes in

work behaviour and increased levels of organizational effectiveness (Bramley & Kitson,

1994). The program developers had certain performance standards in mind regarding how the

program should work and how to identify if it were working. The discrepancies that are

observed between the standards and the developed design are communicated back to the

relevant parties for review or further corrective action. A discrepancy evaluator's role is to

determine the gap between what is and what should be. This model helps the evaluators to

make decisions based on the difference between preset standards and what actually exists

(Boulmetis & Dutwin, 2000).

Provus's Discrepancy Evaluation Model can be considered an extension to Tyler's earlier

objective-based model where a set of performance standards must be derived to serve as the

objectives to which the evaluation of the program is based. Furthermore, the model may be

also viewed as having properties of both the formative and summative evaluation (Boulmetis

& Dutwin, 2000). The design stage comprises the needs analysis and program planning

stages; installation and process are parts of the implementation stage where formative

7

evaluation is done; and product and cost-benefit analysis stages comprises a summative

evaluation stage.

Formative evaluation focuses on the process criteria to provide further information to

understand the training system so that the intended objectives are achieved (Goldstein &

Ford, 2002). Brown, Werner, Johnson and Dunne (1999) note several potential benefits of

formative evaluation. The program could be assessed half way through to see whether it is

on track, effectively performed, and whether the activities are meeting the needs of the

training. The evaluator determines the extent to which the program is running as planned,

measures the program progress in attaining the stated goals, and provides recommendations

for improvement. The evaluation findings in these reports and the monitoring data could be

used to end a program in midstream (Goldstein & Ford, 2002). Unlike formative evaluation,

summative evaluation is fairly stable and does not allow adjustments during the program

cycle. Summative evaluation involves evaluating and determining whether the program has

experienced any unplanned effects. It helps organizational decision makers decide whether

to use the program again or improve it in some way. Campbell (1988) discriminates between

two types of summative evaluations; the first evaluation simply questions whether a

particular training program produces the expected outcome. The second evaluation compares

and investigates the benefits and viability of programmed instruction procedures. By

comparing the two evaluations, it was found that programmed instruction produces quicker

mastery of the subject, but the eventual level of learning retention is the same with either

technique (Campbell, 1988).

Provus Discrepancy Evaluation Model provides information for establishing measures of

training success by determining whether the actual content of the training material would

develop knowledge, skill and ability (KSA) and eventually lead to a successful job

performance. However, there are too many subjective issues that exist, especially on the

setting up of the performance criterion. The chosen criterion is based on the relevance of

three components: knowledge, skill and ability which are necessary to succeed in the training

and eventually on the job. Considering that modem approaches to assessing training

programs must be examined with a multitude of measures, including participant reactions,

learning, performance, and organizational objectives, it is necessary for training evaluators to

view the performance criteria as multidimensional (Goldstein & Ford, 2002). Training can

best be evaluated by examining many independent performance dimensions. However, the

8

relationship between measures of success should be closely scrutinized because the

inconsistencies that occur often provide important insights into training procedures

(Goldstein & Ford, 2002). Decisions and feedback processes depend on the availability of all

sources of information. There are many different dimensions in which the performance

criteria can vary. Issues like relevance and reliability of the criterion are important to consider

should one wish to adopt this discrepancy evaluation model. There are several considerations

in the evaluation of the performance criteria. These include acceptability to the organization,

networks and coalition that can be built between trainees and realistic measures (Goldstein &

Ford, 2002).

Responsive approaches used in the goal-free model are better evaluative approaches as there

is considerable variation in what the objectives of a program are thought to be. Responsive

approaches are a form of action research which involves the stakeholders in the data

collection process (Bramley, 1996). The intention is not to attribute causality, but to gain a

sense of the value of program from different perspectives. The term "responsive evaluation"

was first used by Stake (1977) to describe a strategy in which the evaluator is less concerned

with the objectives of the program than its effect in relation to the concerns of interested

parties, namely the stakeholders.

The responsive approach involves protracted negotiations with a wide range of stakeholders

in constructing the report. It is thus more likely to reflect their reality and be useful for them.

However, the underlying philosophy of responsive evaluation is different from the goal-based

approach. Evaluators are seen as subjective partners and the evaluation is based upon a joint-

collaborative effort which results in findings being constructed rather than revealed by the

investigation. Truth is a matter of consensus among informed parties. Facts have no meaning

except within some value framework. Phenomena can only be understood in the context in

which they are studied, generalization is not possible.

The suggested method intends to achieve progressive focus by giving more attention to

emerging issues rather than seeking the truth. Legge (1984) introduced a model similar to

goal free evaluation which evaluates planned organizational change. The evaluation is a joint,

collaborative process, which results in something more constructed than revealed by the

investigation. Legge (1984) suggests that instead of attempting evaluation as a thoroughly

9

monitored research, a contingency approach should be adopted. The contingency approach

is used to decide which approach is more appropriate or best matches the functional

requirements of the evaluation exercise. Campbell (1988) revealed that internal validity of

the scientific approach may not be so crucial. To increase internal validity, the legitimate

stakeholders should agree on the evaluation approach. The highlight on internal validity in

the scientific approach will frequently imply controlling key aspect of the context and many

organizational variables. This may lead to rather simplified information which clients find

difficult to use because it does not reflect their perception of organizational reality. Due to

this strong bipolarity between practitioners and academics, not many responsive evaluations

have been described in the training literature (Bramley, 1996).

1.3.2 Transaction Model

The Transaction Model developed by Stake (1977) affords a concentration of activity among

the evaluator, participants and the project staff (Madaus, Scriven & Stufflebeam, 1986). This

model combines monitoring with process evaluation through regular feedback sessions

between evaluator and staff. The evaluator uses a variety of observational and interview

techniques to obtain information and the findings will be shared with all the relevant parties

to improve the overall program. The evaluator participates and provides project activities.

Besides trying to obtain objectivity, the evaluators use subjectivity in the transaction model.

This model may have a goal-free or a goal-based orientation. Findings are shared with the

staff of all the projects in order to improve both individual and overall projects (Boulmetis &

Dutwin, 2000).

1.3.3 Goal-Free Model

Unlike early models, the goal-free model developed by Michael Scriven is a model that

involved methodological studies and processes (Popham, 1974). The evaluation model

examines how the program is performing and how the program could address the needs of the

client population. Program goals are not the criteria on which evaluation is based. However,

it is a data gathering process which studies actual happenings and evaluates the effectiveness

10

of the program meeting the client's needs. The evaluator has no preconceived notions

regarding the outcome of the program (as opposed to the goal-based model). Categories of

evaluation naturally emerge from the evaluator's actual observation. Once the data have been

collected, the evaluator attempts to draw conclusions about the impact of the program in

addressing the needs of the stakeholders.

However, this model has its weakness in terms of its subjective measures. There are some

preconceived notions that the evaluator must be an expert in his respective field and some say

no expertise is better (Rossi & Freeman, 1993). Some researchers said that an evaluator who

is not familiar with the nuances, ideologies and standards of a particular professional area

will presumably not be biased when observing and collecting data on the activities of a

program. They maintain, for example, that a person who is evaluating a program to train

dental assistants should not be a person trained in the dental profession. But other researchers

allege that a person who is not aware of the nuances, ideologies and standards of the dental

profession may miss a good deal of what is important to the evaluation. Both sides agree that

an evaluator must attempt to be an unbiased observer and be adept at observation and capable

of using multiple data collection methods (Wholey, Hatry 8z Newcomer, 1994). This is a

topic of debate among many experts. Scriven suggested using two goal-free evaluators, each

working independently to address the preconceived issues and reduce the possible biasness in

evaluation (Scriven, 1991).

A study by O'Leary (1972) illustrates the importance of considering other dimensions of the

criteria. She used a program of role-playing and group problem-solving sessions with hard-

core unemployed women. At the conclusion of the program, the trainees had developed

positive changes in attitude toward themselves. However, it also turned out that these

changes did not reflect the lack of positive attitudes toward their tedious and structured jobs.

These trainees apparently raised their levels of aspiration and subsequently sought

employment in a working setting consistent with their newly found expectations. It was

obvious that the trainees were leaving the job as well as experiencing positive changes in

attitude. However, there are many other cases in which the collection of a variety of criteria

related to the objectives is the only way to effectively evaluate the training program

(Goldstein & Ford, 2002). This has caused goal-based evaluation lost ground during the last

20 years because of the growing conviction that evaluation is actually a political process and

11

that the various values held in the society are not represented by an evaluative process which

implies that a high degree of consensus is possible (Bramley, 1996).

Further studies by Parlette and Hamilton (1977) rejected the classical evaluation system,

which focuses on objective reality, assumed to be equally relevant to all stakeholders in

acknowledging the diversity posed by different interest groups. They suggested the

"illuminative evaluation", with description and interpretation rather than with measurement

and prediction.

1.3.4 Systemic Evaluation

Systemic evaluation analyses the effectiveness of the whole system and enhances the

interfaces between the sub-systems in such a way as to increase the effectiveness of the

system. That is what the "system approach" sets out to do (Rossi & Freeman 1993). The most

comprehensive purpose of systemic evaluation is to find out to what extent training has

contributed to the business plans of various parts of the organization and consider whether the

projected benefits obtained outweigh the likely cost of training.

The main questions, which this strategy sets out to answer, are (Bramley, 1996):

Is the program reaching the target population?

Is it effective?

How much does it cost?

Is it cost effective?

These questions are used to derive facts about the evaluation by defining the size of the target

population and working out the proportion that have attended the training and not opinions of

whether useful learning has taken place. Effectiveness is difficult to measure as the word may

imply different meanings to different people. However, the model seems to measure quantity

rather than the quality of what is being done.

In the system analysis model, the evaluator looks at the program in a systematic manner,

studying the input, throughput and output (Rivlin, 1971).

12

Input are elements that come into the system (i.e. clients, staff, facilities and resources).

Throughput consists of things that occur as the program operates, for example, activities,

client performance, staff performance, and adequacy of resources such as money, people and

space. Output is the result of program-staff effectiveness, adequacy of activities etc. The

evaluator mainly examines the program efficiency in light of these categories.

1.3.5 Quasi-Legal Approach

Quasi-legal evaluation operates in a court of inquiry manner. Witnesses are called to testify

and tender evidence. Great care and attention is taken to hear a wide range of evidence

(opinions, values and beliefs) collected from the program. This approach is basically used to

evaluate social programs rather than formally evaluate training or development activities.

Quasi-legal evaluation was reported flawed by Porter and McKibbin (1988) in the area of

management education in the USA. The substantial information received from stakeholders

was analysed by a small group of professors from a business school. The students were

basically satisfied with the qualification which they have obtained and found course

worthwhile and useful. However, the researchers criticized that young graduates who attend

MBA courses have never worked in an organization and thus do not understand the sort of

issues, which should be the basic discussion material of MBA courses. A similar problem

arose with Constable and McCormick's (1987) report on the demand for and supply of

management education and training in the UK. The researchers found that judgement by

insufficiently impartial judges in the quasi-legal approach may be irrelevant, biased or

inconclusive (Bramley, 1996).

1.3.6 Art Criticism Model

In the Art Criticism Model developed by Eisner (1997), the evaluator is a qualified expert in

the nuances of the program and becomes the expert judge of the program's operation. The

success of this model depends heavily upon the evaluator's judgment. The intended outcome

may come in the form of critical reflection and/or improved standard. This model could be

13

used when a program wishes to conduct a critical review of its operation prior to applying for

funding or accreditation.

1.3.7 Adversary Model

In Owen's Adversary Model, the evaluator facilitates a jury that hears evidence from

individuals on particular program aspects (Madaus, Scriven & Stufflebeam, 1986). The jury

uses multiple criteria to "judge" evidence and make decisions on what have happened. This

model can be used when there are different views of what is actually happening in a program

such as arguments for and against program components.

1.3.8 Contemporary Approaches - Stufflebeam's Improvement-Oriented

Evaluation (CIPP) Model, 1971

Stufflebeam considers the most important purpose of evaluation is not to prove but to

improve (Stufflebeam & Shinkfields, 1985). The four basic types of evaluation in this model

are context (C), input (I), process (P) and product (P).

Context evaluation defines relevant environment and identifies training needs and

opportunities of specific problems. Input evaluation provides information to determine usage

of resources in the most efficient way to meet program objectives. The results of input

evaluation are often seen as policies, budgets, schedules, proposals and procedures. Process

evaluation provides feedback to individuals responsible for implementation. It is

accomplished through providing information for preplanned decisions during implementation

and describing what actually occurs. This includes reaction sheets, rating scales and content

analysis. Ultimately, product evaluation measures and interprets the attainment of program

goals. Contemporary approaches could take place both during and after the program with the

aim to improve program evaluation by expanding the scope of evaluation through its four

basic types of evaluation (Madaus, Scriven & Stufflebeam, 1986).

The CIPP model was conceptualized as a result of attempts to evaluate projects that had been

funded through the Elementary and Secondary Act of 1956 (Stufflebeam, 1983). To conduct

CIPP model evaluation, the evaluator needs to design preliminary plans and deal with a wide

14

range of choices pertaining to evaluation. This requires collaboration between clients and

evaluators as a primary source for identifying the interest of the various stakeholders.

1.3.9 Cervero's Continuing Education Evaluation, 1984

In Cervero's book titled "Effective continuing education for professionals" he suggested

seven categories of evaluation questions organized around seven criteria to determine

whether the programs were worthwhile (Cervero, 1988). The seven criteria are (a) program

design and implementation, (b) learner participation, (c) learner satisfaction, (d) learner

knowledge skills and attitudes, (e) application of learning after the program, (t) impact of

application of learning and (g) program characteristic associated with outcomes.

Program design and implementation is concerned with what was planned, what was actually

implemented and the congruence between the two. Factors such as the activities of learners

and instructors and the adequacy of the physical environment for facilitating learning are

common questions which are asked in this category.

Learner participation has both quantitative and qualitative dimensions. The quantitative

dimension deals with evaluative questions that are most commonly asked in any formal

program. The data is not used to infer answers in the other categories. Qualitative data is

collected in an anecdotal fashion by unobtrusively observing the proceedings of the

educational activities.

Learner satisfaction is concerned with the participants' reaction and is collected according to

various dimensions, such as content, educational process, instructor's performance, physical

environment and cost.

Learner knowledge, skills and attitudes focus on changes in the learner's cognitive,

psychomotor and affective goals. Normally, the evaluator will adopt a pen and paper test to

judge the effectiveness of these categories.

Application of learning addresses the degree of skill transfer to the actual work place. The

impact of application of learning focuses on the second-order effects, which means the

transfer and impact on the public (Cervero, 1988).

15

Program characteristics are associated with the outcome of the program. There are two kinds

of evaluative questions: the implementation questions and the outcome questions.

Implementation questions are useful for determining what happened before and during the

program. Outcome questions are useful for determining what occurred as a result of the

program.

The seven categories in this model are not viewed as a hierarchy (Junaidah, 2001). Cervero's

ideas have several antecedents in the evaluation literature. His framework was influenced by

Kirkpatrick's (1959) and Tyler's (1949) models. It is considered to be a comprehensive

model as it covers all the stages involved in starting from the program design stage to the

outcome stage. However, this model evaluation may be viewed as being too tedious to

implement due to its complexity. The author is too immersed in getting facts of the entire

process and ignores the efficiency of the whole evaluation process. This makes the model

more summative than formative in nature.

1.3.10 The Kirkpatrick Model, 1959a, 1959b, 1960a, 1960b, 1976, 1979,

1994, 1996a, 1996b, 1998

One of the most widely used model for classifying the levels of evaluation, used by Barclays

Bank PLC, Reeves in 1996 and others, was developed by Kirkpatrick. His model looks at

four levels of evaluation, from the basic reaction of the participants to the training and its

impact to the organizational. The intermediary levels examine what people learned from the

training and whether learning has affected their behaviour on the job. Level one (Level 1)

concerns itself with the most immediate reaction of participants and is easily measured by

simple questionnaires after the training. Level two (Level 2) is harder to measure and is

concerned with measuring what people understood and how they were able to demonstrate

their learning in the work environment. Level two (Level 2) can be measured by pen and

paper tests or through job simulations. Level three (Level 3) looks at the changes in people's

behaviour towards the job. For example, after a writing skills course, did the individual make

fewer grammatical and spelling errors and were their memos easier to understand? Level

four (Level 4) measures the "result" gained from the training. It focuses on the impact of the

training on the organization rather than the individual.

16

Kirkpatrick (1959) developed this coherent evaluation model by producing what was thought

to be a hierarchy system of evaluations which indicates effectiveness through:-

Level 1 (Reaction)

Level 2 (Learning)

Level 3 (Behaviour)

Level 4 (Results)

Kirkpatrick's (1994) Training Evaluation Model

Reaction How did the participants react to the training?

Learning What information and skills were gained?

Behaviour How have participants transferred knowledge and skills to their jobs?

Results What effect has training had on the organization and the achievement

of its objectives? (Timely and quality performance appraisals are

corporate goal)

Kirkpatrick was the first researcher to develop a coherent evaluation strategy by producing

what was thought to be a hierarchy of evaluations, which would indicate benefit (Plant &

Ryan, 1994).

Level 1: Reaction Evaluation

Kirkpatrick proposed the use of a post course evaluation form to quantify the reactions of

trainees. Evaluation at this level is associated with the terms "happiness sheet" or "smile

sheet" because reaction information is usually obtained through a participatory

questionnaire administered near or at the end of a training program (Smith, 1990).

Studies on evaluation mechanisms have shown that such evaluation sheets are not held in

high esteem, despite their general use by trainers of many organizations and in institutions

of higher learning (Bramley 1996; Clegg, 1987; Love, 1991; Rae, 1986;). Clegg (1987)

found that training evaluation was conducted for 75 percent of training programs done in

17

organizations. A study by Dawson (1993) found that Level 1 evaluation sheets were

ubiquitous.

Level 2 Learning Evaluation

The learning level is concerned with measuring the learning principles, facts, techniques

and skills presented in a program (Kirkpatrick, 1994). Tyler (2002) found that 32

percent of companies in America have carried out post-training evaluation on Level 2.

Another research conducted by Mathews, Ueno, Kekale, Repka, Pereira and Silva (2001)

on 450 companies in UK, Portugal and Finland which focused on training quality and

training evaluation showed that 40 percent of UK companies, 31 percent of Finland

companies and 51 percent of Portugal companies conduct formal assessment on learning

of the principles, facts, skills and attitudes which were specified as training objectives.

This level evaluates the knowledge, skills development and attitudinal changes that have

taken place. Examination of both knowledge and attitudinal outcomes is important to

increase coverage of training impacts because the pattern of change can vary between the

pre-test and post-test (Basadur, Graen & Scandura, 1986; Kraiger, Ford & Salas, 1993).

Researchers either assessed change before and after a program (Basadur et al., 1986;

Bretz & Thompsett, 1992), or they look merely at the post-training attainment score

(Davis & Mount, 1984; Warr & Bunce, 1995). Measures of learning should be objective,

with quantifiable indicators of how new requirements are understood and absorbed. This

data is used to confirm that participant learning has occurred as a result of the training

initiative (Phillips & Stone, 2002).

Level 3 Behavioural Evaluation

Job performance after training is referred to as behavioural by Kirkpatrick (1959, 1976)

and transfer by Alliger, Tannenbaum, Bennett, Traver and Shotland (1997). Level 3

evaluates the extent to which the "transfer" of knowledge, skills and attitudes has

18

occurred. Tyler (2002) reported that only 9 percent of America industries have carried

out post training evaluation at this level. The focal point is on performance at work after

a program. It is essential to record before and after performance but sometimes self-

report are obtained if information are unavailable to an evaluator (Wexley & Baldwin,

1986). It determines the extent of change in behaviour that has taken place and how this

behaviour would be transferred to the workplace. It further encourages one to take into

account the possible factors in the job environment that could prevent the application of

the newly learned knowledge and skills since a positive climate is important for

transferring.

Level 4 Results Evaluation

The evaluation of a particular training program becomes more complex as one progress

through every level of Kirkpatrick model. Results can be defined as the final results that

occurred because the participants attended the training program. This includes increased

production, improved quality, increased sales and productivity, higher profits and return

on investment. Level 4 evaluation observes changes in the performance criteria (i.e. key

results area) of organizational effectiveness. This level anticipates the gains the

organization can expect from a training event. This level of evaluation is made more

difficult as organization often demand that the explanation be given in financial terms

with measurable quantifiers (Redshaw, 2001).

For the past 30 years since Kirkpatrick's first idea was published in 1959, much debate had

been recorded on this model. Despite criticism, Kirkpatrick model is still the most generally

accepted by academics (Blanchard & Thacker, 1999; Dionne, 1996; Kirkpatrick, 1996a;

1996b; 1998; Phillips, 1991). However, research conducted in the United States has

suggested that US organizations generally have not adopted all of Kirkpatrick's 4-level

evaluation (Geber, 1995; Holton, 1996). This is especially true for the last two, more

difficult, levels of Kirkpatrick's hierarchy (Geber, 1995). In a survey of training in the USA,

Geber (1995) reported that for companies with 100 or more employees, only 62 percent

assessed behavioural change. Geber's (1995) results also indicated that only 47 percent of

US companies assess the impact of training on organizational outcomes. This poses a good

19

research question about the model's methodology and it forms the basis for epistemological

studies around the methodology.

Kirkpatrick's work has received a great deal of attention within the field of training

evaluation (Alliger & Janek, 1989; Blanchard & Thacker, 1999; Campion & Campion, 1987;

Connolly, 1988; Dionne, 1996; Geber, 1995; Hamblin, 1974; Holton, 1996; Kirkpatrick,

1959; 1960; 1976; 1979; 1994; 1996a; Newstrom, 1978; Phillips, 1991). His concept calls

for four levels of evaluation namely reaction, learning, behaviour and results. His four levels

of training effectiveness stimulated a number of supportive and conflicting models of varying

levels of sophistication (Alliger & Janek, 1989; Campion & Campion, 1987). There are

models and methods that incorporate financial analyses of training impact (Swanson &

Holton, 1999). However, Warr, Allan and Birdi (1999) conducted a longitudinal study of the

first three levels of training evaluation. The study correlated the following: relationships

between evaluation levels, individual and organizational predictors of each level and the

differential predictions of attainment vs change score. The study showed that immediate and

delayed learning were predicted by the trainee's motivation, confidence and use of learning

strategies. The researchers highlighted that it is preferable to measure training outcomes in

terms of change from pre-test to post-test, rather than merely through attainment (post-test)

scores (Warr, Allan & Birdi, 1999).

A review of the most popular procedures used by US companies to evaluate their training

programs showed that over half (52 percent) use assessments about participants' satisfaction

with the training. 17 percent assessed application of the trained skills to the job and 13

percent evaluated changes in organizational performance following the training. 5 percent

tested for skill acquisition immediately after training while 13 percent of American

companies carried out no systematic evaluation of their training programs (Mann &

Robertson, 1996). Many of these procedures reflect Kirkpatrick's four levels of reactions,

learning, behaviour and results of which will be further discussed.

More than 50 evaluation models available use the framework of Kirkpatrick model (Phillips,

1991). Currently, majority of the employee training is evaluated at Level 1. Evaluation at

Level 1 is associated with the terms smile sheet or happiness sheet, because reaction

information is usually obtained through a participatory questionnaire administered near the

end or at the end of a training program (Smith, 1990). The specific indication of the smile

20

sheet or happiness sheet is enjoyment of the training, perceptions of its usefulness and its

perceived difficulty (Warr & Bunce, 1995).

Phillips and Stone (2002) enhanced the popularity of the Kirkpatrick model by inserting the

fifth level into the existing 4-level model, though he further argued the inadequacy of this

model in capturing the return on investment aspect of the training outcome. Phillips and

Stone's (2002) 5-level evaluation model was seen as an extension of Kirkpatrick's 4-level

evaluation model as different companies have their own definition of pay offs to measure the

training results. Return on investment compares the training's monetary benefits with the

cost of the training, so that the true value of the training to the organization can be assessed.

Converting data to monetary values is the first phase in putting training initiatives on the

same level as other investments that organizations make (Phillips, 2002). It cannot be used to

cover other variables that may affect the results (i.e. culture, productivity, etc). Kirkpatrick

(1994) refuted this idea by claiming that there are many ways to measure training results.

This raises the question whether training evaluation be varied only as a measure of financial

benefits? Lewis and Thornhill (1994) are of the opinion that there should be 5 levels of

evaluation measuring the training effects on the department (i.e. Level 4) and its effects on

the whole organization (i.e. Level 5). Lewis and Thornhill (1994) emphasized the need to

look at the value and the organization cultures as the variables to measure training

effectiveness.

In recent times others have tried to make the system easier to deal with. Warr et al. (1999)

came up with the context, input, reaction and outcome (CIRO) evaluation system with the

context part going someway towards front-loading the evaluation and partly towards

mirroring Kirkpatrick model. Dyer (1994) proposed an evaluation system that suits all

organizations, irrespective of size or diversity of operation. It is a system that is relatively

easy to come to terms with and can be implemented at all the hierarchical stages of an

organization. It fits the individual and it fits the whole organization. The system puts

Kirkpatrick's evaluation system against a mirror. The benefits of using Kirkpatrick's Mirror

should be self-evident to anyone involved in management. Application of the paradigm

allows the individual to become more business focused, and if adopted universally should

provide efficient and effective training throughout any organization (Dyer, 1994).

21

A different model was used in a study by Shireman (1991) on the evaluation of a hospital

based health education program. The study adopted the CIPP model in examining the type of

evaluation which was being conducted in the hospital. A structured questionnaire was sent to

a stratified random sample of 160 hospitals of four different sizes in four mid-western states.

The result showed that 48 percent of the respondents reported that product evaluations were

usually done and less than 25 percent reported that other types (i.e. context, input, process) of

evaluations were done. The product evaluation is outcome-based and quite similar to

Kirkpatrick's end process evaluation. Both types of evaluations require appropriate data

collection activities.

Kirkpatrick model was used by most researchers as an initial framework of evaluation model

generation. This paper addresses the methodological issues surrounding the taxonomy of

Kirkpatrick model as an area for epistemological study. The theoretical an empirical

literature of Kirkpatrick model will be critically evaluated and further research opportunities

will be outlined.

1.4 Critical Review

Phillips (1991) concluded that out of more than 50 evaluation models available, the

evaluation framework that most training practitioners used is the Kirkpatrick model. Though

the model seemed to be weathered well, it has also limited our thinking on training evaluation

and possibly hindered our ability to conduct meaningful training evaluation (Bernthal, 1995).

More than ever, training evaluation must demonstrate improved performance and financial

results. But in reality, according to Garavaglia (1993), training evaluation often assessed

whether the immediate objectives have been met; specifically, how many items were

answered correctly on the post-test. Some based their evaluation only on trainee reaction; the

first level of Kirkpatrick model developed in 1959 (Brinkerhoff, 1988). Such information

gave organization no basis for making strategic business decisions (Davidove & Schroeder,

1992). Most practitioners are familiar with Kirkpatrick's 4-level evaluation model but many

never seemed to get beyond Levels 1 and 2 (Regalbutto, 1992). Numerous organizations have

adopted the model presented by Kirkpatrick to suit their own situations; the solution seems to

cause the growth of generic models (Dyer, 1994).

22

Kirkpatrick called for a definite approach to the evaluation model. All 4 levels must be

measured to ensure effectiveness of the whole evaluation system since each level provides

different kinds of evidence.

This view was supported by Hamblin (1974), who suggested that reaction leads to learning

and learning leads to change in behaviour, which subsequently leads to changes in the

organization. He further stated that each level can be broken at any link and having positive

reaction is necessary to create positive learning. According to Bramley and Kitson (1994),

there is not much evidence to support this linkage. Further research carried out by Alliger

and Janek (1989) found only 12 articles which attempted to correlate the various levels

advocated by Kirkpatrick. Although there are problems in external validity with such a small

data, the tentative conclusion was that there was no relationship between reaction and the

other three levels of evaluation criteria. A correlation study, which was run on these four

levels of evaluation showed insignificant results. A literature search based on Kirkpatrick's

name, yielded 55 articles but only 8 described evaluation results and none described

correlations between levels (Toplis, 1993). This concluded that good reactions did not

predict learning, behaviour or results.

A series of industrial surveys conducted in the last 30 years show little application of all 4

levels of Kirkpatrick model. Surveys conducted since 1970 showed that most industrial

trainers rely on student reaction, fewer on test learning and almost none on test application

and benefit (Brandenburg, 1982; Plant & Ryan 1994; Raphael & Wagner, 1972). In the last

20 years, a number of writers claimed to have performed a full Kirkpatrick evaluation;

however, the linkages described in connecting the training event with the outcome are

subjective and tenuous (Salinger & Deming, 1982; Sauter 1980).

A survey conducted by the Bureau of National Affairs and American Society of Training and

Development (ASTD) in 1969 using questionnaires indicated that most of the companies

conducted Level 1 evaluation and unsystematic approaches to Level 2 evaluation (Raphael &

Wagner, 1972). The survey indicated that problems of evaluation at higher levels were

mainly due to a lack of understanding of the approach used. Kirkpatrick model seems to

offer a one-size fits all solution to measure training effectiveness. However, there has been

little contribution and reliability of this model despite great industrial emphasis in this area.

23

Kirkpatrick model focuses mainly on immediate outcome rather than the process leading to

the results. The following questions were never successfully addressed. In fact the

improvement of these processes is the main forces of effectiveness (Murk, Barrett &

Atchade, 2000).

How well a person's motivation level affects the learning behaviour

The degree of superiors' support after the training

The extent to which training interventions was appropriate for meeting needs

Longer-term effects of the training, the pay-off in determining a course's overall

impact and cost-effectiveness

The conduciveness of the training environment

An empirical study by Warr, Allan and Birdi (1999) showed that external processes like

increasing confidence and motivation levels of trainees as well as use of certain learning

strategies are important contributing factors towards training effectiveness. A 2-day training

course was studied on 23 occasions over a 7-month period in the Institute of Work

Psychology, UK. Technicians who attended the training courses which involved operating

electronic tools were asked to complete a knowledge test questionnaire on arrival and at the

end of the course. A follow up questionnaire was mailed to the trainees one month later.

More than 70 percent of the respondents returned the questionnaire. The questionnaire was

designed to capture what the researches defined as third factors (i.e. confidence, perception,

motivation, learning strategies, age, etc). The results showed a non-significant correlation

between reactions towards the course and job behaviour. Perceptions of course difficulty

were significantly negatively associated with frequency of use of equipment. Correlation

between levels two and three evaluation were small. Learning scores and changes in those

score - Level 2 were strongly predicted by trainee's specific reactions to the course, but those

reactions were not significantly associated with later job behaviour - Level 3 (Warr, Allan &

Birdi, 1999).

Alliger et al. (1989) carried out a meta-analysis of studies where reaction measures had been

related to measures of learning (11 studies) and changes in behaviour (9 studies). They found

that positive reactions did not predict learning gains better than negative ones (the average

24

correlation between reactions and amount of learning was .02 nor were they any better at

predicting changes in behaviour after the program was .07).

Bramley and Kitson (1994) asserted that measuring learning is problematic because designing

a reliable measuring instrument is difficult and the necessary skills are often not available.

Grove and Ostroff (1990) pointed out that training directors often do not possess the essential

skills to conduct training evaluation. This could be part of the reason why companies are

reluctant to evaluate their training effectiveness.

Though Kirkpatrick's traditional assessment methods were widely used on Level 1 and 2

evaluations, the benefits of collecting data at each level are unclear. This uncertainty may

result in organization failing to evaluate training completely or selecting forms of evaluation

that may not be reliable. Inadequacy in Kirkpatrick model on each level forces one to look

for other possible measures. Therefore, one may argue that to make Kirkpatrick model

definite, a more detailed assessment method must be conducted at each level to ensure

practicality, validity and applicability (Mann & Robertson, 1996).

Mann and Robertson (1996) undertook to investigate the utility of various methods used in

evaluating training programs. Twenty-nine subjects were selected from a three-day training

seminar for the European National Run in Geneva, Switzerland. The seminar was a computer

training event (on e-mail and the Internet) for youth workers, and trainees were asked to

complete training evaluation forms before and after the training program and by post one

month later. Sixteen people returned this final questionnaire. Each questionnaire contained

three sets of questions designed to measure knowledge, attitudes and self-efficacy. The

results showed doubt over the value of the data received from reaction and learning levels.

Recommendations were made based on the following findings:-

Measuring learning (Level 2) as a method of evaluating training effectiveness is

important. The study showed that not all of what is learned immediately after training

is retained one month later. This denotes that the practitioner should be aware of the

short-term training effectiveness.

25

To ensure a more realistic evaluation at Level 2, one must be prudent of the pre and

post course evaluation method proposed by Kirkpatrick. The time frame for learning

to take place was never specified. An appropriate measuring model is necessary to

determine the extent of learning has taken place. In another words, Kirkpatrick model

lacks longitudinal considerations.

Measuring changes in learning through data collection as prescribed by Kirkpatrick

(absolute term) gained no value in predicting how well a person can perform the skills

attained from the training after a one-month period.

A positive attitude does not show any relevance on how well a person can perform a

trained task after a month. Reaction evaluation that shows positive attitude attained

have no direct linkage to performance.

However, individual self-efficacy did not decrease over time. Empirical studies

shown that self-efficacy correlates with actual performance (Kraiger, Ford & Salas

1993). One might look at the possibility of measuring self- efficacy instead of

reaction evaluation. In another words, self-efficacy offers more tangible results as

compared to reaction evaluation.

The reasons for Kirkpatrick failure in Level 3 and Level 4 evaluation was due to lack of a

defined framework and specific tools that are appropriate for measuring transfer of learning

since its first introduction 40 years ago. It is necessary, at the most basic level, to have a body

of case studies from which the generalizations can be drawn and thus hypotheses formed.

However, this body of information has not been published (Bramley & Kitson, 1994).

The issue here is whether or not the knowledge taught during training is being transferred or

demonstrated by the trainees on the job. The transfer component of training evaluation was

examined by Olsen (1998) in a study conducted in 1996. Transfer is evidence of whether

what has been learned is actually being used on the job for which it was intended.

The survey asked questions regarding how Kirkpatrick's 4-level evaluation were performed,

what percentage of payroll was spent on training, how much training was actually transferred

26

to the job and what specific items would enhance the level of transfer. A content analysis

was carried out on the 138 survey comments received on how the respondents made estimates

of the percentage of transfer value they reported. Follow up interviews were also undertaken

to provide additional clarification on responses and record impressions and opinions about

the data collection. The results showed that the percentage of transfer depended on the types

of training. Technical training showed the best rate of transfer, soft skills (interpersonal) do

not transfer as readily and are not easily observed. Transfer is not so readily apparent in the

effective work areas (Olsen, 1998).

Bramley (1996) offered an explanation why evaluation is not being carried out at the

behaviour and result levels. Traditionally most trainers use individual and educational

models of training process. The process has its limitation as emphasis is on encouraging

individuals to learn something rather than to find uses (if any) for the learning.

1.5 Future Research

Bramley and Kitson (1994) argued that the problems of evaluation at Levels 3 and 4 were not

well understood because not enough evaluation of this kind has been carried out. This is due

to the fact that effective measurement methods for Levels 3 and 4 are not available and the

amount of work in setting up the criteria for measuring these two levels is time consuming. It

is apparent that the incompleteness of Kirkpatrick model lies in its Levels 3 and 4 of

evaluation.

1.5.1 The Transfer Component

The transfer component is a potential area for future research. Transfer of training can be

defined as 'the application of knowledge, skills and attitudes learned from training on the job

and subsequent maintenance of them over a certain period of time (Baldwin & Ford, 1988;

Xiao, 1996). This process does not appear to have received much attention since most

organizations were apparently looking primarily at Levels 1 and 2 evaluations. Early studies

lacked theoretical framework to guide these investigations (Baldwin & Ford, 1988).

27

A survey conducted by Cheng and Ho (1998) revealed that there were inconsistent findings

on the variables that promised positive training transfer. The main intention of further

research is to develop common variables that are critical to different training and transfer

situations, including the establishment of common scales or instruments that can be used in

different research settings.

The current approach which uses variables such as individual ability, motivation and

environmental favourability has shown a profound effect on training transfer research (Noe &

Schmidt, 1986). However, this approach raises the question of application. This is because

individual differences (e.g. self efficacy and locus of control) are expected to extent

considerable influence on transfer outcome (Cheng & Ho, 1998).

A longitudinal study would be a better way of measuring the effectiveness of transfer

learning. It is argued that trainees who show similar levels of transfer performance after a

short period of training, may differ substantially in the long run (Kraiger & Ford, 1993).

Therefore, another major aspect of transfer research is to examine the level of newly acquired

knowledge, skills or behaviour retained in the transfer settings after a longer period of time.

For example, research should record the changes in terms of levels of skill proficiency as a

function of time after training.

1.5.2 Evaluating Beyond the 4 Levels

In considering the above studies, an effective evaluation should measure beyond the aspect of

reaction, learning, behaviour and results. Lewis and Thornhill (1994) suggested that an

effective training evaluation needs to be integrated and matched to the culture of the

organization. This integrated culturally related approach is advocated because it would be

able to minimize the risk of not meeting the objectives of carrying out training at the input

stages as well as evaluating reactions and impact at the outcome stage. This brings more

strategic approaches in identifying and prioritizing training needs, in relation to

organizational objectives.

28

29

To justify the training evaluation results, we may consider Brinkerhoff s (1987) criticism on

Kirkpatrick model, which only concentrates on the outcome of training. This is further

supported by Bernthal (1995) who found necessary to look for a broader linkage between

training and the organization context. Bernthal (1995) introduced the training-impact tree

method in measuring organization context. This is done by listing the barriers of training and

the factors that facilitate training next to their associated values and practices which are

aligned with the organization objectives.

Although Kirkpatrick model focuses on the attainment of tangible outcomes, it is important to

note that the question of measuring intangible outcomes that are related to training

effectiveness must not be ignored. Kirkpatrick (1994) revisited his 4-level evaluation model

and states that as long as the evidence collected is beyond a reasonable doubt, one should be

satisfied with the evidence. Perhaps an experienced training practitioner may want to explore

the possibility of interacting the absolute 4-level evaluation model with other process models.

As a result of this, the gap that exists in short and long term measures of training evaluation

may be minimized. Future research may be built upon deriving the integrated model that

would complement both absolute and process evaluation on training effectiveness.

1.5.3 Incorporating Competence-based Approach into TrainingEvaluation

The aim of future research is to develop a comprehensive training evaluation by

incorporating the absolute Kirkpatrick model with the competence-based process. The

competence-based assessment system could be used in collecting sufficient evidence to

determine whether individuals are performing competently in their jobs.

Strebler, Robinson and Heron (1997) classified two different meanings of the term

competency namely expressed as behaviours that an individual needs to perform a job and as

minimum standards of performance. The term competency has been used to refer to the

meaning expressed as behaviours and performance standards. Competence-based assessment

is helpful to provide a behaviourist framework for learning in training evaluation. A

behaviourist approach to learning provides simpler tasks for the trainer and clarity of

outcome for the learner (Hoffmann, 1999). Another definition of competencies is the quality

of outcome which may be used to evaluate gains in productivity or efficiency in the

workplace as a result of training (Strebler et al., 1997).

Further research by Sternberg and Kolligian (1990) defined competency as the underlying

attributes of a person such as their knowledge, skills or abilities. The use of this definition

created a focus on the required inputs of individual in order for them to produce competent

performances. This is aligned with the traditional training evaluation approach of measuring

knowledge, skills and abilities of a person after training. Rowe (1995) suggested that

competence-based assessment which looks at evaluating the whole process of learning should

consist of:-

Objective: The trainer should exhibit clear learning objectives and methods for

obtaining those objectives.

Evidence: Evidence must be provided to indicate competent performance.

Observation: An assessor looks out for competent performance.

Peers' Comments are obtained from work colleagues, peers.

Comments: and customers.

The key point is that a competence-based model supplements knowledge-based

achievements. Programs will be designed by permitting competence-based models to build

on knowledge-based achievement. In this way knowledge supports work, learning supports

skill and theory supports practice (Rowe, 1995).

The competence-based method would be able to assess whether knowledge and skills learned

are being effectively applied in the workplace and whether the trainee can now be described

as competent after completion of a training program.

This integrated model could also be used prior to designing a training program in order to

establish development needs and to determine training program content.

30

1.5.4 Multi-Rater Feedback System in Training Evaluation

There does not appear to be a distinct individual who founded or invented this process and

according to Moses, Hollenbeck and Sorcher (1993), the term multi-rater feedback is

misleading as it suggests a newly discovered concept, whereas they argue that perceptions of

people have been available as long as there have been people to observe them.

Nowack (1993) presents a useful summary of some of the reasons for the increased use of

multi-rater feedback in organizations:

The need for a cost-effective alternative to assessment centers;

The increasing availability of assessment software capable of summarizing data from

multiple sources into customized feedback reports;

The need for continuous measurement of improvement efforts;

The need for job-related feedback for employees affected by career plateauing; and

The need to maximize employee potential in the face of technological change,

competitive challenges and increased workforce diversity.

From the organizational perspective, multi-rater feedback can be used solely for

developmental purposes. Romano (1994) and Atwater et al. (1993) found that the most

common use is in the area of training and development. The overall net effect of training and

development should enhance organizational performance.

From the individual perspective, the feedback is invaluable because it comes from numerous

sources, providing multiple perspectives and opinions. Each opinion and perspective may

provide relevant yet different feedback (Atwater et. al, 1993; Hazucha et. al, 1993; Tornow,

1993). This form of feedback can increase the reliability, fairness and acceptance of the data

by the person being rated (London, Wohlers & Gallagher, 1990). This occurs because the

feedback is received from multiple sources and not just from one ratee.

One of the advantages of using multi-rater feedback is that it provides the opportunity for

individuals who are being assessed to compare their self perceptions against the perceptions

of others regarding their behaviour (Rosti & Shipper, 1998).

31

The difference in perspective between the rater and the ratee is not treated as an error but is a

source of information which can enhance personal learning. Ratees can learn from the

discrepancy between self rating and the rating of others.

The use of multi-rater feedback provides a natural method for both enhancing learning of the

participants and improving the evaluation process. Feedback is seen as a critical element in

affecting change (Bennis, Benne & Chin, 1969). Multi-rater feedback could be used to serve

as an unfreezing process in Lewin's (1948) model of change. This would enhance the ratee's

learning by creating doubts on the ratee's current performance standard and provides an

opportunity for prospective development. Most training evaluation models emphasize the

absolute outcome of training. However, multi-rater feedback involves the change process

where the resultant behaviour involved reinforcement of past performance and also provides

an opening for future learning. Thus, collecting multi-rater feedback before and after training

will enhance learning and provide at least part of the data needed to evaluate training.

Moses et al. (1993) provides the following criticism of multi-rater feedback:

It relies on generalized traits as there is a limited or non-existent frame of reference

for making rater/observer judgments.

It is based on an individual's memory, which can often be incomplete descriptions of

past performance.

The observer may be unable to interpret behaviours

It relies on the instrument designers' scoring system, factor analysis or data collection

methods to interpret the information for the participant.

The main argument of Moses et al. (1993) is that multi-rater feedback is based on other

people's observations and that such observations are often incomplete descriptions of past

performance because the observer does not know what to look for. The unresolved issue is

what behaviours to study. Multi-rater feedback has been taken to identify the behaviour of

effective management. There is lack of sufficient definitional detail to study managerial

proficiency or the effectiveness of training (Morrison & McCall, 1978; Schriesheim & Kerr,

1977). Yulk (1994) argued that further refinement of these constructs is needed by identifying

32

specific skills which make up each construct. Hence, development of construct and its

validity is important prior to training.

Multi-rater feedback has been found to be widely used in managerial and leadership

development programs (Cacioppe R., 1998; Cacioppe & Albrecht, 2000; Garavan, Morley &

Flynn, 1997; McCauley & Moxley, 1996; Thach, E.C., 2002). However, its usage in other

fields needs further research and exploration. This is further supported by Rosti and Shipper

(1998) in their study on the impact of training in a management development program based

on multi-rater feedback.

1.6 Conclusion

It is widely acknowledged that the Kirkpatrick evaluation model has been providing the most

basic thoughts on training evaluation throughout this decade. However, there seems to be

incomplete application of Kirkpatrick's 4-level evaluation model being carried out by the

industries. No significant success has been identified from the use of 4-level evaluation

model by the majority of organizations that have conducted training evaluations.

Based on this literature review, it may be concluded that Kirkpatrick model has not reached a

stage of clarity for in-depth training evaluation to be carried out. His model would provide

training managers with the idea of what is training evaluation on a systematic approach

however the aspect of training measurement method was not well explored or detailed.

While training has been conceptualized as a continually evolving process, the existing

literature appears to have failed to provide adequate strategies for organizations wanting to

evaluate the immediate, as well as the long-term, effectiveness and value of their training

efforts.

At face value, the literature shows that the full Kirkpatrick evaluation strategy is being widely

applied; however, more detailed analysis found that none were able to demonstrate Level 4

evaluation and of those who claimed evaluation at Levels 2 or 3, none were able to

demonstrate a systematic approach to the problem.

33

Arguably the dilemma in adopting the Kirkpatrick's taxonomy as a comprehensive and

integrated approach to evaluation lies in both the qualitative and quantitative attempts that

may or may not provide good phenomenological studies. Further analysis of the method

shows considerable confusion as to what is, or is not, a valid indicator for evaluation. Clearly,

there has been little change in terms of level of confidence towards the reliability of training

evaluation, notwithstanding greater emphasis on this key organizational development

process.

The weaknesses of Kirkpatrick model have brought opportunity for future research in

incorporating competencies and multi-rater feedback approach into the long-term evaluation

of training.

These weaknesses have also opened up opportunities for further research in the transfer

learning especially in the studies of its longitudinal and application effect.

1.7 References for Paper One

Alliger, G.M. & Janek, E.A. 1989, 'Kirkpatrick's levels of training criteria: thirty yearslater', Personnel Psychology, vol. 42, pp. 331-342.

Alliger, G.M., Tannenbaum, S.I., Bennett, W., Traver, H. & Shotland, A. 1997, 'A meta-analysis of the relations among training criteria', Personnel Psychology, vol. 50, pp.341-358.

Atwater, L., Roush, P. & Fishthal, A. 1993, The Impact of Upward Feedback on Self andFollower Ratings of Leaders, Centre for Creative Leadership, New York.

Baldwin, T.T. & Ford, J.K. 1988, 'Transfer of training: a review and directions for futureresearch', Personnel Psychology, vol. 41, pp. 63-105.

Basadur, M., Graen, G.B. & Scandura, T.A. 1986, 'Training effects on attitudes towarddivergent thinking among manufacturing engineers', Journal of Applied Psychology,vol. 71, pp. 612-617.

Bennis, W.G., Benne, K.D. & Chin, R.1969, The Planning of Change, 2nd edn, Holt,Rinehart & Winston, New York.

Bernthal, P.R. 1995, 'Education that goes the distance', Training and Development, vol. 49,no. 9, pp. 41.

34

Blanchard, P.N. & Thacker, J.W. 1999, Effective Training, Systems, Strategies andPractices, Prentice Hall Publisher, New Jersey.

Blanchard, P.N., Thacker, J.W. & Way, S.A. 2000, 'Training evaluation: perspectives andevidence from Canada', International Journal of Training and Development, vol. 4,no.4, pp. 295-303.

Boulmetis, J. & Dutwin, P. 2000, The ABCs of Evaluation: Timeless Techniques forProgram and Project Managers, Jossey-Bass Publisher, San Francisco.

Boyle, P.G. & Jahns, I. 1970, 'Program development and evaluation' in Handbook of adulteducation, eds Smith, R.M., Aker, G.F. & Kidd, J.E., Macmillan Company, NewYork, pp. 70.

Bramley, P. & Kitson, B. 1994, 'Evaluating training against business criteria', Journal ofEuropean Industrial Training, vol. 18, no.1, pp. 10-14.

Bramley, P. 1996, Evaluating Training Effectiveness, McGraw-Hill, Maidenhead and NewYork.

Brandenburg, D. 1982, 'Training evaluation: what is the current status?' Training andDevelopment Journal, pp. 14-19.

Bretz, R.D. & Thompsett, R.E. 1992, 'Comparing traditional and integrative learningmethods in organizational training programs', Journal of Applied Psychology, vol. 77,pp. 941-951.

Brinkerhoff, R. 0. 1987, Achieving results from training, Jossey-Bass Publisher, SanFrancisco.

Brinkerhoff, R.O. 1988, 'An integral evaluation model for human resource development',Training and Development Journal, vol. 42, no. 2, pp. 66-68.

Brown, K.G., Werner, M.N., Johnson, L.A. & Dunne, J.T. 1999, Formative evaluation inIndustrial/Organization Psychology: further attempts to broaden training evaluation,presented at a symposium on training evaluation: advances and new directions forresearch and practice, Society of Industrial and Organizational Psychology, Atlanta.

Cacioppe, R. 1998, 'An integrated model and approach for the design of effectiveleadership development programs', Leadership and Organization DevelopmentJournal, vol. 19, no. 1, pp. 44-53.

Cacioppe, R. & Albrecht, S. 2000, 'Using 360-degree feedback and the integral model todevelop leadership and management skills', Leadership and OrganizationDevelopment Journal, vol. 21, no. 8, pp. 390-404.

Campbell, J.P. 1988, Training Design for Performance Improvement, in Productivity inOrganizations, eds Campbell, J.P. & Campbell, R.J., Jossey-Bass Publisher, SanFrancisco.

35

Cascio, W.F. 1989, Using utility analysis to assess training outcomes, in Training andDevelopment in Organizations, ed. I.L. Goldstein, Jossey-Bass, San Francisco.

Cervero, R.M. 1988, Effective Continuing Education for Professionals, Jossey-BassPublisher, San Francisco.

Campion, M.A. & Campion, J.E. 1987, 'Evaluation of an interview skills trainingprogram in a natural field setting', Personnel Psychology, vol. 40, no. 4, pp. 675-91.

Chen, H.T. & Rossi, P.H. 1992, Using Theory to Improve Program and PolicyEvaluations, Greenwood Press, Westport, CT.

Cheng, E. & Ho, D. 1998, 'The effects of some attitudinal and organizational factors ontransfer outcome', Journal of Managerial Psychology, vol. 13, no. 5/6, pp. 309-317.

Clegg, W.H. 1987, 'Management training evaluation: an update', Training and DevelopmentJournal, vol. 41, no. 2, pp. 65-71.

Connolly, M.S. 1988, 'Integrating evaluation, design and implementation', Training andDevelopment Journal, vol. 42, no. 2, pp.20-23.

Constable, J. & McCormick, R. 1987, The Making of British Managers, BIM, CBI, London.

Davidove, A.E. & Schroeder, P.A. 1992, 'Demonstrating ROI of training' Training andDevelopment Journal, vol. 46, no. 8, pp. 70-71.

Davis, B.L. & Mount, M.K. 1984, 'Effectiveness of performance appraisal training usingcomputer assisted instruction and behaviour modeling', Personnel Psychology, vol.37, pp. 439-452.

Dawson, R.P. 1993, Model of evaluations of equal opportunities training in localgovernment with special reference to women, unpublished PhD thesis, South BankUniversity, London.

Dionne, P. 1996, 'The evaluation of training activities: a complex issue involvingdifferent stakes', Human Resource Development Quarterly, vol. 7, pp. 279-86.

Dyer, S. 1994, `Kirkpatrick's mirror', Journal of European Industrial Training, vol. 18,no. 5, pp 31-32.

Eisner, E.W. 1997, The Enlightened Eye: Qualitative Inquiry and the Enhancement ofEducational Practice, 2nd edn., Merrill, New York.

Garavaglia, L.P. 1993, 'How to ensure transfer of training', Training & DevelopmentJournal, vol. 47, no. 10, pp. 63-68.

Garavan, T.N., Morley, M. & Flynn, M. 1997, '360-degree feedback: its role in employeedevelopment', Journal of Management Development, vol. 16, no.2, pp. 134-147.

36

Geber, B. 1995, 'Does your training make a difference? Prove it!', Training andDevelopment Journal, vol. 3, pp. 27-34.

Goldstein, L.I. 1986, Training in Organizations: Needs Assessment, Development andEducation, Cole Publishing Company, California.

Goldstein, L.I. & Ford, J.K. 2002, Training in Organizations: Needs Assessment,Development and Evaluation, Thomson Learning, Wadsworth, Canada.

Grove, E.A. & Ostroff, C. 1990, Program evaluation, in Developing Human Resources, edsWexley, K. & Hinnicks, J., BNA Books, Washington D.C.

Hamblin, A.C. 1974, Evaluation and Control of Training, McGraw-Hill Publisher, NewYork.

Hazucha, J.F., Hezlett, S.A. & Schneider, R.J. 1993, 'The impact of 360-degree feedback onmanagement skills development', Human Resource Management, vol. 32, pp. 325-351.

HMSO 1989, Training in Britain: A Study of Funding, Activity and Attitudes, HerMajesty's Stationery Office, London.

Hoffmann, T. 1999, 'The meanings of competency', Journal of European IndustrialTraining, vol. 23, no. 6, pp. 275-285.

Holton, E.F. III 1996, 'The flawed four-level evaluation model', Human ResourceDevelopment Quarterly, vol. 7, pp. 5-21.

Junaidah, H. 2001, 'Training evaluation: clients' roles', Journal of European IndustrialTraining, vol. 25, no. 7, pp. 374-379.

Kirkpatrick, D.L. 1959a, 'Techniques for evaluating training programs: part 1 - reaction',Journal of American Society for Training and Developing, vol. 13, pp. 3-9.

Kirkpatrick, D.L. 1959b, 'Techniques for evaluating training programs: part 2 - learning',Journal of American Society for Training and Developing, vol. 13, no. 12, pp. 21-26.

Kirkpatrick, D.L. 1960a, 'Techniques for evaluating training programs: part 3- behaviour',Journal of American Society for Training and Developing, vol. 14, no. 1, pp. 13-18.

Kirkpatrick, D.L. 1960b, 'Techniques for evaluating training programs: part 4 - results',Journal of American Society for Training and Developing, vol. 14, no. 2, pp. 28-32.

Kirkpatrick, D.L. 1976, Evaluation of Training, Training and Development Handbook: Aguide to human resource development, 2nd edn, Craig, R.L.O., McGraw-HillPublisher, New York.

Kirkpatrick, D.L. 1979, 'Techniques for evaluating training programs', Training andDevelopment Journal, vol. 33, pp. 78-92.

37

Kirkpatrick, D.L. 1994, Evaluating Training Programs: The Four Levels, Berrett-KoehlerPublishers, San Francisco.

Kirkpatrick, D.L. 1996a, 'Great ideas revisited', Training and Development Journal,vol. January, pp. 54-59.

Kirkpatrick, D.L. 1996b, 'Invited reaction: reaction to Holton article', Human ResourceDevelopment Quarterly, vol. 7, pp. 23-24.

Kirkpatrick, D.L. 1998, Evaluating Training Programs: The Four Levels, Berrett-Koehler Publishers, San Francisco.

Kraiger, K., Ford, J.K. & Salas, E. 1993, 'Application of cognitive, skill-based and affectivetheories of learning outcomes to new methods of training evaluations', Journal ofApplied Psychology, vol. 78, no. 2, pp. 311-328.

Legge, K. 1984, Evaluating Planned Organizational Change, Academic Press, London.

Lewin, K. 1948, Resolving social conflicts, Harper & Bros Publishers, New York, NY.

Lewis, P. & Thornhill, A. 1994, 'The evaluation of training an organizational cultureapproach', Journal of European Industrial Training, vol. 18, no. 8, pp. 25-32.

London, M., Wholers, A.J. & Gallagher, P. 1990, '360-degree feedback surveys: a source offeedback to guide management development', Journal of Management Development,vol. 9, pp. 17-31.

Love, A.J. 1991, Internal Evaluation: Building Organizations From Within, SagePublication, California, CA.

Madaus, G.F., Scriven, M.S. & Stufflebeam, D.L. 1986, Evaluation Models: Viewpointson Educational and Human Services Evaluation, Kluwer-Nijhoff Publishing, Boston.

Mann, S. & Robertson, I. T. 1996, 'What should training evaluation evaluate?'Journal of European Industrial Training, vol. 20, no. 9, pp. 14-20.

Mathieu, J.E. & Leonard, R.L. Jr. 1987, 'Applying utility concepts to a training program insupervisory skills: a time-based approach', Academy of Management Journal, vol.30, pp. 316-335.

Mathews, B.P., Ueno, A., Kekale, T., Repka, M., Pereira, Z.L. & Silva, G. 2001, 'Qualitytraining: needs and evaluation-findings from a European survey, Total QualityManagement, vol. 12, no. 4, pp. 483-490.

McCauley, C.D. & Moxley, R.S. Jr. 1996, Developmental 360: How Feedback Can MakeManagers More Effective, Jossey-Bass Publisher, San Francisco.

Morrison, A.M. & McCall, J.D. 1978, Feedback to Managers: A ComprehensiveReview of Twenty-four Instruments, Centre for Creative Leadership, Greensboro, NC.

38

Morrow, C.C., Jarrett, M.Q. & Rupinski, M.T. 1997, 'An investigation of the effect andeconomic utility of corporate-wide training', Personnel Psychology, vol. 50, pp. 91-119.

Moses, J., Hollenbeck, G.P. & Sorcher, M. 1993, 'Other people's expectations', HumanResource Management, vol. 32, Summer Fall.

Murk, P., Barrett, A. & Atchade, P. 2000, 'Diagnostic techniques for training andeducation: strategies for marketing and economic development', Journal ofWorkplace Learning, vol. 12, no. 7, pp. 296-306.

Noe, R.A. & Schmitt, N. 1986, 'The influence of trainee attitudes on trainingeffectiveness: test of a model', Personnel Psychology, vol. 39, pp. 497-523.

Noe, R.A. 2000, Employee Training and Development, McGraw-Hill Publisher, New York.

Nowack, K. 1993, '360-degree feedback: the whole story', Training and DevelopmentJournal, vol. 47, no. 1, pp. 69-73.

Newstrom, J.W. 1978, 'The problem of incomplete evaluation of training', Training andDevelopment Journal, vol. 32, no. 11, pp. 22-24.

O'Leary, V.E. 1972, 'The Hawthorne effect in reverse: effects of training and practice onindividual and group performance', Journal of Applied Psychology, vol. 56, pp. 491-494.

Olsen, J. H. Jr. 1998, 'The evaluation and enhancement of training transfer', InternationalJournal of Training and Development, vol. 2, no. 1, pp. 61-75.

Parlette, M. & Hamilton, D. 1977, 'Evaluation as a new approach to the study of innovativeprogrammes', in Beyond the Numbers Game, eds Hamilton, D. et al., Macmillan,London.

Phillips, J.J. 1991, Handbook of Training Evaluation and Measurement Methods, GulfPublishing Company, Houston, TX.

Phillips, J.J. 2002, Return on Investment in Training and Performance ImprovementPrograms, 2nd edn, Butterworth-Heinemann, Woburn, MA.

Phillips, J.J. & Stone, R.D. 2002, How to Measure Training Results, A Practical Guide toTracking the Six Key Indicators, McGraw-Hill Publisher, New York.

Plant, R.A. & Ryan, R.J.1994, 'Who is evaluating training?', Journal of European IndustrialTraining, vol. 18, no. 5, pp. 27-30.

Popham, W. J. 1974, Evaluation in Education: Current Applications, Berkeley,McCutchan, California.

Porter, L., & McKibbin, L. 1988, Future of Management Education and Development DriftOr Thrust Into the 21' Century?, McGraw-Hill Publisher, New York.

39

Provus, M. 1971, Discrepancy Evaluation, Berkeley, McCutchan, California.

Rae, L. 1986, How to Measure Training Effectiveness, Gower Publications, Aldershot,London.

Raphael, M. & Wagner, E. 1972, 'Training surveys surveyed', Training and DevelopmentJournal, vol. 26, pp. 10-14.

Redshaw, B. 2001, 'Evaluating organizational effectiveness', Measuring BusinessExcellence, vol. 5, no. 1, pp. 16-18.

Regalbutto, G.A. 1992, 'Targeting the bottom line', Training and Development Journal, vol.46, no. 4, pp. 29-32.

Rivlin, A.M. 1971, Systematic Thinking for Social Action, Brookings Institution,Washington.

Romano, C. 1994, 'Conquering the fear of feedback', Human Resource Focus, vol. 71, no. 3.

Rossi, P.H. & Freeman, H.E. 1993, Evaluation.. A Systematic Approach, 5th edn, SagePublication, California.

Rosti, R.T. Jr. & Shipper, F. 1998, 'A study of the impact of training in amanagement development program based on 360 feedback', Journal of ManagerialPsychology, vol. 13, no.1/2, pp. 77-89.

Rowe, C. 1995, 'Incorporating competence into the long term evaluation of training anddevelopment', Industrial Commercial Training, vol. 27, no.2, pp. 3-9.

Salinger, R. & Deming, R. 1982, 'Practical strategies for evaluating education', Training andDevelopment Journal, vol. 4, pp. 20-29.

Sauter, J. 1980, 'Purchasing public sector executive development', Training andDevelopment Journal, vol. 34, no. 4, pp. 92-98.

Schriesheim, C.A. & Kerr, S. 1977, 'Theories and measurement of leadership: a criticalappraisal of present and future directions', in Leadership: The Cutting Edge, edsHunt, J.G. & Larson L.L., Southern Illinois University Press, Carbondale, IL.

Scriven, M. 1991, Evaluation Thesaurus, Sage Publication, Newbury Park, California.

Shadish, W. R. & Epstein, R. 1987, 'Patterns of program evaluation practice amongmembers of the evaluation research society and evaluation network', EvaluationReview, vol. 11, no. 5, pp. 555-590.

Shadish, W.R. & Reichardt, C.S. 1987, 'Evaluation studies', Evaluation Review, vol. 12, pp.13-30.

40

Shireman, J.A.R. 1991, Utilization of program evaluation for decision making regardinghospital based patient/client focused health education programs, doctoral dissertation,University of Iowa, dissertation abstracts international, 52/12A, AA C9212928.

Smith, A.J. 1990, 'Evaluation of management training subjectivity and the individual',Journal of European Individual Training, vol. 14, no. 1, pp. 12-15.

Stake, R. 1977, 'Responsive evaluation', in Beyond the Number Game, eds Hamilton, D.,Jenkins, D., King, C., MacDonald, B. & Parlett, H.M., Macmillan, London.

Steel, S. 1970, 'Program evaluation: a broader definition', Journal of Extension, vol. 13, pp.13-20.

Sternberg, R. & Kolligian, J. Jr. 1990, Competence Considered, Yale University Press, NewHeaven, CT.

Strebler, M., Robinson, D. & Heron, P. 1997, 'Getting the best out of yourcompetencies', Institute of Employment Studies, University of Sussex, Brighton.

Stufflebeam, D.L. 1971, Education Evaluation: Decision Making, by the PDK national studycommittee on education, Itasca, III: F.E. Peacock Publisher Inc, Boston.

Stufflebeam, D.L. 1983, 'The CIPP model for program evaluation', in Evaluation Models,eds Madaus, G.F., Scriven, M.S. & Stufflebeam, D.L., Kluwer-Nijhoff Publishing,Boston, pp. 117-141.

Stufflebeam, D.L. & Shrinkfield, J.A. 1985, Systematic evaluation, Kluwer NijhoffPublishing, Boston.

Swanson, R.A. & Holton, E.F. 1999, Results: How to Assess Performance, Learning AndPerceptions in Organizations, Berrett-Koehler Publishers, San Francisco.

Tesoro, F. 1998, 'Implementing an ROI measurement process at Dell Computer',Performance Improvement Quarterly, vol. 11, pp. 103-114.

Thach, E.C. 2002, 'The impact of executive coaching and 360-feedback on leadershipeffectiveness', Leadership and Organization Development Journal, vol. 23, no. 4,pp. 205-214.

Toplis, J. 1993, 'Training evaluation reflections on the first steps', European WorkOrganization Psychology, vol. 2, no. 2, pp. 146-152.

Tornow, W.W. 1993, 'Perceptions or reality, is multiple-perspective measurement ameans or an end?', Human Resource Management, vol. 32. no. 2 & 3, pp. 209-408.

Tyler, R.W. 1949, Basic Principle of Curriculum and Instruction, University of ChicagoPress, Chicago.

Tyler, R.W. 2002, 'Evaluating evaluations', Human Resource Magazine, vol. June, pp. 85-93.

41

Warr, P. & Bunce, K. 1995, 'Employee age and voluntary development activity',International Journal of Training and Development, vol. 2, pp. 190-204.

Warr, P., Allan, C. & Birdi, K. 1999, 'Predicting three levels of training outcome', Journalof Occupational and Organizational Psychology, vol. 72, pp. 351-375.

Wexley, K.N. & Baldwin, T.T. 1986, 'Post-training strategies for facilitating positivetransfer: an empirical exploration', Personnel Psychology, vol. 29, pp. 503-520.

Wholey, J.S., Hatry, H.P. & Newcomer, K.E. 1994, Handbook of Practical ProgramEvaluation, Jossey-Bass Publisher, San Francisco.

Xiao, J. 1996, 'The relationship between organizational factors and the transfer oftraining in the electronics industry in Shenzhen, China', Human ResourceDevelopment Quarterly, vol. 7, no. 1, pp. 55-73.

Yulk, G.A. 1994, Leadership in Organizations, 2nd edn, Englewood Cliffs, Prentice HallPublisher, New Jersey.

42

43

Research Paper 2

EVALUATING TRAINING EFFECTIVENESS:AN EMPIRICAL STUDY OF KIRKPATRICK

MODEL OF EVALUATION IN THEMALAYSIAN TRAINING ENVIRONMENT FOR

THE MANUFACTURING SECTOR


University of Hull


Evaluating Training Effectiveness: An EmpiricalStudy of Kirkpatrick Model Of Evaluation in the

Malaysian Training Environment for theManufacturing Sector



2.1 Abstract

This research adopted an empirical approach to track the history, rationale, objectives and the

implementation of training evaluation initiatives in Malaysia's manufacturing sector. Since

the establishment of the Human Resource Development Fund, training activities in Malaysia

have increased. The majority of Malaysian organizations that conduct training are doubtful

about how training activities could add value to the organization performance and justify

their training investment. This research provides an understanding of training evaluation

culture within the Malaysian manufacturing sector and the effectiveness of this Kirkpatrick's

4-level evaluation model as applied to the Malaysian manufacturing sector.

2.2 Introduction

The Malaysian government is committed towards education, training and human resource

development. The government recognizes the importance of human resource development in

its quest for achieving a fully developed nation status. This commitment has translated into

the establishment and growth of the training practice in the country.

Being the sole provider of training previously, the government has adopted the policy of

involving private enterprises in all aspects of training. Training needs have become crucial

and vital to the development of capital-intensive and value added industries. Apart from

44

involving enterprise to make training more market-driven, there is a need for enterprise to

share the burden of training. In the Seventh Malaysia Plan, the private sector was expected to

play a more active role in upgrading the qualification and skill of its workers (Junaidah,

2001).

2.3 Training Practices in Malaysia

Training activities within Malaysian companies are behind countries like Singapore, Japan

and Korea. Training activities in Malaysia are mainly conducted by large multinational

companies. The International Labour Organization's study in 1997 showed that Malaysia is

in the 12th position in terms of providing in-company training (Junaidah, 2001).

The Malaysian government passed a new Act of Parliament entitled Human Resources

Development Act in 1992, to encourage and stimulate the private sector to introduce training

and development for its employees (HRDC, 1992). The objective of this Act is to set aside

accumulated funds to promote training activities within the organization. Under this Act,

companies with more than 50 employees will have to contribute 1 percent of their total staff's

monthly salary to the Ministry of Human Resources through the Human Resources

Development Council (HRDC). The fund is known as the Human Resources Development

Fund (HRDF), was launched in January 1993. The government set up the HRDC to manage

this fund by identifying the systematic training needs and approving relevant training

programs required by organizations. The levy is partially refunded under special schemes

known as Training Aid Scheme and Approved Training Program (ATP) Scheme to the

respective organizations once the training program is completed. The policy lays down the

parameters for a Human Resource oriented development strategy that is designed to mobilize

national effort to increase technological capabilities and competitiveness as well as create

highly skilled, productive, disciplined and efficient workforce. This strategy would aid

Malaysia's transition into an industrialized economy. Private sector companies are also

expected to enhance their training activities by utilizing the HRDF and participating in skill

development programs run by the state governments (MEPU, 1996). Since the establishment

of the HRDC, how has the Malaysian manufacturing sector gained from the training

conducted? With information on how training benefit organizations, it would help the

45

Malaysian government to chart the progress and expected time frame needed for Malaysia to

transform into an industrialized economy.

The need to develop a highly trained workforce is evident from the increase of more than 200

management consulting and training institutions, professional associations and management

schools operating in Malaysia (Arthur Anderson & Co, 1991). The number of employees

who return to formal education and training has increased consistently since 1972 (Ahmad,

1998). The government set up the National Institute of Public Administration Malaysia

(INTAN) which is responsible for training government employees in administration and

management (Junaidah, 2001).

There are some real difficulties in assessing the full extent of skill development for

government training in Malaysia even after conducting evaluation (Mirza & Juhary, 1995).

Firstly, much of skill development takes place in the private sectors. Most skills even those

involving advanced manual skills are acquired on the job. Secondly, skill development

during employment tends to be demand-driven (Pillai, 1994). Workers gain experience on

the job and upgrade their skills when they are exposed to a higher skill level. A study by

Pillai and Othman (1994) showed that the budget for training and education in Malaysia has

increased by 40 percent. Company emphasis has been on improving the quality of training to

help develop competent labour force that improves the competitiveness of the industrial

sector in Malaysian. This new demand will force employers to further develop employee

competencies. Saiyadain (1995) found that as many as 82.6 percent of organizations

sponsored their managers for training, and on average these organizations spent 4.65 percent

of the managerial payroll on training managers. This shows that the number of knowledge

workers and new knowledge-based opportunities is expected to increase dramatically in the

next few years.

2.4 The Practice of Evaluation in Training

Although the methodology of evaluating training effectiveness may look fair, it could make it

difficult to express rational criticism. A survey by Wagel (1977) found that 75 percent of

companies have no formal method for evaluating training effectiveness. In a subsequent

46

survey by Easterby-Smith (1985), the result showed that out of 15 organizations with 320

300,000 employees, only one conducted some form of evaluation on a regular basis which

was a post-course questionnaire. According to Rowe (1992), although every training manual

gives lip service to evaluation, it is notoriously difficult to carry out effectively. The

extensive survey by Plant and Ryan (1994) served to further underline the lack of widespread

sophistication in evaluation. They point to budget cutting and economies pressures as being

possible explanations. A recent study by Blanchard, Thacker and Way (2000) on 202

organizations in Canada reported that more than half of the organizations are not

comprehensively evaluating their training.

According to Carnevale and Schulz (1990), the American Society for Training and

Development (ASTD) research indicated that the most popular reasons for evaluation are to

gather information to help decision makers improve the training process and facilitate

participants' job performance. This explains why the outcome-based Kirkpatrick model is so

popularly used. Evaluation also helps measure the degree of improvement in application and

assesses how well the learner achieves the established goals (Attkinsson, Sorenson,

Hargreaves & Hororwitz, 1978).

For the past 30 years the Kirkpatrick model had been considered the most prominent training

evaluation model (Bernthal, 1995). Phillips (1991) concluded that, out of more than 50

evaluation models available, the evaluation framework that most training practitioners use is

the Kirkpatrick model. It is easy to find firms that practice training evaluation. However,

most firms only conduct post course evaluation using Kirkpatrick's Level 1 evaluation.

Another important purpose for training evaluation is to meet the accountability requirements

of funding groups or clients (Rossi & Freeman, 1993). The demand for accountability has

been the major impetus for program evaluation since 1980s. Fiscal constraints have

increased the competition of companies' activities for available dollars and raised the

question of value for money from their activities (Ruthman & Mowbray, 1983).

Training evaluation is more than a set of empirical methods governed solely by the standards

of social science. Judgments on the quality of program evaluation must also be based on

criteria that are meaningful both to immediate users and the larger system in which the

program is embedded (Corday & Lipsey, 1986).

47

Phillips (1991) stated that when it comes to training evaluation, there still appears to be more

talk than action. In many organizations, training evaluation is either ignored or approached in

an unsystematic manner. Previous literature (Davidove & Schroeder, 1992; Shelton &

Alliger, 1993; Smith, 1990) demonstrated that training evaluation is unsystematic and based

on simple means. Gutek (1988) stated that there was little or no demand on the part of the

organization to seriously evaluate a training program. Most organizations evaluate their

training programs by emphasizing one or more levels of Kirkpatrick model (Chen & Rossi,

1992). The researchers, however, commented that evaluation knowledge found in the

literature is not being fully utilized in evaluation practices.

Admittedly it is difficult to completely ascertain a training program's effectiveness. What

works at a particular time at a particular training location with a group of participants may not

necessarily work as well when transferred to another time, setting and group (Junaidah,

2001).

Bramley and Kitson (1994) asserted that measuring learning is problematic because it is

difficult to design a reliable measuring instrument. There are also few people who possess

the necessary skills to evaluate training however these skills are often not available. Grove

and Ostroff (1990) mentioned that training directors often do not possess the necessary skills

to conduct training evaluation. However, Bramley (1996) mentioned that the lack of training

evaluation skills could be due to the methodological weakness embedded within the

Kirkpatrick model of evaluation.

In addition to the unavailability of a reliable measuring instrument, Barron (1996)

commented that why management does not demand evaluation because the management

believes that training will be reflected in an employee's work performance. The research by

Smith and Piper (1990) supported this view and showed that trainers openly said, "We do just

what we are asked to do deliver training. We do not do what we are not asked to do

improve human performance in the workplace". Smith and Piper (1990) also mentioned this

as one of the reasons for providing training but not evaluation. The research found that their

clients did not request for an evaluation. This could be the reason why training providers do

not evaluate their products.

48

A research by the ASTD in 1990 showed that most companies now conduct some form of

evaluation of their training programs. Practitioners tend to use different methodology and

approaches. In examining evaluation methods in business-education partnerships, Erickson

(1991) found that there is little standardization in the methodology. Shadish and Epstein

(1987) conducted a study to look at program evaluations among members of the Evaluation

Research Society and Evaluation Network. They found that practitioners had different

methodologies as well as different assumptions about evaluation. In their study, three patterns

of practices emerged from the evaluation practices which they labeled the academic pattern,

decision-driven pattern and the outcome pattern.

Heneman and Schurab (1986) stated that the evaluation of training programs is considered

different compared to the theory and models in the literature. Many authors commented that

once participants leave the training setting, program providers seldom attempt to determine

the effect of their program. Indeed, the word evaluation raises all sorts of emotional defense

reactions. Such response indicates a low level of commitment among training professionals

toward evaluation. Most of the time, the practices are informal, unsystematic and based on

one popular model. However in the study by Junaidah (2001) on Malaysian training

evaluation practices, it was found that evaluation was moderately formal, comprehensive and

systematic but could be further improved. Nevertheless, it is uncertain whether this so-called

comprehensive approach to training evaluation is within the taxonomy of the Kirkpatrick

framework. Currently, there is little literature on the evaluation system within the Malaysian

context.

2.5 Training Evaluation Practices in Malaysia

Validation of training effectiveness and benefits of training and development programs have

gained importance in public and private sectors in Malaysia. The Malaysian government

places great emphasis on program evaluation and appointed two federal agencies to be

responsible for evaluation. They are the National Institute of Evaluation and the Evaluation

Unit at the Prime Minister's Department. This unit is responsible for evaluating special

governmental projects and programs (Maimunah, 1990). Another evaluating body is the

Publication and Consultancy Bureau which carries out evaluation for government training.

49

There are three types of evaluation process currently being practiced in the agency. The

formal training evaluation uses standard evaluation questionnaires and oral evaluation in the

form of informal discussions, while the informal evaluation conducted during training

(Junaidah, 2001).

The reasons why Malaysian organizations do not evaluate training may lie in the inability to

develop relevant measuring tools or the difficulty in determining which performance

outcomes are attributed to training.

The rise in the awareness of training evaluation during the Malaysian economic downturn in

1997 has increased the pressure for organizations to justify the investment cost placed on

training (Junaidah, 2001). Organizations realized that training must be a worthwhile effort

and this raises the need for measuring training effectiveness. Evaluating training

effectiveness does not seem to be the culture of most organizations in Malaysia. Thousands

of training programs have been conducted in Malaysia since the rise of HRDF, (Mirza &

Juhary, 1995). However, effectiveness in terms of productivity, skills improvement, increase

in performance standards and return on investment is still unknown. Training should be

evaluated to learn the weaknesses of the training program. The selection criteria for

evaluation should be able to find out the improvement in the participants' work performance.

The need for greater quality management during the economic downturn forced Malaysian

companies to upgrade their current version of International Standard Organization (ISO) to

ISO 9001:2000 which emphasized on documenting the training evaluation process.

Companies that pursued this latest version of ISO are required to justify their training efforts

and money spent by linking skill development with the quality philosophy of the company.

As organizations pursue the latest version of ISO, evaluating training ranks high among top

management as a means of justifying training investment (Junaidah, 2001). The opportunity

cost of foregoing training commitment has become extremely high. More than ever, training

evaluation must demonstrate improved performance and financial results. As the investment

spent on training is costly, it is understandable why top managers wish to see value for

money and demand justification for training cost. Training providers need to show clients that

they are getting good returns on their investment in training. This demand for accountability

had been the major impetus for training in the past few years (Junaidah, 2001).

50

Most organizations in Malaysia have sufficient training facilities. Most managers are

sponsored to attend training programs on production, general management and human

resources management for an average of 2 days (Mirza & Juhary, 1995). On average

organizations spend 4.65 percent of the managerial payroll on training (Saiyadain, 1995).

The measurement of training effectiveness varies from organization to organization. A few

organizations have developed systematic plans to follow up on training. The top

management's attitude towards training has been identified as a critical factor in effective

operationalization of training (Mirza & Juhary, 1995). In organizations where the top and

middle management have been perceived to be supportive, training seems to have contributed

to the overall growth. But how far the evaluation process has been conducted to prove the

growth is still questionable. In order to improve the overall effectiveness of training, all

organizations should undertake training evaluation effectively. As mentioned by Brinkerhoff

(1988), training needs to adopt evaluations and measuring systems that can improve the

feedback mechanism in order to build their response capacity. A system of pre course

evaluation followed by post course evaluation may help in setting relevant expectations for

improvement.

A serious gap in the Malaysian training context is the insufficient information on the number,

nature and content of training facilities in the country. The skill-level at which the output

would fit into the labour market is not known while the syllabus, duration and quality of

training vary from one agency to another. This is due to the lack of collaboration and

consultation between industry and training institution. The quality of training is not up to the

mark. Trainees have theoretical knowledge but little practical experience (Pillai, 1994).

There has been limited study on training evaluation practices in Malaysia. A training

evaluation research by Shamsuddin (1995) was on the contextual factors associated with

evaluation practices of selected adult and continuing education providers in Malaysia.

According to him even though the management directed an evaluation to be conducted, it was

only for a narrow purpose. It was used to demonstrate program success by showing how

good was the training and how many people received the training which is merely Level 1

evaluation. The wider purpose of program evaluation such as measuring the acquired

learning (Level 2), program impact (Level 3) and cost effectiveness (Level 4) was not the

management priority. According to Shamsuddin (1995), the clients were not aggressive

stakeholders who cared and demanded accountability from the training providers. Their

51

behaviour and characteristics did not push the training provider to examine the real effect of

the programs in terms of learning gain and program effectiveness.

Besides Shamsuddin's (1995) study, four other studies conducted locally included the

element of evaluation practice. The first study by Hamid, Mohd, Muhamad and Ismail

(1987) asked 235 organizations if management education in Malaysia significantly provides

candidates with a set of skills. Organizations found that 67.6 percent of management

programs offered by local universities and colleges are too theoretical. Out of 121

respondents, 60.3 percent indicated that training is important while the rest felt the contrary.

This study focused on reaction evaluation (Level 1) to study the participants' satisfaction

level towards the overall programs. Another study conducted by Asma (1994) examined the

design of training practices of four training providers in Malaysia and found that the

evaluation practiced by the trainers do not conform to any theory and most of the evaluations

used were ad hoc and informal.

Mirza and Juhary (1995) conducted a study on local and multinational organizations and

found that in the majority of these organizations even if managers who return from training

may write a report, no formal systematic mechanism exists to assess how well they are

utilizing their training in the organizations. The research further found that participants were

only encouraged to apply learning at work but do not take the effort to find out what caused

the change. The result indicates that the behaviour towards measuring training effectiveness

is not popularly practiced. Organizations feel that if learning does not take place, it would

show in the next appraisal report. Participants who have learned something should have

applied it and therefore not necessary to track changes in performance.

Mirza and Juhary's (1995) study also revealed that most organizations in Malaysia evaluate

training effectiveness on a superficial level. Some encourage their managers to try out new

ideas while others do not show the same kind of support. Unfortunately for most companies,

measuring training effectiveness may not be practiced organization wide. This is because

measuring training effectiveness has never been a policy in most organizations. Lack of

support by most department heads is deterring most organizations from carrying out post-

training evaluation. Most organizations felt that if they had a more supportive top

management they could have established systems for measuring training effectiveness.

52

The most recent study was by Junaidah (2001) on training evaluation practices by training

institutions in Malaysia. The study showed moderately formal training evaluation practices

by Malaysian training practitioners. However, the researcher was uncertain whether these

training practitioners applied the taxonomy of Kirkpatrick model in training evaluation

practices.

Generally, training evaluation practices in Malaysia are either not done or if done, do not

follow any theory suggested in the literature. There is a paucity of detailed evidence of direct

causal links between investment in training and the resultant return in the form of increased

performance. Brandenburg (1982) suggested that part of the reason training practitioners

tended not to conduct evaluation or if they did, they relied heavily on soft information

evaluation methods and did not disseminate the results widely. Pauzi (1985) felt that part of

the problem lies in the attitude of the top management who do not show full commitment to

the evaluation process.

A further study is needed to study current training evaluation practices in Malaysia and to

understand updates of this practice. It is important to understand training effectiveness in

Malaysia as it is worthwhile to analyze the training evaluation process which has undergone

in the country. This study would contribute to the existing body of knowledge as current

information on training evaluation is inadequate. Since a large number of professional

associations, private consultants and management schools in universities are organizing

training programs in Malaysia, the results of the study would indicate areas where training

evaluation could be practiced for different training programs.

2.6 Methodology of Study

Most recent surveys of training and evaluation practices in Malaysia were conducted by

Hamid et al. (1987), Asma (1994), Mirza and Juhary (1995), Shamsuddin (1995) and

Junaidah (2001). The dearth of published materials on training and development activities of

managers in Malaysia has prompted this study.

53

This explorative study was conducted to understand the evaluation culture and the

extensiveness of training evaluation practices in Malaysia. The lack of baseline information

prevented the evaluation of transfer learning. This prompted the use of empirical approach in

this study. The study evaluates the perceptual effects on both management and non-

management levels of training programs in the manufacturing sector. This survey asked the

level of training evaluation performed, the percentage of payroll spent on training, the

impediments to training and the percentage of training transferred to the job. Follow up

interviews were also undertaken to provide additional clarification and interpretation on

responses and enabled impressions and opinions about the data to be recorded accurately.

2.6.1 Questionnaire Construction

A comprehensive survey of the literature was done to find out the degree of training

evaluation being conducted by training practitioners in Malaysia. The survey questions asked

the degree that training evaluation practices were conducted in Malaysia based on

Kirkpatrick's 4-level of evaluation (Kirkpatrick, 1959a, 1959b, 1960a, 1960b, 1976, 1979).

Examples of questions are:-

Reaction How did the participants react to the training?

Learning What information and skills were gained?

Behavior How have participants transferred knowledge and skills to their jobs?

Results What effect has training had on the organization and achievement of

its objectives?

The instrument was designed primarily based on the published work of Blanchard, Thacker

and Way (2000) with modification based on the Malaysian training environment. The

modifications from Blanchard et al. questionnaire include rephrasing and simplifying

question structure to suit local linguistic understanding. Words which were ambiguous or

misunderstood were replaced. These modifications were applied in order to encourage a more

accurate response. Care was taken to ensure that simple and clear questions were used to

54

seek information on significant areas of training evaluation activity in Malaysia. The

questionnaire can be found in Table 4.

The questionnaire is made of 34 questions. There are 8 questions in Level 1, 5 in Level 2, 13

in Level 3 and 8 in Level 4. Level 3 was constructed with the most questions as it asked

about practices for measuring transfer learning. Practitioners could use a variety of

assessment to measure transfer learning hence the survey questions require detailed practices

undertaken by practitioners.

The questions in the questionnaires were randomly sorted to avoid biasness caused by the

order of the questions. The survey questions used a 5-point Liken scale to permit good scale

discrimination.

A panel of experts which consisted of training professionals from the Malaysia Institute of

Management was used to evaluate the items in the questionnaire. Extensive pilot testing was

undertaken by the training professionals to ensure that the questions were easily understood.

The internal consistency was determined using the Cronbach alpha method. The Cronbach

alpha coefficient is 0.8458.

2.6.2 The Sample and Sampling

To improve the effectiveness and efficiency in terms of time and resources, a purposeful

sampling technique was employed. The sample was manufacturing based companies found

in the HRDC Directory. The HRDC Directory listed approximately 5000 organizations but

only 40 percent from the listing are manufacturing based companies. The questionnaires

were sent to 2000 manufacturing based companies with more than 50 employees. The

questionnaires were posted between December 2003 and January 2004. The questionnaires

were addressed to the Personnel and Human Resources Managers of the organizations. A

self-addressed stamped envelope was enclosed to maintain anonymity on the return of the

completed questionnaires through the postal service.

55

2.6.3 Questionnaire Response

The questionnaires were posted to 2000 of manufacturing organizations in Malaysia found in

the HRDC Directory. The appeal highlighted the focus of the study, i.e. training evaluation

activities that relate to the benefits of training.

Of the 2000 questionnaires posted 94 were returned with a note that the organizations were

closed down or had moved to a new address. This reduces the original samples of 2000 to

1906. Reminder notes were sent out three weeks after the first posting in order to encourage

greater response rate. However there were only 109 completed questionnaires returned.

The overall lack of organizational response can be attributed to a variety of causes: low

interest, lack of time to respond, current restructuring of the organization, unavailable contact

person, and outdated addresses.

2.7 Findings and Discussion

Data was analysed using SPSS for XP Windows (Version 13). Statistical significance was

accepted at the 0.05 level of confidence. A total of 5.5 percent of the questionnaires were

returned. Part 1 of the questionnaire gathered information on the background of the

companies. It was found that out of the 109 companies, 46 percent are multinational

companies while 54 percent are Malaysian companies. Part 2 of the questionnaire gathered

information on the organization's commitment to training. The results are shown in Table 1.

56

Table 1. Commitment to Training

A total of 41.3 percent of organizations agreed that a training needs analysis was conducted

prior to conducting any training program. The rest of the organizations conduct training to

meet the needs of the organization such as low productivity or a morale problem, reaction to

a crisis and frequently not coordinated with other functions of the organization. The lack of

baseline information prevents evaluation and no meaningful comparison of the participant's

performance before and after training can occur.

The results indicate that 64.3 percent of organizations organized technical training. A large

number of organizations felt the need to upgrade the technical competence of their employees

in the areas of quality, productivity, product training, IT training, accounting system and job

related training. Of all the organizations interviewed, 65 percent reported that they have

57

Commitment to Training Statistics (n =109)

Does your organization conduct training programsfor employee development

Does your organization conduct training needs analysis beforeconducting any training programs

What type of training is conducted by your organizationManagement

e.g. Leadership, supervisory, managing change,communication, human relations and interpersonal skills

Organization Specifice.g. training programs related to policies, values, cultures,goals and objectives of the whole organization

Technicale.g. quality, productivity, product training, IT training,accounting system and job related training

Personal Improvemente.g. motivation, time management, self development,managing self, presentation skills and businesscommunication skills

Others

Yes = 100 percentNo = 0 percent

Yes = 41.3 percentNo = 58.7 percent

Multinational = 39Malaysian companies = 6

45.9 percent

18.9 percent

64.3 percent

24.2 percent

0 percent

extended their range of products during the last two years and 88 percent had made changes

to machinery and equipment.

Management training was ranked the second at 45.9 percent followed by personal

development at 24.2 percent. One fifth of the organizations are also concerned with

management training. Many feel that skills such as leadership, supervision, managing

change, communication, human relations and interpersonal skills are needed for management

development. Although organization specific training is an emerging area, only about 18.9

percent of the organizations feel the need to impart training in this field.

Table 2 shows the level of evaluation conducted on management and non-management

training by the organization.

Table 2. Training Evaluation Practices in Organization

Training Evaluation Practices in Organization

Level 1 reaction evaluation

Level 2 learning evaluation

Level 3 behavioural evaluation

Level 4 results evaluation

No training evaluation practices

Statistics (n =109)

35 percent

25 percent

16.5 percent

11 percent

12.5 percent

The results indicate that out of 109 companies, 35 percent of organizations conducted Level 1

evaluation by measuring the participant's reactions towards the training program while 25

percent of the organizations conducted Level 2 evaluation by measuring the participant's

degree of learning as the result of the training initiatives. Only 16.5 percent of organizations

conducted Level 3 evaluation by measuring the changes in the participant's behaviour

towards the job after each training program. However, 11 percent of organizations

quantified the results of training and calculated its return on investment in training which is

classified as level 4 evaluation. The remaining 12.5 percent of organizations have never

conducted training evaluation after each training program. The results indicate that more

58

than half of the organizations do not evaluate their training at the behavioural or the results

levels. The reason for this is that sometimes training function is seen as an isolated and

peripheral function, which is not truly integrated into the job setting (Olsen, 1998).

The means and standard deviations of the four levels of training evaluation for all 109

companies are shown in Table 3.

Table 3: Means and Standard Deviations of the Four Levels of Training Evaluation

Note: Likert scale: where 5 = strongly agree; 4 = agree; 3 = neutral; 2 = disagree; 1 = strongly disagree

The majority of the organizations agree that they conduct Level 1 evaluation after each

training program. The average for Level 1 evaluation is 3.63 which suggest that the majority

of organizations conduct Level 1 evaluation. The average score for Level 2 evaluation is

3.41 which indicate that some companies conduct Level 2 evaluation selectively and the

majority is done on technical training. The average for Level 3 evaluation is 3.26. The result

indicates that the degree of measuring behavioral changes in the job after training is not that

popular among these manufacturing organizations. This could be due to the unavailability of

specific tools to measure the subjective changes in behavior. The average score for level 4

evaluation is 2.99 indicating that the majority of these manufacturing organizations do not

conduct result evaluation. The result was further confirmed by an interview which

mentioned that the benefits of training are not easily measured in quantitative terms and most

benefits cannot be measured immediately.

The means and standard deviations of the 34 questions in the instrument for all 109

companies are shown in Table 4.

59

Level Mean + SD

Level 1 3.63 ±0.62

Level 2 3.41 + 0.62

Level 3 3.26 + 0.63

Level 4 2.99 + 0.68

Table 4: Means and Standard Deviations of 34 Questions in the Instrument

60

L evel 1

- eactR ion1. Departmental heads conducted collective opinions from

MeanScore

SD

Evaluation participants with regards to the training program conducted 4.12 0.6892. Evaluate perceptions of participants on key benefits and value

arising from training3.03 0.934

3. Conduct training environmental audit to track participantssatisfaction after training

4.03 0.724

4. Focus on perception of trainees towards the training program. 4.38 0.862

5. Measure trainers competency and credibility after eachtraining program

2.74 0.908

6. Most training programs conduct post course reactionevaluation after training.

3.89 0.715

7. Always make an effort to ask participants whether they enjoyattending the training programs

4.20 0.815

8. Measure the accuracy of the training program in addressingthe exact requirement of the job

2.67 1.021

Level 2

- Learning

1. Allow participants to write down what they have learnedwhich might be useful for their work

3.69 0.641

Evaluation 2. Conduct pen and paper test for measuring the amount ofknowledge gained from a training program

4.28 0.703

3. Administer a test before and after training with regards to theknowledge gained from a training program

3.41 0.912

4. Identify the principles, facts and techniques learned byparticipants

2.98 1.090

5. Participants were asked if there were any barriers preventingthem from using what they have learned

2.69 0.932

Level 3- Behavioral

1. Develop performance-based tests as part of the trainingevaluation

2.89 0.909

Evaluation 2. Assess the level of transfer of learning to the job 3.04 0.994

3. Measure the success rate of participants performing each itemlearned

3.23 0.089

4. Define an action plan for participants and evaluate theimplementation success rate

3.43 0.745

5. Identify specific skill improvement as a result of a trainingprogram

3.93 1.079

6. Measure positive changes in personnel efficiency andeffectiveness after training

3.77 0.931

7. Measure the behavior changes resulting from the trainingprogram

3.51 1.099

8. Organize the trainer's follow up session to track theparticipant's behavioral change after training

3.28 1.141

9. Use observation techniques to monitor changes of behaviorand attitudes resulting from the training program

2.62 1.062

10. Conduct work performance evaluation in the workplace aftertraining

2.71 0.703

11. Observing and documenting the practice of knowledge andskills learned by the trainee into the workplace.

3.32 0.773

12. Assess the increase in knowledge and skills as well as attitudechange of trainees

2.84 0.842

13. Conduct a preview session with your trainee to specify theexpected objectives to achieve from the training

3.79 0.952

Note: Likert scale: where 5 = strongly agree; 4 = agree; 3 = neutral; 2 = disagree; 1 = strongly disagree

The results indicate that Level 1 evaluation (reaction) seems to be the most significant

training evaluation practice. A high mean score of 4.38 indicates that the majority of

Malaysian manufacturing companies focus on the perception of trainees towards the training

program. Managers do play an active role in conducting Level 1 evaluation by collecting

opinions from participants with regards to the training program conducted. Measuring the

accuracy of a training program in addressing the exact requirement of the job is the least

practiced and is indicated by a low mean score of 2.67.

The practice of pre and post pen and paper test after a training program is most popularly

practiced by these manufacturing companies and is shown in the mean score of 4.28. The

lowest mean score for Level 2 evaluation was 2.69 indicates that organizations seldom ask

participants if there were any barriers which prevented them from using what they have

learned.

Level 3 evaluation is modestly practiced by manufacturing companies in Malaysia. The

highest mean score of 3.93 indicates that the majority of these manufacturing companies

identified specific skill improvement as a result of a training program. The use of

observation techniques to monitor changes of attitude and behaviour as a result of the training

program shows the lowest mean score of 2.62.

The apparent lack of practice in Level 4 evaluation (result) is probably due to the effort and

potential complexities involved which entails much more work. This is reflected in the

61

Level 4- Results

Evaluation

1. Measure the level of productivity before and after a trainingprogram

2.56 0.721

2. Link effectiveness of training to financial benefit 2.91 0.668

3. Conduct cost-benefit analysis on training programs conducted 3.10 0.711

4. Measuring the worthiness of attending training in terms ofcost and time away from work

3.35 0.823

5. Measure the tangible cost in terms of reduced cost andimproved quality after training

2.82 0.913

6. Calculate the cost of training and its impact towardsorganization improvements

2.71 0.793

7. Compare the cost of training program with benefits obtainedfrom it

3.24 0.894

8. Finding evidence of direct links between training investmentand returns from training

3.18 0.615

survey result which indicates low interest in conducting cost-benefit analysis of training by

these organizations. Measuring the worthiness of attending training in terms of cost and time

away from work showed a mean score of 3.35. This is regarded as one of the most popular

practice of Level 4 evaluation by these organizations. Calculating the costs of training and its

impact towards organization improvements showed the lowest mean score of 2.71.

Independent t-tests were used to test for significant difference in the four levels of training

evaluation conducted by multinational and Malaysian companies. It was found that there

were significant differences between training evaluation at Level 1, Level 2, Level 3 and

Level 4 between multinational companies (N=50) and Malaysian companies (N=59) at p <

0.05. See Table 5.

Table 5: Summary oft-tests of the four levels of training for multinational companiesand Malaysian companies

*p <0.05

The results indicate that the majority of multinational companies operating in Malaysia have

a clearer objective of what ought to be done and have enshrined this in mission statements on

training. These multinational companies provide training and development for all employees

in all areas of operations with expensive investment and serious attempts to produce a

competent and quality workforce. The results show that multinational companies judge

training effectiveness as their immediate reaction to training evaluation. These multinational

companies applied formal and systematic procedures and processes to assess training

effectiveness as compared to Malaysian companies.

The results show that the majority of Malaysian companies did not conduct Level 3 and

Level 4 evaluations. Most Malaysian companies seem to lack the formal mechanism to

62

Level 1 Level 2 Level 3 Level 4Company (Mean + SD) (Mean + SD) (Mean + SD) (Mean + SD)

Multinational 3.78 + 0.52 3.69 + 0.56 3.63 + 0.48 3.50 + 0.53

Malaysian 3.49 + 0.67 3.17 + 0.57 2.94 + 0.56 2.56 + 0.46

t-value 2.635 * 4.758 * 6.794 * 9.838 *

assess training effectiveness. The results of the t-tests were further confirmed by interviews

which suggested relatively mild commitment of top management to training and some

resistance by middle management to the function of training in Malaysian companies.

Training seems to be a low priority area and training evaluation is conducted on an ad hoc

basis. Part of this could be because identifying individual performance improvement after

training is regarded as a tedious and lengthy process.

Only six Malaysian companies have conducted training needs analysis prior to conducting

training. The result was further confirmed by interview which mentioned at times managers

send employees to training programs just to fill the quota. These employees are not the

intended participants of the training program and would return without learning much.

Since they do not have much commitment for learning after training, it does not permit Level

2, Level 3 or Level 4 evaluation to take place. This trend is shown in less Malaysian

companies practising Level 2, Level 3 and Level 4 evaluation as compared to multinational

companies.

2.8 Limitations of Study

The number of respondents was relatively low. Even though the majority of manufacturing

companies that have more than 50 employees registered with the Human Resource

Development Council, the actual number of organizations that actively participate in training

and development is rather low.

Out of the 2000 manufacturing based organizations listed in the EIRDC Directory, less than

40 percent of them conducted at least one training program in a year (HRDC, 2003). Details

of companies that do not participate in training were not disclosed by HRDC. The reason is

because HRDC does not want training providers to seek for organizations with high unused

funds. The survey was decided to send to all 2000 manufacturing based organizations as the

details of the organizations that do not conduct training program were not known. Hence, the

majority of these organizations that do not conduct training could not answer the

questionnaire.

63

The success of the study depends on the willingness of respondents to cooperate. Some may

not see the value in participation while others may view the topic as sensitive or irrelevant.

Despite reminder notes were sent out three weeks after the first posting to encourage greater

response rate. A comparison between respondents and non-respondents would have been

helpful. Unfortunately, data were not available for making such comparisons in this study.

2.9 Conclusion

Kirkpatrick model has been considered one of the most prominent models of evaluation

practised in Malaysia. The application of the 4-level of evaluation in Malaysia is not well

adopted. This study reveals that training evaluation carried out by most organizations in

Malaysia is mainly to judge trainees' reactions. A culture of fill in one of this before you go

typically pervades in training evaluation. Most organizations lack the formal and systematic

mechanisms to assess training effectiveness. Many companies remain blissfully unaware of

how much they spend on training, whether it is effective or not. Indeed, even the use of

expensive external trainers does not appear to trigger detailed evaluations.

The majority of Malaysian organizations show little or no interest in conducting training

evaluation and have even less interest in the results of evaluation as method of evaluating

effectiveness. Some find evaluation difficult as it is almost impossible to determine which

participant efforts are attributable to training and which are not.

Although Kirkpatrick model of evaluation serves as an outcome of training, most

practitioners do not know what evaluation criteria to look for. The confusion of the actual

outcome possibly hindered the ability to conduct Level 3 and Level 4 evaluations

meaningfully.

Hence this research gap shows the opportunity to examine specific outcome required from

training and the transfer component of training in detail. This study will determine what

strategies might be most helpful in maximizing the transfer learning and constructing an

appropriate model for evaluation.

64

2,10 References for Paper Two

Ahmad, R.H. 1998, 'Educational development and reformation in Malaysia: past, presentand future' Journal of Educational Administration, vol. 36, no. 5, pp. 462-475.

Attkinson, C. C., Sorenson, J.E., Hargreaves, W.A. & Hororwitz, M.J. 1978,Evaluation of Human Service Programs, Academic Press, London.

Arthur Anderson & Co. 1991, Professional Services in Malaysia, Arthur Anderson & Co.,Kuala Lumpur, Malaysia.

Asma, A. 1994, Training design development: The practice of four development agencies inMalaysia, Unpublished Ph. D. dissertation, University Pertanian Malaysia, Serdang.

Barron, T. 1996, 'A new wave in training funding', Training and DevelopmentJournal, vol. 50, no. 5, pp. 28-32.

Bernthal, P. R. 1995, 'Evaluation that goes the distance', Training and Development',vol. 49, no. 9, pp. 41-49.

Blanchard, P.N., Thacker, J.W. & Way, S.A. 2000, 'Training evaluation: perspectives andevidence from Canada', International Journal of Training and Development, vol. 4,no.4, pp. 295-303.

Bramley, P. 1996, Evaluating Training Effectiveness, McGraw-Hill, Maidenhead and NewYork.

Bramley, P. & Kitson, B. 1994, 'Evaluating training against business criteria', Journal ofEuropean Industrial Training, vol. 18, no.1, pp. 10-14.

Brandenburg, D.C. 1982, 'Training evaluation: what's the current status', Training andDevelopment Journal, vol. 36, pp. 28-29.

Brinkerhoff, O.R. 1988, 'An integrated evaluation model of HRD', Training andDevelopment Journal, vol. 42, no. 2, pp. 66-8.

Carnevale, A. P. & Schulz, E.R. 1990, 'Return on investment: according to training',Training and Development Journal, vol. 44, no. 7, pp. 1-32.

Chen, H.T. & Rossi, P.H. 1992, Using Theory to Improve Program andPolicy Evaluations, Greenwood Press, Westport, CT.

Davidove, A.E. & Schroeder, P.A. 1992, 'Demonstrating ROI of training',Training and Development Journal, vol. 46, no. 8, pp. 70-71.

Erickson, M.R.C. 1991, Business-education partnerships: a study of evaluationmethods, Doctorial dissertation, the George Washington University, dissertationabstracts international, vol. 52/07A, AAC9133008.

65

Easterby-Smith, M. 1985, 'Training course evaluation from an end to a means',Personnel Management, vol. April, pp. 25-27.

Gutek, S.P. 1988,'Training program evaluation: an investigation of perceptions andpractice in non-manufacturing business organizations', doctoral dissertation, WesternMichigan University, Kalamazoo, MI, dissertation abstracts international, vol. 49/05a,AA C8811388.

Groove, E.A. & Ostroff, C. 1990, 'Program evaluation', in Developing Human Resource, edsWexley, K. & Himicks, J., BNA Books, Washington D.C.

Heneman, H.G. & Schurab, D.P. 1986, Human Resource Management, Irwin, Illinois.

Hamblin, A.C. 1974, Evaluation and Control of Training, McGraw-Hill, New York.

Hamid, A.A., Mohd, S., Muhamad, A.H. & Ismail, Z. 1987, Management Education inMalaysia, in Developing managers in Asia, eds Tan Jing Hee & You Poh Seng,Addison-Wesley, Singapore.

Human Resource Development Council 1992, Human Resource Development Act 1992,Ministry of Human Resource, Kuala Lumpur, Malaysia.

Human Resource Development Council 2003, Ministry of Human Resource, Kuala Lumpur,Malaysia.

Junaidah, H. 2001, 'Training evaluation: clients' role', Journal of European IndustrialTraining, vol. 25, no. 7, pp. 374-379.

Kirkpatrick, D.L. 1959a, 'Techniques for evaluating training programs: part 1 - reaction',Journal of American Society for Training and Developing, vol. 13, pp. 3-9,


Kirkpatrick, D.L. 1960a, 'Techniques for evaluating training programs: part 3- behaviour',Journal of American Society for Training and Developing, vol. 14, no. 1, pp. 13-18.

Kirkpatrick, D.L. 1960b, 'Techniques for evaluating training programs: part 4 - results',Journal of American Society for Training and Developing, vol. 14, no. 2, pp. 28-32.

Kirkpatrick, D.L. 1976, Evaluation of Training, Training and Development Handbook: Aguide to human resource development, 2nd edn, Craig, R.L.O., McGraw-HillPublisher, New York.

Kirkpatrick, D.L. 1979, 'Techniques for evaluating training programs', Training andDevelopment Journal, vol. 33, pp. 78-92.

Malaysia Economic Planning Unit 1996, Seventh Malaysia Plan: 1996-2000, GovernmentPrinter, Kuala Lumpur, Malaysia.

66

Maimunah, I. 1990, Extension: Implication to Community Development, 2nd ed, DewanBahasa and Pustaka, Kuala Lumpur, Malaysia

Mirza, S.S. & Juhary, H.A. 1995, Managerial training and development in Malaysia,Malaysian Institute of Management, Malaysia.

Olsen, J.H. 1998, 'The evaluation and enhancement of training transfer', InternationalJournal of Training and Development, vol. 2, no. 1, pp 61-75.

Pauzi, M. 1985, 'Training nuisance, 12th ARTDO International Conference', Petaling Jaya,Malaysia, 22-27 July.

Phillips, J.J. 1991, Handbook of Training Evaluation and Measurement Methods, GulfPublishing Company, Houston, TX.

Pillai, P. 1994, Industrial Training in Malaysia: Challenge and Response, ISIS Publication,Setiakawan Printers Sdn Bhd, Malaysia.

Pillai, P. & Othman, R. 1994, 'Learning to work, working to learn', Institute of Strategic andInstitutional Studies, Kuala Lumpur.

Plant, R.A. & Ryan, R.J.1994, 'Who is evaluating training?', Journal of European IndustrialTraining, vol. 18, no. 5, pp. 27-30.

Rowe, C. 1992, 'How useful was it? The problem of evaluating in-house training programs',Industrial and Commercial Training, vol. 24, no. 7, pp. 14-18.

Rossi, P.H. & Freeman, H.E. 1993, Evaluation: A Systematic Approach, 5th edn, SagePublications, California.

Ruthman, L. & Mowbray, G. 1983, Understanding Program Evaluation, Sage Publication,London.

Saiyadain, M.S. 1995, 'Perceptions of sponsoring managers, training organizations, and topmanagement attitude toward training', Malaysian Management Review, vol. 30, no. 4,pp. 69-74.

Shadish, W.R. & Epstein, R. 1987 'Patterns of program evaluation practice among membersof the evaluation research society and evaluation network', Evaluation Review, vol.11, no. 5, pp. 555-590.

Shamsuddin, A. 1995, 'Contextual factors associated with evaluation practices of selectedadult and continuing education providers in Malaysia', unpublished PhD dissertation;University of Georgia, Athens, G.A.

Shelton, S. & Alliger, G. 1993, 'Who's afraid of level of evaluation?', Training andDevelopment Journal, vol. 47, no. 6, pp. 43-46.

Smith, A. 1990, 'Evaluation of management training subjectivity and the individual', Journalof European Industrial Training, vol. 14, no. 1, pp. 12-15.

67

Smith, A.J. & Piper, J.A. 1990, 'The tailor-made training maze: a practitioner's guide toevaluation', Journal of European Industrial Training, vol. 14, no. 8, pp. 2-24.

Wagel, H.W. 1977, 'Evaluating management development and training programmes',Personnel Management, vol. 54, no. 4.

68

2.11 Appendix A The Questionnaire for Research Paper Two

This survey is about .... grainituj gAlat.t.k. 9...N.4 4. aZatatt44.

Enormous resources of time, money, and energy are invested in every imaginable kind oftraining and development program. Little effort is invested in discovering the how well thoseprocess work, how they might be improved or, indeed, if they work at all. It is important fororganization that uses training and development activities to seek practical ways ofevaluating those activities.

With greater emphasis by the Ministry of Human Resources since the enactment of HumanResources Development Act, 1992, there is a need to improve the effectiveness of trainingactivities in Malaysia in order to achieve greater productivity among the workforce.However, effective evaluation requires the examination of training outcomes at several levelsof evaluation. This research study is designed to study to what extent the Malaysianmanufacturing sectors have carried out training evaluation and how these organizations havebenefited from the training event. The information you provide will help us betterunderstand the quality and effectiveness of training evaluation system that has so far beingcarried out within the Malaysian context.

Because you are the one who can give us a correct picture of how you experience conductingtraining evaluation, I wish to invite you to participate in this research study. The resultswill be presented in an aggregate and untraceable manner.

If you have any enquiry about this research or the questionnaire, feel free to contact me, LimGuan Chong, at No. 54, Jalan SS2167, 47300 Petaling Jaya, Selangor Darul Ehsan, or my cellphone 019-4781553, or my e-mail [email protected]. You can also contact my supervisors,to verify this survey and my doctoral candidateship: Dr. Travis Kemp (e-mail:[email protected]) or Professor Dr. Leo Ann Mean (e-mail: [email protected]).

Part 1: Tell us about your Organization

Name of organization:

Type of company:

Multinational

Malaysian companies

Nature of business:

Manufacturing

Service

Others, please specify

69

Part 2: Commitment to Training

Do your organization conduct training program (in house training program, public programand on-the-job training) for employees development?

Yes

No

Do your organization conduct training needs analysis before conducting any trainingprograms?

Yes

No

What types of training programs conducted by your organization?

Management (e.g. leadership, supervisory, managing change, human relation andinterpersonal skills, communication)

Organizational specific (e.g: training programs related to whole organization policies, values,culture, goals and objectives)

Technical (e.g. quality, productivity, product training, IT training, accounting system and jobrelated training)

Personal Improvement (e.g. motivation, time management, self development, managing self,presentation skills and business communication skills)

Others. Please specify

Part 3: Training Evaluation Practices

Instructions: Please indicate your agreement and disagreement that truly represents the practice inyour organization on a scale of 5 (strongly agree), 4 (agree), 3 (neutral), 2 (disagree) to 1(strongly disagree), to express your view.

Training Evaluation Practices

1 Most training programs conductpost course reaction evaluationafter training

2 Always make an effort to askparticipants whether they enjoyattending the training programs

70

StronglyAgree

Agree Neutral Disagree StronglyDisagree

5

5

4

4

3

3

2

2

1

1


3 Departmental heads conductedcollective opinions from participantswith regards to the training programconducted.

4 Participants were asked if therewere any barriers preventing themfrom using what they have learned

5 Allow participants to write down whatthey have learned which might be usefulfor their work

6 Define an action plan for participantsand evaluate the implementationsuccess rate

7 Conduct pen and paper test formeasuring the amount of knowledgegained from a training program

8 Administer a test before and aftertraining with regards to the knowledgegained from a training program.

9 Develop performance-based testsas part of the training evaluation

10 Identify specific skill improvementas a result of a training program

11 Measure positive changes in personnelpersonnel efficiency and effectivenessafter training

12 Measure the behaviour changesresulting from the training program

13 Conduct a preview session with yourtrainee to specify the expectedobjectives to achieve from the training

14 Organize the trainer's follow upsession to track the participant'sbehavioural change after training

15 Measuring the worthiness ofattending training in terms of costand time away from work

71

StronglyAgree


5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1


16 Use observation techniques tomonitor changes of behaviour andattitudes resulting from the trainingprogram.

17 Measure the level of productivitybefore and after a training program

18 Link effectiveness of training tofinancial benefit

19 Conduct cost-benefit analysis ontraining programs conducted

20 Evaluate perceptions of participantson key benefits and value arisingfrom training

21 Identify the principles, facts andtechniques learned by participants

22 Measure the tangible cost in termsof reduced costs and improvedquality after training

23 Measure the accuracy of the trainingprogram in addressing the exactrequirement of the job

24 Measure the success rate ofparticipants performing eachitem learned

25 Conduct training environmentalaudit to track participantssatisfaction after training

26 Measure productivity improvementafter each training

27 Calculate the cost of training and itsimpact towards organizationimprovement

28 Conduct work performanceevaluation in the workplaceafter training

29 Measure focus on perceptionof trainees towards thetraining program

72

StronglyAgree


5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1


30 Assess the increase in knowledgeand skills as well as attitude changeof trainees

31 Compare the cost of training programwith benefits obtained from it

32 Observing and documenting thepractice of knowledge and skillslearned by the trainee into the workplace.

33 Assess the level of transfer oflearning to the job

34 Measure trainers competency andcredibility after each training program

35 Finding evidence of direct linksbetween training investment andreturns from training.

Thank you for your participation!

73

StronglyAgree


5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

Research Paper 3

MULTI-RATER FEEDBACK FOR TRAININGAND DEVELOPMENT: AN INTEGRATED

PERSPECTIVE


University of Hull


74

Multi-Rater Feedback For Training AndDevelopment: An Integrated Perspective



3.1 Abstract

This paper looks at the difference between success and failure of multi-rater feedback in

enhancing employee self awareness and encouraging them to engage in development

programs. Multi-rater feedback is basically used as an unfreezing process in which

employees are motivated to rethink their behaviour and its impact on others. Multi-rater

feedback provides employees with good data from multiple perspectives, encouraging

openness to listen and accept self weaknesses for development. A comprehensive integral

model encompassing process consultation and good conversation is used to facilitate

effective development after multi-rater feedback. Process consultation provides a description

of prescriptive approach to help the employees recognize and accept responsibility for the

difference in perception. The flexibility of process consultation should be enhanced by

integrating good conversation to promote ideal communication and interaction between the

process consultant and employees which will eventually build trust for open learning and

development. Post multi-rater feedback is introduced to assess the degree of performance

improvement which resulted from the development program.

3.2 Introduction

Multi-rater or 360-degree feedback has gained wide acceptance and usage to support

development of leadership and management skills (Cacioppe & Albrecht, 2000). The multi-

rater feedback process provides a comprehensive feedback collected from people around

ratees, in the workplace. One underlying rationale to such system is its potential impact on

75

the target ratee's self awareness: increasing self awareness is thought to enhance development

(Ashford, 1993; Mount, Judge, Scullen, Sytsma & Hezlett 1998). According to Van Veslor

et al. (1993), the number of multi-rater feedback instruments has increased significantly in

the past 15 years. It is estimated that American companies spent $152 million in 1992 on this

form of feedback for development (Hoffman, 1995). Multi-rater feedback was first

introduced to the UK in the early 1990s, and has spread quickly across public and private

sector organizations (Fletcher & Baldry, 2000). The spread is based on the perceived benefits

of fairer and greater accuracy in representing a performance, which creates development and

learning potential that can consequently motivate changes in behaviour. In review of 20

organizations responding to the delivery of multi-rater feedback, London and Smither (1995)

found that 40 percent of the respondents always linked multi-rater feedback to specific

developmental activity. According to Moses, Hollenbeck and Sorcer (1993), there does not

appear to be a distinct individual who founded or invented the multi-rater feedback process.

They argued that the term multi-rater feedback has been mistaken to be a newly discovered

concept, as perceptions of people have been available as long as there were people around to

observe them.

3.3 The Use of Multi-rater Feedback

The implementation of multi-rater feedback varies among organizations. The widespread

adoption of multi-rater feedback and other multi-source feedback is based on the perceived

benefits of fairer and greater accuracy in representing performance because it offers a more

rounded assessment of the individual, not just the top-down perspective of conventional

appraisal. It is an empowering mechanism, which allows subordinates to exert some

influence over the way they are managed. The same is true for peers, who can think back and

improve a colleague's role perform as a team member. Moreover, the multi-rater feedback

system provides a natural method for both enhancing learning and improving performance.

As the complexity of job function increases in the workplace, it is crucial for employees to

receive feedback from a variety of constituencies and not only the traditional superior-

subordinate appraisal approach. This feedback facilitates self awareness by enabling

participants to compare their own perceptions of their skills and personal style with the

perceptions of important observers in their work environment. Multi-rater systems are

76

assumed to improve performance by increasing selfawareness through diversified

information from multi-rater feedback (Borman, 1997). Ratees who receive feedback or

appraisal on their performance from a variety of sources will be responsible to improve

current job performance through continuing to add value to the organization needs and be

prepared for the future.

Continuing development has been the key priority for most organizations in keeping the

workforce updated with on-going technological changes. Measuring and improving worker

performance has become increasingly important for organizations to stay competitive.

According to Nowack (1993), the increased use of multi-rater feedback in organizations was

mainly due to the increasing need for continuous measurement of improvement efforts; the

need for job-related feedback for employees affected by career plateauing; and the need to

maximize employee potential in the face of technological changes, competitive challenges

and increased workforce diversity.

Multi-rater feedback has also been seen to increase reliability, fairness and acceptance of the

data by the person being rated (London, Wohlers & Gallagher, 1990). This is because

feedback is received from multiple sources and not just from one. A study conducted in an

American company showed that only 3.9 percent of staff felt that feedback should come

solely from the superior, while 94.8 percent felt that feedback should come from both

superior and co-workers (Cacioppe & Albrecht, 2000). This result indicates that there is an

on-going trend in American companies to have performance feedback from multiple sources

despite the fact that there may always be variation in terms of perceptual differences between

the self and others on the feedback results. Although multi-rater feedback system provides

the ratee with greater information as a base for development, Tornow and London (1998)

suggested that multiple feedback sources require balancing, as multiple sources may

potentially offer conflicting viewpoints to the ratee. However, the differences in perspective

between the rater and the ratee should not be treated as an assessment error. This is further

supported by Ashford (1993) who found that multi-rater feedback is important to the ratee

because the information could further stimulate the ratee's cognitive reactions that would

likely give impact to subsequent behavioral changes. A multiple feedback system is a source

of information, which can enhance personal learning by providing the opportunity to ratees

who are being assessed to compare their self-perceptions against the perceptions of others

77

regarding their behaviour. Multi-rater feedback is simply a set of performance-related

information which is essential for learning and development.

Bennis, Benne and Chin (1969) is of the view that multi-rater feedback is a critical element in

affecting change in performance evaluation. According to Zemke and Zemke (1995), adults

undertake learning experiences when they see a need to acquire a new or different skill or

knowledge. Multi-rater feedback provides the opportunity for open communication between

rater and ratee to discuss on ratee's past behaviour and weaknesses, encouraging openness to

hearing and accepting feedback. Such feedback and open communication is instrumental for

an unfreezing process which ratee is motivated to rethink back his or her weaknesses and

strengths (Shipper & John, 1992). McCauley and Moxley (1996) also viewed multi-rater

feedback as an instrument in an unfreezing process, in which ratees will have a chance to re-

think their previous and current behaviour based on the discrepancies of the results between

self and others and how their weaknesses would create impact on others. Hence, conducting

multi-rater feedback before and after training provides an avenue for the training provider to

evaluate performance changes. This shows that the feedback received can be used both as

reinforcement of past learning and also an opportunity for future learning (Rosti & Shipper,

1998).

Evidence from various settings has demonstrated an association between self awareness and

performance outcome (Fletcher, 1997). It has been found that high self awareness is related

with high performance ratings in various aspects (Atwater, Ostroff, Yammarino & Fleenor,

1998; Bass & Yammarino, 1991; Furnham & Stringfield, 1994). Nasby (1989) is of the

opinion that ratees with high self awareness are more able to integrate various feedbacks into

self-perception in order to reach a higher performance outcome. Ashford (1984) found that

people with low self awareness are more likely to ignore or discount feedback about them

and will have a negative attitude towards work. This indicates that a highly self-aware ratee

is likely to exert self-positivism to accept feedback and show self-motivation for

improvement (Fletcher & Bailey, 2003).

However, according to Greguras, Ford and Brutus (2003), further research needs to be

conducted to investigate the effectiveness of multi-rater feedback systems in increasing one's

self awareness. This will lead to eventual improvement in one's performance as the ratee

may react differently to different source of information they received. Information obtained

78

from different sources would have different affects on the ratee's self awareness. If the

feedback proves to be true, how would the ratee react to this perceptual reality. Conway and

Huffcutt (1997) commented that although different raters present different information, not

much research has explored how the ratee attends to, integrates, and uses the information

from the various raters. Although multi-rater feedback system increases one's self awareness

and leads to further improvement, a further specific mechanism needs to be included

(Hazucha, Hezlett & Schneider, 1993; Reilly, Smither & Jasilopoulos, 1996; Walker &

Smither, 1999). The specific mechanism refers to identification of appropriate personality

traits, skills or competency needed by the ratees; establishing an appropriate feedback rating

approach and selection of relevant raters. This proposed mechanism must be embedded

within the feedback system prior to the implementation. This will increase ratee readiness in

accepting the multi-rater feedback and lead to individual self awareness for further

improvement. Many studies have been conducted to discover specific interventions of the

multi-rater feedback mechanism.

3.4 The Effectiveness of Multi-rater Feedback for Development

Fletcher and Bailey (2003) found that multi-rater feedback provides the opportunity for the

ratee and raters to agree on the level of competence that is needed. Church (1997) also

supports this view and suggested that multi-rater feedback provides both the rater and ratee

with the opportunity to agree on the development needs ofa required performance standard,

competency and skills necessary for the ratee. Both the rater and ratee would have an

opportunity to clarify their respective expectations in order to develop a psychological

contract or agreement. The ratee would be more focus on what is needed by people working

around them and raters would have a better understanding of the ratee's strengths and

weaknesses. In certain multi-rater feedback systems this is known as a gap analysis process

and other multi-rater literature refer to it as congruence-d (Warr & Bourne, 1999). Edward

(1993, 1994) stated that d is the score difference score between the ratee and other raters.

However, Fletcher and Bailey (2003) commented that telling a ratee d score is of no use

unless the rater can provide specific and meaningful information to reduce the gap between

ratee's and raters' scores. Congruence-d is obtained by subtracting the average score of the

other raters from the self-rating for each feedback questionnaire item, and dividing that value

79

with the standard deviation of raters and ratee's scores (Warr & Bourne, 1999). The level of

self awareness is signified by the d score. If the d score is equal to 0, this signifies complete

agreement between the self and the others rating on all items. Disagreement between ratee's

and raters' ratings generally showed low correlations between-source ratings. This showed

that different rater sources actually provide different information (Conway and Huffcutt,

1997; Harris and Schaubroeck, 1988). Ashford (1993) and Brutus, London and Martineau

1999) conducted studies on relative impact of different rater sources showed raters have

different implications on development of the ratee. The studies discovered that subordinate

ratings had the largest impact on goal selection, followed by peers and superior. This shows

that the selection of information from different rater source is important for ratees to decide

which rater source is most qualified and which feedback is important for further improvement

(Kluger & Denisi, 2000). Mount (1984) supports the validity of subordinates rating and

indicated that the majority of ratees show approval for subordinate ratings for developmental

purposes (Bernardin, Dahmus & Redmon, 1993; Facteau et al., 1997, 1998; London et al.,

1990; McEvoy, 1990).

Further study by Greguras, Ford and Brutus (2003) on 213 managers using a policy capturing

design that allowed factors (i.e. lead others, general administrative performance, building

working relationship and overall performance) to be manipulated. The study showed that

superior ratings would be weighted more heavily than peer or subordinate ratings for the

ability to lead others, general administrative performance, building working relationship and

overall performance. Ratees will attend more to peer ratings than subordinate ratings for

general administration of roles and responsibilities because peers are more likely to

understand the ratee's duties, which are similar to their own. However, a study by Atwater,

Roush and Fischthel (1995) showed that the ratee attend more to subordinate ratings as

compared to peer ratings for the ability to lead others as subordinates have first-hand

experience with the ratee's leadership behaviour. Selection of feedback information is tied

closely to the ratee's perception. Needless to say, ratee development success is closely related

to the ratee's perception of the source of information. User should consider whether multi-

rater is best used for development or only provides different dimensions of reference for the

ratees.

80

3.5 The Effectiveness of Multi-rater Feedback for Appraisal

There are debates on the use of multi-rater feedback for appraisal and development (Bracken,

Dalton, Jaka, McCauley & Pollman, 1997). According to London et al. (1990) and Antonioni

(1994), respondents will answer questions differently if it is for appraisal purposes. A study

by London and Smither (1995) showed that 40 percent of people who provided multi-rater

feedback ratings said they would have altered those ratings if the company planned to use

them for evaluation or appraisal. According to McEvoy and Buller (1987), ratees view the

process as most useful when uses for development as apposed to appraisal. London and

Beatty (1993) found evidence to support this. They reported that 34 percent of the

respondents in their study would rate their superior differently if the feedback were shared

with their superior. Hence, there is still an element of fear for individuals to appraise their

superiors honestly. Further studies should be carried out to determine the capability of

multi-rater feedback that is used in performance appraisal. Few researchers agree that multi-

rater is useful solely for developmental purposes as it is also widely used in managerial and

leadership development programs (Cacioppe, 1998; Cacioppe and Albrecht, 2000; Garavan,

Morley & Flynn, 1997; McCauley & Moxley, 1996; Thach, 2002). O'Reilly (1994)

suggested that when multi-rater feedback is used for development purposes, scores from

raters do not vary much. However, this was not the case for formal performance appraisals.

3.6 The Variation of Multi-rater Feedback Information

A study by Kluger and DeNisi (1996) on the effectiveness of multi-rater feedback

interventions showed that only one-third actually yielded positive improvements in

performance. There is an urgent need to take a closer look at the effectiveness of multi-rater

feedback in performance development. Feedback is invaluable to ratee as it comes from

multiple sources, and provides multiple perspectives. Each opinion or perspective may

provide relevant yet different feedback for the ratee to focus upon (Atwater & Yammarion,

1993; Hazucha et al., 1993; Tornow, 1993). Ghorpade (2000) commented that having more

information does not necessarily mean a higher accuracy rate and information provided by

just a superior does not mean it is not impartial. If the source does not have an opportunity to

observe the ratee's behaviour, or does not recognize the requirements of a particular

81

performance dimension, feedback from the source may be inaccurate for the ratee's

development. Therefore the quality of ratings from different sources for a particular

dimension should be assessed (Kluger & DeNisi, 1996). London and Smither (1995) stated

that ratings provided by different raters are likely to be inconsistent because it may create

much confusion and disagreement on the results and may not increase future development.

According to Moses et al. (1993) multi-rater feedback relies solely on the instrument scoring

system or data collection methods to interpret the information for ratees. Moses et al. (1993)

argued that multi-rater feedback is based on people's observations and the observer may not

know what behaviour to look for. If the primary purpose of multi-rater feedback is to

identify developmental opportunities, then a set of competent performance behaviours has to

be identified and communicated to all raters prior to the process. This would enable the rater

to understand the required habits, behaviors or styles so that a proper and fair judgment

towards the ratee's performance is ensured. The rater's feedback is important and may have

an impact on the ratee's subsequent developmental priorities.

The rater's feedback such as perception bias, cultural issues and gender should also be given

special attention (Cacioppe & Albrecht, 2000). An example of perception bias is a man will

show better leadership than a woman. A study of three organizations, with a total of over

20,000 employees, showed that there was a positive correlation between performance and age

until the age of 45 (Cacioppe & Albrecht, 2000). The study indicates that raters are likely to

stereotype younger ratees as performing better than older ratees, or older ratees may have

better experience compared to younger ratees. However, by looking at the cultural

dimension, Leslie, Gryskiewicz and Dalton (1998) argued that multi-rater feedback might not

necessarily be well accepted by cultures in certain countries. Some cultures do not subscribe

to the same notion that feedback is valuable and can guide manager development. For

instance, cultures such as the French may place more value on lineage or social class than

developing managers. Different cultures may find it a shock to be asked personal

information regarding their superiors. American managers find difficult to get those that

report directly to them to give negative feedback (Wilson et al., 1996). Another example is

Asian value of face-saving where a request for information needed in a multi-rater feedback

may come across as offensive (Wilson et al., 1996). An organization that wishes to conduct

multi-rater feedback needs to take a closer look at the age, culture and genders of the raters or

ratee as these may affect the effectiveness of multi-rater feedback process.

82

Honey and Mumford (1982) reflected that in the event of self assessment, most managers are

poor reflectors. They prefer to charge on with new ideas rather than look backwards and

reflect on how things might have gone better. However, according to Waldman, Atwater and

Antonian (1998) individuals who rated themselves higher are likely to have higher self-

esteem and self-concept. Disagreement over the result could be a threat to the ratee's self

esteem and weaken their motivation for further development. Special caution need to be

taken in designing multi-rater feedback in order to minimize the potential of ratee being

pessimistic and to ensure that ratee's self-image is converted to productive behavioural

change (Wood, Allen, Pillenger & Kahn, 1999). The feedback process should be designed as

a tool to ensure effective interpretation of information received from multi-rater feedback to

stimulate individual and organization improvement in attaining strategic business objectives

(Heisler, 1996). Information from multi-rater feedback is mainly used for developing people

but increasingly, it is being used for strategic planning in training and development (Romano,

1994; Atwater et al., 1993). A research conducted with 48,000 participants indicated that

multi-rater feedback could successfully contribute to the effectiveness of training and

development (Cacioppe & Albrecht, 2000).

3.7 Multi-rater Feedback Practices in Malaysia

In the Malaysian training environment, multi-rater feedback could be used as one of the

assessment models for training and development. Al imo-Metcalf (1998) commented that

multi-rater feedback should only be used in the context of assessment for development.

Payne (1998) supported the view that multi-rater feedback could be a potentially powerful

and even dangerous tool. Therefore it should be confined to the developmental arena and

used by people who know what they are doing.

Training evaluation and assessment practices in Malaysia are still considered at an

elementary stage. A study conducted by Zakaria and Rodzhan (1993) on 94 manufacturing

and service organizations in Malaysia found that only 44 percent of respondent organizations

conducted formal training. Of those who conducted formal training, 23 per cent did not

conduct any training needs assessment. The main reason was lack of expertise to perform

83

assessment. Among these respondents, the main source of information for training needs

assessment was the problems faced by their organizations. This evidence shows the lack of

attention given to transference of skills in training evaluation and feedback. Therefore, it is

wise to establish and instill the right approach to training and development as jobs today are

increasingly complex, and the traditional method of having a superior rate a subordinate

performance is inadequate in giving quality information to improve performance and skills.

The training culture in Malaysia has been indirectly influenced by multinational companies

operating in Malaysia. This is supported by a survey by Wan Aziz (1994) showed that the

majority of multinational companies operating in Malaysia brought in training culture. A

survey by Zakaria and Rodzhan (1993) on 108 manufacturing companies, suggested that

about 67 percent of the multinational companies interviewed conducted general and specific

training programs for all levels of staff. These multinational companies in Malaysia need to

conduct training because they require highly-skilled manpower who are able to operate new

and sophisticated machinery or research product improvement. A study by Wan Aziz (1994)

on 120 companies showed that 55.6 percent of Malaysian-owned companies conduct training.

This shows that Malaysian-owned companies are emulating the training culture of

multinational companies in order to cope with a challenging environment. A research by

Junaidah (1999) showed that Malaysian companies feel discouraged when undertaking

training, as they are not able to mark the progress of development after training. The main

reason may to lie in their inability to see the tangible benefits of training (Saiyadain &

Juhary, 1995). The majority of Malaysian companies conduct training needs on a general

basis. Zakaria and Rodzhan (1993) found that only 16 per cent of Malaysian companies

indicated that their training needs assessment was based on the strategic plan of the

organization. This indicates a lack of strategic orientation in the way training was conducted

in Malaysian companies. Components of training and development in an organization need

to cohere with one another in supporting organization strategy.

During the pre-training stage, a needs assessment is crucial in identifying relevant skills

needed by the individual to contribute to the strategic objectives of the company. Multi-rater

feedback would compliment the needs analysis by providing ratee with multi-source

feedback for further development. The ratee will be given an opportunity to understand their

strengths and weaknesses from a different source and focus on reducing weaknesses and

maintaining strengths. In Malaysia training needs assessment are not conducted by measuring

84

85

individual skill deficiency but through general perceptions of a few top executives in thewhole department or organization. Organizations see training as an organizational need

rather than an individual need. Mirza and Juhary (1995) found that training organizations in

Malaysia offered training programs that were relevant to the needs of the organizations and

were too theoretical, one-shot with no follow up and not interactive. Organizations have

neither the professional competence nor the resources to identify training needs and mountrelevant training programs. Mirza and Juhary (1995) indicated that the stated flaws of

training could be attributed to the partial training culture brought by multinational companies

in Malaysia. They commented that assessment is difficult; it is almost impossible to

determine which employee weakness can be addressed by training. The culture of

conducting training evaluation among Malaysian companies was simply not popular orencouraging.

According to June and Rozhan (2000), given no proper pre and post training evaluation, the

organization will be constrained in its ability to link training with strategic objectives. It

would be difficult for the training and development to have a meaningful impact on

organizational effectiveness. Their study also provided the argument that multi-rater

assessment is not practiced by Malaysian-owned companies for development. If Malaysian

companies wish to conduct complete training and proper assessment, it is wise to use multi-

rater feedback on the training needs assessment so that organizations would be more focused

in the development process and able to measure its effectiveness.

Shipper and John (1992), found that multi-source information may be a mechanism for open

communication among diverse groups to establish proper psychological contract and clarify

expectations. This is supported by Luthans and Farner's (2002) study using the Kirkpatrick

(1994) training evaluation framework integrated with multi-rater feedback on 409 expatriateworkers from 49 multinational companies on whether transfer learning on the job was well

received. The mentioned training evaluation framework may be applied to local managerswho worked in multinational companies in Malaysia who are not clear of the cultures broughtin by expatriates and the expectation of their foreign counterparts within the company.

Therefore, multi-rater feedback, which has been described as needs analysis process, will

clarify ratee's expectations with the people working around them (Fletcher & Bailey, 2003).

Instilling multi-rater feedback as part of the pre-training needs analysis in Malaysian can

bring practical benefits for the organization by focusing on a particular behaviour or key

competency that is necessary for employee development. Employees will also have the

chance to audit self-perception against others through this self awareness mechanism, which

will result in higher work performance (Atwater et al., 1998; Bass & Yammarino, 1991;

Fumham & Stringhfield, 1994). The information received from the multi-rater feedback

would impact on the targeted individual's selfawareness and lead to the achievement of

agreed developmental needs (Fletcher & Bailey, 2003). Indeed, research has confirmed that

the use of multi-rater feedback is one of the best methods to promote ratees' self awareness of

their strengths and skill deficiencies (Hagberg, 1996; Rosti & Shipper, 1998; Shipper &

Dillard, 2000). Multi-rater feedback has been defined as an information gathering process

from relevant observers and is linked to specific business needs or objectives. Therefore, a

multi-rater feedback refers to the practice of providing an employee with perceptions of his or

her performance competencies from numerous sources (Cacioppe & Albrecht, 2000). By

reviewing different perceptions of their performance competencies, ratees can confirm their

strengths as well as identify their blind spots, habits, behaviours and styles, which may have

an adverse impact on others and their developmental priorities. This process helps a ratee to

focus on and develop performance competencies through a well-structured development

process. Waldman et al. (1998) were concerned about the lack of research examining the

effectiveness of multi-rater feedback on the performance developmental cycle.

3.8 Integrating Multi-rater Feedback with Development Tool

Organizations need to look at development as a continuous process by incorporating

development model in the multi-rater feedback system. If the purpose of having multi-rater

feedback is not clear and not integrated with the developmental systems, it will come across

like a trend. This is shown by Judge and Cowell (1997) using executive coaching as a

development process after conducting multi-rater assessment. The study showed that the

combination of multi-rater feedback coupled with individual coaching as a developmental

process increased leadership development effectiveness by 60 percent. This was based on the

direct report and peer post-survey feedback. Another study by Heisler (1996a) used a

comprehensive combined model which integrated multi-rater feedback with the leadership

86

and management skills development process. This approach was applied to a sample of 304

superiors and more than 1000 subordinates. The result showed that the ratees felt an increase

of ownership towards their personal and professional development. Ratees reported

improved communication and interaction with their superiors, peers and subordinates

(Heisler, 1996a). Effective communication and interaction between raters and ratee will

reduce possible multi-rater feedback drawbacks.

The developmental process of multi-rater feedback involves a great deal of cognitive

complexity and acknowledgement of the validity and legitimacy of the feedback. It also

requires balancing multiple or conflicting perspectives and balancing a sense of self with the

larger context and role requirements. There should be some mechanism to address the

discrepancy between the ratee's and rater's feedback in order to make it into a coherent

developmental tool. The identified discrepancies can be used to assist ratees in developing

their personal action plan for development. Research is needed to clarify and validate the

most effective concept design to develop ratee after multi-rater feedback.

3.9 Multi-rater Feedback: Process Consultation as a Development Tool

Process consultation is an ideal support tool for development (Schein, 1997). The process

consultation session should be conducted after multi-rater feedback so that it turns out to be a

very positive experience, regardless of discrepancies in the results. Process consultation is an

ongoing development system approach that has skilled third party (process consultant) work

with ratees and helping them learn about their competency gap from the multi-rater feedback

process. The process consultant should emphasize on the ratee's strengths and improve on

what the ratee does best, not what he does worst. In spite of this, if different world-view

arises between the process consultant and the ratee (client), the process consultant may use

non-directive techniques in order to help the client recognize and accept responsibility for the

deficiency (Hall, Otaza & Hollenbeck, 1999; Judge & Cowell 1997; Thach & Heinselman,

1999).

Process consultation is based on the idea that ownership of the issues of concern remain with

the ratee, who has actively participated in defining the key issues resulting from multi-rater

87

feedback and formulating a solution that is culturally appropriate (Schein, 1987). The role of

the consultant revolves around facilitation and engaging in a helpful relationship with the

ratee, rather than simply being a provider of expertise. The process consultant's role is more

nondirective and questioning as he or she gets the groups to solve their own problems

(French & Bell, 1999). This approach increases the likelihood ofconfronting the most

pressing issues and helps the ratee benefits from problem-solving skills needed for ongoing

organizational change.

Schein (1987) commented that process consultation is not one single thing the process

consultant does but are paramount goals the process consultant helps the ratees (client)

achieve, change and resolve key issues ofconcern through different interventions. Although

information on the stages of change (Lovelady, 1989) and the focus of intervention

(Fagenson & Burke, 1990) are important, it reveals too little about the specific activities that

process consultants engage in, and the skills they need to accomplish them successfully.

Schein (1987) concluded that process consultants make interventions in the following order:

agenda setting, feedback of observations or other data, counseling and coaching, and

structural suggestions if any. During the process consultation, ratee (client) who wishes to

change their traditional practices and behaviors need to be given the opportunity to reflect on

a wide range of meaningful feedback. Without reflection, it is just lip service to change

ratee's behaviour or performance. Process consultants will also be given the opportunity to

reflect their feelings, thoughts and perceptions on ratee's development. Through the

reflection process, process consultants will be able to evaluate the degree of reaction and

learning of the ratee. This is supported by Kolb's (1984) learning theory which states that an

individual will learn effectively if he or she is able to reflect on the feedback received.

It is important to take a closer look at a process consultant's actual intervention role which

involves intertwining events, issues, thoughts, emotions and human interactions. Schein

(1987) and Weisbord (1988) showed appreciation for the complex role and behavioural

repertoire required by the process consultant. Most research does not distinguish between the

different settings and contexts for consultancy practice (Chapman, 1998). The question arises

on whether the process consultant engages in different activities in their work within the

organization. If so, what particular skills are required for them to successfully develop a

ratee (client). This question is of interest to many people including the process consultants

themselves. Chapman (1998) asserted that a successful facilitation process requires building

88

89

emotional ties between the process consultant and client through good communication and

interpersonal skills. According to Kirkpatrick (1959), adults must be motivated to learn.

Hence through effective communication and interpersonal interactions, development of

psychological contract and emotional ties between both parties will motivate them to

participate in the development plan (Wolfe & Kolb, 1984).

3.10 Micro Perspective of Conversation Theory in Process Consultation

Pask's (1975) work in developing a human learning system through conversation theory may

be used to enhance facilitation between the process consultant and client. Conversation

theory is a framework for intervention analysis called the conversations model developed by

Ledington in 1989. The elements of the framework are individuals, groups or organizations

that formed. The framework is used to manage an intervention and the intervener is free to

construct a social group or community. Conversation is a means of knowledge acquisition

and is a process in gaining self-understanding and mutual understanding. It is also a way to

achieve predetermined objectives by using specific strategies (Navarro, 2001). The specific

strategies could be used in a fair manner by trying to genuinely convince the other

participants in a good conversation session.

The conversation model would be able to guide process consultants on how to go through

various intervention strategies so that the client is stimulated to tell his or her story with

minimal disruption of either the process or content. This can be done if every learning

conversation is followed by reflections by both the process consultant and client. Pask's

(1976a) conversation theory mentioned that reflection would bring about a desired emergent

behaviour which shows what the participants have learned and achieved, how the participants

have interacted interpersonally, and what the participants need to learn in the future.

Pask's (1976a) conversation theory contemplated the phenomenon of human learning as the

result of an emergent process of conversation such as linguistic interaction based on

conscious, conceptual resonance between several P-individuals. These P-individuals can be

distinct points of view within a biological individual, different biological individuals or even

specific groups of them. P-individual is an effective participant in a conversation, which

connect with many P-individuals. He suggested the existence of a close relationship between

these two aspects P-individuals as perspectives and P-individuals as participants.

According to Pask's (1976a) conversation theory, process of communication can also be

considered as a P-individual: a strict conversation is a prototypical P-individual. There are

three different conceptual aspects coexisting in the idea of a P-individual: the concept of a

cognitive perspective, the concept of a participant (in a conversation) and that of a whole

conversation. The conversation is a P-individual, and so are the participants who converse

with each other (Pask, 1961).

Hence, good conversation is an important concept derives from Pask's conversation theory

that forms the basis for effective process consultation by encouraging process consultant and

his client to reproduce new behaviour through mutual information transfer and network of

concepts. Pask (1975) adopted a few alphabetical equations to explain his conversation

process by phrasing A (the process consultant) is conscious with B (the client) and

committing themselves to some dependency or relationship T (the agreed course of action).

The commitment of A and B to T is sought because this supposedly leads to desired

outcomes. In an analogous manner, the performance of a client in a conversation will

potentially involve the whole personality and not just the epistemic resource. Pask (1975) did

not consider information and conversation as a pre-selection of interaction but as a

consequence of the emergence of new realities when a given system interacts with other

systems. This emergence is due to the synchronization effect of the two systems.

The creation of the learning context requires selfawareness as well as a social context for

intentional interaction (Black & Mendenhall, 1990). The learning context can be facilitated

by developing good conversation between the process consultant and his or her client. Good

conversation creates a form of conversation between the process consultant and client where

norms of discourse are developed consensually, values and assumptions can be surfaced and

tested, and all voices can be heard (Schuurman & Veermans, 2001). Through good

conversation the process consultant can enhance transfer learning and proceed with the

development process for his or her client.

Good conversation creates a cycle of effective transaction between the process consultant and

client who come into conversation and learn from each other. Research on stereotyping has

found an association between the level of self-acceptance a client feels and the tendency to

90

stereotype or accept others (Adorno, Frenkel-Brunswik, Levinson & Stanford, 1950; Rubin,

1967). Therefore, we expect that as clients increasingly accept themselves, they are more

able to let go of their prejudices and stereotypes of the process consultant. When clients are

fully free to speak, and feel they are genuinely being heard, the affirmation they experience

enhances self-acceptance. This enables them to listen more completely, allows for the

synergistic cycle of being heard and experience increased self-acceptance (Rogers, 1970).

Through good conversation, the possibility of stereotyping will be minimized to allow the

process consultant plays his/her role more effectively.

3.11 An Integrated Approach for Post Multi-rater Feedback Development

Post multi-rater feedback development start with a contact client or known as the ratee with

whom the process consultant meets concerning his or her performance deficiencies.

Whether or not that client admits to owing the performance deficiencies that is to be worked

on in the event of development, the process consultant would not want to be prematurely

perceived as an expert. The process consultant would want the client to feel helped after a

few meetings. The client should feel that every conversation is helpful especially during

early interactions.

According to Schein (1997), the process consultant and client have something to learn from

each other during the development process. He came up with eight general principles to

improve the flexibility of process consultation. They are: always be helpful, always deal with

reality, access your ignorance, everything you do is an intervention, it is the client who owns

the problem, go with the flow, be prepared for surprises and learn from them, share the

problem. These eight general principles govern the process consultant's roles and

relationship with the client. Chapman (1998) said that a process consultant should adopt

flexible consulting roles. Some clients may need a mentor and adviser on general

management matters as much as they require a facilitator and project manager. He further

commented that good process consultants help to identify the real issues and challenges

facing the organization as well as discuss a tailor-made process for constructive change.

91

Although Schein's eight general principles were used to enhance flexibility of the process

consultant's role, it does not mention how an effective dialogue session could be established

between the process consultant and his/her client. Vygotsky (1978) mentioned that

psychological contract could be established through effective dialogue and it will help the

process consultant and client reach higher levels of understanding. The establishment ofpsychological contract would be an opportunity for the process consultant and client to learn

important new things about a situation when they explore it together.

According to Schein (1997), the client owns the problem and has to live with the

consequences of the problem and the solution. Therefore the consultant must not withdraw

any problems away from the client because the client is the best person to understand and

appreciate what would be the next best steps. Involvement of client depends on their

willingness to openly discuss issues they are facing and the trust they give the consultant.

Sometimes the client hides the real problem because he or she is testing the consultant to

determine whether the relationship is characterized by sufficient trust to reveal what may be

very intimate and personal information. Trust building therefore requires greater 'good

conversation' between both parties to explore their commitment and intention.

Learning about good conversations and adjusting our responses to different individuals,

groups and issues appropriately, can have a dramatic impact on outcomes for individuals,

teams and whole organizations. Any significant human learning is not just cognitive

information-processing but also moral and aesthetic co-construction of parts of our life-world

(Boyd, 2001). The process consultant and client may have conflicting views and feelings if

both parties hold strongly to their beliefs or worldview. The resolution to this conflicting

belief or worldview demands that new realities be generated through synchronisation of

perceptual differences (Navarro, 2001). Reflections provide an avenue for both parties to

understand each other's views and learn from each other's differences. The process

consultant may use reflections to help his or her client to focus on one behavioral change they

would like to make as a result of their experience: "What do you want to work on, and are

you willing to make a commitment to change?" This gives their reflection an action

component which is often beneficial.

Learning occurs in two forms: single-loop and double-loop (Argyris, 1994). Single-loop

learning asks a one-dimensional question to elicit a one-dimensional answer. Double-loop

92

learning takes an additional step, or more often than not, several additional steps. It turns the

question back on the questioner. It asks what the media calls follow-up. A double loop

process might also ask why the current setting was chosen in the first place. Because double-

loop learning depends on questioning one's own assumptions and behaviour, this apparently

benevolent strategy is actually anti-learning (Argyris, 1994). Admittedly, being considerate

and positive can contribute to the solution of single-loop problems for example cutting costs.

But it will never help people figure out why they lived with problems for years, why they

covered up, why they were so good at pointing to the responsibility of others and so slow to

focus on their own. The notion of good conversation expands the phenomenon of ideal

speech to include ideal listening and promote interaction. For ideal listening to occur, the

individuals must feel secure and accepting enough of themselves to be open to new

possibilities. Enhanced self-acceptance can contribute to the possibility of valuing the

diversity of others. Thus, the responsibility of the process consultant includes nurturing

clients' self-acceptance and inspiring a sense of personal power among people around them.

This framework promotes a deep commitment to empathetic interaction between process

consultant and client to construct a shared reality as a common setting for development

pathway (Navarro, 2001).

In good conversation theory, learning is approached from an inside-out perspective based on

personal experience (Hunt, 1987). Under the process consultation practice, individual

personal experience is required to be reflected on. Reflection pinpoints and dramatizes what

individuals have learned and achieved, how individuals have interacted interpersonally, and

what individuals need to learn in the future. Through valuing each person's individual

experience, the uniqueness of every person is assumed and considered a resource. With the

more typical outside-in approach to learning, the dissimilarity of each person is considered a

problem to be solved (Hunt, 1987). The point of departure for learning in 'good

conversation' is not only the presumption that each individual is different, but that diversity is

an inherent resource. In this consensual and self-reflective process, as more and more diverse

reflections become fully heard within the group, the values and perspectives of each member

influence others, and the process of mutual socialization evolves.

The process consultant must become the reflective practitioner cum learner besides helping

the individual benefit from the double loop processes (Argyris & Schon, 1978). They must

diagnose the issues and take action to improve the practice by involving themselves directly

93

and fully, preparing to investigate such experiences from as many different perspectives as

possible and patterning their observations into meanings through reflection. Documentation

of the agreed proposed course of action was not mentioned by Schein (1997). In the absence

of documented agreement between the process consultant and client, commitment in fulfilling

the course of action is unlikely to happen. Schuurman and Veermans (2001) derived two

classes of consequences from conversation theory: the weak consequence and the strong

consequence. The weak consequence stresses observation, the strong consequence stresses

control. The weak consequence concentrates on record keeping of the experimental subject,

closed conversation and topic of exchanges. The strong consequence notes the records of

agreements derived from the outcome of negotiations between two parties. Both these

consequences will bring about total commitment between the process consultant and the

client. The documentation process that records the reflections made between two parties

during the learning process will become an obligated implication to be fulfilled. Besides this,

the record keeping will also provide both parties with a flow of progression towards their

development goals.

Pask (1961, 1965, 1975a, 1975b, 1976a, 1976b) introduced both the object-language and

meta-language to explain the required exchanges during the learning process. He stressed the

need for researchers to distinguish object-language and meta-language with any of the

learning interfaces (Schuurman & Veermans, 2001). The object-language comprises a system

of expression (i.e. sentences during conversation) belonging to the object of study. These

sentences should be internal expressions of the object, that is to reflect properties of the

object and these expressions should conform to well define rules (in this case, the

developmental pathway undertaken by the process consultant and the client). Within the

meta-language, a new object-language can be proposed. If the new object-language fits the

purpose (i.e. learning objective) better than the original object-language then it can be

replaced. However, this is only possible if the process consultant and/or the client knows

what to replace. Keeping apart object-language and meta-language allows revisions to be

tracked. This is a crucial prerequisite for systematic inquiry (De Zeeuw, 1995). Record

keeping process holds a very important key in making this a success (Schuurman &

Veermans, 2001). Pask (1965) considered interaction between object-language and meta-

language pivotal in learning and human performance in general. Pask observed object-

language and meta-language interactions so as to study how conversations are punctuated by

agreements (including agreements to disagreements). According to Pask, researcher needs to

94

keep proper record of the interaction between object-language and meta-language and to

mark all agreements. The agreements serve as controlled conversation of true hard data. He

argued that psychological experiments start with basic meta-language interactions: the

experimenter and experimental subject have to agree on their respective roles (Schuuman &

Veermans, 2001). The meta-language interactions that Pask strongly advocated should serve

as the whole basis of process consultation sessions where emergent behaviour for learning is

likely to happen.

However, there are a few drawbacks to the conversation theory where the theory itself

actually eliminates some basic traits of the human mind, human interaction and ignores other

aspects of human reality (Navarro, 2001). The other factors which prevent a straightforward

application of conversation theory to the study of social realities are strong dependence on

Pask's theory and not the study of real social life, specifically of human interactions. To

address the weakness of conversation theory in a real social environment and natural

conversation situation, the strength of the theory depends on its ability to bring into sharp

focus on specific aspect of the world. Massaro and Cowan (1993) suggested that in building

a community of good conversation, people are required to put themselves in the shoes of

others and to empathise if they are to arrive at consensually developed norms. Through

empathy for others, they can begin to understand, bring life, feelings and even accommodate

for the consequences of each other's norms distinct from themselves. It has to be an attempt

to truly see the world as the other sees it, understand the real life situation of the other and

adopt other's perspectives and values (Massaro & Cowan, 1993). One assumption of good

conversation is its essential dynamic quality and process, resisting a tendency to control for

predictability. This form of conversation implicitly and explicitly sets the conditions for

valuing individuals or organization through the integration of affective and cognitive modes

of experience and learning (Argyris, 1994). In this initial phase of the process, the process

consultant's role is particularly important. As part of the norm creation process, the process

consultant needs to be continually modeling a respectful and inclusive approach throughout

to foster a safe, receptive space for the conversation to unfold (Argyris, 1994).

95

3.12 Conclusion

Although research on multi-rater feedback assessment indicates that different rater sources

provide different information, multi-rater feedback technique is still useful at the preliminary

stage to provide information or create self awareness on individual strength, weaknesses or

blind spots. One underlying rationale to such systems is their potential impact on the target

individual's self awareness which increasing self awareness is thought to enhance

performance. This paper provides a concept on how multi-rater feedback can lead to a

successful developmental process through process consultation in Malaysia. Through the

years, training evaluation culture in Malaysia has not been properly practiced, hence it is

recommended that a proper approach be used to enable organizations to see the benefits of

holding pre training needs analysis and effective development approach so that a

comprehensive training and effective development approach could be instilled in the

Malaysian environment. Hence, the process consultant holds the key to effective

development process using multi-rater assessment as a pre-training gap analysis.

Process consultation provides the opportunity to check and balance the degree of learning and

development activities through reflection, problem solving capabilities and application of

theories throughout the developmental process. The flexibility of process consultation should

be enhanced by integrating conversation theory using good conversation and documentation

of pre-agreed commitment of action known as reflection. This will promote ideal

communication and interaction between the process consultation and client which will

eventually build trust for open learning and development. Good conversation is an important

intervention tool that has potential for applying effective human communication, decision

making, and policy making in the development process through single loop and double-loop

learning.

Multi-rater feedback approach also gathers information from various sources, in order to

evaluate the level of transfer learning of an individual at the end of the development stage of

process consultation. It is recommended that an integrated and comprehensive model

comprising preliminary multi-rater feedback assessment, followed by developmental process

using process consultation and good conversation in an effort to facilitate transfer learning to

the organization.

96

3.13 References for Paper Three

Adorno, T.W., Frenkel-Brunswik, E., Levinson, D.J. & Stanford, R.N. 1950, Theauthoritarian personality, Harper and Brother, New York.

Alimo-Metcalf, B. 1998, 'Editorial 360-degree assessment and feedback',Professional Forum, vol. 6, no. 1, pp. 16-18.

Antonioni, D. 1994, 'Designing an effective 360-degree appraisal feedback system',Personnel Psychology, vol. 47, pp. 349-356.

Argyris, C. & Schon, D. 1978, Organization Learning: A Theory in Action Perspective,Addison-Wesley, Reading, MA.

Argyris, C. 1994,'Good conversation that blocks learning', Managerial Excellence, HarvardBusiness Review, vol. 15, pp. 303-317.

Ashford, S.J. 1984, 'Self-assessments in organizations: a literature review and integrativemodel', Research in Organizational Behaviour, vol. 11, pp. 133-174.

Ashford, S.J. 1993, 'The feedback environment an exploratory study of cue use', Journal ofOrganizational Behaviour, vol. 14, pp. 201-224.

Atwater, L.E. & Yammarino, F.J. 1993, 'Personal attributes as predictors of superiors' andsubordinates' perceptions of military academy leadership', Human Relations, vol. 46,pp. 645-668.

Atwater, L.E., Ostroff, C.M., Yammarino, F.I. & Fleenor, I.W. 1998, 'Self-other agreement:does it really matter?' Personnel Psychology, vol. 51, no. 3, pp. 577-598.

Atwater, L.E., Roush, P. & Fischthal, A. 1995, 'The influence of upward feedback on self-and follower ratings of leadership', Personnel Psychology, vol. 48, pp. 35-49.

Bass, B.M. & Yammarino, F.I. 1991, 'Congruence of self and others' leadership ratings ofnaval offices for understanding successful performance', Applied Psychology: AnInternational Review, vol. 40, no. 4, pp. 437-454.

Bennis, W.G., Benne, K.D. & Chin, R. 1969, The Planning of Change, 2nd edn, Holt,Rinehart and Winston, New York, NY.

Bernardin, H.J., Dahmus, S.A. & Redmon, G. 1993, 'Attitudes of first line supervisorstowards subordinate appraisals', Human Resource Management, vol. 32, pp. 315-324.

Black, J.S. & Mendenhall, M. 1990, 'Cross-cultural training effectiveness: a review and atheoretical framework for future research', Academy of Management Review, vol. 15,no. pp. 113-136.

97

Borman, W.C. 1997, '360-degree ratings: an analysis of assumptions and research agendafor evaluating their validity', Human Resource Management Review, vol. 7, pp. 315-324.

Boyd, G. 2001, 'Reflections on the conversation theory of Gordan Pask', Kybernetes, vol.30, no. 5/6, pp. 560-570.

Bracken, D.W., Dalton, M.A., Jako, R., McCauley, C.D. & Pollman, V.A. 1997, Should360-degree Feedback Be Used Only for Developmental Purposes? Greensboro, NC:Center for Creative Leadership.

Brutus, S., London, M. & Martineau, J. 1999, 'The impact of 360-degree feedback onplanning for career development', Journal of Management Development, vol. 18, pp.676-693.

Cacioppe, R. 1998, 'An integrated model and approach for the design of effectiveleadership development programs', Leadership and Organization DevelopmentJournal, vol. 9, no. 1, pp. 44-53.

Cacioppe, R. & Albrecht, S. 2000, 'Using 360-degree feedback and the integral model todevelop leadership and management skills', Leadership and OrganizationDevelopment Journal, vol. 21, no. 8, pp. 390-404.

Cacioppe, R. & Albrecht, S. 2000, 'Differing perceptions of managers: behavioursusing the holon leadership-management model', in Parry, K. edn. forth coming.

Chapman, J. 1998, `Do process consultants need different skills when working withnonprofits?', Leadership and Organization Development Journal, vol. 19, no. 4, pp.211-215.

Church, A.H. 1997, 'Managerial self awareness in high performing individuals inorganizations', Journal of Applied Psychology, vol. 82, pp. 281-292.

Conway, J.M. & Huffcutt, A.I. 1997, 'Psychometric properties of multi-source performanceratings: a meta-analysis of subordinate, supervisor, peer and self-ratings', HumanPerformance, vol. 10, pp. 331-360.

Dezeeuw, G. 1995, 'Values, science and the quest for demarcation', System Research, pp.15-24.

Edwards, J.R. 1993, 'Problems with the use of profile similarity indices in the study ofcongruence in organizational research', Personnel Psychology, vol. 46, pp. 641-65.

Edwards, J.R. 1994, 'The study of congruence in organizational behaviour research: critiqueand proposed alternative, Organizational Behaviour And Human Decision Processes,vol. 58, pp. 51-100.

98

Facteau, J.D., Facteau, C.I., McGonigle, T.P. & Fredholm, R.I. 1997, Characteristics offeedback and managers' reactions in multi-source appraisal systems, paper presentedat the 12th annual conference of the Society of Industrial and OrganizationalPsychology, St. Louis, MO.

Fagenson, E. & Burke, W. 1990, 'Organization development practitioners' activities andinterventions in organizations during the 1980s', Journal of Applied BehaviouralScience, vol. 26, no. 3, pp. 285-297.

Fletcher, C. 1997, 'Self awareness: a neglected attribute in selection and assessment?',International Journal Of Selection And Assessment, vol. 5, no. 3, pp. 183-187.

Fletcher, C. & Baldry, C. 2000, 'A study of individual differences and self awareness in thecontext of multi-source feedback', Journal Of Occupational And OrganizationalPsychology, vol. 73, pp. 303-319.

Fletcher, C. & Bailey, C. 2003, 'Assessing selfawareness: some issues and methods',Journal of Managerial Psychology, vol. 18, no. 5, pp. 395-404.

French & Bell 1999, Organization Development: Behavioural Science Interventions forOrganization Improvement, 6th edn, Prentice-Hall Publisher, New Jersey.

Furnham, A. & Stringfield, P. 1994, 'Correlates of self and subordinate ratings ofmanagerialpractices as a correlate of supervisor evaluation', Journal of Occupational andOrganizational Psychology, vol. 67, no. 1, pp. 57-67.

Garavan, T.N., Morley, M. & Flynn, M. 1997, '360-degree feedback: its role in employeedevelopment', Journal of Management Development, vol. 16, no.2, pp. 134-147.

Ghorpade, J. 2000, 'Managing five paradoxes of 360-degree feedback', Academy ofManagement Executive, vol. 14, pp. 140-50.

Greguras, G.J., Ford, J.M. & Brutus, S. 2003, 'Manager's attention to multi-source feedback',Journal of Management Development, vol. 22, no. 4, pp. 345-361

Hagberg, R. 1996, 'Identify and help executives in trouble', Human Resource Magazine,vol. 41, no. 8, pp. 88-92.

Hall, D., Otazo, K. & Hollenbeck, G. 1999, 'Behind closed doors: what really happens inexecutive coaching', Organizational Dynamics, vol. 27, no. 3, pp. 39-58.

Harris, M.M. & Schaubroeck, J. 1988, 'A meta-analysis of self-supervisor, self-peer andpeer-supervisor ratings', Personnel Psychology, vol. 41, pp. 43-62.

Hazucha, J. Fr., Hezlett, S.A. & Schneider, R.J. 1993, 'The impact of 360-degree feedback onmanagement skills development', Human Resource Management, vol. 32, pp. 325-351.

Heisler, W.J. 1996a, '360-degree feedback: an integrated perspective', Career DevelopmentInternational, vol. 1, no. 3, pp. 20-23.

99

Hoffman, R. 1995, 'Ten reasons you should be using 360-degree feedback', HumanResource Management Magazine, vol. 40, no. 4, pp. 82-86.

Honey, P. & Mumford, A. 1982, Manual ofLearning Styles, Honey Publication,Maidenhead.

Hunt, D.E. 1987, Beginning With Ourselves, Brookliine Books, Cambridge, MA.

Judge, W. & Cowell, J. 1997, 'The brave new world of executive coaching', BusinessHorizons, vol. 40, no. 4, pp. 71.

Junaidah, H. 1999, Training Management: A Malaysian Perspective, Prentice-HallPublisher, Pearson Education, Malaysia.

June, M.L. P. & Rodzhan, 0. 2000, 'Management training and development practices ofMalaysian organizations', Journal of the Malaysian Institute of Management,Malaysian Management Review, vol. 35, no. 2, pp. 77-85.

Kirkpatrick, D.L. 1959a, 'Techniques for evaluating training programs: part 1 - reaction',Journal of American Society for Training and Developing, vol. 13, pp. 3-9,


Kirkpatrick, D.L. 1994, Evaluating Training Programs The Four Levels, Berrett-KoehlerPublishers, San Francisco.

Kluger, A.N. & Denisi, A.D. 1996, 'The effects of feedback interventions on performance:historical review, a meta-analysis and a preliminary feedback intervention theory',Psychological Bulletin, vol. 119, pp. 254-284.

Kluger, A.N. & Denisi, A.D. 2000, 'Feedback effectiveness: can 360-degree appraisals beimproved?', Academy of Management Executive, vol. 14, pp. 129-139.

Kolb, D. 1984, Experimental Learning, Prentice-Hall Publisher, New Jersey.

Leslie, J., Gryskiewicz, N. & Dalton, M. 1998, 'Understanding cultural influences on the360-degree feedback process', in Maximizing the Value of 360-degree Feedback: AProcess for Successful Individual and Organization Development, eds Tornow, W. &London, M., Jossey-Bass, San Francisco, pp. 196-216.

London, M. & Beatty, R.W. 1993, '360-degree-feedback as a competitive advantage', HumanResource Management, vol. 2-3, pp. 353-372.

London, M. & Smither, J.W. 1995, 'Can multi-source feedback change perceptions of goalaccomplishment, self evaluations and performance related outcomes? Theory-basedapplications and directions for research', Personnel Psychology, vol. 48, pp. 803-839.

100

London, M., Wholers, A.J., & Gallagher, P. 1990, '360-degree feedback surveys: a source offeedback to guide management development', Journal of Management Development,vol. 9, pp. 17-31.

Lovelady, L. 1989, 'The process of organization development: a reformulated model of thechange process, Part 1', Management Decision, vol. 27, no. 4, pp. 143-154.

Luthans, K.W. & Farner, S. 2002, 'Expatriate development: the use of 360-degree feedback',Journal of Management Development, vol. 21, no. 10, pp. 780-793.

Massaro, D.W. & Cowan, N. 1993, 'Information processing models: microscopes of themind', Annual Review and Psychology, vol. 44, pp. 383-425.

McCauley, C.D. & Moxley, R.S. Jr. 1996, Development 360: how Feedback Can MakeManagers More Effective, Jossey-Bass Publisher, San Francisco.

McEvoy, G.M. 1990, 'Public sector managers' reactions to appraisals by subordinates',Public Personnel Management, vol. 19, pp. 201-212.

McEvoy, G. M. & Buller, P.F. 1987, 'User acceptance of peer appraisals in an industrialsetting', Personnel Psychology, vol. 40, pp. 785-797.

Mirza, S.S. & Juhary, H.A. 1995, Managerial training and development in Malaysia,Malaysian Institute of Management, Malaysia.

Mount, M. K. 1984, 'Psychometric properties of subordinate ratings of managerialperformance', Personnel Psychology, vol. 37, pp. 687-701.

Mount, M.K., Judge, T.A., Scullen, S.E., Sytsma, M.R. & Hezlett, S.A. 1998, 'Trait, rater,and level effects in 360-degree performance ratings', Personnel Psychology, vol. 51,pp. 557-576.

Moses, J., Hollenbeck, G. P. & Sorcer, M. 1993, 'Other people's expectations', HumanResource Management, vol. 32, Summer Fall.

Nasby, W. 1989, 'Private self-consciousness, self awareness and the reliability of self-reports', Journal of Personality and Social Psychology, vol. 56, no. 6, pp. 950-957.

Navarro, P. 2001, The Limits of Social Conversation, Kybernetes, MCB University Press,vol. 30, no. 5/6, pp. 771-788.

Nowack, K. 1993, '360-degree feedback: the whole story', Training and DevelopmentJournal, vol. 47, no. 1, pp. 69-73.

O'Reilly, B. 1994, '360-degree feedback can change your life', Fortune Magazine, vol. 130,no. 8, pp. 93-97.

Pask, G. 1961, An Approach to Cybernetics, Hutchinson, London.

Pask, G. 1965, Inleiding tot de Cybernetica, Het Spectrum, Utrecht.

101

Pask, G. 1975a, The Cybernetics Of Human LearningAnd Performance', Hutchinson,London.

Pask, G. 1975b, Conversation, Cognition and Learning:A Cybernetic Theory andMethodology, Elsevier, Amsterdam.

Pask, G. 1976a, Conversation Theory: Applications In Education And Epistemology,Elsevier, Amsterdam and New York.

Pask, G. 1976b, Revisions in the foundations of cybernetics and general systems theory as aresult of research in education, epistemology and innovation (mostly in man-machinesystems), proceedings of the 8th International Congress on Cybernetics, Namur, vol.6, no. 11, September, pp. 83-109.

Payne, T. 1988, 'Editorial 360-degree assessment and feedback', International Journal ofSelection and Assessment, vol. 6, no. 1.

Reilly, R.R., Smither, J.W. & Vasilopoulos, NJ. 1996, 'A longitudinal study of upwardfeedback', Personnel Psychology, vol. 49, pp. 599-612.

Rogers, C.R. 1970, Encounter Groups, Harper and Row, New York.

Romano, C. 1994, 'Conquering the fear of feedback', Human Resource Focus, vol. 71, no. 3.

Rosti, R.T., Jr & Shipper, F. 1998, 'A study of the impact of training in a managementdevelopment program based on 360-degree feedback', Journal of ManagerialPsychology, vol. 13, pp. 77-89.

Rubin, I. 1967, 'The reduction of prejudice through laboratory training', Journal of AppliedBehavioural Science, vol. 3, no. 1.

Saiyadain, M.S. & Juhary, A. 1995, 'Managerial training and development in Malaysia',Journal of the Malaysia Institute of Management, Management Review, vol. 5, pp.23-36.

Schein, E.H. 1987, Process Consultation, vol. 2, Addison-Wesley, MA.

Schein, E.H. 1997, 'The concept of 'client' from a process consultation perspective', Journalof Organizational Change Management, vol. 10, no. 3, pp. 202-216.

Schuurman, J.G. & Veermans, K. 2001, 'Conversation and research', Kybernetes, vol. 30, no.7/8, pp. 881-890.

Shipper, F. & Dillard, J.E. Jr. 2000, 'A study of impending derailment and recovery ofmiddle managers across career stages', Human Resource Management, vol. 39, no. 4,pp. 331-345.

102

Shipper, F. & John, J. 1992, 'Employees' feedback: its use for management development andthe results in a government organization', in Fargher, J.S. edn., Proceedings ofSymposium on Productivity and Quality Improvement with a Focus on Government,Industrial Engineering and Management Press, Washington, DC.

Thach, E.C. 2002, 'The impact of executive coaching and 360-degree feedback on leadershipeffectiveness', Leadership and Organization Development Journal, vol. 23, no. 4, pp.205-214.

Thach, I. & Heinselman, T. 1999, 'Executive coaching defined', Training and DevelopmentJournal, vol. 53, pp. 34-39.

Tornow, W.W. 1993, 'Perceptions or reality: is multi-perspective measurement a means oran end?' Human Resource Management, vol. 32, no. 2 and 3, pp. 221-230.

Tornow, W.W. & London, M. 1998, Maximizing the Value of 360-Degree Feedback: AProcess for Successful Individual and Organizational Development, Jossey-BassPublisher, San Francisco.

Van Veslor, E., Taylor, S. & Leslie, J.B. 1993, 'An examination of the relationship amongself-perception accuracy, self awareness, gender and leaders' effectiveness', HumanResource Management, vol. 32, summer fall, no. 2/3, pp. 249-263.

Vygotsky, L.S. 1978, Mind in Society, Harvard University Press, Cambridge.

Waldman, D.A., Atwater, L.E. & Antonian, D. 1998, 'Has 360-degree feedback goneamok?', Academy of Management Executive, vol. 12, no. 2, pp. 86-94.

Walker, A.G., & Smither, J. W. 1999, 'A five-year study of upward feedback: what managersdo with their results matters', Personnel Psychology, vol. 52, pp. 393-423.

Wan Aziz, W.A. 1994, 'Transnational corporations and human resource development',Personnel Review, vol. 23, no. 5, pp. 50-69.

Warr, P. & Bourne, A. 1999, 'Factors influencing tow types of congruence and similarity asrelated to interpersonal evaluation in manager-subordinate dyads', Academy ofManagement Journal, vol. 23, pp. 320-30.

Weisbord, M. 1988, 'Towards a new practice theory of OD: notes on sharpshooting andmoviemaking', in Research in Organizational Change and Development, edsPasmore, W. & Woodman, R., JAI Press, Greenwich, CT, vol. 2, pp. 59-96.

Wilson, M.S., Hoppe, M.H., & Sayles, R.S. 1996, Managing Across Cultures: A LearningFramework, Centre for Creative Leadership, Greensboro, NC.

Wolfe, D.M. & Kolb, D.A. 1984, 'Career development, personal development andexperiential learning', in Organization Psychology: Readings on Human Behavioursin Organizations, 4th edn, Prentice-Hall, NJ.

103

Wood, R., Allen, T., Pillenger, T. & Kahn, N. 1999, '360-degree feedback: theory, researchand practice', in Human Resource Strategies: An Applied Approach, edsTravaglione,T. & Marshall, V., McGraw-Hill, Sydney.

Zakaria, I. & Rodzhan, 0. 1993, Human resource development practice in the manufacturingsector in Malaysia: an empirical assessment, Paper Presented at the Seminar onHuman Resource Management, Faculty of Business Management, UniversityKebangsaan Malaysia.

Zemke, R. & Zemke, S. 1995, 'Adult learning: what do we know for sure?', Training andDevelopment Journal, vol. 32, pp. 31-37.

104

Documents

Evaluating training effectiveness