Identifying academic language needs through diagnostic assessment

Journal of English for Academic Purposes 7 (2008) 180e190www.elsevier.com/locate/jeap

Identifying academic language needs throughdiagnostic assessment

John Read*

Department of Applied Language Studies and Linguistics, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand

Abstract

The increasing linguistic diversity among both international and domestic students in English-medium universities creates newchallenges for the institutions in addressing the students’ needs in the area of academic literacy. In order to identify students withsuch needs, a major New Zealand university has implemented the Diagnostic English Language Needs Assessment (DELNA) pro-gramme, which is now a requirement for all first-year undergraduate students, regardless of their language background. The resultsof the assessment are used to guide students to appropriate forms of academic language support where applicable. This article ex-amines the rationale for the assessment programme, which takes account of some specific provisions governing university admis-sion in New Zealand law. Then, drawing on the test validation network by Read and Chapelle [Read, J., & Chapelle, C. A. (2001). Aframework for second language vocabulary assessment. Language Testing, 18, 1e32] the article considers in some detail: 1) theway in which DELNA is presented to staff and students of the university, and 2) the procedures for reporting the results. It alsoconsiders the criteria by which the programme should be evaluated.� 2008 Elsevier Ltd. All rights reserved.

Keywords: Language assessment; English for academic purposes; Diagnosis; University admission; Undergraduate students; Language support

1. Introduction

The internationalisation of education in the major English-speaking countries has long created the need to providevarious forms of academic language support for those international students who have been admitted to the institution,but whose proficiency is still not fully adequate to meet the language demands of their degree studies. Language sup-port most often takes the form of English for academic purposes (EAP) courses targeting specific skills such as writingor listening, but it can also include adjunct language classes linked to a particular content course, writing clinics, peerediting programmes, self-access centres, and so on. A typical strategy is to require incoming international students totake an in-house placement test, the results of which are used either to exempt individuals from the EAP programme orto direct them into the appropriate courses to address their needs. Accounts of tests designed broadly for this purposeat various universities can be found in Brown (1993), Fox (2004), Fulcher (1997), and Wall, Clapham, and Alderson(1994).

* Tel.: þ64 9 373 7599x87673; fax: þ64 9 308 2360.

E-mail address: [email protected]

1475-1585/$ - see front matter � 2008 Elsevier Ltd. All rights reserved.

doi:10.1016/j.jeap.2008.02.001

mailto:[email protected]

http://www.elsevier.com/locate/jeap

181J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

At the same time, it is now well recognised that many students who are not on student visas also have academiclanguage needs. This may result from the success of policies to recruit students from indigenous ethnic or linguisticminority groups which have traditionally been underrepresented in tertiary education. Another major category con-sists of relatively recent migrants or refugees, who have received much if not all of their secondary education inthe host country and thus have met the academic requirements for university admission, but who still experience dif-ficulties with academic reading and writing in particular (Harklau, Losey, & Siegal, 1999). The term Generation 1.5has been coined in the US to refer to the fact that these students are separated from the country of their birth but oftennot fully integrated e linguistically, educationally or culturally e into their new society. Beyond these two identifiablecategories, there is a broader continuum of academic literacy needs within the student body in the contemporaryEnglish-medium university, including many students who are monolingual in English.

Although various forms of language support may be available to these domestic students on campus, the issue ishow to identify the ones who need such support and to what extent they should be required to take advantage of it.There can be legal or ethical constraints on directing students into language support on the basis of their languagebackground or other demographic characteristics. It may also be counterproductive to make it obligatory for studentsto participate in a support programme when they have no wish to be set apart from their peers and are reluctant toacknowledge that they have language needs. One way to address the situation is to introduce some form of diagnosticassessment, comparable to the in-house placement tests for international students. In fact, one of the tests cited above(Fulcher, 1997) was designed to be administered at the University of Surrey in the UK to all incoming students, re-gardless of their immigration status or language background. A similar solution is emerging at the university which isthe subject of the present article.

Having regard for these various considerations, it is necessary to give some careful thought to the development ofan assessment procedure for this purpose. There are technical issues, such as how to assess native and non-nativespeakers by means of a common metric and how to reliably identify those with no need of language support withinthe minimum amount of testing time. However, the focus of this discussion will be on the need to present the assess-ment to the students and to the university community in a manner that will achieve its desired goals while at the sametime avoiding unnecessary compulsion.

2. The context

The particular case to be considered here is a programme called Diagnostic English Language Needs Assessment(DELNA), which has been implemented at the University of Auckland in New Zealand. The programme was intro-duced to address concerns that developed through the 1990s with the influx of students who are now collectively iden-tified as having English as an additional language (EAL). During that decade New Zealand tertiary institutionsvigorously recruited international students, particularly from East Asia. These students were required to demonstratetheir proficiency in English as a condition of admission. However, the typical requirement for undergraduates of Band6.0 in IELTS came to be recognised as a relatively modest level of English proficiency, particularly for students whosecultural background and previous educational experience made it difficult to meet the academic expectations of theirlecturers and tutors (Read & Hayes, 2003). In the absence of any moves to raise the minimum English requirement forentry, then, the University of Auckland e like other New Zealand universities and polytechnics e needed to providevarious forms of ongoing language support for international students.

The liberalisation of immigration policy in the late 1980s also opened up opportunities for skilled migrants andbusiness investors to migrate to New Zealand with their families. This led to an inflow of new immigrants from Tai-wan, China, South Korea, India and Hong Kong, peaking in 1995 but continuing at lower levels to this day. The vastmajority of the new immigrants settled in the Auckland metropolitan area and in time these communities producedsubstantial numbers of students for tertiary institutions in the region, and for the University of Auckland in particular.The students from these communities had quite similar linguistic, educational and cultural profiles to internationalstudents; many students in both categories had attended a New Zealand secondary school for one, two or more yearsbefore entering the university. However, there was one crucial difference. Under New Zealand law (the Education Act1989), permanent residents are classified as domestic students for the purpose of university admission and cannot besubjected to any entry requirement that is not also imposed on citizens of the country. This means specifically that newmigrants cannot be targeted to take an English proficiency test or enrol in ESL classes as a condition of being admittedinto a university.

182 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

Another provision in the Education Act creates further challenges. The law allows any domestic student who hasreached the age of 20 to apply for special admission to a New Zealand university, regardless of their level of prioreducational achievement. Thus, in principle adult migrants as well as citizens have had open entry to tertiary educa-tion, although in practice their choices have been constrained by admission requirements for particular degree pro-grammes, and those lacking a New Zealand secondary school qualification are likely to be strongly counselled toinitially take on a light, part-time workload.

Students accepted for special admission have diverse language needs. Whereas those from the East Asian migrantcommunities may resemble international students linguistically and culturally, others are mature students from En-glish speaking backgrounds who may not lack proficiency in the language as such but rather academic literacy. Thesestudents include members of the Pacific Nations communities (particularly from Samoa, Tonga, the Cook Islands,Niue and the Tokelau Islands) who may have native proficiency in general conversational English but whose low levelof achievement in their secondary schooling would have excluded them from further educational opportunity, had thespecial admission provision not been available. Although the Pacific communities are long established in New Zea-land, it has only been in more recent years that the universities have made systematic efforts to recruit Pasifika stu-dents, with a particular emphasis on programmes in Education, Health Sciences and Theology.

Thus, through the 1990s the University of Auckland faced various challenges in responding to the growing linguis-tic diversity of its student body, not least because of the constraints imposed by the Education Act. Proposals from twoleading professors (Ellis, 1998; Ellis & Hattie, 1999) that the university should introduce an entrance examination inEnglish for students who could not produce evidence of adequate competence in the language received support fromthe Faculty of Arts and were accepted by the central administration of the university. The development and piloting ofthe DELNA instruments took place in 2000e01 (Elder & Erlam, 2001) and the programme became operational in2002.

3. DELNA: its philosophy and design

Before looking at how DELNA operates in practice, it is useful to outline several basic principles underlying itsdevelopment. To some extent, the principles reflect the constraints imposed on the university by the EducationAct, but they can also be seen as a positive commitment by the institution to enhancing the educational opportunitiesof the whole student body.

� One principle was that the test results would not play any role in admissions decisions; students were to be as-sessed only after they had been accepted into the university for their chosen degree programme. In this sense,then, the administration of DELNA represents a ‘‘low-stakes’’ situation, although from another point of view thestakes are higher for students who are at serious risk of failing courses or not achieving their academic potentialas a result of their limited proficiency in the language. The university, too, has a stake in preserving academicstandards and maintaining good completion rates, particularly on equity grounds for M�aori, Pasifika and otherstudents from historically underrepresented groups on the campus.� As a means of emphasising the point that DELNA was not IELTS under another guise, it was deliberately called

an ‘‘assessment’’ rather than a test, and the individual components are known as measures.� There was to be an important element of personal choice for students in their participation in DELNA and their

subsequent uptake of opportunities for language support and enhancement. In practice, particular departmentsand degree programmes have required their students either to take DELNA and/or to participate in someform of language support, but the principle remains that students should be strongly encouraged to take advan-tage of this initiative rather than being compelled to do so against their will.� DELNA represented a recognition by the university that it shares with students a joint responsibility to address

academic language needs. This contrasts with the situation of international students applying for admission,where the onus is on the students to demonstrate, by paying a substantial fee for an international Englishtest, that they have adequate competence in the language. For students and for departments, DELNA is freeof charge and several of the language support options are available to students at no additional cost to them.

In operation, DELNA involves two phases of assessment, Screening and Diagnosis, as shown in Table 1. TheScreening measures were designed to provide a quick and efficient means of separating out native speakers and other

Table 1

The structure of DELNA

Screening (30 min)

� Vocabulary

� Speed Reading

Diagnosis (2 hours)

� Listening to a mini-lecture

� Reading academic-type texts

� Writing an interpretation of a graph


proficient users of the language who were unlikely to encounter difficulties with academic English, and exemptingthem from further assessment. Both of the Screening measures are computer-based. One is a vocabulary test, assessingknowledge of a sample of academic words by means of a simple wordddefinition matching format (Beglar & Hunt,1999). The other, variously known in the literature as a speed reading (Davies, 1975, 1990) or cloze-elide (Manning,1987) format, is a kind of reverse cloze procedure. In each line of an academic-style text an extraneous word is insertedand the test takers must identify each inserted word under a speeded condition which means that only the most pro-ficient students complete all 73 items within the time available. In a validation study (Elder & Erlam, 2001), the re-liability estimates were 0.87 for Vocabulary and 0.88 for Speed Reading. The two tests correlated with a compositelistening, reading and writing score from the Diagnosis (see below) individually at 0.74 (vocabulary) and 0.77 (speedreading), and collectively at 0.82.

For students who score below a threshold level on the Screening, the three measures in the Diagnosis phase providea more extensive, task-based assessment of their academic language skills. Unlike the computerised Screening measures,they are all paper-based instruments. In the Listening test (30 min), the students hear an audio-recorded mini-lecture ona non-specialist topic and respond to short answer, multiple-choice and information transfer items. The Reading test(45 min) is based on one or two reading texts on topics of general interest totalling about 1200 words. Various item typesare used, including cloze, information transfer, matching, multiple-choice, true-false and short answer. For the Writingtask (30 min), the candidates write 200 words of commentary on a social trend, as presented to them in the form of a sim-ple table or graph. Their writing is rated on three analytic scales: fluency, content, and grammar and vocabulary.

The Diagnosis phase takes 2 hours to administer, as compared to 30 min for the Screening, and is obviously moreexpensive in other respects, in that it requires manual scoring and, in the case of the writing task, double rating on thethree scales by trained examiners (for research on the training procedures, see Elder, Barkhuizen, Knoch, & vonRandow, 2007; Knoch, Read, & von Randow, 2007). The Elder and Erlam (2001) validation study obtained reliabilityestimates of 0.82 for Listening and 0.83 for Reading. In the case of Writing, the two recent studies just cited (Elderet al., 2007; Knoch et al., 2007) produced estimates of 0.95e0.97 for the reliability of candidate separation, using theFACETS program.

Further details of the two phases of the DELNA assessment, including sample items and tasks, can be found in theDELNA Handbook, which is downloadable from the programme website: www.delna.auckland.ac.nz .

Set out this way, DELNA looks very much like a conventional language test. Certainly the Diagnosis tasks are sim-ilar to those found in IELTS and other EAP proficiency tests. However, the intended purpose of the instrument is dif-ferent and this means that it needs to be presented in a distinctive manner, in keeping with the principles outlined at thebeginning of this section.

4. An analysis of test purpose

A useful framework for analysing how test purpose should influence test design and delivery is that developed byRead and Chapelle (2001). Although the framework is exemplified in terms of vocabulary testing, it has general ap-plicability to various forms of language assessment. As shown in Fig. 1, the framework has numerous components andit is beyond the scope of the present article to consider them all in detail.

At the top level of the framework, test purpose is decomposed into three components e inferences, uses and in-tended impacts e which in turn lead to validity considerations and mediating factors. It is the second and third me-diating factors which are of particular concern here, but it is also necessary to address the first component briefly.

http://www.delna.auckland.ac.nz

TEST PURPOSE Inferences Uses Intended Impacts

VALIDITYCONSIDERATIONS

ConstructValidity

Relevanceand Utility

ActualConsequences

MEDIATINGFACTORS

ConstructDefinition

PerformanceSummary andReporting

TestPresentation

TEST DESIGNDecisions about the Structure

and Formats of the Test

VALIDATION Arguments based on Theory, Evidence and Consequences

Fig. 1. A framework for incorporating a systematic analysis of test purpose into test validation (adapted from Read & Chapelle, 2001, p. 10).


4.1. Construct definition

The inferences to be made on the basis of performance in DELNA can be defined in terms of academic literacy inEnglish: the ability of incoming undergraduate students to cope with the language demands of their degree pro-gramme. Although ultimately the assessment is targeted at students for whom English is an additional language(EAL), the construct is broader than academic literacy in English as an additional language because many of thoseto be assessed come from English-speaking backgrounds, and the whole function of the initial Screening phase ofDELNA is to separate out students for whom adequate academic literacy is unlikely to be at issue. Designinga test for students with English as both a first and an additional language creates a special challenge because it cannotbe assumed that items and tasks will perform the same way for the two groups. Elder, McNamara, and Congdon (2003)used Rasch analysis to investigate this issue and found a somewhat complex pattern, whereby each of the DELNAtasks except the vocabulary measure exhibited some significant bias in favour of either native or non-native speakers.However, since the bias was in both directions and relatively small in magnitude overall, the researchers consideredthat it was within tolerable limits for a low-stakes assessment of this kind.

Read and Chapelle (2001) distinguish three levels of inference: whole test, sub-test and item. For DELNA item-level inferences are not appropriate. In the Screening phase, the construct is defined specifically in terms of efficientaccess to academic language knowledge and it is sufficient to make inferences at the level of the whole test. Thus, thevocabulary and speed reading scores are combined into a single result to determine whether the student should proceedto the Diagnosis phase.

Elder and von Randow (in press) have investigated the validity of inferences based on the Screening scoreexamining its suitability as a basis for determining whether students needed to proceed to the Diagnosis. Their studyinvolved an analysis of the performance of 353 students who took both the Screening and Diagnosis measures. Aminimum criterion score was set on the basis of performance in the listening, reading and writing tests of theDiagnosis phase. Then, by means of regression analysis, an optimum cut score (combining the vocabulary and speedreading scores) was established for the Screening phase. This cut score successfully identified 93% of the studentswhose performance fell below the criterion level in the Diagnosis phase. However, it also meant that relatively fewstudents would be exempted from taking the costly Diagnosis measures and so, with financial considerations inmind, a lower cut score was set. The lower score identified only 81% of the students who were under the criterionlevel but on the other hand it resulted in less than 1% of ‘‘false negatives’’: students below the cut score who neverthe-less had achieved the criterion level in the Diagnosis. Therefore, for operational purposes it is only students whoseScreening performance falls under the lower cut score who are required to proceed to the Diagnosis. Those whoare between the two cut scores receive a general recommendation to seek academic language support (see 4.3 below).


For those who complete the Diagnosis, sub-test inferences are desirable so that students can be advised on whetherthey should seek language support in each of the three skill areas of listening, reading and writing. This means thateach sub-test needs to provide a reliable measure of the skill involved. The reliability estimates quoted in Section 3 arevery satisfactory from this perspective.

4.2. Test presentation

Although test presentation comes third in the Read and Chapelle framework, it is more appropriate to discuss itnext in this account of DELNA. Presentation is a mediating factor that comes from a consideration of the impact ofa test (Messick, 1996). Read and Chapelle (2001) point out that most research on impact in language testing hasfocused on the washback effects of existing tests and examinations (see, e.g. Alderson, 1996; Cheng & Watanabe,2004). However, Read and Chapelle argue that if the consequences of implementing a test are to be seen as an in-tegral element in evaluating its quality, a statement of the intended impact of the instrument needs to be included inthe specification of test purpose early in the development of a new test. Thus, the actual consequences of putting thetest into operation can be evaluated by reference to the prior statement of intended impact. This means in turn thatthe test developers should consider how the intended impact can be achieved through the way that the test ispresented.

Test presentation is a concept that has not received much attention in the literature and it deserves some consider-ation here. It consists of a series of steps, taken as part of the process of developing and implementing the test, to in-fluence its impact in a positive direction. Since there are numerous stakeholders in assessment, particularly when thestakes are high, ‘‘[t]est developers choose to portray their tests in ways that will appeal to particular audiences’’ (Read& Chapelle, 2001, p. 18). These can include educational administrators, teachers, parents, users of the test results, andof course the test takers, who need to be familiar with the test formats and willing to accept that the test is a fair as-sessment of their language abilities.

Seen in this light, test presentation has a strong connection to that much maligned concept in testing, face validity.Authors of introductory texts on language testing, starting with Lado (1961), have generally dismissed this concept asnot being a credible form of evidence to support a validity argument, since it is based on ‘‘simple inspection’’ (Lado,1961, p. 321) or the judgment of ‘‘an untrained observer’’ (Davies et al., 1999, p. 59).

However, this rejection of the concept has generally been accompanied by an acknowledgement that, although theterm may be a misnomer, it represents a matter of genuine concern in testing. That is to say, test developers are con-fronted with a real problem if e regardless of the technical merits of the test e one or more of the stakeholder groupsare not convinced that the content or the formats are suitable for the assessment purpose. Thus, Alderson, Clapham,and Wall (1995) give face validity a positive gloss as meaning ‘‘acceptable to users’’ (p. 173), echoing Carroll (1980),who had earlier proposed acceptability as one of the four desirable characteristics (along with relevance, comparabil-ity and economy) of a communicative test. In addition, Bachman and Palmer (1996, p. 42) and Davies et al. (1999, p.59) refer to the even more positive notion of ‘‘test appeal.’’

Thus, test presentation can be seen as a proactive approach to promoting the acceptability of the test to the variousstakeholders, and above all to the test takers, in order to achieve the intended impact. The test developer needs to en-sure that the purpose and nature of the assessment is clearly understood, that it meets stakeholder expectations as muchas possible, and that the test takers in particular engage with the test tasks in a manner that will help produce a validmeasure of their language ability. Major proficiency tests generate a strong external motivation for students because ofthe stakes involved, whereas with a programme like DELNA it is more important to create a positive internal moti-vation based on a recognition of the benefits that the results of the assessment may bring for the student.

4.2.1. Presentation of DELNA to studentsThe general principles underlying the presentation of DELNA are those that were introduced in Section 3 above:

the fact that the results are never used for admissions purposes; the term ‘‘assessment’’ is preferred to ‘‘test’’; there isa significant element of personal choice for students; and the university shares with its students the responsibility foraddressing their academic language needs. The slogan ‘‘Increases your chance of success,’’ which has featured inDELNA publicity, is also intended to express the positive intent of the programme.

There have been two main pathways to DELNA for students entering the university each semester. The first wasliterally by invitation. In the admissions office the records of incoming domestic students (citizens and permanent


residents) were reviewed to identify those who had not provided evidence of their competence in English for tertiary-level study. Students coming directly from secondary school in the last few years hold the National Certificate of Ed-ucational Achievement (NCEA), which includes a ‘‘literacy requirement’’ to demonstrate proficiency in academicreading and writing in English. However, mature students and recently arrived immigrants who enter the universityunder special admission often lack a recognised secondary school qualification or any other evidence of academicliteracy.

Students thus identified received a letter inviting them to take the DELNA diagnosis. For statutory reasons, as pre-viously explained, the university could not make it mandatory but, that consideration aside, the wording of the letterhad a positive tone which emphasised the intended role of DELNA in enhancing the student’s study experience. Ini-tially, the uptake of these invitations was relatively low but by 2005 it had reached about 40% (116/295).

The other main pathway, which has now essentially superseded the first one, results from decisions by departmentsor faculties to require all students in designated first-year courses to take DELNA. Initially, this applied to programmeswhich attracted a high proportion of EAL students, such as the Bachelor degrees in Business and Information Man-agement, and in Film, TVand Media Studies. However, from 2007 it has officially become a requirement for almost allfirst-year students, regardless of their language background, to take the DELNA Screening. This not only observes thelegal niceties but also highlights the important role of the Screening phase in efficiently separating out academicallyliterate students for exemption from further assessment.

In 2007 a total of 5427 students were assessed through the DELNA programme. These students are estimated torepresent around 70% of all the first-year students at the university that year, although the percentage is higher ifgroups such as transferring and exchange students are excluded. Of those who completed the Screening, 1208were recommended to return for the Diagnosis phase; however, only 504 (42%) did so. This shortfall is discussedin Section 5 below.

In terms of presentation, as DELNA assessment has become the norm for first-year students, it is increasinglyaccepted as just another part of the experience of entering the university. Students are informed of the assessment re-quirement in their department’s handbook and can obtain further information from the programme website, includingthe downloadable DELNA Handbook, with its sample versions of the assessment tasks and advice on completingthem. In addition, it is easy for students to book for a DELNA session online at their preferred day and time. One otherappealing feature of the Screening measures in particular is that they are computer-administered, which adds a noveltyvalue for students who may never have taken such a language test before.

4.2.2. Presentation of DELNA to staffMuch of the initial impetus for the development of DELNA came from the concerns of teaching staff across the

university in the 1990s about the academic literacy needs of students in their classes. This created a positive environ-ment for the acceptance of a programme like DELNA to address those needs, but of course that is not the same as anunderstanding of how this particular programme works.

The establishment of DELNA saw the formation of a Reference Group chaired by the Deputy Vice-Chancellor (Ac-ademic) and composed of representatives from all the university faculties as well as from the various language supportprogrammes. The group meets regularly to discuss policy issues, monitor the implementation of DELNA and providea channel of communication from the programme staff to the faculties. The departments which were the early partic-ipants in DELNA are well represented on the group but, as the assessments have expanded, it has been necessary toopen new avenues of communication to academic and administrative staff across the university to ensure that: a) aninformed decision is made when departments or faculties decide to require their first-year students to take the assess-ment; b) the relevant staff correctly interpret the DELNA results when they receive them; and c) effective follow-upaction is taken to give students access to language support if they need it.

In 2005 an information guide for staff was produced in pamphlet form and it has been followed by an FAQ doc-ument. However, experience has shown that the printed material must be backed up by face-to-face meetings with keystaff members responsible for DELNA administration in particular faculties or departments.

4.3. Performance summary and reporting

This brings us back to the second mediating factor of the Read and Chapelle (2001) framework, performance sum-mary and reporting, which relates to the intended use of the test. The assessment results are used to identify students


who may be at risk because of their low academic literacy and then to advise them on suitable forms of language sup-port and enhancement. Where participation in DELNA is a course requirement, the results also go to the academicprogramme or department for follow-up action as appropriate. Thus, the two main recipient groups for the resultsare the students and their departments.

Given that the whole purpose of the programme is to address academic literacy needs, the reporting of student per-formance includes not only the assessment result but also a recommendation for language enhancement where appro-priate. At this point, then, it is useful to list the main language support options available on the campus.

� Credit courses in academic language skills: ESOL100e102 for EAL students, and ENGWRIT101, a writingcourse for students from English-speaking backgrounds.� Workshops, short non-credit courses, individual consultations and self-access study facilities offered by the Stu-

dent Learning Centre (SLC e available to all students) and the English Language Self-Access Centre (ELSAC especialising in services for EAL students).� Discipline-specific language tutorials linked to particular courses (following a kind of adjunct model) which

have for some time attracted a high proportion of EAL students. Currently these courses are in Commerce,Health Sciences, Theology, and Film, TV and Media Studies.

The Screening phase of DELNA is primarily intended to exempt highly proficient students from further assess-ment. Thus, the scores from the vocabulary and speed reading measures are combined to divide the test-takers intothree categories with deliberately non-technical labels:

Good e no language enrichment required.Satisfactory e some independent activity at SLC or ELSAC recommended.Recommended for Diagnosis e should take the DELNA Diagnosis.

The Screening result is sent individually to each student by email and, when DELNA is a departmental require-ment, an Excel file of results for each course is forwarded to a designated staff member. Until 2006, the Screeningreports included the two actual test scores for vocabulary and speed reading. However, the fact that the cut scoresfor the three categories varied according to which form of the test each student took caused some confusion and,in addition, there were indications that the scores were being used in at least one academic programme as quasi-pro-ficiency measures to assign students to tutorial groups according to their language ability. This led to the current policyof reporting just the student’s category.

In the case of the Diagnosis phase, a scale modelled on the IELTS band scores (from a top level of 9 down to 4)has been used for rating performance and reporting the results to students. However, for reporting to staff a simplerA-B-C-D system is used for each of the three skills (listening, reading and writing). The A and B grades correspond tothe Good and Satisfactory categories respectively in the Screening, and students whose three-grade average is at one ofthese levels receive an email report. On the other hand, students averaging in the C and D range, who are considered tobe at significant risk, are sent an email request to collect their results in person from the DELNA language adviser. Aswith the Screening, the results are also sent to the designated staff member when the Diagnosis is required by thedepartment.

The appointment of the language adviser, beginning in 2005, resulted from a recognition that students scoring lowin the Diagnosis were generally not accessing the recommended language support. A small-scale study by Bright andvon Randow (2004), involving follow-up interviews with eighteen DELNA candidates at the end of the academic year,found that only four of them had taken the specific advice given in the results notification. Although most of theparticipants had in fact passed their courses, they acknowledged that they had really struggled to meet the languagedemands of their studies. One strong message from the interviews was that the students would have greatly appreci-ated the opportunity to discuss their DELNA results and their language support options face-to-face, rather than justreceiving the impersonal emailed report. Thus, the language adviser now meets with each student individually, goesover their profile of results, and directs them to the most appropriate form(s) of support. She often follows up the initialmeeting with ongoing monitoring of their progress through the semester or even longer.

Thus performance summary and reporting in this case involves not simply the form of the report but also, for theless proficient students, the medium by which the result is communicated to them.


5. Evaluating the programme

The extended discussion in Section 4, drawing on the Read and Chapelle (2001) framework, has shown how thepurpose of the assessment has been worked out through the design and delivery mechanisms of DELNA. At the time ofwriting, the programme is still being rolled out. It has yet to achieve full participation by the incoming first-year stu-dent population in the Screening phase, and furthermore in 2006 only 30% of students (444 out of 1340) who wererecommended for Diagnosis on the basis of their Screening results actually went on to the second phase of the assess-ment. Higher levels of participation will depend on the extent to which faculties and departments enforce the require-ment that their students should take one or both phases of DELNA. Some academic programmes have introducedspecific incentives for students to take the assessment, by for instance withholding the first essay grade or subtractinga few percent of the final course grade of students who do not comply.

However, the point of the exercise is not just to assess the students but rather to address their academic languageneeds where appropriate. As noted in the previous section, there is now provision within the DELNA programme it-self, through the work of the language adviser, to provide intensive counselling for those students whose results in theDiagnosis phase show that they have the most serious language needs. Some academic units have introduced their ownfollow-up measures for such students. For example, the Bachelor’s degree in Business and Information Managementhas a well-established Language and Communication Support Programme (LanCom), which integrates various formsof support into the delivery of their courses. In the Faculty of Engineering students who score below a minimum levelin the DELNA Diagnosis must undertake a quasi-course with its own course code, involving attendance at 15 hours ofworkshops at the Student Learning Centre (SLC) and satisfactory completion of another 15 hours of directed study atthe English Language Self-Access Centre (ELSAC).

With the expansion of DELNA assessment into the Faculties of Arts and Science, it is more of a challenge to re-spond to the language needs of students enrolled for a degree which includes courses offered by several different de-partments. In the first instance, the Screening results may simply provide course conveners with a broad profile of thelanguage needs of their students, who may be several hundred in number. Many departments lack the resources to offerspecialised language support to their students. One realistic option for them is the introduction of systematic proce-dures for referring students in need to SLC or ELSAC; another option may be to review their teaching and assessmentpractices to avoid creating unnecessary difficulties for EAL students in their courses.

Returning briefly to the Read and Chapelle (2001) framework, one element in the validation of a test or assessmentprocedure is an investigation of its actual consequences as compared to its intended impact. At the institutional level,the intended impact can be defined in terms of levels of academic literacy in the student population. The implemen-tation of DELNA is supposed to lead to a meaningful reduction over time in the number of students whose academicperformance is hampered by language-related difficulties. The question is what kind of data counts as evidence thatthe goal is being achieved for the undergraduate student body as a whole.

Davies and Elder (2005) took up this point in their review of current theory and practice in language test validation,using DELNA as a case study. They formulated a series of eight hypotheses that can be investigated to build an ar-gument for the validity of DELNA. Most of the hypotheses relate either to the technical qualities of the DELNA testsas measures of academic literacy or to the utility of the scores to the users. However, the final hypothesis takes up theissue of the wider impact of the programme:

H.8 The student population will benefit from the information that the test provides. (2005, p. 805).

Davies and Elder highlight a number of challenges in, first, defining the nature of the benefit and then gatheringevidence in support of the hypothesis. One way to address the hypothesis would be to define the benefit as an increasein academic literacy, as measured by a further assessment of their language proficiency after, say, a semester or two ofstudy. However, DELNA is set up as a one-time assessment for each student and the system blocks them from taking itmore than once. In addition, there are currently no plans to introduce an exit test of English proficiency for graduatingstudents.

This means that we need to look for benefits in other ways. One kind of evidence relates to student uptake of theDELNA advice by accessing the various language support options available to them. If they enrol in an ESOL creditcourse or attend a tutorial linked to one of their subject courses, their progress and end-of-course achievement will beassessed by their tutors. On the other hand, it is more of a challenge to monitor the benefit gained by students whoparticipate in the support programmes at SLC and ELSAC. Students are required to register when they first access


these programmes and records are kept of their attendance at workshops and individual consultations, but that is notthe same as assessing the benefit of these language support opportunities in improving the students’ academic lan-guage proficiency.

A broader approach to the situation is to look at grade point averages and retention rates for whole undergraduatecohorts, particularly in courses with large EAL student enrolments. As Davies and Elder (2005) point out, though, it isdifficult to separate out language proficiency from academic ability, motivation, sociocultural adjustment and therange of other factors that influence student achievement in their university studies, particularly if underachievementis represented not just by dropout or failure rates but also by lower grades than the student might otherwise haveachieved. The issues involved are reminiscent of those which have complicated research on the predictive validityof major proficiency tests like TOEFL and IELTS (see, e.g., Hill, Storch, & Lynch, 1999; Light, Xu, & Mossop, 1987).

Thus, global university-wide measures of impact may prove to be less useful than more focused investigations ofparticular groups of students. One such study, being conducted by the DELNA Programme in conjunction with theDepartment of Film, TVand Media Studies, is tracking a cohort of students through their three years of study towardsa BA major in FTVMS. The data include annual interviews with the students as well as the quantitative measuresprovided by the initial DELNA results and their course grades. Through this kind of targeted research, it will becomepossible to develop a validity argument that combines rich qualitative evidence with more objective measures ofstudents’ language proficiency and academic achievement.

6. Conclusion

The DELNA assessment programme has a number of features that differentiate it from other tests of English foracademic purposes. First, it does not function as a gatekeeping device for university admission, and students cannot beexcluded from the institution on the basis of their results in either phase of the assessment. The fact that some studentsfind this hard to believe helps to account for the relatively low participation rate in the Diagnosis phase of DELNAamong those who are recommended to take it. Secondly, it is not simply a placement procedure to direct studentsinto one or more courses within a required EAP programme according to their level and areas of need. There is a rangeof language support options that students are recommended to participate in as appropriate. A related feature is thedistinctive philosophy behind the programme which holds that students should retain a degree of personal choiceas to whether they take advantage of the opportunities for language and study support which are available to them.Although it partly reflects the constraints imposed by national education legislation, this approach is also based onthe assumption that academic language support will be more effective if students recognise for themselves the extentof their language needs and make a commitment to attend to them.

One other important characteristic of DELNA is that it is centrally funded with a direct management line to theoffice of the Deputy Vice-Chancellor (Academic). Although its offices are located in the Department of Applied Lan-guage Studies and Linguistics, the programme has always been conceived as a university-wide initiative. This helps toavoid any perception that DELNA is just serving the interests of a particular department or faculty. It is an issue thathas emerged in discussions with staff from other New Zealand universities about the possibility of introducinga DELNA programme on their own campuses. Initial enquiries have typically come from student learning advisersor ESOL tutors who have thought in terms of purchasing a set of diagnostic tests for their own institution. However,a briefing on the full scope of DELNA and its associated language support provisions reveals how much more is in-volved, with a firm commitment by senior management being a crucial element in the successful operation of the pro-gramme at Auckland.

The DELNA programme is moving to a consolidation phase after the considerable expansion in the coverage ofincoming undergraduate students over the past couple of years. There is a consequent need to ensure that effectiveuse is made of the DELNA results and that an increasing proportion of the targeted students participate in the appro-priate forms of language support and enhancement. The position of DELNA as a centrally funded programme is se-cure for the foreseeable future, although it remains to be seen to what extent the university will be able to commitsufficient resources to meet the range of language needs that the assessment results are revealing. Other related issuesmay yet emerge, such as the need to set language proficiency standards for students graduating from Bachelors pro-grammes or concerns about the academic literacy of postgraduate students. For now, though, it is widely acceptedwithin the institution that DELNA is a very worthwhile means of addressing the language needs of incomingundergraduates.


References

Alderson, J. C. (Ed.). (1996). Washback in language testing [Special issue]. Language Testing, 13(3).

Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press.

Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press.

Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary tests. Language Testing, 16,

131e162.

Bright, C., & von Randow, J. (2004, September). Tracking language test consequences: the student perspective. Paper presented at the Ninth

National Conference on Community Languages and English for Speakers of Other Languages (CLESOL), Christchurch, New Zealand.

Brown, J. D. (1993). A comprehensive criterion-referenced language testing project. In: D. Douglas, & C. Chapelle (Eds.), A new decade oflanguage testing research (pp. 163e184). Washington, DC: TESOL.

Carroll, B. J. (1980). Testing communicative performance. Oxford: Pergamon.

Cheng, L., & Watanabe, Y. (2004). Washback in language testing: Research contexts and methods. Mahwah, NJ: Lawrence Erlbaum Associates.

Davies, A. (1975). Two tests of speed reading. In: R. L. Jones, & B. Spolsky (Eds.), Testing language proficiency (pp. 119e130). Arlington, VA:

Center for Applied Linguistics.

Davies, A. (1990). Principles of language testing. Oxford: Blackwell.

Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). A dictionary of language testing. Cambridge: Cambridge

University Press.

Davies, A., & Elder, C. (2005). Validity and validation in language testing. In: E. Hinkel (Ed.), Handbook of research in second language teaching

and learning (pp. 795e813). Mahwah, NJ: Lawrence Erlbaum.

Elder, C., Barkhuizen, G., Knoch, U., & von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writing

assessment. Language Testing, 24, 37e64.

Elder, C., & Erlam, R. (2001). Development and validation of the diagnostic english language needs assessment (DELNA): Final report. Auckland:

Department of Applied Language Studies and Linguistics, University of Auckland.

Elder, C., McNamara, T., & Congdon, P. (2003). Rasch techniques for detecting bias in performance assessments: An example comparing the

performance of native and non-native speakers on a test of academic English. Journal of Applied Measurement, 4, 181e197.

Elder, C., & von Randow, J. (in press). Exploring the utility of a web-based English language screening tool. Language Assessment Quarterly.

Ellis, R. (1998). Proposal for a language proficiency entrance examination. Unpublished manuscript, New Zealand: University of Auckland.

Ellis, R., & Hattie, J. (1999). English language proficiency at the University of Auckland: A proposal. Unpublished manuscript, New Zealand:

University of Auckland.

Fox, J. (2004). Test decisions over time: Tracking validity. Language Testing, 21, 437e465.

Fulcher, G. (1997). An English language placement test: issues in reliability and validity. Language Testing, 14, 113e139.

Harklau, L., Losey, K. M., & Siegal, M. (Eds.). (1999). Generation 1.5 meets college composition: Issues in the teaching of writing to U.S.

educated learners of ESL. Mahwah, NJ: Lawrence Erlbaum.

Hill, K., Storch, N., & Lynch, B. (1999). A comparison of IELTS and TOEFL as predictors of academic success. In: R. Tulloh (Ed.), IELTS

research reports, (Vol. 2, pp. 52e63). Canberra: IELTS Australia.

Knoch, U., Read, J., & von Randow, J. (2007). Re-training writing raters online: How does it compare with face-to-face training? Assessing

Writing, 12, 26e43.

Lado, R. (1961). Language testing. London: Longman.

Light, R. L., Xu, M., & Mossop, J. (1987). English proficiency and academic performance of international students. TESOL Quarterly, 21,

251e261.

Manning, W. H. (1987). Development of cloze-elide tests of English as a second language. TOEFL Research Report, No. 23. Princeton, NJ:

Educational Testing Service.

Messick, S. (1996). Validity and washback in language testing. Language Testing, 13, 241e256.

Read, J., & Chapelle, C. A. (2001). A framework for second language vocabulary assessment. Language Testing, 18, 1e32.

Read, J., & Hayes, B. (2003). The impact of IELTS on preparation for academic study in New Zealand. In: R. Tulloh (Ed.), IELTS research reports

2003, (Vol. 4, pp. 153e205). Canberra: IELTS Australia.

Wall, D., Clapham, C., & Alderson, J. C. (1994). Evaluating a placement test. Language Testing, 11, 321e344.

John Read is Head of the Department of Applied Language Studies and Linguistics at the University of Auckland. His primary research interests

are in vocabulary assessment and testing English for academic and professional purposes. He is the author of Assessing Vocabulary (Cambridge,

2000) and has been co-editor of Language Testing.

Documents

Identifying academic language needs through diagnostic assessment