THE VALIDITY ANALYSIS OF READING TEST ITEMS …eprints.ums.ac.id/68416/10/PUBLICATION ARTICLE.pdfii STATEMENT OF AUTHORSHIP Hereby the writer declares that the thesis entitled “The

i

THE VALIDITY ANALYSIS OF READING TEST ITEMS ON

NATIONAL STANDARD SCHOOL FINAL EXAMINATION

FOR 12th

GRADE OF MAN 1 SEMARANG

Submitted to the Department of Language Studies,

Graduate School of Universitas Muhammadiyah Surakarta

in partial fulfilment of the requirements for

the degree of Master of Education

By :

MUHAMMAD NURYANTO

S200160074

GRADUATE PROGRAM

MAGISTER OF LANGUAGE STUDIES

UNIVERSITAS MUHAMMADIYAH SURAKARTA

2018

i

SUPERVISOR’S APPROVAL FORM

THE VALIDITY ANALYSIS OF READING TEST ITEMS ON NATIONAL

STANDARD SCHOOL FINAL EXAMINATION FOR 12TH

GRADE OF

MAN 1 SEMARANG

MUHAMMAD NURYANTO

S 200160074

Publication article has been approved by the advisors to be examined by the

board of examiners.

Firs Advisor

Muamaroh, Ph.D.

Second Advisor

Mauly Halwat Hikmat, Ph.D.

ii

STATEMENT OF AUTHORSHIP

Hereby the writer declares that the thesis entitled “The Validity Analysis

of Reading Test Items on National Standard School Final Examination for

12th

Grade of MAN 1 Semarang” is made by the writer himself, and does not

contain materials written and have been published by other peoples except the

information from the references. The thesis has satisfied the rules and regulation

of Universitas Muhammadiyah Surakarta with respect to plagiarism.

The writer is capable to account his thesis if in the future it is proved of

containing others‟ idea or in fact the writer imitates the others‟ thesis. This

declaration is made by the writer, and he hopes that this declaration can be

understood.

Surakarta, 16th

of October, 2018

The Writer

Muhammad

Nuryanto

iii

1

THE VALIDITY ANALYSIS OF READING TEST ITEMS ON

NATIONAL STANDARD SCHOOL FINAL EXAMINATION

FOR 12th

GRADE OF MAN 1 SEMARANG

Abstrak

Tujuan dari penelitian ini adalah (1) untuk menguji apakah butir soal NSSFE

sesuai dengan materi, (2) untuk menguji apakah butir soal NSSFE sesuai dengan

tujuan pembelajaran yang diharapkan, (3) untuk menguji apakah butir soal tes

NSSFE memenuhi persepsi guru, (4) untuk menguji apakah butir tes NSSFE

memenuhi persepsi siswa. Jenis penelitian ini adalah deskriptif kualitatif

dilengkapi dengan analisis statistik. Penelitian ini merupakan penyelidikan kasus

tunggal atau kasus kolektif untuk menangkap kompleksitas objek penelitian.

Objek penelitian ini soal reading Ujian Sekolah Berstandar Nasional untuk kelas

12 MAN 1 Semarang. Subyek penelitian ini adalah tiga guru bahasa Inggris dan

40 siswa jurusan IPA dari kelas 12 MAN 1 Semarang. Sumber data yang

digunakan dalam penelitian ini adalah: dokumen, dan informan. Teknik

pengumpulan data yang digunakan adalah analisis dokumen, wawancara dan

kuesioner. Validitas data yang digunakan adalah triangulasi data untuk

mendapatkan informasi yang dapat dipertanggungjawabkan melalui berbagai

sumber. Temuan dari penelitian ini menunjukkan bahwa tes telah memenuhi

kriteria memiliki validitas konten karena temuan dari penelitian ini adalah (1)

butir soal USBN sesuai dengan materi, (2) butir soal USBN sesuai dengan tujuan

pembelajaran, (3) butir soal USBN memenuhi persepsi guru, (4) butir soal USBN

memenuhi persepsi siswa.

Kata kunci: Ujian Nasional, Tes Bahasa Inggris, Validitas, Reliabilitas.

Abstract

The objectives of this study were (1) to examine whether the test items of NSSFE

match the materials, (2) to examine whether the test items of NSSFE match the

expected learning objectives, (3) to examine whether the test items of NSSFE

meet the teachers‟ perception, (4) to examine whether the test items of NSSFE

meet the students‟ perception. The type of the research is a descriptive qualitative

research method. It was an investigation of a single case or collective case to

capture the complexity of objects of the study. The object of the study was reading

test items on National Standard School Final Examination for 12th

grade of MAN

1 Semarang. The subjects of the study were three English teachers and 40 students

of Natural science major from 12th

grade of MAN 1 Semarang. The data sources

used in this study were: document, and informant. The techniques of collecting

data used were document analysis, interview, test and questionnaire. The data

validity used was data triangulation to gain valid information through a variety of

sources. The findings of this study showed that the test has fulfilled the criteria of

having content validity since the findings of the current study were (1) The test

2

items of NSSFE match the materials, (2) The test items of NSSFE match the

learning objectives, (3) The test items of NSSFE meet the teachers‟ perception,

(4) the test items of NSSFE meet the students‟ perception.

Keywords: National Examination, English Test, Validity, Reliability.

1. INTRODUCTION

In the world of education, teaching and testing are the most important and

inseparable activities. They are interdependent and interrelated processes. The

successful of an educational setting can be shown from the successful of the

assessment. That is way assessment is the most important part of educational

setting in every teaching and learning process including English. Start from

planning, teaching and learning process in the classroom, evaluation, and the last

one is assessment. The quality of any assessment in any educational setting results

from the quality of the instruments.

On the other hand, reading as one of language skills is practice to

understand meaning of texts effectively and comprehensively (Anderson et al,

1985; Nakamoto et al, 2008 in Muamaroh et al, 2018). Reading also can be

defined as the process of constructing meaning trough dynamic interaction among

readers in written language, in line with the reading situation. It is clear that,

reading is an important activity in any language class, not only as the source of

information and pleasure but also as a means of consolidating and extending one‟s

knowledge of a language. Patel and Jain (2008:113) state that reading is an active

process which consists of recognition and comprehension skill. Among four

language skills, reading is the most necessary and important skill (Paul and

Bruder, 1982 in Muamaroh et al, 2018). It is mean that reading is an important

activity in life with which one can update readers‟ knowledge. Reading also can

improve the learners‟ vocabularies, grammar and event to understand the content.

Reading skill is an important tool for academic success.

Basing on these viewpoints, it is interesting to carry out an inquiry with its

main goal to answer the question: To what extent is the test instrument used in

English National Standard School Final Examination (NSSFE) in MAN 1

Semarang valid and reliable? The roles, the importance, and the issue of

3

authenticity of National Standard School Final Examination (NSSFE) are not

discussed in detail as they are beyond the scope of this research. Instead, the

research focus is on the content validity; construct validity and reliability, and the

item analysis of the 25 multiple-choice test items and 5 essay test items of Senior

High School National Standard School Final Examination (NSSFE) of the test in

the academic year 2017/2018. Based on the background above, the writer

interested to conduct a research related to the topic. It is an analysis of the test

evaluation of students with the title: “The Validity Analysis of Reading Test Items

on National Standard School Final Examination for 12th

Grade of MAN 1

Semarang”.

A lot is written about evaluation, a great deal of which is misleading and

confused. Many informal educators are suspicious of evaluation because they see

it as something that is imposed from outside. Braun, et al. (2006) assert that

evaluation is a process of examining student performance, comparing, judging its

quality, and determining whether or not and how well the learner has met the

course objectives. Through evaluation teachers can assess the success of the

students and make adjustment in the lesson plans for the coming teaching days.

Zimmaro (2004) additionally explains that evaluation is done based on the process

of gathering, describing, or quantifying information about performance.

Stufflebeam (2000) Evaluation is a study designed and conducted to assist some

audience to assess an object‟s merit or worth. Scriven (1991) Evaluation is the

process of determining the merit, worth and value of things and evaluations are

the products of that process.

Brown (1998: 420) stated that assessment is an integral part of the

teaching-learning cycle, influence and communicate curriculum. Brown (2004: 4)

also said that assessment is a process which covered a much of wider domain in

general competence of all skills of a language. He added that an assessment is not

a product but a test is a form of assessment. Assessment is a process to ensure that

the course or class objectives and goals are met. The examples of assessment; the

student‟s task should suitable to the objectives of teaching and learning process.

Linn and Gronlund (2000) defined assessment as “the various methods used to

4

determine the extent to which students are achieving the intended learning

outcomes of instruction.”

In general, language tests can be designed for either norm-referenced or

criterion referenced tests (Braun, et al., 2006). Brown (2005) claim that norm-

referenced tests are designed to place examinees along a mathematical continuum

in rank order. Thus, they are not designed to determine how much students have

learned in a specific setting, but rather to compare students with other similar

students (Thomas, Allman & Beech, 2004). Furthermore, Linn and Gronlund

(2000) point out that norm-referenced evaluation typically covers a large domain

of requirements with a few tasks used to measure mastery, emphasizes

discrimination among individuals, favors tasks of average difficulty, omits very

easy and very hard tasks, and requires a clearly defined group of persons for

interpretations.

In evaluation and assessment, we know the word validity. Pearson, (2012:

35) said that a valid test is an instrument to measure what is aimed to be measured

The ability of a test to measure what is supposed to be measured can be improved

in some ways. Parson (2012) also note that it can be improved by carefully

matching a test with the course objectives and teaching methods, increasing the

selection of objectives and content areas included in any given test, using methods

that are appropriate for the specified objectives, employing a range of test

methods, ensuring adequate security and supervision to avoid cheating, and

improving the reliability of the test. Brown (2004: 22) stated that validity is the

complex criterion of a good test which is test actually is intended to measure.

Groundlund (1998: 226) in Brown (2004) stated that validity is the extent to

which inferences made from assessment results are appropriate, meaningful, and

useful in term of the purpose of the statement. Validity has been identified as the

most important quality of test. It concerns the extent to which meaningful

inferences can be drawn from test scores (Bachman, 1990). A valid test of reading

ability actually measures reading ability – not 20/20 vision, nor previous

knowledge in a subject, nor some other variable of questionable relevance

5

(Brown, 2004). In other words, a valid test is test measuring what should be

measured, assessing what it should assesses.

2. METHOD

The design of the research is a descriptive qualitative research method. Brown

(2005) stated that descriptive qualitative research method allowed the researcher

to describe the data and examine relationships between variables that provide

information about conditions, situations, and events that occur in the present.

While the subject of the research is the twelfth grade students of MAN 1

Semarang in the academic year of 2017/2018. Since the limit of the time the

present study only investigated two groups of test taker that consist of twenty

students each group from Natural science. Therefore, the basic data sources are

the Reading test items of English National Standard School Final Examination

(NSSFE) of MAN 1 Semarang in the academic year of 2017/2018 that managed

by the teachers association (MGMP) in Kabupaten Semarang and the English

syllabus of 2013 curriculum.

The data taken from the document analysis, interview, questionnaire, and

test. Therefore, the basic data source is the English National Standard School

Final Examination (NSSFE) of MAN 1 Semarang in the academic year of

2017/2018; the second one is the English syllabus. The data sources were

informants and document. In order to get the data in the research, the researcher

needed to determine the data collection method that used in the research.

Moreover, to analyze the content validity of the English National Standard School

Final Examination (NSSFE) the analyzed the test instrument and English syllabus.

The researcher also interviewed and shared questionnaire to three English teachers

and forty students of MAN 1 Semarang from the twelfth grade. Four instruments

were used in the content validation process. These include document analysis,

interview and questionnaire.

To ensure the data validity, the triangulation method was applied in the

current study in order to eliminate bias. The researcher needs many data related to

the validity analysis of the Reading test items in the English National Standard

6

School Final Examination (NSSFE) of MAN 1 Semarang. Therefore, the

researcher used several techniques in collecting data in order to make it valid,

there are as follows; Document analysis, interview, questionnaire, and test. The

data was analyzed in description and statistical analysis.

3. FINDINGS AND DISCUSSION

The discussion of research findings is underlined on analysis of the research

findings. In this part, the writer would like to discuss about the comparison

between present research findings towards the previous studies, then the

compatibility to the theories.

From the findings of the current research, found that 27 items (90%) of the

English National Standard School Final Examination (NSSFE) of MAN 1

Semarang in the academic year of 2017/2018 represented the materials in the

syllabus and the rest 3 items (10%) did not match the materials as stated in the

syllabus. So, if it is calculated in percentage, the English National Standard

School Final Examination (NSSFE) contains 90 % of the materials in the syllabus.

Based on the description above, it means the English National Standard School

Final Examination (NSSFE) is 90 % valid in term of content validity. The current

findings in line with Sugianto‟s (2016) finding, it is found that all items of the

English National Final Examination for junior high school represent the materials

in the syllabus.

It is also correspond to Fiktorius‟ (2014) finding; he concluded that the test

has fulfilled the criteria of having content validity. But, the test developers need to

consider revising items with very low or very high item difficulty and very low

item discrimination. Finally, a further action needs to be taken to revise the

implausible distractors. The present study also found that the test items 90%

covered the content of the materials in the syllabus. But, correction were needed

to be taken to revise the test items of English NSSFE of MAN 1 Semarang by

taking consideration to the materials stated in the syllabus.

It was in line to the theory that believed content validity is applied to see

how well the content of the instrument represents the entire universe of content

which is aimed to be measured. Brown (2004) notes that there is more emphasis

7

given to content validity in the assessment of what has been learnt because it is

one of the major aspects required in the sampling technique of educational

contents and learning outcomes. As the name implies, content validity is

concerned with whether or not the content of the test is sufficiently representative

and comprehensive for the test to be a valid measure of what it is supposed to

measure that can be best examined with the table of specifications. If the result of

the analysis shows each of the English test item suits the test specifications, then

the test is claimed to be valid Brown in Kindeye, (2002).

In contrary the research that has been done by Kindeya (2002) on the

whole, in his research concluded that the present test items did not match with the

syllabus contents for both the materials and the learning objectives. It is different

with the current study, all of the test items match with the content of the syllabus

in term of the objectives and the materials since the test instrument constructed

regionally by the teachers association in MAN 1 Semarang. Kindya (2002) added

that the EGSEC English examination of the present study, there is poor

concentration towards preparing of a valid test. Moreover, Rukmini‟s (2015)

finding showed that the English National Examination does not have content-

evidence of validity. The test-items do not cover all sections required by the

curriculum. Validity has been identified as the most important quality of test. It

concerns the extent to which meaningful inferences can be drawn from test scores

(Bachman, 1990). „A valid test of reading ability actually measures reading ability

– not 20/20 vision, nor previous knowledge in a subject, nor some other variable

of questionable relevance‟ (Brown, 2004). In other words, a valid test is test

measuring what should be measured, assessing what it should assesses. In the case

of NSSFE, the test items should be able to measure the materials stated in the

syllabus.

In this part, the writer would like to compare the findings of the current

study with the previous studies to place the position of the research. The current

finding found that 27 items (90%) of the English National Standard School Final

Examination (NSSFE) represent the learning objectives as stated in the syllabus

and the rest 3 items (10%) did not match the syllabus. So, if it is calculated in

8

percentage, the English National Standard School Final Examination (NSSFE)

contains 90 % of the learning objectives in the syllabus. Compared with the study

that has been done by Sugianto (2016) it is found that all items represent the

indicators in the syllabus. In other words, the English National Final Examination

contains 100% of the indicator in the syllabus. It means that the English National

Final Examination is valid in term of construct validity.

Moreover, the findings of the research that had been done by Rukmini

(2015), the result showed that the English National Examination does not have

content-evidence of validity. We can say so since the test-items do not cover all

sections required by the curriculum. In contrary, the current study found that 90 %

of the test items of English (NSSFE) of MAN 1 Semarang cover the materials and

objectives of the syllabus. It also met to the teachers‟ and students‟ perception. A

test should, therefore, always be constructed on an explicit specification which

addresses both the cognitive and linguistic abilities involved in activities in the

language use domain of interest, as well as the context in which these abilities are

performed. There are two major threats to construct validity: construct under-

representation and construct irrelevance (Messick, 1989). Test developers need to

ensure the constructs elicited are precisely those intended to and that these are not

contaminated by other irrelevant variables. If important constructs are under-

represented in a test, this may have an adverse backwash effect on the teaching

that precedes the test.

As stated in the finding above, the interview from the English teachers of

MAN 1 Semarang in the academic year of 2017/2018 showed that two teachers

(66.6%) believed that the test items of NSSFE of MAN 1 Semarang match the

materials, and one teacher (33.33%) said that the tests items of NSSFE of MAN 1

Semarang dot match the materials. Come along with Aprianto‟s (2013) finding, he

concluded that most teachers (47.37%) believed that English test in National

Exam has improved students‟ language ability, but a relatively high percentage of

teachers (36.84 %) did not really believe that it supported students‟ improvement

for some reasons. They said that ET in NE did not assess all skills so that it was

far from the whole description of students‟ improvement. Some others, 15.79% of

9

teachers, actually said the same thing as those who did not really believe that ET

in NE improved of students‟ competences. It only assessed perceptive skills

(reading and listening) which were not communicative as not all skills are

included, whereas actually they are inseparable.

The present study found 31 students said that the NSSFE test items of

MAN 1 Semarang in the academic year of 2017/2018 match the materials and 6

students believe that the test items do not match the materials, while the rest 3

students do not really sure whether the test items match the materials in the

syllabus or not. While dealing to the compatibility between the test items of

NSSFE towards the learning objectives, the current research found 30 students

said that the NSSFE test items of MAN 1 Semarang in the academic year of

2017/2018 match the learning objectives and 4 students believe that the test items

do not match the learning objectives, while the rest 6 students do not really sure

whether the test items match the learning objectives or not. Validity has been

identified as the most important quality of test. It concernsas the extent to which

meaningful inferences can be drawn from test scores (Bachman, 1990). “A valid

test of reading ability actually measures reading ability – not 20/20 vision, nor

previous knowledge in a subject, nor some other variable of questionable

relevance” (Brown, 2004). In other words, a valid test is test measuring what

should be measured, assessing what it should assesses. In the case of NSSFE,

most of the students believed that the test items of NSSFE of MAN 1 Semarang

match the materials and the learning objectives. Since the test items should be

able to measure the materials and learning objectives as stated in the syllabus.

A test fulfilled the content validity if it measures the materials that have

been programmed and given. The program materials are as described in the

curriculum. So, to reach the content validity, it is necessary to construct the items

of the test based on the materials that have been programmed in the curriculum.

The details of the curriculum can be seen in the syllabus. The statements above

are as Fulcher and Davidson (2007: 67) stated, Content validity is defined as any

attempt to show that the content of the test is a representative sample from the

10

domain that is to be tested. In our example of the academic reading test it would

be necessary to show that the texts selected for the.

4. CONCLUSSION

In this part, the researcher presented the research finding and discussion of the

findings on the English Reading test items of National Standard School Final

Examination (NSSFE) of MAN 1 Semarang to answer the research questions that

are delivered in the first chapter. The result of research finding can be summed up

as follows:

It is found that most of the test items of the English National Standard

School Final Examination (NSSFE) of MAN 1 Semarang represented the

materials and the learning objectives as stated in the syllabus. Moreover, from the

interview that has been done to English teachers and students of MAN 1

Semarang it can be said that all of the test item of English National Standard

School Final Examination (NSSFE) match with the objectives and materials of the

syllabus. Since most of the test items of English National Standard School Final

Examination (NSSFE) match the learning objectives and materials of the syllabus,

it can be said that the test items stated in the English National Standard School

Final Examination (NSSFE) met their perceptions.

In addition, from the analysis, in term of face validity, 27 of the questions

were determined as a valid item, and the rest 3 items are not valid. The validity

analysis was done for two groups of students, the first class and the second class

that consists of twenty students each. The result also shown that the test item of

English National Standard School Final Examination (NSSFE) of MAN 1

Semarang in the academic year of 2017/2018 were reliable, since the data taken

from one class to another class shown a consistent result.

The result of the current study concluded that the instrument of English

National Standard School Final Examination of MAN 1 Semarang in the

academic year of 2017/2018 determined as a valid instrument. Since it was a valid

instrument in term of content validity it should be able to measure students‟

achievement. Therefore, the improvement of the teaching technic need to be

achieved. The test items that are constructed according to the syllabus and the

11

magnitude of contents are a good test that can measure and assess students‟

achievement in teaching and learning process. This study is expected to provide

benefits and implications for learning English in general and specifically to the

accuracy of national examination instruments to assess the ability and learning

outcomes of students in English learning. By equipping the schools with language

instructional materials, continuous assessment can be introduced in grade 10 and

11 so that the difficult skills to assess will be accomplished there and a certain

amount of the students result (in %) can be considered in the final exam of the

NSSFE. The present findings show the NSSFE test items match the materials and

learning objectives; therefore, great attention should be given to test construction

by the teacher association.

BIBLIOGRAPHY

Aprianto, K. (2013). Validity and Washback of English Test in National

Examination. English Education Journal EEJ 3 (1) (2013), 7.

Bachman, L. F. (1990). Building and supporting a case for test use. Language

Assessment Quarterly.

Braun, H., Kanjee, A., Bettinger, E., & Kremer, M. (2006). Improving education

through assessment, innovation, and evaluation. Cambridge: American

Academy of Arts and Sciences.

Brown, H. Douglas. (2004). Language Assessment: Principles and Classroom

Practices. Pearson Education: NY.

Brown, H. Douglas, (2005). Managing Change and Innovation in. Public Service

Organizations. New York: Routledge. Osborne.

Fiktorius, T. (2014). A Validation Study on National English Examination of

Junior High School in Indonesia. Postgraduate thesis of Masters Study

Program of English Language Education Teacher Training and Education

Faculty. Tanjungpura University Pontianak.

Kindeya, N. T. (2002). The Content Validity of the Ethiopian General Secondary

Education Certificate English Examination. A Thesis of The School of

Graduate Studies Addis Ababa University, 111.

Linn, R., Gronlund, E., (2000) Measurement and Assessment in Teaching Edition

8. illustrated. Publisher

Patel, M. E. and Jain, P. M. (2008). English Language Teaching (Methods, Tools

& Techniques). Jaipur: Sunrise Publishers & Distributors.

12

Rukmini, D., (2015). Does the National Examination of English Subject in

Indonesia Test What Should Be Tested?. Ragam Jurnal Pengembangan

Humaniora Vol. 15 No. 3, Desember 2015

Scriven, M., (1991). Evaluation Thesaurus. SAGE, Aug 27, 1991 - Language

Arts & Disciplines - 391 pages.

Stufflebeam, D. L. (2000). Evalutaion Models Viewpoints on Educational And

Human Services Evaluation (2nd ed., pp. 280-317). Boston, MA Kluwer

Academic.

Sugianto, A. (2016). An analysis of English National Final Examination for Junior

High School in Term of Validity and Reliability. Journal on English as a

Foreign Language p-ISSN 2088-1657 Volume 6, Number 1, 12.

Thomas, J., Allman, C., & Beech, M. (2004). Assessment for the diverse

classroom: A handbook for teachers. Tallahassee, FL: Florida

Department of Education, Bureau of Exceptional Education and Student

Services. Retrieved from http://www.fldoe.org/ese/pdf/assess_diverse.pdf.

Volante and Fazio.

Yibrah, M. (2017). Assessing Content Validity of the EGSEC English

Examinations. International Journal of Innovations in TESOL and Applied

Linguistics Vol. 3, No. 1; 2017 ISSN 2454-6887, 66.

Documents

THE VALIDITY ANALYSIS OF READING TEST ITEMS …eprints.ums.ac.id/68416/10/PUBLICATION ARTICLE.pdfii STATEMENT OF AUTHORSHIP Hereby the writer declares that the thesis entitled “The