Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
i
THE VALIDITY ANALYSIS OF READING TEST ITEMS ON
NATIONAL STANDARD SCHOOL FINAL EXAMINATION
FOR 12th
GRADE OF MAN 1 SEMARANG
Submitted to the Department of Language Studies,
Graduate School of Universitas Muhammadiyah Surakarta
in partial fulfilment of the requirements for
the degree of Master of Education
By :
MUHAMMAD NURYANTO
S200160074
GRADUATE PROGRAM
MAGISTER OF LANGUAGE STUDIES
UNIVERSITAS MUHAMMADIYAH SURAKARTA
2018
i
SUPERVISOR’S APPROVAL FORM
THE VALIDITY ANALYSIS OF READING TEST ITEMS ON NATIONAL
STANDARD SCHOOL FINAL EXAMINATION FOR 12TH
GRADE OF
MAN 1 SEMARANG
MUHAMMAD NURYANTO
S 200160074
Publication article has been approved by the advisors to be examined by the
board of examiners.
Firs Advisor
Muamaroh, Ph.D.
Second Advisor
Mauly Halwat Hikmat, Ph.D.
ii
STATEMENT OF AUTHORSHIP
Hereby the writer declares that the thesis entitled “The Validity Analysis
of Reading Test Items on National Standard School Final Examination for
12th
Grade of MAN 1 Semarang” is made by the writer himself, and does not
contain materials written and have been published by other peoples except the
information from the references. The thesis has satisfied the rules and regulation
of Universitas Muhammadiyah Surakarta with respect to plagiarism.
The writer is capable to account his thesis if in the future it is proved of
containing others‟ idea or in fact the writer imitates the others‟ thesis. This
declaration is made by the writer, and he hopes that this declaration can be
understood.
Surakarta, 16th
of October, 2018
The Writer
Muhammad
Nuryanto
iii
1
THE VALIDITY ANALYSIS OF READING TEST ITEMS ON
NATIONAL STANDARD SCHOOL FINAL EXAMINATION
FOR 12th
GRADE OF MAN 1 SEMARANG
Abstrak
Tujuan dari penelitian ini adalah (1) untuk menguji apakah butir soal NSSFE
sesuai dengan materi, (2) untuk menguji apakah butir soal NSSFE sesuai dengan
tujuan pembelajaran yang diharapkan, (3) untuk menguji apakah butir soal tes
NSSFE memenuhi persepsi guru, (4) untuk menguji apakah butir tes NSSFE
memenuhi persepsi siswa. Jenis penelitian ini adalah deskriptif kualitatif
dilengkapi dengan analisis statistik. Penelitian ini merupakan penyelidikan kasus
tunggal atau kasus kolektif untuk menangkap kompleksitas objek penelitian.
Objek penelitian ini soal reading Ujian Sekolah Berstandar Nasional untuk kelas
12 MAN 1 Semarang. Subyek penelitian ini adalah tiga guru bahasa Inggris dan
40 siswa jurusan IPA dari kelas 12 MAN 1 Semarang. Sumber data yang
digunakan dalam penelitian ini adalah: dokumen, dan informan. Teknik
pengumpulan data yang digunakan adalah analisis dokumen, wawancara dan
kuesioner. Validitas data yang digunakan adalah triangulasi data untuk
mendapatkan informasi yang dapat dipertanggungjawabkan melalui berbagai
sumber. Temuan dari penelitian ini menunjukkan bahwa tes telah memenuhi
kriteria memiliki validitas konten karena temuan dari penelitian ini adalah (1)
butir soal USBN sesuai dengan materi, (2) butir soal USBN sesuai dengan tujuan
pembelajaran, (3) butir soal USBN memenuhi persepsi guru, (4) butir soal USBN
memenuhi persepsi siswa.
Kata kunci: Ujian Nasional, Tes Bahasa Inggris, Validitas, Reliabilitas.
Abstract
The objectives of this study were (1) to examine whether the test items of NSSFE
match the materials, (2) to examine whether the test items of NSSFE match the
expected learning objectives, (3) to examine whether the test items of NSSFE
meet the teachers‟ perception, (4) to examine whether the test items of NSSFE
meet the students‟ perception. The type of the research is a descriptive qualitative
research method. It was an investigation of a single case or collective case to
capture the complexity of objects of the study. The object of the study was reading
test items on National Standard School Final Examination for 12th
grade of MAN
1 Semarang. The subjects of the study were three English teachers and 40 students
of Natural science major from 12th
grade of MAN 1 Semarang. The data sources
used in this study were: document, and informant. The techniques of collecting
data used were document analysis, interview, test and questionnaire. The data
validity used was data triangulation to gain valid information through a variety of
sources. The findings of this study showed that the test has fulfilled the criteria of
having content validity since the findings of the current study were (1) The test
2
items of NSSFE match the materials, (2) The test items of NSSFE match the
learning objectives, (3) The test items of NSSFE meet the teachers‟ perception,
(4) the test items of NSSFE meet the students‟ perception.
Keywords: National Examination, English Test, Validity, Reliability.
1. INTRODUCTION
In the world of education, teaching and testing are the most important and
inseparable activities. They are interdependent and interrelated processes. The
successful of an educational setting can be shown from the successful of the
assessment. That is way assessment is the most important part of educational
setting in every teaching and learning process including English. Start from
planning, teaching and learning process in the classroom, evaluation, and the last
one is assessment. The quality of any assessment in any educational setting results
from the quality of the instruments.
On the other hand, reading as one of language skills is practice to
understand meaning of texts effectively and comprehensively (Anderson et al,
1985; Nakamoto et al, 2008 in Muamaroh et al, 2018). Reading also can be
defined as the process of constructing meaning trough dynamic interaction among
readers in written language, in line with the reading situation. It is clear that,
reading is an important activity in any language class, not only as the source of
information and pleasure but also as a means of consolidating and extending one‟s
knowledge of a language. Patel and Jain (2008:113) state that reading is an active
process which consists of recognition and comprehension skill. Among four
language skills, reading is the most necessary and important skill (Paul and
Bruder, 1982 in Muamaroh et al, 2018). It is mean that reading is an important
activity in life with which one can update readers‟ knowledge. Reading also can
improve the learners‟ vocabularies, grammar and event to understand the content.
Reading skill is an important tool for academic success.
Basing on these viewpoints, it is interesting to carry out an inquiry with its
main goal to answer the question: To what extent is the test instrument used in
English National Standard School Final Examination (NSSFE) in MAN 1
Semarang valid and reliable? The roles, the importance, and the issue of
3
authenticity of National Standard School Final Examination (NSSFE) are not
discussed in detail as they are beyond the scope of this research. Instead, the
research focus is on the content validity; construct validity and reliability, and the
item analysis of the 25 multiple-choice test items and 5 essay test items of Senior
High School National Standard School Final Examination (NSSFE) of the test in
the academic year 2017/2018. Based on the background above, the writer
interested to conduct a research related to the topic. It is an analysis of the test
evaluation of students with the title: “The Validity Analysis of Reading Test Items
on National Standard School Final Examination for 12th
Grade of MAN 1
Semarang”.
A lot is written about evaluation, a great deal of which is misleading and
confused. Many informal educators are suspicious of evaluation because they see
it as something that is imposed from outside. Braun, et al. (2006) assert that
evaluation is a process of examining student performance, comparing, judging its
quality, and determining whether or not and how well the learner has met the
course objectives. Through evaluation teachers can assess the success of the
students and make adjustment in the lesson plans for the coming teaching days.
Zimmaro (2004) additionally explains that evaluation is done based on the process
of gathering, describing, or quantifying information about performance.
Stufflebeam (2000) Evaluation is a study designed and conducted to assist some
audience to assess an object‟s merit or worth. Scriven (1991) Evaluation is the
process of determining the merit, worth and value of things and evaluations are
the products of that process.
Brown (1998: 420) stated that assessment is an integral part of the
teaching-learning cycle, influence and communicate curriculum. Brown (2004: 4)
also said that assessment is a process which covered a much of wider domain in
general competence of all skills of a language. He added that an assessment is not
a product but a test is a form of assessment. Assessment is a process to ensure that
the course or class objectives and goals are met. The examples of assessment; the
student‟s task should suitable to the objectives of teaching and learning process.
Linn and Gronlund (2000) defined assessment as “the various methods used to
4
determine the extent to which students are achieving the intended learning
outcomes of instruction.”
In general, language tests can be designed for either norm-referenced or
criterion referenced tests (Braun, et al., 2006). Brown (2005) claim that norm-
referenced tests are designed to place examinees along a mathematical continuum
in rank order. Thus, they are not designed to determine how much students have
learned in a specific setting, but rather to compare students with other similar
students (Thomas, Allman & Beech, 2004). Furthermore, Linn and Gronlund
(2000) point out that norm-referenced evaluation typically covers a large domain
of requirements with a few tasks used to measure mastery, emphasizes
discrimination among individuals, favors tasks of average difficulty, omits very
easy and very hard tasks, and requires a clearly defined group of persons for
interpretations.
In evaluation and assessment, we know the word validity. Pearson, (2012:
35) said that a valid test is an instrument to measure what is aimed to be measured
The ability of a test to measure what is supposed to be measured can be improved
in some ways. Parson (2012) also note that it can be improved by carefully
matching a test with the course objectives and teaching methods, increasing the
selection of objectives and content areas included in any given test, using methods
that are appropriate for the specified objectives, employing a range of test
methods, ensuring adequate security and supervision to avoid cheating, and
improving the reliability of the test. Brown (2004: 22) stated that validity is the
complex criterion of a good test which is test actually is intended to measure.
Groundlund (1998: 226) in Brown (2004) stated that validity is the extent to
which inferences made from assessment results are appropriate, meaningful, and
useful in term of the purpose of the statement. Validity has been identified as the
most important quality of test. It concerns the extent to which meaningful
inferences can be drawn from test scores (Bachman, 1990). A valid test of reading
ability actually measures reading ability – not 20/20 vision, nor previous
knowledge in a subject, nor some other variable of questionable relevance
5
(Brown, 2004). In other words, a valid test is test measuring what should be
measured, assessing what it should assesses.
2. METHOD
The design of the research is a descriptive qualitative research method. Brown
(2005) stated that descriptive qualitative research method allowed the researcher
to describe the data and examine relationships between variables that provide
information about conditions, situations, and events that occur in the present.
While the subject of the research is the twelfth grade students of MAN 1
Semarang in the academic year of 2017/2018. Since the limit of the time the
present study only investigated two groups of test taker that consist of twenty
students each group from Natural science. Therefore, the basic data sources are
the Reading test items of English National Standard School Final Examination
(NSSFE) of MAN 1 Semarang in the academic year of 2017/2018 that managed
by the teachers association (MGMP) in Kabupaten Semarang and the English
syllabus of 2013 curriculum.
The data taken from the document analysis, interview, questionnaire, and
test. Therefore, the basic data source is the English National Standard School
Final Examination (NSSFE) of MAN 1 Semarang in the academic year of
2017/2018; the second one is the English syllabus. The data sources were
informants and document. In order to get the data in the research, the researcher
needed to determine the data collection method that used in the research.
Moreover, to analyze the content validity of the English National Standard School
Final Examination (NSSFE) the analyzed the test instrument and English syllabus.
The researcher also interviewed and shared questionnaire to three English teachers
and forty students of MAN 1 Semarang from the twelfth grade. Four instruments
were used in the content validation process. These include document analysis,
interview and questionnaire.
To ensure the data validity, the triangulation method was applied in the
current study in order to eliminate bias. The researcher needs many data related to
the validity analysis of the Reading test items in the English National Standard
6
School Final Examination (NSSFE) of MAN 1 Semarang. Therefore, the
researcher used several techniques in collecting data in order to make it valid,
there are as follows; Document analysis, interview, questionnaire, and test. The
data was analyzed in description and statistical analysis.
3. FINDINGS AND DISCUSSION
The discussion of research findings is underlined on analysis of the research
findings. In this part, the writer would like to discuss about the comparison
between present research findings towards the previous studies, then the
compatibility to the theories.
From the findings of the current research, found that 27 items (90%) of the
English National Standard School Final Examination (NSSFE) of MAN 1
Semarang in the academic year of 2017/2018 represented the materials in the
syllabus and the rest 3 items (10%) did not match the materials as stated in the
syllabus. So, if it is calculated in percentage, the English National Standard
School Final Examination (NSSFE) contains 90 % of the materials in the syllabus.
Based on the description above, it means the English National Standard School
Final Examination (NSSFE) is 90 % valid in term of content validity. The current
findings in line with Sugianto‟s (2016) finding, it is found that all items of the
English National Final Examination for junior high school represent the materials
in the syllabus.
It is also correspond to Fiktorius‟ (2014) finding; he concluded that the test
has fulfilled the criteria of having content validity. But, the test developers need to
consider revising items with very low or very high item difficulty and very low
item discrimination. Finally, a further action needs to be taken to revise the
implausible distractors. The present study also found that the test items 90%
covered the content of the materials in the syllabus. But, correction were needed
to be taken to revise the test items of English NSSFE of MAN 1 Semarang by
taking consideration to the materials stated in the syllabus.
It was in line to the theory that believed content validity is applied to see
how well the content of the instrument represents the entire universe of content
which is aimed to be measured. Brown (2004) notes that there is more emphasis
7
given to content validity in the assessment of what has been learnt because it is
one of the major aspects required in the sampling technique of educational
contents and learning outcomes. As the name implies, content validity is
concerned with whether or not the content of the test is sufficiently representative
and comprehensive for the test to be a valid measure of what it is supposed to
measure that can be best examined with the table of specifications. If the result of
the analysis shows each of the English test item suits the test specifications, then
the test is claimed to be valid Brown in Kindeye, (2002).
In contrary the research that has been done by Kindeya (2002) on the
whole, in his research concluded that the present test items did not match with the
syllabus contents for both the materials and the learning objectives. It is different
with the current study, all of the test items match with the content of the syllabus
in term of the objectives and the materials since the test instrument constructed
regionally by the teachers association in MAN 1 Semarang. Kindya (2002) added
that the EGSEC English examination of the present study, there is poor
concentration towards preparing of a valid test. Moreover, Rukmini‟s (2015)
finding showed that the English National Examination does not have content-
evidence of validity. The test-items do not cover all sections required by the
curriculum. Validity has been identified as the most important quality of test. It
concerns the extent to which meaningful inferences can be drawn from test scores
(Bachman, 1990). „A valid test of reading ability actually measures reading ability
– not 20/20 vision, nor previous knowledge in a subject, nor some other variable
of questionable relevance‟ (Brown, 2004). In other words, a valid test is test
measuring what should be measured, assessing what it should assesses. In the case
of NSSFE, the test items should be able to measure the materials stated in the
syllabus.
In this part, the writer would like to compare the findings of the current
study with the previous studies to place the position of the research. The current
finding found that 27 items (90%) of the English National Standard School Final
Examination (NSSFE) represent the learning objectives as stated in the syllabus
and the rest 3 items (10%) did not match the syllabus. So, if it is calculated in
8
percentage, the English National Standard School Final Examination (NSSFE)
contains 90 % of the learning objectives in the syllabus. Compared with the study
that has been done by Sugianto (2016) it is found that all items represent the
indicators in the syllabus. In other words, the English National Final Examination
contains 100% of the indicator in the syllabus. It means that the English National
Final Examination is valid in term of construct validity.
Moreover, the findings of the research that had been done by Rukmini
(2015), the result showed that the English National Examination does not have
content-evidence of validity. We can say so since the test-items do not cover all
sections required by the curriculum. In contrary, the current study found that 90 %
of the test items of English (NSSFE) of MAN 1 Semarang cover the materials and
objectives of the syllabus. It also met to the teachers‟ and students‟ perception. A
test should, therefore, always be constructed on an explicit specification which
addresses both the cognitive and linguistic abilities involved in activities in the
language use domain of interest, as well as the context in which these abilities are
performed. There are two major threats to construct validity: construct under-
representation and construct irrelevance (Messick, 1989). Test developers need to
ensure the constructs elicited are precisely those intended to and that these are not
contaminated by other irrelevant variables. If important constructs are under-
represented in a test, this may have an adverse backwash effect on the teaching
that precedes the test.
As stated in the finding above, the interview from the English teachers of
MAN 1 Semarang in the academic year of 2017/2018 showed that two teachers
(66.6%) believed that the test items of NSSFE of MAN 1 Semarang match the
materials, and one teacher (33.33%) said that the tests items of NSSFE of MAN 1
Semarang dot match the materials. Come along with Aprianto‟s (2013) finding, he
concluded that most teachers (47.37%) believed that English test in National
Exam has improved students‟ language ability, but a relatively high percentage of
teachers (36.84 %) did not really believe that it supported students‟ improvement
for some reasons. They said that ET in NE did not assess all skills so that it was
far from the whole description of students‟ improvement. Some others, 15.79% of
9
teachers, actually said the same thing as those who did not really believe that ET
in NE improved of students‟ competences. It only assessed perceptive skills
(reading and listening) which were not communicative as not all skills are
included, whereas actually they are inseparable.
The present study found 31 students said that the NSSFE test items of
MAN 1 Semarang in the academic year of 2017/2018 match the materials and 6
students believe that the test items do not match the materials, while the rest 3
students do not really sure whether the test items match the materials in the
syllabus or not. While dealing to the compatibility between the test items of
NSSFE towards the learning objectives, the current research found 30 students
said that the NSSFE test items of MAN 1 Semarang in the academic year of
2017/2018 match the learning objectives and 4 students believe that the test items
do not match the learning objectives, while the rest 6 students do not really sure
whether the test items match the learning objectives or not. Validity has been
identified as the most important quality of test. It concernsas the extent to which
meaningful inferences can be drawn from test scores (Bachman, 1990). “A valid
test of reading ability actually measures reading ability – not 20/20 vision, nor
previous knowledge in a subject, nor some other variable of questionable
relevance” (Brown, 2004). In other words, a valid test is test measuring what
should be measured, assessing what it should assesses. In the case of NSSFE,
most of the students believed that the test items of NSSFE of MAN 1 Semarang
match the materials and the learning objectives. Since the test items should be
able to measure the materials and learning objectives as stated in the syllabus.
A test fulfilled the content validity if it measures the materials that have
been programmed and given. The program materials are as described in the
curriculum. So, to reach the content validity, it is necessary to construct the items
of the test based on the materials that have been programmed in the curriculum.
The details of the curriculum can be seen in the syllabus. The statements above
are as Fulcher and Davidson (2007: 67) stated, Content validity is defined as any
attempt to show that the content of the test is a representative sample from the
10
domain that is to be tested. In our example of the academic reading test it would
be necessary to show that the texts selected for the.
4. CONCLUSSION
In this part, the researcher presented the research finding and discussion of the
findings on the English Reading test items of National Standard School Final
Examination (NSSFE) of MAN 1 Semarang to answer the research questions that
are delivered in the first chapter. The result of research finding can be summed up
as follows:
It is found that most of the test items of the English National Standard
School Final Examination (NSSFE) of MAN 1 Semarang represented the
materials and the learning objectives as stated in the syllabus. Moreover, from the
interview that has been done to English teachers and students of MAN 1
Semarang it can be said that all of the test item of English National Standard
School Final Examination (NSSFE) match with the objectives and materials of the
syllabus. Since most of the test items of English National Standard School Final
Examination (NSSFE) match the learning objectives and materials of the syllabus,
it can be said that the test items stated in the English National Standard School
Final Examination (NSSFE) met their perceptions.
In addition, from the analysis, in term of face validity, 27 of the questions
were determined as a valid item, and the rest 3 items are not valid. The validity
analysis was done for two groups of students, the first class and the second class
that consists of twenty students each. The result also shown that the test item of
English National Standard School Final Examination (NSSFE) of MAN 1
Semarang in the academic year of 2017/2018 were reliable, since the data taken
from one class to another class shown a consistent result.
The result of the current study concluded that the instrument of English
National Standard School Final Examination of MAN 1 Semarang in the
academic year of 2017/2018 determined as a valid instrument. Since it was a valid
instrument in term of content validity it should be able to measure students‟
achievement. Therefore, the improvement of the teaching technic need to be
achieved. The test items that are constructed according to the syllabus and the
11
magnitude of contents are a good test that can measure and assess students‟
achievement in teaching and learning process. This study is expected to provide
benefits and implications for learning English in general and specifically to the
accuracy of national examination instruments to assess the ability and learning
outcomes of students in English learning. By equipping the schools with language
instructional materials, continuous assessment can be introduced in grade 10 and
11 so that the difficult skills to assess will be accomplished there and a certain
amount of the students result (in %) can be considered in the final exam of the
NSSFE. The present findings show the NSSFE test items match the materials and
learning objectives; therefore, great attention should be given to test construction
by the teacher association.
BIBLIOGRAPHY
Aprianto, K. (2013). Validity and Washback of English Test in National
Examination. English Education Journal EEJ 3 (1) (2013), 7.
Bachman, L. F. (1990). Building and supporting a case for test use. Language
Assessment Quarterly.
Braun, H., Kanjee, A., Bettinger, E., & Kremer, M. (2006). Improving education
through assessment, innovation, and evaluation. Cambridge: American
Academy of Arts and Sciences.
Brown, H. Douglas. (2004). Language Assessment: Principles and Classroom
Practices. Pearson Education: NY.
Brown, H. Douglas, (2005). Managing Change and Innovation in. Public Service
Organizations. New York: Routledge. Osborne.
Fiktorius, T. (2014). A Validation Study on National English Examination of
Junior High School in Indonesia. Postgraduate thesis of Masters Study
Program of English Language Education Teacher Training and Education
Faculty. Tanjungpura University Pontianak.
Kindeya, N. T. (2002). The Content Validity of the Ethiopian General Secondary
Education Certificate English Examination. A Thesis of The School of
Graduate Studies Addis Ababa University, 111.
Linn, R., Gronlund, E., (2000) Measurement and Assessment in Teaching Edition
8. illustrated. Publisher
Patel, M. E. and Jain, P. M. (2008). English Language Teaching (Methods, Tools
& Techniques). Jaipur: Sunrise Publishers & Distributors.
12
Rukmini, D., (2015). Does the National Examination of English Subject in
Indonesia Test What Should Be Tested?. Ragam Jurnal Pengembangan
Humaniora Vol. 15 No. 3, Desember 2015
Scriven, M., (1991). Evaluation Thesaurus. SAGE, Aug 27, 1991 - Language
Arts & Disciplines - 391 pages.
Stufflebeam, D. L. (2000). Evalutaion Models Viewpoints on Educational And
Human Services Evaluation (2nd ed., pp. 280-317). Boston, MA Kluwer
Academic.
Sugianto, A. (2016). An analysis of English National Final Examination for Junior
High School in Term of Validity and Reliability. Journal on English as a
Foreign Language p-ISSN 2088-1657 Volume 6, Number 1, 12.
Thomas, J., Allman, C., & Beech, M. (2004). Assessment for the diverse
classroom: A handbook for teachers. Tallahassee, FL: Florida
Department of Education, Bureau of Exceptional Education and Student
Services. Retrieved from http://www.fldoe.org/ese/pdf/assess_diverse.pdf.
Volante and Fazio.
Yibrah, M. (2017). Assessing Content Validity of the EGSEC English
Examinations. International Journal of Innovations in TESOL and Applied
Linguistics Vol. 3, No. 1; 2017 ISSN 2454-6887, 66.