30
CHAPTER 5 RESULTS 5.1 Descriptive Statistics 5.1.1Test Set A 5.1.1.10ッera〃Results Table 5-1 shows the information of the number of test ta deviation, maximum score, minimum score, and KR20 of Test S e test takers. Overall, the present authodnds no problemat the descriptive statistics. The KR 20 may seem a bit low, administered in a formal situation usually bears a KR 20 hig since the test employed in the present study had only 27 test it limited, n㎜ber of test倣ers, itα㎜ot be helped that the rel than that of GTEC. This and, also, the fact that the test item present author with no particular means to item bahking causes of decreased reliability, the reliability of O.768 present sltuatlon・ Table 5-1 Descriptive statistics and reliability coefficients of #of Test Takers Mean S.D. Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maxi mark. Ideally, there should be no fUll marks in a proficiency are, it indicates that the test was not accurate in measuring th which could have been beyond what was tested. However, s quite a proficiency test, and since the frequency of fU11 marks that this aspect of result was no threat to the reliability ofthe The histogram in Figure 5-1 indicates the distributions of whole. The statistics bore-0.808 fbr Skewedness and O.22 63 東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

CHAPTER 5 RESULTS

5.1 Descriptive Statistics

5.1.1Test Set A

5.1.1.10ッera〃Results

     Table 5-1 shows the information of the number of test takers, mean, standard

deviation, maximum score, minimum score, and KR20 of Test S et A With regard to all

test takers. Overall, the present authodnds no problematic results were fb皿d in

the descriptive statistics. The KR 20 may seem a bit low, considering that GTEC

administered in a formal situation usually bears a KR 20 higher than O.9. However,

since the test employed in the present study had only 27 test items with sufficient, but

limited, n㎜ber of test倣ers, itα㎜ot be helped that the reliability become lower

than that of GTEC. This and, also, the fact that the test items were written by the

present author with no particular means to item bahking being accounted as the

causes of decreased reliability, the reliability of O.768 seems acceptable fbr the

                   .

present sltuatlon・

Table 5-1 Descriptive statistics and reliability coefficients of Test Set A

#of Test Takers Mean S.D. Minimum Maximum KR20

573 17.7 4.5 4 27 0,768

     There was no minimum score of zero, though the maximum score was a血ll

mark. Ideally, there should be no fUll marks in a proficiency test, because if there

are, it indicates that the test was not accurate in measuring the test taker’s ability,

which could have been beyond what was tested. However, since this test was not

quite a proficiency test, and since the frequency of fU11 marks was one, it was j udged

that this aspect of result was no threat to the reliability ofthe present study.

     The histogram in Figure 5-1 indicates the distributions of test takers’scores as a

whole. The statistics bore-0.808 fbr Skewedness and O.227 fbr Kurtosis, which

63

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 2: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

allows the present author to determine that the curve presented is close enough to a

normal curve, although the peak is slightly on the right. This problem was solved

when different ability groups were determined.

Figure 5・1 Histogram of overall resultS for Test Set A

FREQUENCY   15.0+

       1

       1

       1

       1

   10.0+

       1

        1

        1

        l

        I

        I

        l

        l

        l

           *

          **

          **

       *   **

       ** ***

       ******

       ******

     *  *******

  *  **********

  ************

*  *************

*******************

0.0十一一一一一十一一一一十一一一一十一一一一十一一一一十一一一一十

0. 5.  10.  15.  20.  25.  30.  SCORES

5.1.1.2」rte〃2レblidation

     Facility value(percentage correct)and discrimination index(point-biserial

coefficient)fbr each item in Test Set A are provided in Table 5-2. Items 1,2,7and

16seem a little problematic when their point-biserial coefficients are examined.

One reason could have been because,㎜ong fbur options, the answer in each item

was皿clear and hard to distinguish from other options due to its defective

construction. Furthermore, the fact that the percentages correct for ltems 2,7,16are

especially low could mean that they were so difficult that even those who had scored

well on the test as a whole tended to get them wrong, resulting in low coefficients in

the point-biserial. However, overall, the figures seemed satisfactory as a test

64

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 3: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

instrument to be employed in this study, and it was decided to employ all the items in

the analyses to be fbllowed.

Table 5・2 Percentage Correct and Point-biserial Coefficient of ltems in Test Set A

ITEM# PC PBs1 0.66 0.16

2 0.34 0.07

3 0.37 0.26

4 0.92 0.36

5 0.84 0.30

6 0.46 0.237 0.39 0.00

8 0.58 0.25

9 0.36 0.28

13 0.68 0.49

14 0.72 0.54

15 0.90 0.40

16 0.37 0.04

17 0.70 0.55

18 0.71 0.47

19 0.81 0.53

20 0.71 0.42

21 0.51 0.46

22 0.84 0.51

23 0.71 0.50

24 0.74 0.54

25 0.76 0.49

26 0.78 0.57

27 0.87 0.40

28 0.71 0.55

29 0.68 0.58

30 0.63 0.47

Avera e 0.66 0.39

5.1.1.3Predeter〃lin ing’th e/l bility G7ro卯5

     As explained in 4.3.1, the ability groups of Group A-Low and Group A-High

were predeterrnined based on the overall results of descriptive statistics on Test Set A.

Examining Figure 5-2, which is the score distribution table fbr the whole population

of Test Set A, the present author had detected something obscure was detected about

the population who scored g and皿der・They seem to deviate from the rest of the

group since they form a small normal curve of their own. Fulthemore, when the

65

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 4: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

distribution was reviewed丘om 11(items correct)to 26, it seemed to form a rather

perfect normal curve. Since the median is 19(items correct), those test takers who

血ll in the zone between line b and c would be considered as having

Figure 5■2

       Number      Freq-

       Correct      uency

        . . . No exam i nees

          3       0

          4       2

          5       5

          6       4

          7       9

          8       10

           9       6

          10       15

Score distribution table for lrest Set A

   Cum

   Freq    PR   PCT

be l ow th i s score   . .

     0   1   0

     2   1   0

     7   1   1

     11   2   1

     20   3   2

     30   5   2

     36   6    1

     51   9   3

1

1

+#

1#

1##

1##

1#

+###

51 people (8.9%)

一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一

11       7     58  10

12          24        82    14

13        18      100   17

14        24      124   22

15          30       154    27

16          24        178    31

17          54       232    40

1

4

3

4

5

4

9

1#

1##nv

l###

1####    181 people (31.6%)

+#####     〈Group A-Low>

1####

1#########

_____________________________________一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一

000り011り乙

283  49

326  57

393  69

1#########  161 people (28.1%)

1########   ←median

+############

________________________一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一 b

   21          73       466    81     13      1#############

   22          46       512    89     8      1########

   23          31       543    95     5      1#####    180 people (31.4%)

   24          17       560   98     3     1###       〈Group A-High>

   25           8       568    99      1     +#

   26 45729911#                                   1-一一一←一一一+一一一一←一一一+一一一一+

                                      510152025

                                     Percentage of Examinees

(27number c・rreCt・is・mitted・f・。m・the・table・because・its・f・eque・cy・was・1,・less・tha・the number f・・which#

would be given which is 4.)

marginal ability between those people who would be considered as having high

66

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 5: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

ability and those as having low ability. When the population percentage was

calculated for each zone, the population that fell皿der the zone between lines a and b

was 31.6%, lines b and c 28.1%, and line c below 31.4%. Since the grouping

seemed to allow roughly the same percentages of people to be allotted to the group,

the present author decided that those people who had scored between l l and 17

would be predetermined to be in Group A-Low and those between 21 and 27 in

Group A-High.

     Table 5-3 shows the information of the number of test takers, mean, standard

deviation, maximum score and minimum score for Group A-Low and Group A-High.

Table 5-3 Descriptive statistics for Group A・Low and Group A-High

Group #of Test Takers Mean S.D. Minimum Maximum

A-Low 181 14.8 1.9 11 17

A-High 180 22.2 1.3 21 27

5.1.2Test Set B

5.1.2.10vera〃」Results

     Table 5-4 shows the information of the number of test takers, mean, standard

deviation, maximum score, minimum score, and KR200f Test Set B. Overall, the

present author finds no problematic results in the descriptive statistics. The KR 20

may seem a bit too low, considering that GTEC and TOEIC administered in the

formal situation usually bears KR 20 higher than O.9. However, since the tests

employed in the present study had only 27 test items, far less than the number of

items included in o亘ginal tests, with su伍cient, but limited number of test takers, the

drop in the index seems unavoidable. Assessing this as an undisturbing element in

the present situation, the present author had decided to proceed with this result.

Table 54  Descriptive statistics and reliability coef「icien笛of Test Set B

#of Test Takers Mean S.D. Minimum Maximum KR20

67

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 6: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

257 16.3 4.0 4 25 0,675

     The histogram in Figure 5-3 indicates the distributions of test takers’scores.

The statistics bore-0.340 fbr Skewedness and O.250 fbr Kurtosis, which allows the

present author to determine that the curve presented is close enough to a normal curve.

There was no minimum score of zero, and the maximum score was 25. From this,

along with the mean score of 16.3 and from the score distribution in Figure 5-3, it

was pre sumed that Test S et B had worked well to illustrate the reading ability of the

population who had worked on this test set.

Figure 5-3 Histogram of resultS for Test Set B

FREQUENCY :  15.0+

      1

      1

      1

      l

      l

  10.0+

      1

      l

      l

      l

      l

      l

      l

      I

      l

        *

        *

        * *

       ** *

       *****

    *  *****

    *********

    **********

    ***********

    ***********

    ***********

******************

0.0十一一一一一十一一一一十一一一一十一一一一十一一一一十一一一一十一一一一十

0. 5. 10.  15.  20. 25. 30. SCORES

5.1.2.2」rte〃1 Vatidation

     Facility value(percentage correct)and discrimination index(point-biserial

coefficient)fbr each item in Test Set B are provided in Table 5-5. When their

point-biserial coefficients are examined, items 2 and 4 seem to bear problems. One

explanation could be because,㎜ong fbur options, the answer in each item was

mclear and difficult to choose from the given options due to its defective

68

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 7: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

construction. However, overall, the figures seemed satisfactory as a test instrument

to be employed in this study, and all the items were observed in the analyses to be

fbllowed.

Table 5-5 Percentage Correct and Point-biseriat Coefficient of ltems in Test Set B

ITEM# PC PBs1 0.73 0.24

2 0.24 0.19

3 0.38 0.26

4 0.95 0.15

5 0.81 0.266 0.51 0.23

10 0.65 0.2811 0.61 0.30

12 0.72 0.36

13 0.72 0.41

14 0.42 0.34

15 0.88 0.30

16 0.40 0.22

17 0.53 0.38

18 0.78 0.26

19 0.44 0.29

20 0.68 0.33

21 0.45 0.46

22 0.63 0.38

23 0.40 0.31

24 0.55 0.25

25 0.77 0.48

26 0.73 0.49

27 0.70 0.46

28 0.66 0.44

29 0.54 0.41

30 0.46 0.30

Avera e 0.61 0.33

5.2 Factor Analytic Studies

5.2.1Group A-Low

     A fU11-information factor analysis was applied to all the items in Test Set A With

the responses of Group A-Low. Here, a two-factor solution was adopted because of

69

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 8: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

its interpretability. The correlation between the factors is low,.341 between the first

and second factors, which indicates that the orthogona1(VARIMAX)analysis is

preferable. Table 5-6 illustrate s the factor loadings fbr this group.

     For inspection of loadings on each factor, the factor loadings of each item were

rearranged in order from those that bear high loadings to low loadings on the first

飴ctoL The predetermined question t)?ell of each item is indicated in the table as

‘‘p-TYPE’うso that the relationship between factor loadings and question types might

be sought. The numbers under“P-#”in the table shows for which passage each test

item was responded.

Table 5-6 Factor Loadings for Test Set A by Group A-Low

21

111n order to make the reference to the terms simpler,‘‘global-inferential”will be presented as‘‘GI”,

‘‘撃盾モ≠戟|literal”as‘‘LL”and‘‘loca1-inferential”as‘‘LI”in the tables and in the discussion丘om here on.

70

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 9: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

     The first thing noticed is that the text(context)characteristics did not affect the

extraction of factors. Items from different passages had significant loadings on the

same factors, and the loadings of the items that came from the same prompt varied

greatly in the loadings.

     In looking fbr particularity in the loadings on the first factor, one notices that

the items that bear rather high loadings on the first factor are the items that have

smaller item numbers. In other words, they are the items that appear early in the test

set. In the same way, the items that bear negatively high loadings are the items that

apPear later in the test set. V西at this indicates is that the first factor in the factor

loadings for Test Set A via the performance of Group A-Low could be determined as

‘‘翌??窒?@a test item is located in the test setう’or‘‘item position”. This point will be

fUrther discussed in Chapter 6.

     As fbr the interpretation of the second factor in the present analysis, the

possibility of a‘‘literal’うtype of reading being an attribute arises. The items that load

heavily on the second factor are items 2 and 7. The predetermined question type

varies between the two, so fUrther analyses of the two items were done.

     Item 7, which was originally categorized to be a GI(‘‘global-inferential”, see

note#11)item, is presented in the test as fbllowing:

7.What is the main topic of this passage?

(A)   The possibility of space celonies

(B)  Space travel in the twenty-first century

(C)  How to become an astronaut

(D) What people think about space exploration

     The correct option(D)could be chosen if the test taker could observe that the

explanations about different percentages introduced in the passage are all about

‘‘vhat people think about space exploration , option(D), and that that i s the theme of

the passage. However, at the same time, it could be supposed that some test takers

71

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 10: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

look at the first sentence in the passage,‘‘During the last fbrty years, many studies

have been done to learn people’s opinions about space exploration,”and match the

phrase‘‘people’s opinion about space exploration”with what is said in option‘‘D”.

If it could be presumed that this type of reading was done to reach the answer, this is

aliteral matching of a limited(or local)information, and item 7 could be considered

as an LL(‘‘local-literal’う, see note#11)item.

     Item 2 is also an LL item:

2.The speed at which the seafloor is spreading is

(A)  about an inch in 200 million years.

(B)  changes according to the year.

(C)  half as fast as human fingernails grow.

(D)  slower than the scientists can process.

     This is indeed an LL item since the correct option(C)would be chosen when

the test taker notices that the last sentence in the passage,‘‘This spreading occurs in

half of a speed of how fast fingernails grow,” perfectly matches the phrases in option

(C).Thus, it is possible that the‘‘local-literal”element explains the second factor.

     To㎞her con丘㎜this inteΦretation, another thing to be pointed out is that

there are quite a few items that bear negatively high loadings on the second factor:

items 14,17,18, and 19.

     Items 17 and 18 share the same passage, and they were originally categorized

to be an LL item and an LI(‘‘local-inferential”, see note#11)item, respectively.

17.According to this passage, what do scientists now believe about the ocean

  depths?

(A)  There are many dark-shaded jellies・

(B)  Sea color changes with the seasons.

(C)  Akind of desert exists in some parts.

(D)  Most of the living things there are jellies・

72

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 11: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

18.We can guess from the passage that

(A)  scientists have found that deep-sea is Iike a watery desert.

(B)  scientists don’t know why deep-sea jellies have bright colors.

(C)  many jellies in the underwater have their common ciear color.

(D)  many fish in the deep sea have very bright colors.

     For item 17, the correct answer is(D). The sentence,‘‘Scientists now believe

that j elly animals may be one of the most common types of animal life in the ocean

depths,う’ starting from the seventh line, was to be matched with the question,“what do

the scientists now believe about the ocean depths?”and what is written in option(D).

However, it seems that, fbr Group A-Low test takers, making a link between the

phrase‘‘the most common types of animal life’うfrom the passage with the phrase

“Most of the living things”in option(D)was an‘‘inferential”type of reading, rather

than a‘‘matchingう’, which makes us identifシitem 17as an LI item fbr this group.

     With regard to item 18, the correct option(B)would be chosen if the test taker

could locate the last sentence,‘‘The reason fbr these bright colors is a mystery,”and

infer that“a mystery”means that nobody knows why j ellies in the deep-sea have

bright colors. In other words, this item was constnlcted with the intention to test test

takers’ability to make an inference after understanding a small amount of information,

and, therefbre, it was labeled as an LI item. If this was what was done by the test

takers, it might l)e possible to explain that the second factor is indeed a‘‘local-literal”,

or at least a“literal”element, on acco皿t that the items that load negatively high on

the same factor are perceived to pre sent an‘‘inferential’うfeature, a feature that would

be on the other end of‘‘literal”.

     This proposition is fUrther confirmed when items 14 and 19 are consulted.

Item 19is presented as:

19.What is the main idea of this passage?

(A)  Scientists work very hard to make new discoveries.

73

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 12: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

(B)

(C)

(D)

lmportant things can be discovered accidentally.

Making scientific discoveries is an easy thing to do.

Sticky materials are useful in today‘s world.

     This item was predetermined to be a GI item because the whole passage was

about how Art Fry had come up with the idea of stick-ons by luck, giving the message

that‘‘(B)Important things can be discovered accidentally. This answer could also be

reached if the first sentence of the passage,‘‘Some discoveries have come as a result

of luck-an accident that causes a scienti st to look differently at what has occurred,う’

is located, and the phrases‘‘as a result of luck”and‘‘an accidentう’from the sentence is

correctly linked with‘‘accidentally”in option(B). Making this link might require a

bit of infening, so this item could be determined as an‘‘inferential”item, whether it is

categorized as a GI or LI item.

     Item 14 was an item which was predetermined to be an LL item:

14.Why was the Great Smoky Mountains National Park built?

(A)  People in the East needed a pIace to take a walk for exercises.

(B)  Many kinds of birds and trees were discovered in Smoky Mountains.

(C) Many parks in the West were becoming too crowded with cars.

(D)  There were few national parks in the eastern part of the US.

     The correct answer is(D), which gives the same explanation as the first

sentence in the passage,‘‘In the early 1920s, the new United States National Park

Service realized that most of its parks were in the West,”in a slightly different

expression. This item was first categorized as an LL item in the item-writing

process because she thought that this‘‘matching”was of a‘‘literar’nature. However,

given the p・pulati・n・f Gr・up A-L・w, it might be c・nsidered that the nature・f

matching here is something‘‘inferential”, which fUrther suggests that the attribute of

the second factor is whether the item elicits‘‘1iteral”or‘‘inferential” type of reading.

74

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 13: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

5.2.2 Group A-High

     A fU11-information factor analysis was applied to all the items in Test S et A With

the responses of Group A-High. The pre sent author had adopted two-factor solution

fbr its interpretability. The correlation of.054 between the first factor and the

second factor is rather low, so the orthogonal(VARIMAX)analysis seemed more

appropriate. Table 5-7 illustrates the factor loadings fbr this group.

’『able 5・7 Factor Loadings for Test Set A by Group A-High

21

     For inspection of loadings on each factor, the factor loadings of each item were

rearranged in order from tho se that bear high loadings to low on the first factor. The

question type(Q-TYPE)of each item is indicated in the table, along with the passage

number(P-#), so that the relationship between the factor loadings and questions types

75

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 14: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

and passage numbers might be sought.

     The first thing noticed is that the text(context)characteristics did not affect the

extraction of factors. Items from different passages had significant loadings on the

same factors, and the loadings of the items that came from the same prompt varied

greatly in the loadings.

     In seeking particularity in the loadings on the first factor, it seems that the items

that bear rather high loadings on the first factor are the items that have smaller item

numbers and tho se with low loadings have 1arger item numbers, as was the case with

Group A-Low. However, at the same time, one can also observe that the items that

bear rather high loadings on the first factor are the items that are labeled GI fbr

question type. These are the questions that ask fbr the main ideas of the given

passage. To make sure that these items actually elicit a GI type of reading, items 7,

4,and 22 were revisited with some test takers after the test implementation, and it

was confirmed that they do.

     Item 6, labeled as an LI item, also bears a rather high loading on the first factor.

When each item is closely examined, item 6 is presented as:

6.We can guess from the passage that

(A) some trees in Muir Woods existed 1,200 years ago.

(B)  the redwood trees have been discovered just recently.

(C)  redwood trees are very popuIar in the US.

(D)  cutting down of the redwood is not allowed in the US.

This question is given with the intention that, if the test taker could locate the

sentence,“Some are about 1,200 years of age,” on the fifth line of the passage(refer

to Test Set A in Appendix A), option(A)would be chosen after inferring that if the

trees are 1,200 years old, they should have existed in Muir Wbods 1,200 years ago.

This item was constructed with the intention to test a test taker’s ability to make an

inference after understanding a small amo皿t of information・

     However, in closely examining item 6,0ne thing to which an attention is dra㎜

76

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 15: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

is that, compared to other LI items in Test Set A, in order to discard incorrect options

fbr this item, the test takers had to read and refer to rather a large amo皿t of

information. This could be considered as a type of reading predetermined for a GI

type of question, and, therefbre, it could be said that item 6 had worked as a GI item

in the present analysis, allowing us fUrther to interpret the first factor to be the ability

to read rather a large amo皿t of information and make inferences from its

comprehension.

     Item 5, another item which loads heavily on the first factor, shares the passage

with item 6 and asks:

5.What is one reason why redwood trees have existed so long?

(A)  They form an unusual forest just outside San Francisco.

(B)  They have special covers that protect themselves.

(C)  They are very tall, so the fire can’t reach the whole tree.

(D)  They are officially protected by the State of California.

The correct answer(B)could be chosen if the test taker could locate the sentence,

‘‘

shey contain chemicals which protect them against fire, decay, and insects,”that

starts from the eighth line of the passage. This item was labeled as a LL

(local-literal)type and was constructed to test a test taker’s ability to皿derstand a

small amount of information with little or no inferring. However, when option(B)

is closely reviewed, to correctly choose option(B), the test takers had to comprehend

(and maybe infer)that the word“cover”in option(B)means the bark of . the tree.

Furthermore, the correct option could also be chosen when the sentence,‘‘One reason

is that they are not easily harmed by fire because they have very thick bark, and there

is much water in their wood,”starting from the sixth line of the passage, is located,

and the same inference about the“bark”was made by the test taker, which would

make this item‘‘LI9う.

     At the same time, one notices that, although the correct answer could be

reached by LI type of reading as it was examined above, what is asked in item 5 is

77

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 16: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

actually the central theme of passage. On the sixth line, the passage presents a

question‘‘How do these trees live so long?’うand the rest of the text is fbcused on

answering this question. Although this question was given in the middle of the

passage, because the explanation of why redwood trees have lived so long seems to

dominate the main discussion in the passage, it could be judged that item 5 is asking

fbr the main idea to be comprehended. Therefbre, it could be deduced that item 5

might as well be categorized to be a GI item, which would allow us fUrther to

conclude that the first factor in the present analysis is the ability to present a GI type

of reading comprehension.

     If the first factor could be explained by a GI nature of reading, items that hold

negatively high loadings could be perceived as items that elicit non-GI, and perhaps

LL, types of reading performance. These are items 17,19,28, and 30, in the order

ofhow negatively high factors are loaded on each item.

     Item 28 was an item which was predetermined to be a GI item as i s clear from

the question given.

28.What is the main idea of this passage?

(A)The popularity of national parks is creating problems.

(B)National parks are built as children’s playground.

(C)Pollution is a problem in national parks.

(D)The cost of visiting a national park is increasing.

     The passage was about how national parks in the US have problems because

too many people are visiting them. A similar proposition is expressed in option(A),

which should be chosen if the test taker had correctly comprehended the passage.

However, even if the whole passage was not read globally, the correct option could be

chosen after reading the first sentence,‘‘The U.S. National Parks Service is trying to

solve a difficult problem,”along with an earlier part of the second sentence,“Many

national parks have become too popular.” @If this was the case, it might be more

appropriate to consider this item as testing an LL type, or at least a local type, of

78

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 17: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

reading.

     A similar case holds true for item 19:

19.What is the main idea of this passage?

(A)Scientists work very hard to make new discoveries.

(B)lmportant things can be discovered accidentally.

(C)Making scientific discoveries is an easy thing to do.

(D)Sticky materials are useful in today‘s world.

     This item was termed to be a GI item in the factor analytic study done for

Group A-Low. However, with the population of Group A-High, because of their

higher ability and the ease with which they read English, the matching of‘‘Some

discoveries have come as a result of luck-an accident that causes a scientist to look

differently at what has occurred,う’ from the first sentence and option(B)had rather an

LL nature than GI. In the same respect, item 17, which was categorized to be an LI

item with Group A-Low, could now be considered to be an LL item.

     Item 30, which shares the same passage with item 29 above, was constructed

with the intention to elicit a test taker’s LI reading performance.

30.Why is it necessary for some parks to limit the number of visitors?

(A)

(B)

(C)

(D)

There aren’t enough parking spaces for all the visitors around the parks.

Having too many visitors has bad influences on the living things in the

parks.

They don’t have enough money to hire people as the guides in the parks.

There would be too much traf「ic on the roads inside and around the parks.

     In order to correctly choose(B)as an answer, the test taker was to locate the

second to last sentence,“The 1arge number of visitors is al so a threat to the plant and

animal life of the parks,”and infer that if something is“a threat to the plant and

animal life of the parksう’, it“has a bad influences on the living things.”When the

79

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 18: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

item was revisited with some of the test takers, this seemed to be the case.

     Nevertheless, what became clear as items 17,19,28, and 30 were revisited was

that they were certainly not GI items, in fact, they seem to have elicited a type of

reading that could be considered as an opposite of GI, and shared a common feature

of a“localううnature. Therefbre, it might as well be concluded that the first factor in

the present analysis is indeed the‘‘global”, if not global-inferential, element of

reading performance.

     As for the second factor in the factor loadings presented in Table 5-6, items 16,

23 and 30 bear high factor loadings. They are from different reading prompts, so the

text features cannot be a factor. Furthermore, all three items bear different

predetermined question types. Therefore, each items were reexamined to seek a

common feature that would help interpret this factor.

     Item 16 was predetermined to be a GI question:

16.What is the main idea of this passage?

(A) Scientists believe that the deep sea is like a desert in water.

(B) Scientists learned a lot about jellies in the sea from the sailors.

(C) Scientists discovered a Iot about jellies in the ocean depths.

(D) Scientists were surprised to find so many jelIies in the deep-sea.

This item is given with the intention to elicit a test takeピs global comprehension of

the passage. The test taker is to read the whole passage and infer that the main idea

presented by the author is(C). When this item was revisited with some test takers,

more or less, this was what was done to reach(C)as an answer, which con丘㎜s that

item 16 was indeed a GI item. They said that the second sentence,‘‘But with new

ways to explore the oceanうs depths, we are finding that they are much richer in life

than we ever expected,”had worked as a clue to infer that the theme of this passage

was how scientists are‘‘discovering a lot about j ellies in the ocean depths”, and that

the rest of the passage was giving examples to support this theme.

     Looking at item 23, which was originally labeled as an LL item, the correct

80

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 19: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

answer(D)would be chosen if the test taker could locate the first sentence in the

passage,‘‘About 85%of all animal life consists of insects,う’and match the phrase

‘‘W5%”with the phrase‘‘the most part’ and‘‘consists of’with‘‘forms” in option(D).

23.According to the passage,

(A) some insects eat other insects for food.

(B) some insects make food from oil pool.

(C) insects are usually found near the water.

(D) insects form the most part of animal life.

However, in the analysis subsequent to the data collection, it was perceived that what

had been presumed to be a‘‘literal matching”(an LL type of reading)was actually皿

‘‘

奄獅??窒窒奄獅〟D” In other words, interpreting“the most part”in option(D)to mean

‘‘

W5%”and‘‘forms”to mean‘‘consists of’in the original sentence could actually be

considered as an‘‘inferring”rather than a‘‘literal matchingう’. If this is true, item 23

should be called an LI item, and now, the common feature that items 16 and 23 share

is an‘‘inferential”element. Here, the possibility that an attribute that explains the

second factor is an‘‘inferential element”arises.

     This proposition is fUrther confirmed when item 30 is examined. Item 30, in

the analysis that was done for the first factor, was determined to be an LI item, a

question type that holds an“inferential”element. Therefore, this leads us to affrirm

that an‘‘inferentialう’element is the attribute that explains the feature of the second

factor.

     Conversely, if the second factor could be explained by an‘‘inferential”nature of

reading, items that hold negatively high loadings could be perceived as items that

elicit“non-inferential”, or“literal”, type of reading performance. These are items

18,26,and 27, in the order of how negatively high factors are loaded on each item.

     Item 1 8 was termed to be an LI item in the factor artalytic study done for Group

A-Low.

81

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 20: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

18.We can guess from the passage that

(A)scientists have found that deep-sea is like a watery desert.

(B)scientists don’t know why deep-sea jellies have bright colors.

(C)many jellies in the underwater have their common clear・coior.

(D)many fish in the deep sea have very bright colors.

     However, as was the case with items 17 and 19 in examining the nature of the

first factor in the present analysis, with the population of Group A-High, because of

their higher ability and the ease with which they read English, what was determined

as the‘‘inferential”matching of the last sentence,‘‘The reason fbr these bright colors

is a mystery,’うwith the correct option(B)fbr the test takers of Group A-Low had

rather a LL nature than GI for tho se in Group A-High.

     Items 26 and 27 share the same passage and are presented as:

26.Mendez could succeed because his parents

(A)helped him travel around the world.

(B)brought him up very strictly.

(C)put in much money and time.

(D)taught him many kinds of sports.

27.Rober寸Mendez is

(A)afather of two children

(B)afisherman from California

(C)atennis player

(D)aTV star

     Item 26 was predetermined to be an LL item because the correct option(C)

could be reached if the test taker could locate the sixth and seventh sentence in the

passage,‘‘Robert traces his success to his parents’sacrifices. They invested every

spare penny and every spare moment in their sons’ fUture,”and fbllow that, in essence,

82

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 21: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

what is said by these sentences is what is said in option(C), which makes this an LL

item.

     In the same way, item 27 i s an LL que stion because the reading performance

elicited by this item is the‘‘literal’うmatching of‘‘a world-class player”and‘‘a tennis

racket”from the first two sentences of the passage with option(C). Although item

27 was predetermined to be an LI question because the matching above was thought

to hold an‘‘inferential”nature, in reality, the case above seems to hold true, which

allows the present author to conclude that the second factor in the present analysis is

well explained by the‘‘inferential/literal”nature that an item exhibits.

5.2.3 Group B

     All the responses of Group B working on Test Set B were analyzed using a

血ll-information factor analysis. The present author had decided to employ a

two-factor solution after consulting the results since it seemed the most appropriate.

The correlation between the first and second factors was not too high,.549, which

indicates that the orthogonal(VARIMAX)analysis is preferable. Table 5-8

illustrates the factor loadings fbr this group.

     For the purpose of inspecting the loadings on each factor, the factor loadings of

each item were rearranged in the order from tho se that bear high loadings to low on

the first factor. To help seeking the relationship between the factor loadings and

question types along with passage numbers, the predetermined question type

(Q-TYPE)of each test item and the number of passage for which each item was

answered(P-#)are indicated in the table.

     It could be said that the text (context) characteristics did not affect the

extraction of factors. Items from different passages had significant loadings on the

same factors, and there were sufficient variations in the loadings fbr the three items

that were constructed fbr the same prompt.

     In seeking Particularity in the loadings on the first factor, one notices that the

items with 1arger item numbers bear rather high loadings on the first factor. In other

83

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 22: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

words, the items that load heavily on the first factor are the items that appear later in

the test set. In the same way, the items that appear early in the test set bear

negatively high loadings. Vのb.at this indicates is that the first factor in the factor

loadings for Test S et B via the performance of Group B could be determined as

‘‘翌??窒?@a test item is located in the test set”, or‘‘item position”, as was the case with

Group A-Low. Thi s will finther be discussed in Chapter 6.

’『able 5-8 Factor Loadings for Test Set B by Group B

21

     As fbr the interpretation of the second factor in the present analysis, the

possibility of an“inferential’type of reading being an attribute arises. The items

that load heavily on the second factor are items 14,15, and 21. The predetermined

question types are LL fbr item 14, LI fbr items 15and 21.

84

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 23: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

     Items 14 and 15 share the same passage and are presented in the test set as

fbllowing:

14.Ptarmigan keep warm in the winter by

(A) huddling together on the ground with other birds

(B) building nests in trees

(C) burrowing into dense patches of vegetation

(D) digging tunnels into the snow

15.The author mentions kinglets in line 17 as an example of birds that

(A) protect themselves by nesting in holes

(B) nest with other species of birds

(C) nest together for warmth

(D) usualiy feed and nest in pairs

     In order to correctly answer item 15, which shows the highest loading on the

second factor, the test takers are to locate the last and the second to the last sentences,

‘‘aody contact reduces the surface area exposed to the cold air, so the birds keep each

other w㎜. Two kinglets huddling together were found to reduce their heat losses by

aquarter, and three together saved a third of their heat.”@They are to integrate the

information given in these sentences to deduce that(C)is the correct answer, and this

leads us to con丘m that item 15is indeed an LI item.

     Item 14 was an item that was constructed with the intention to elicit a test

taker’s LL type of reading perfbmance. The fifth sentence,“Solitary roosters

shelter in dense vegetation or enter a cavity-homed larks dig holes in the ground

and ptarmigan burrow into snow banks,う’is the key in choosing the correct option(D),

and it was supposed that the test takers in this group would try to match‘‘burrow into

snow banks”from the original sentence with“digging tunnels into the snow”in

option(D). However, at the same time, it could be presumed that this matching had

required a bit of inferring since the words used in the targeted phrases are slightly

85

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 24: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

different, which makes this an LI item.

     Item 21 can also be confirmed as an LI item:

21.Why does the author mention Joseph Pulitzer and William Randolph

Hearst?

(A)They established New Ybrk’s first newspaper.

(B)They published comic strips about the newspaper war.

(C)Their comic strips are still published today.

(D)They owned major competitive newspapers.

     The information from the two sentences from the passage,“The丘rst血11-color

comic strip appeared in January 1894 in the New Ybrk Wbrld, owned by Joseph

Pulitzer,”and‘‘The first regular weekly fUll-color comic supplement, similar to

today,s Sunday fUnnies, appeared two years later, in William Randolph Hearst’s rival

New、York paper, the Morning Joumal,”as well as the phrase,“between giants of the

㎞eric紐press”丘om the丘rst sentence皿d“Both were immensely popular,”丘om

the first sentence of the second paragraph are integrated to infer that these two people

‘‘盾翌獅?п@maj or competitive newspapers,う’(option(D)).

     The fact that items 14,15, and 21 are all considered to be LI items allows us to

claim that the second factor in this analysis can be explained by the“local-inferential”

element of reading perfbrmances.

5.3 ltem Analyses

5.3.1Selecting items to be紐alyzed in this part of study

     It is clear丘om the results of factor analytic studies in section 5.2 that some of

the question types that were predetermined for each item did not fUnction in the way

they were expected. However, at the same time, through the qualitative analyses of

each item that were done to specifシthe nature of each factor in sections 5.2.1,5.2.2,

and 5.2.3, new question types were assigned to the items which revealed a great

86

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 25: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

particularity to each factor. In investigating relationships between question types

and item diffriculty, it seems necessary to proceed with this part of analysis with the

items for which the question types became explicit and coherent in the factor analytic

studies above.12@For this cause, the items which are incorporated in this part of

analysis are listed in Table 5-9.

     In Table 5-9, Group B is excluded because items from Test Set B ca皿ot be

incorporated in this part of the analysis owning to the fact that, fbr the second factor,

only a few items showed high loadings and that no item showed a strong negative

loading.

Table 5・9 1tems adopted for item anatyses by their question types and ability groups

Question Type Group A-Low Group A-High

Gl (inferential)

(inferential) 16,23.30

14,17,18,19 (global)

456722     ,   , ,   ,

Ll

(local)

17,19,28,30

(literal) (literal)

LL 2.7

     Furthermore, in the factor anal)戊ic studies, the items in both Group A-Low and

Group A-High exhibited only partial aspects of question types that were defined

12 she items with the factor loadings of.400 and above and-.400 and below were selected as items

that had explicit featUres ofquestion types and were employed for fUrther analyses with each ability

9「oups・

87

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 26: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

earlier in the present thesis. Therefbre, the present author could only specify the

question types according to the literal/inferential or local/global dimensions, rather by

their‘‘question types”(i.e. local-inferential). This is why, fbr Group A-High, item

30,appears twice in Table 5-9:0nce as an inferential item and again as a local.

5.3.2Group A-1.ow

     For each test item in Test Set A, item dif日culty was calibrated via Rasch

Analysis based on the test performances of the test takers in Group A-Low.

RASCAL converged after 3 Loops. The final parameter estimates are presented in

Table 5-10. Raw score conversion table, item by person distribution map, test

characteri stic curve, and test information curve are in Appendix C. The present

author had fb皿d nothing Problematic with test characteristic curve and test

information curve, and item by person distribution map indicated that the difficulty of

items in Test Set A was generally equal to the ability estimates of the test takers in

Group A-Low.

     The value for item difficulty can vary between-3.00 and 3.00, with-3.00 being

the easiest and 3.OO the most difficult. The numbers in“Rank”column indicate the

difficulty ranking of each of 27 items included in Test Set A.

     In investigating the relationship between item difficulty and question type, the

mean scores of item difficulty for items selected in Table 5-9 with reference to their

question types were calculated and are presented in Table 5-11. The items employed

in this part of analysis were limited to the items from Table 5-9 because they were the

items that loaded heavily on each factor in the factor analytic study and bore explicit

features of each question type.

     For a precise examination of the difference in the means of difficulties in these

two groups of items, a t-test was carried out(p.<0.1[p.=0.090]). From this result, it

can be seen that, fbr the population of Group A-Low,‘‘1iteral”items pose more

di伍culty than‘‘inferential”items. No analysis of relationship between question

type and item difficulty could be done fbr‘‘local/global”items since factors丘om

factor analytic studies did not indicate this feature.

88

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 27: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

Table 5-10 Final Parameter Estimates of test items for Group A・Low

Item# Difficul Rank Std. Error Chi s. df Sc. Diff

1 一〇,076 18 0,152 24,899 6 992 0,763 4 0,155 5,555 6 1073 1,265 1 0,168 10,742 6 1124 一2.045 27 0,255 22,284 6 81

5 一1,270 25 0,193 7,301 6 886 0,740 5 0,154 4,057 6 1077 0,405 9 0,151 25,349 6 1048 0,188 12 0,150 9,565 6 1029 1,238 2 0,167 1,796 6 111

13 0,122 13 0,151 5,810 6 101

14 0,035 15 0,151 5,123 6 10015 一1.650 26 0,219 4,003 6 8516 0,604 6 0,152 12,121 6 10517 0,253 11 0,150 6,744 6 10218 0,100 14 0,151 8,806 6 101

19 一〇.549 22 0,162 5,841 6 9520 一〇.032 17 0,152 9,894 6 10021 1,238 3 0,167 8,678 6 111

22 一〇.818 23 0,171 2,087 6 9323 一〇.076 19 0,152 2,394 6 9924 一〇,010 16 0,152 8,168 6 10025 一〇.424 21 0,159 7,846 6 9626 一〇.189 20 0,154 8,111 6 9827 一1.095 24 0,184 4,450 6 9028 0,318 10 0,150 12,019 6 10329 0,449 8 0,151 4,310 6 10430 0,515 7 0,151 4,856 6 105

Table 5-11 Means of item dif『icutty for each question type in Group A-Low

Litera1

Item# Di伍cul2 0,7637 0,405

Mean 0,584

Infbrential

Item# Dif猛cul

14 0,03517 0,25318 0,100

19 一〇.549

Mean 一〇.040

89

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 28: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

5.3.3Group A-High

     With the test performances of the test takers in Group A-High, item diffil culty of

each test item in Test Set A was calibrated via Rasch Analysis based on. RASCAL

converged after 3 Loops. The final parameter estimates are presented in Table 5-12,

and raw score conversion table, item by person distribution map, test characteristic

curve, and test information curve are in Appendix D. Nothing problematic was

found with the test characteristic curve and the test information curve, and the item by

person distribution map indicated that the difficulty of items in Test Set A was

generally lower than the ability estimates of the test takers in Group A-High. The

numbers in“Rank”column indicates the difficulty ranl(ing of each item out of 27

items included in Test Set B.

Table 5・12 Final Parameter Estimates of test items for Group A-High

Item# Difficul Rank Std. Error Chi s. df Sc. Diff

1 0,982 7 0,180 8,249 5 1092 2,575 2 0,157 4,634 5 123

3 1,995 5 0,155 4,524 5 118

4 一1.866 25 0,563 1,565 5 83

5 一〇.312 14 0,279 3,488 5 97

6 1,740 6 0,157 5,201 5 116

7 2,456 3 0,155 5,399 5 122

8 0,823 9 0,187 4,118 5 107

9 2,041 4 0,154 1,467 5 119

13 一〇.105 13 0,256 4,752 5 99

14 一〇.770 17 0,339 2,884 5 93

15 一1.583 23 0,492 87,938 5 86

16 2,647 1 0,158 5,898 5 124

17 一〇.662 16 0,323 5,814 5 94

18 一〇.390 15 0,288 8,646 5 96

19 一1.583 24 0,492 4,278 5 86

20 0,126 11 0,235 14,538 5 101

21 0,951 8 0,181 4,244 5 109

22 一2.263 27 0,682 0,783 5 79

23 0,016 12 0,245 4,522 5 100

24 一〇.770 18 0,339 6,776 5 93

25 一〇.890 19 0,357 2,161 5 92

26 一1.362 22 0,444 3,335 5 88

27 一1.866 26 0,563 4,698 5 83

28 一1.180 21 0,408 12,565 5 89

29 一1.025 20 0,380 5,544 5 91

30 0,276 10 0,222 6,360 5 103

90

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 29: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

     In Table 5-13, the means of item difficulty for test items selected in Table 5-9

according to their question types are indicated so that the relationship between item

difficulty and question type of the items could be investigated. Only the items from

Table 5-9 were employed in this part of analysis because they were the items that

loaded heavily on each factor in the factor analytic study and bore explicit features of

each question type.

Table 5-13 Means of item difficulty’for each question type in Group A・High

Literal

18 一〇,390

25 一〇.890

26 一1.362

27 一1.866

Mean 一1.127

Inferential

16 2,647

23 0,016

30 0,276

Mean 0,980

Local

17 一〇.662

19 一1.583

28 一1.180

30 0,276

Mean 一〇.787

Global

4 一1.866

5 一〇.312

6 1,740

7 2,456

22 一2.263

Mean 一〇.049

     At-test was carried out fbr the precise examination of the difference in the

mean difficulties of these two pairs of items. The difference was significant

between‘‘literal”and‘‘inferential”items(p.<0.05[p.=0.045])but not between

“local”and“global”items(p.>0.1[p.=0.533]). From this result, it can be seen that,

fbr the population of Group A-High,‘‘inferential”items pose more dif丘culty than

‘‘撃奄狽?窒≠戟hitems, but no difference could be found with regard to the local/global nature

of reading performance・

5.3.4Gmup B

     Based on the test performances of the test takers in Group B, item di伍culty

was calibrated via Rasch Analysis fbr each test item in Test Set B. RASCAL

91

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

Page 30: CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17.7 4.5 4 27 0,768 There was no minimum score of zero, though the maximum score

converged after 3 Loops. The final parameter estimates are presented in Table 5-14.

Appendix E includes the raw score conversion table, the item by person distribution

map, the test characteristic curve, and the test infbmation curve. No problems were

fb皿d with the test characteristic cinve and the test information curve, and the item by

person distribution map indicated that the difficulty of items in Test Set B was

generally equal to the ability estimates of the test takers in Group B. The numbers

in“Rank”column of Table 5-14 indicates the difficulty ranking of each item out of 27

items included in Test Set B.

1「able 5・14 Final Parameter Estimates of test items for Group B

     As it was explained in Section 5.3.1, items丘om Test Set B cannot be

incorporated in the analysis of the relationship between question types and item

difficulty since the‘‘item position”factor(or‘‘where a test item is indicated in the test

set), which accounted fbr the first factor in the factor analytic study of Group B

performances on Test S et B, was very strong, and only a few items showed high

loadings on the second factor・

92

東京外国語大学 博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)