Ed8 Assessment of Learning 2

ADVANCED METHODSin

EDUCATIONAL ASSESSMENTand

EVALUATION ASSESSMENT OF LEARNING

2

EDDIE T. ABUGBSE-TLE 4A

DR. REBECCA AMAGSILA Ph. D.

CHAPTER 1

REVIEW OF PRINCIPLES OF

HIGH QUALITY

ASSESSMENT

CHAPTER 1

REVIEW OF PRINCIPLES OF

HIGH QUALITY

ASSESSMENT

CHAPTER 1REVIEW OF PRINCIPLES

OF HIGH QUALITY ASSESSMENT

CLARITYOF

LEARNING TARGETS

APPROPRIATENESSOF

ASSESSMENT METHODS

PROPERTIESOF

ASSESSMENT

METHODS

COGNITIVE

TARGETSSKILL

S,

COMPE-

TENCIES

AND

ABILITIES

TARGETS

COGNITIVE TARGETS

PRODUCTS,

OUTPUTS

AND

PROJECTS

TARGETS

WRITTEN -RESPONSE

INSTRUMENT

PERFOR-MANCE TEST

PRODUCT RATING SCALES

ORALQUESTIO

-NING

OBSER-VATION

AND SELF

REPORTS

VALIDITY

RELIABILITY

FAIRNESS

PRACTICA-LITY AND

EFFICIENCY

ETHICSIN

ASSESSMENT

A. Clarity of Learning TargetsAssessment can be made precise, accurate and dependable only if what are to be achieved are clearly stated and feasible .

We consider learning targets involving knowledge, reasoning skills, products and

effects.

Learning targets need to be stated in behavioral terms

orTerms that denote something which

can be observed through

the behavior of the student.

1. Cognitive Targets2. Skills, Competencies and Abilities Targets3. Products, Outputs and Project Targets

As early as the 1950’s, Bloom (1954), proposed a hierarchy of educational objectives as the cognitive level. These are:

1. COGNITIVE TARGETS

Knowledge Refers to the acquisition of

Facts, Concepts and

Theories.

Knowledge of Historical Facts like the DATE of EDSA revolution

Knowledge about the Discovery “Philippines”

MagellanMarch 15

1521

Knowledge

Forms the foundation of all other cognitive objectives for w/o knowledge, it is not possible to move up to the next higher level of thinking skills in the hierarchy of educational objectives.

Comprehension Refers to the same

concept as“understanding”.

It is a step higher than mere acquisition of facts and involves a cognition of awareness of the interrelationships of facts and concepts

Ex: (knowledge of facts).The Spaniards ceded the Philippinesto the Americans in 1898.

In effect, the Philippines declared independence from the Spanish rule only to be ruled by yet another foreign power, the Americans (comprehension)

APPLICATIONRefers to the transfer of knowledge from one fieldof study to another or from one concept in the same discipline.

Ex: The classic experiment Pavlov on dogs showed that animals can be conditioned to respond in a certain way to certain stimuli.

The same principle can be applied in the context of teaching and learning on behavior modification for school children.

ANALYSISRefers to the breaking down of a concept or idea into its components and explaining the concept as a composition of these concepts.

Ex: Poverty in the Philippines, particularly at the barangay level, can be traced back to the low income levels of families in such barangays and the propensity for large households w/ an average of about 5 children per family. (Note:Poverty is analyzed in the context of income and number of children.

SYNTHESISRefers to the opposite of analysis and entails putting together the components in order to summarize the concept.

Ex: The field of geometry

Replete w/ examples of synthetic lessons. from the relationship of the parts of a triangle for instance, one can deduce that the sum of the angles of a triangle is 180˚.

Evaluate the actors professionals, amateurs, or students?

Criticize the actors capable of dealing with the script's requirements?

(Be fair to the actors in your assessment of their talents and the level of their "craftsmanship.")

EVALUATION AND REASONING

Refers to valuing and judgment or putting the “worth” of a concept or principle.

Students make judgments about the value of ideas, items, materials, and more.Students are expected bring in all they have learned to make informed and sound evaluations of material.

Key Words for the Evaluation Category:evaluate, appraise, conclude, criticize, critique

Ex:Watch an stage play and write a critique of the actor’s performance.

2. SKILLS, COMPETENCIES AND ABILITIES TARGETS

Skills refer to specific activities or tasks that a student can proficiently do e.g. skills in coloring, language skillsSkills can be clustered together to form specific competencies e.g. Birthday card making.Related competencies characterize student’s ability. (DACUM, 2000)

Abilities can be roughly categorized into: cognitive, psychomotor and affective abilities

Ability to work well w/ others & to be trusted by every classmate (affective ability)

is an indication that the

student can most likely succeed in work that requires

leadership abilities.

Other students are better at doing things

alone like programming & web designing

(cognitive ability) and, therefore, they would

be good at highly technical individualized

work.

3. PRODUCTS, OUTPUTS AND PROJECTS TARGETS

Tangible and concrete evidence of student’s ability

A clear target for products and projects need to clearly specify the level of worksmanship of such projects e.g. expert level, skilled level or novice level.

Once the learning targets

are clearly set, it is now

necessary to determine an appropriate assessment

procedure or method.

B. APPROPRIATENESS

OFASSESSMENT

METHODS 1. Written-Response Instruments 2. Product Rating Scales 3. Performance Test 4. Oral Questioning 5. Observation and Self Reports

1. WRITTEN-RESPONSE INSTRUMENTS

OBJECTIVE TESTSa.Multiple Choiceb.True-Falsec.Matching or Short Answer

TESTS, ESSAYS, EXAMINATIONS AND CHECKLIST

OBJECTIVE TESTSAppropriate for assessing the various levels

of hierarchy of educational objectives.Require a user to choose or provide a response to a question whose correct answer is predetermined. Such a question might require a student to :a. select a solution from a set of choices (multiple choice, true-false, matching)b. identify an object or position (graphical )c. supply brief numeric or text responses

What is higher-level thinking?What do we mean by higher-level thinking? Benjamin Bloom described six levels of cognitive behavior, listed here from the most basic – Knowledge – at the bottom to the most complex – Evaluation – at the top:

EvaluationSynthesisAnalysisApplicationComprehensionKnowledge

1. MULTIPLE CHOICE TESTIn particular can be constructed in such a way as to

test higher order thinking skills.

MULTIPLE CHOICE TEST

a. Tim needs extra practice reading and writing problematic letters and words at home at least 30 minutes per day.

b. Please discuss the importance of schoolwork to Tim so that he will increase his efforts in classwork.

c. These are possible symptoms of dyslexia so I would like to refer him to a specialist for diagnosis.

d. Please adjust Tim’s diet because he is most likely showing symptoms of ADHD due to food allergies.

Explanation: C is the best answer because the behaviors could be symptoms of dyslexia.

Students must evaluate multiple pieces of evidence, then apply that evidence to solve a problem, student must select

the best action to take with the evidence.

Tim’s second grade teacher is concerned because of the following observations about Tim’s behavior in class:

Withdraws from peers on the playground and during groupworkOften confuses syllables in words (ex: says mazagine instead of magazine)Often confuses b and d, p and q, etc. when writing or recognizing letters

The teacher has arranged a meeting with Tim’s mother to discuss these concerns. Which of the following statements is

best for the teacher to say to Tim’s mother?

When properly planned, can test the student’s grasp of the higher level cognitive skills

particularly in the areas of application analysis, synthesis, and judgment.

Questions - “precise”, PARAMETERS - “properly defined”

Write an essay about the first EDSA revolution.

(give add’l. requirements to give focus)

Focus on the main characters and their respective roles in the revolution

2. ESSAYS

2. PRODUCT RATING SCALESA Teacher is often tasked to rate products.1. Book reports2. Maps3. Charts4. Diagrams 5. Notebooks6. Essays7. Creative endeavors

Example of a Product Rating ScaleClassic “Handwriting” Scale

Used in California Achievement Test Form W (1957)

PurposeThe CAT is often administered to determine a child's readiness for promotion to a more advanced grade level and may also be used by schools to satisfy state or local testing requirements.

The test report includes a scale score, which is the basic measurement of how a child performs on the assessment scale score: determined by the total number of test items correct or through item-pattern scoring

One of the most frequently used measurement instruments is the checklist.

A performance checklist consists of a list of behaviors that make up a certain type of performance (e.g. Using a microscope, typing a letter, solving a mathematics performance and so on).

It is used to determine whether or not an individual behaves in a certain (usually desired) way when asked to complete a particular task.

If a particular behavior is present when an individual is observed, the teacher places a check opposite it on the list.

3. PERFORMANCE TESTS

4. ORAL QUESTIONING

The traditional Greeks used oral questioning extensively as an assessment method, Socrates himself, considered the epitome (perfect example of a particular quality) of a teacher, was said to have handled his classes solely based on questioning and oral interactions,

Oral questioning is an appropriate assessment method when the objectives are: a.) to assess student’s stock knowledge and/or b.) to determine the student’s ability to communicate ideas in coherent (logical and consistent) verbal sentences.

Of particular significance are the student’s state of mind and feelings, anxiety and nervousness in making oral presentations w/c could mask the student’s true ability.

A Tally Sheet is a device often used by teachers to record the frequency of student behaviors, activities or remarks.

5. OBSERVATION AND SELF REPORTS

Useful supplementary (additional) assessment methods when used in conjunction (connects) w/

oral questioning and performance tests.

A Self-checklist is a list of several characteristics or activities presented to the subjects of a study.

C. PROPERTIES OF

ASSESSMENTMETHODS

1. Validity2. Reliability3. Fairness4. Practicality and efficiency5. Ethics in assessment

The quality of the assessment instrument and method used in education is very important since the evaluation and judgment that the teacher gives on a student are based on the information he obtains using these instruments.

1. validity

Defined as the instrument’s ability to measure what it purports (intention) to measure.

Defined as referring to the appropriateness, correctness, meaningfulness and usefulness of the specific conclusions that a teacher reaches regarding the teaching-learning situation.

Content Validity refers to the content and format of the instrument How appropriate is the content? How comprehensive?

How adequately does the sample items or questions represent the content to be assessed? Is the format appropriate?

Does the instrument logically get the intended variable or factor?

Content and Format-Consistent w/ the definition of variable or factor to be measured-1. Do students have adequate experience w/ the type of task posed by the item?

Content and Format

2. Did the teachers cover sufficient material for most students to be able to answer the item correctly?

Content and Format

3. Does the item reflect the degree of emphasis received during instruction?

Two (2) Forms of Content Validity

Table

CRITERIA

I T E M

1. Material covered

sufficiently.

2. Most students are able to answer item correctly.

3. Students have prior experience w/ the type of task.

4. Decision:Accept or Reject

FORM A: ITEM VALIDITY

1 2 3 4 5 6

FORM B: ENTIRE TESTKNOWLEDGE/

SKILLS AREAESTIMATED PERCENT OF INSTN.

PERCENT.OF ITEMS COVERED IN TEST

1. Knowledge

2. Comprehension

3. Application

4. Analysis

5. Synthesis

6. Evaluation


Table

FORM B: ENTIRE TESTKNOWLEDGE/

SKILLS AREAESTIMATED PERCENT OF INSTRUCTION

PERCENT.OF ITEMS COVERED IN TEST

1. Knowledge

2. Comprehension

3. Application

4. Analysis

5. Synthesis

6. Evaluation

Based on Form B, adjustments in the number of items that relate to a topic can be made accordingly.


Table

CRITERIA

I T E M

1. Material covered

sufficiently.

2. Most students are able to answer item correctly.

3. Students have prior experience w/ the type of task.

4. Decision:Accept or Reject

FORM A: ITEM VALIDITY

1 2 3 4 5 6

While Content Validity is important

Two(2) Types of Validity

1. Face Validity Outward appearance of the

test lowest form of test validity.

2. Criterion-Related Validitythe test item is judged against specific criterion, correlating the test w/ a known valid test.

1.Face Validity

A test can be said to have face validity if it "looks like" it is going to measure what it is supposed to measure. For instance, if you prepare a test to measure whether students can perform multiplication, and the people you show it to all agree that it looks like a good test of multiplication ability, you have shown the face validity of your test.

2. Criterion-related Validity

(more important tupe)The test item is judge against a specific criterion Can also be measured by correlating the test with a known valid test (as a criterion)

A test needs to possess construct validity

A “construct” is another term for a factor, and we already know that a group of variables that correlate highly w/ each other form a factor.

Construct

let us say we are conducting a study on success in college. If we find out there is a high correlation between student grades in high-school math classes and their success in college (which can be measured by many possible variables),

http://en.wikipedia.org/wiki/Correlation

Construct

We would say there is high criterion-related validity between the intermediate variable (grades in high-school math classes) and the ultimate variable (success in college). Essentially, the grades students received in high-school math can be used to predict their success in college.

http://en.wikipedia.org/wiki/Validity_(statistics)

2. RELIABILITY

The reliability of an assessment method refers

to its consistency. It is also a term that is

synonymous w/

dependability or stability.

Stability or internal consistency as reliability measures can be estimated in several ways.a. The Split-half Method

(using Spearman-Brown prophecy formula)

b. The Kuder-Richardson formula

a. The Split-half Method

Involves scoring two halves of a test separately for each person and then calculating a correlation coefficient for the two sets of scores.

The coefficient indicates the degreee to w/c the two halves of the test provide the same results

Hence, describes the internal consistency of the test.

Splitting a test to estimate reliability.

Example:10 item test split (2)subtests, A. 1st 1-5, 2nd 6-10 Responses: 1st half different- 2nd half Reason: increase in item difficulty and fatigue

B. Odd items vs. even items Guarantee: each half will contain an

equal number of items from the beginning, middle, and end of the original test.

The Reliability of the test is calculated using

The Spearman–Brown

prediction formula, also known as the Spearman–Brown prophecy formula

The method was published

independently by Spearmanand Brown

(1910).

Reliability of test=2 x rhalf

1+ rhalf

Where, rhalf=reliability of half of the test

Charles Edward Spearman(Father of the True Score Theory of Reliability)

http://en.wikipedia.org/wiki/Charles_Spearman

http://en.wikipedia.org/wiki/William_Brown_(psychologist)

The equation for the correlation coefficient is:

Correlation Score between the two halves

Example:Five (5) Students

Test: 10 items Split-Half: odd vs. even

Result: 0.1336

Spearman–Brown prophecy formula

Reliability of test=2 x rhalf

1+ rhalf

R = 2 x 0.1336 1 + 0.1336 R = 0.2672 1.1336

R = 0.2357

Reliability

b. The Kruder-Richardson is the more frequently employed formula for determining internal consitency,

particularly KR20 (more difficult to calculate/requires a computer program)

and KR21

Dr. Frederic Kuder (1903-2000) one of the premier innovators of vocational assessments.

His 1938 Kuder Preference Record became one of the most-used career guidance instruments in schools and colleges, and was taken by more than a million people worldwide over the course of several decades.

Reliability

The Kruder-RichardsonFormula:KR20 = K { 1 – __∑ pq__} (K – 1) (Variance)

Where, K = number of items in the testp = proportion of students who answered the item correctlyq = proportion of students who answered the item wrongly = 1 – ppq = variance of a single item schored dichotomously (right/wrong)

KR21 = K {1 – n (K – M)_} (K – 1) K(Variance)Where, K = number of items on the test,M = mean of the test,Variance = variance of the test scores

The mean of a set of scores is simply the sum of the scores divided bu the number of scores; its variance is by:

Variance = Sum of differences of individual scores and mean / n – 1

Where n is the number of test takers

Reliability

c. The Test-retest Method of estimating reliability

Reliability of a test may also mean the consistency of test results when the same test is administered at two different time periods.

The estimate of test reliability is then given by the correlation of the two test results.

The test results only affected by the amount of time.

The closer the period the test given to the same set of examiners between the 1st and the 2nd , the higher the correlation. The longer the gap between the two test, the lower the correlation.

3. Fairness

An assessment procedure needs to be fair.

Students needs to know exactly what the learning targets are and what method of assessment will be used.If students do not know what they are supposed to be achieving, then they could get lost in the maze of concepts being discussed in the class. likewise, students have to be informed how their progress will be assessed in order to allow them to strategize and optimize their performance.

Assessment has to be viewedas an opportunity to learn rather than an oppurtunity to weed out poor and slow learners

Fairness also implies freedom from teacher-stereotyping.(Biases)Ex. Boys are better than Girls in Math or Girls are better than Boys in Language

04/13/2023 52

4. PRACTICALITY AND EFFICIENCY

Another Quality of a Good Assessment Procedure

Practical in the Sense that the Teacher should be familiar w/ it.Does not require Too much Time (Implementable)A Complex Assessmentt Procedure tends to be Difficult to Score and Interpret. Resulting in a lot of MisdiagnosisOr Too Long a Feedback Period w/c may render the Test Inefficient

5. ETHICS IN ASSESSMENT

The Term “Ethics” refers to questions of Right and Wrong

When Teachers think about Ethics, they need to ask themselves

If it is Right to Assess a Specific Knowledge or Investigate a Certain Question.

Are there some aspects of the Teaching-Learning situation that should Not to be Assessed?

ETHICS IN ASSESSMENT

Here are some situations in w/x assessment may not be called for:Requiring Students to

answer checklist of their sexual fantasies;Asking elementary pupils to answer sensitive questions w/o consent of their parents;Testing the mental abilities of pupils using an instrument whose validity and reliability are unknown;


When aTeacher Thinks about Ethics the Basic Question to ask in this regard is.

“Will any Physical or Psychological harm come to any one as a result of assessment or testing?”

Naturally, no Teacher would want this to

happen to any of his/her student.


Ethical (behavior) “conforming to the standards of conduct

Of a given profession or group” (Webster)The Fundamental Responsibility of a TeacherThe Most Important Ethical Consideration of allTo Do All in his/her power to Ensure that Participants in an Assessment Program

Are Protected from Physical/Psychological harm

“ “ “ Discomfort or Danger that may arise due to the testing procedure

“A Teacher who wishes to test-Physical Endurance may ask Students to climb a very steep mountain thus Endagering them physically.”


Test Results

and Assessment Results

areConfident

ial Results

Should be known only by the student concerned and the teacher

Deception(3rd Ethical issue in

assessment)

a. Determine whether the use of such

techniques is justified by the

educational value of the assessment

b. Determine whether

alternative procedures are available that do not make

use of concealment

and

c. Ensure that students are provided w/

sufficient explanationn as

soon as possible

There are instances in w/c it is necessary to conceal the objective of the assessment from the students in order to ensure fair and

impartial results.

Teacher’s Special Responsibility

Finally, the temptation to assist certain individuals in class during assessment or

testing is ever present.In this case, it is best if the teacher does not administer the test himself if he believes that

such a concern may, at a later time, be considered unethical.

Education

Ed8 Assessment of Learning 2