Upload
snowy
View
73
Download
0
Embed Size (px)
DESCRIPTION
Using Emerging Technologies to Improve Student Achievement: The Potential of Virtual Performance Assessments. Chris Dede Harvard University [email protected] www.gse.harvard.edu/~dedech/. Flawed Assessments Undercut Student (and Teacher) Achievement. - PowerPoint PPT Presentation
Citation preview
River City Introduction1
Using Emerging Technologies to Improve Student
Achievement:The Potential of Virtual
Performance AssessmentsChris DedeHarvard University
[email protected]/~dedech/
Flawed Assessments Undercut Student (and Teacher) Achievement
“Drive-by” high stakes tests frighten many students into suboptimal performance,which cumulatively leads to disengagement,low self-efficacy, and alienation
Students are rightly wary of investingin knowledge that tacitly is not valuedbecause it is not measured or rewarded.
Teachers are forced to emphasizetest performance rather than domain mastery
Current Summative Tests Undercut Achievement and Motivation
Paper-and-pencil item-based tests are inexpensive, reliable, and practical – but not valid for higher order thinking skills, such as scientific inquiry, or 21st century skills, such as mediated collaboration.
Physical performance assessments are more valid for sophisticated skills, but unreliable, impractical, expensive, and limited in typesand number of tasks possible
The Assessment Triangle
Cognition model of how
students represent knowledge & develop competence in the domain
Observations tasks or situations
that allow one to observe students’ performance
Interpretation methods for making
sense of the data
Observation
Interpretation
Cognition
Reasoning from Evidence
Mediated Performancesare an Untapped Resource
Cognition is distributed across human minds, tools/media, groups of people, and space/ time; dispersed physically, socially, and symbolically
Event-logs of performances and communications provide insights
Distributed learning: collaborative, mediated, scaffolded, and data-generating
Types of Rich Datastreams
Multi-User Virtual Environments:Immersion in virtual contexts withdigital artifacts and avatar-based identities
Wikis and other forms of Web 2.0 media
Asynchronous Discussions Intelligent Tutoring Systems Games Augmented Realities
What is a MUVE?
An “Alice in Wonderland” experience where users enter a virtual space that has been configured for learning
Learners represent themselves through graphical avatars to communicate with others’ avatars and computer-based agents, as well as to interact with digital artifacts and virtual contexts
River City
Figure 1: Lab Equipment inside the University
Figure 2: River Water Sampling
http://muve.gse.harvard.edu/rivercityproject
Evidence of Student Work
Assessment data: Pre-post content Pre-post affective Embedded assessments
(formative) Performance assessment
(summative) Contextual Data:
Attendance records Demographic data School data Observations Interviews
Active Data: Team chat Notebook entries Tracking of in-world
activities: Data gathering
strategies Pathways Inquiry processes
Event Logs as Observational Data
Indicates with Timestamps Where students went With whom they communicated
and what they said What artifacts they activated What databases they viewed What data they gathered
using virtual scientific instruments What screenshots and notations they placed in
team-based virtual notebooks
unobtrusive observational data
Student’s Role in theRiver City MUVE
Travel back in time 6 times between 1878-79 Bring 21st century skills and technology
to address 19th century problems Help town understand and solve part of
the puzzle of why so many residentsare becoming ill Work as a research team Keep track of clues that hint at causes of illnesses Form and test hypotheses in a controlled
experiment Make recommendations based on experimental
data
Capturing Data onChange over Time
Fall, 1878 Winter, 1879 Spring, 1879 Summer, 1879
Students visit the same places and see how things changeover time. They spend an entire class period in an individual season, gathering data.
Visit 1 Visit 2 Visit 3 Visit 4
“Evidence Gathering”
An important, generic inquiry process amount (how much evidence per time
spent) range (coverage/balance among all the
types of evidence) saliency (importance of the evidence in
understanding causality in the situation) clustering (grouping of evidence based on
its causal affiliation)
“Evidence Gathering”
Foundational for other inquiry processes hypothesis formation, experimental
design,and argumentation
Related to student attributes self-efficacy, metacognition, engagement,
and content knowledge
Virtual Performance Assessments
Funded by Institute of Educational Sciences
Three year grant Design three virtual performance
assessments to assess middle grade(6th and 7th) students' science inquiry learning in a standardized testing setting
http://virtualassessment.org
NSES Model of Inquiry Identify questions that can be answered through
scientific investigation (not independent of knowledge)
Design and conduct a scientific investigation Use appropriate tools and techniques to gather,
analyze, and interpret data Develop prescriptions, explanations, predictions, and
models using evidence Think critically and logically to make the
relationships between evidence and explanations Recognize and analyze alternative explanations and
predictions Communicate scientific procedures and explanations Use mathematics in all aspects of scientific inquiry
Authentic Environments
A Challenge on which Every Student has Roughly Equal Familiarity
Assessment Platform 3-D Immersive Environment for Science Experimentation
Based on Authentic Setting
Highly Secure, Cross Platform Application Builtin the Unity Framework
Realistic Complex Causal Model For Science Experimentation
Back End Architecture Real-Time Analysis of Student Paths
All Interactions are Logged for Future Research
Ensure Data Integrity by Encrypting Data Along the Way
Complex Student Work Product is Recorded as XML, which can be tokenized
EcoMUVE (www.ecomuve.org)Formative/Diagnostic
Formative, diagnostic assessment provides more leverage for improvement than summative measures
Formative, diagnostic assessment is richerand more accurate than summative measures
Potentially, formative, diagnostic assessment could substitute for summative measures.
Module 1: Pond EcosystemModeled after Black’s Nook Pond in
Cambridge, MA
“Submarine” Tool
Instruction and Assessment based on Learning Trajectories
Table 1: Forces as Interactions facet cluster (Krauss & Minstrell, 2002)
00 All forces are the result of interactions between two objects. Each object in the pair interacts with the other object in the pair. Each influences the other.
01 All interactions involve equal magnitude and oppositely directed action and reaction forces that are on the two separate interacting bodies.
40 Equal force pairs are identified as action and reaction but are on the same object. For the example of a book at rest on a table, the gravitational force on the book and the force by the table on the book are identified as an action-reaction pair.
50 Effects (such as damage or resulting motion) dictate relative magnitudes of forces during interaction.
51 At rest, therefore interaction forces balance. 52 "Moves", therefore interacting forces unbalanced. 53 Objects accelerate, therefore interacting forces unbalanced.
60 Force pairs are not identified as having equal magnitude because the objects are somehow different.
61 The “stronger” object exerts a greater force. 62 The moving object or the one moving faster exerts a greater force. 63 More active/energetic exerts more force. 64 Bigger/heavier exerts more force.
90 Inanimate objects cannot exert a force.
Types of Rich Datastreams
Multi-User Virtual Environments:Immersion in virtual contexts withdigital artifacts and avatar-based identities
Wikis and other forms of Web 2.0 media
Asynchronous Discussions Intelligent Tutoring Systems Games Augmented Realities
Related Initiatives Cisco-Intel-Microsoft global initiative on
assessing 21st century skills Advances in European measures, such as
PISA Evolution of US tests, such as NAEP Numerous other scholars working on
games and simulations for learning and assessmentA Breakthrough in the Next Few Years-
But Don’t Wait!
“Disruptive” Assessment
Rewarding Achievement Useful in Real World
Students see academic learning as relevant
Quality is measured in sophisticated ways along multiple dimensions
Rote teaching and learning are exposedas tragically inadequate
Learning and formative assessment arerichly interwoven in engaging ways
Call for New Measuresof Inquiry
Paper-and-pencil tests, such as the National Assessment of Educational Progress (NAEP), Third International Math and Science Study (TIMSS), and New Standards Science Reference Exams (NSSRE), don’t measuring inquiry well and aren’t aligned with the NSES standards
NAEP published their framework for establishing a new science assessment in 2009 that calls for multiple modes of assessment, including interactive computer assessments
“Immersive” Interfaces for Learning
Virtual RealityFull sensory immersion via head-mounted displays or CAVES
Multi-User Virtual EnvironmentsImmersion in virtual contexts withdigital artifacts and avatar-based identities
Ubiquitous ComputingWearable wireless devices coupled tosmart objects for “augmented reality”
Affordances ofImmersive Interfaces
The types of behaviorsimmersive interfaces can enable Complex situations with tacit clues Simulated scientific instruments Virtual experimentation Simulated collaboration in a team Adaptive responses to student choices
Documented in Event-logs and Chat-logs
Traditional Evaluation of Quality
Inferential methods:On average, students in the River City treatment
scored .2 points higher on the post self-efficacy in general science inquiry section of the affective measure (t=2.22, p<.05).
On average, students in this sample who saw higher gains in self efficacy in general science inquiry scored higher on the post test. These gains were higher for students in the River City project (n=358).
Yet these results tell us nothing about patterns, behaviors,and processes that lead to inquiry. We are also limitedby # of variables we can build into our inferential models.
Goals of IES VPA Project
Proof of Concept for Immersive Virtual Performance Assessments (IVPAs) thatMeasure Sophisticated Intellectual/Social Skills Establish higher validity than physical
performance assessments (PPAs) No challenges of physical materials Virtual worlds enable performances impossible in
classrooms Establish higher reliability and usability than
PPAs,as well as lower cost
Detailed tracking of participant behaviors Respectable psychometrics compared to
paper-and-pencil item-based tests Establish that student engagement leads
to every participant working hard to succeed The importance of shifts in identity
Research Questions Can we construct a virtual
assessment that measures scientific inquiry, as defined by the National Science Education Standards (NSES)?
What is the evidence that our assessments are designedto test NSES inquiry abilities?
Are these assessments reliable?
Research Methods
Alignment studies Cognitive analysis studies
(think-alouds with students) Generalizability study across
three instances of the same assessment
Assessment Framework
Evidence Centered Design I. Domain Analysis II. Domain Modeling III. Conceptual Assessment
Framework IV. Assessment Implementation V. Assessment Delivery VI. Refinement
Design Process is Not Linear
Domain Analysis
Domain Modeling
Conceptual AssessmentFramework
Assessment Implementatio
n
Domain AnalysisWe analyzed different models for
science inquiry: NSES Standards (National Research Council,
1996) Inquiry Cycle (White & Frederiksen, 1998) Novice-expert models
(Chi, Feltovich, & Glaser, 1981) Scientific Discovery as Dual Search (SDDS)
(Klahr, 2000) Epistemological & Strategic (Kuhn & Pease,
2008) NAEP Framework (NAEP, 2008)
Inquiry Models
“The whole of science is nothing more than a refinement of everyday thinking.”
-- Einstein, 1936 (quoted in Klahr, 2000)Inquiry is the way we think. Some people do it better.
Experts are doing something cognitively different in their head.
Enhanced Assessment Platform
Use Performance Palettes to Collect Student Work
Minimize the Prediction of Language Art Skillsvia use of Audio Instruction and Visual Cues
Enable Realistic Use of Tools Anywhere in the World
Map of the Context
Can vary the casualmodel, so the assessment can differ from one studentor class to another –as long as each model has an equivalent amount of evidence collectable withequivalent time and effort
Back End Architecture