Upload
mehrnooshv
View
116
Download
0
Embed Size (px)
Citation preview
Human Algorithmic Stabilityand Human Rademacher Complexity
European Symposium on Artificial Neural Networks,Computational Intelligence and Machine Learning
Mehrnoosh Vahdat, Luca Oneto, Alessandro Ghio,Davide Anguita, Mathias Funk and Matthias Rauterberg
University of Genoa
DIBRISSmartLab
April 22-24, [email protected] - www.lucaoneto.com
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Goal of Our Work
In Machine Learning, the learning process of an algorithm given a set ofevidences is studied via complexity measures.
Algorithmic Stability↔ Human Stability
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Introduction
• Exploring the way humans learn is a major interest in Learning Analytics andEducational Data Mining1
• Recent works in cognitive psychology highlight how cross-fertilization betweenMachine Learning and Human Learning can also be widened to the extents ofhow people tackle new problems and extract knowledge from observations 2
• Measuring the ability of a human to capture information rather than simplymemorizing is a way to to optimize Human Learning
• In a recent work is proposed the application of Rademacher Complexity approachto estimate human capability of extracting knowledge (Human RademacherComplexity) 3
• As an alternative, we propose to exploit Algorithmic Stability in the HumanLearning framework to compute the Human Algorithmic Stability 4
• Human Rademacher Complexity and Human Algorithmic Stability, is measured byanalyzing the way a group of students learns tasks of different difficulties
• Results show that using Human Algorithmic Stability leads to beneficial outcomesin terms of value of the performed analysis
1Z. Papamitsiou and A. Economides. Learning analytics and educational data mining in practice: A systematicliterature review of empirical evidence. Educational Technology & Society, 17(4):49-64, 2014.
2N. Chater and P. Vitanyi. Simplicity: A unifying principle in cognitive science? Trends in cognitive sciences,7(1):19-22, 2003.
3X. Zhu, B. R. Gibson, and T. T. Rogers. Human rademacher complexity. In Neural Information ProcessingSystems, pages 2322-2330, 2009.
4O. Bousquet and A. Elisseeff. Stability and generalization. The Journal of Machine Learning Research,2:499-526, 2002.
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Machine Learning
Machine Learning studies the ability of an algorithm to capture information rather thansimply memorize it.5
Let X and Y ∈ {±1} be, respectively, an input and an output space. We consider msets Dm = {S1
n , . . . ,Smn } of n labeled i.i.d. data S j
n : {Z j1, . . . ,Z
jn} (j = 1, ...,m), where
Z ji∈{1,...,n} = (X j
i ,Yji ), where X j
i ∈ X and Y ji ∈ Y, sampled from an unknown
distribution µ. A learning algorithm A is used to train a model f : X → Y ∈ F based onS j
n. The accuracy of the selected model is measured with reference to `(·, ·) (the hardloss function) which counts the number of misclassified examples.
Several approaches6 in the last decades dealt with the development of measures toassess the generalization ability of learning algorithms, in order to minimize risks ofoverfitting.
5V. N. Vapnik. Statistical learning theory. Wiley-Interscience, 1998.6O. Bousquet, S. Boucheron, and G. Lugosi. Introduction to statistical learning theory. Advanced Lectures on
Machine Learning. Springer Berlin Heidelberg, 2004
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Rademacher Complexity7
Rademacher Complexity (RC) is anow-classic measure:
Rn(F) =1m
m∑j=1
[1− inf
f∈F
2n
n∑i=1
`(f , (X ji , σ
ji ))
]
σji (with i ∈ {1, . . . , n} and j ∈ {1, . . . ,m})
are ±1 valued random variable for whichP[σj
i = +1] = P[σji = −1] = 1/2.
7P. L. Bartlett and S. Mendelson. Rademacher and gaussian complexities: Risk bounds and structural results.The Journal of Machine Learning Research, 3:463-482, 2003.
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Algorithmic Stability8,9
In order to measure Algorithmic Stability,we define the sets S j\i
n = S jn \ {Z
ji }, where
the i-th sample is removed. AlgorithmicStability is computed as
Hn (A) =1m
m∑j=1
|`(AS jn,Z j
n+1)− `(AS j\in,Z j
n+1)|.
8O. Bousquet and A. Elisseeff. Stability and generalization. The Journal of Machine Learning Research,2:499-526, 2002.
9L. Oneto, A. Ghio, S. Ridella, and D. Anguita. Fully empirical and data-dependent stability-based bounds. IEEETransactions on Cybernetics, 10.1109/TCYB.2014.2361857 :in-press, 2014.
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Human Learning
Inquiry-guided HL focuses on contexts where learners are meant to discoverknowledge rather than passively memorizing10,11: the instruction begins with a set ofobservations to interpret, and the learner tries to analyze the data or solve the problemby the help of the guiding principles12, resulting in a more effective educationalapproach13.
Measuring the ability of a human to capture information rather than simply memorizingis thus key to optimize HL.
10A. Kruse and R. Pongsajapan. Student-centered learning analytics. In CNDLS Thought Papers, 2012.11V. S. Lee. What is inquiry-guided learning? New directions for teaching and learning, 2012(129):5-14, 2012.12V. S. Lee. The power of inquiry as a way of learning. Innovative Higher Education, 36(3):149-160, 2011.13T. DeJong, S. Sotiriou, and D. Gillet. Innovations in stem education: the go-lab federation of online labs. Smart
Learning Environments, 1(1):1-16, 2014.
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Rademacher Complexity
A previous work14 depicts an experiment targeted towards identifying the HumanRademacher Complexity, which is defined as a measure of the capability of ahuman to learn noise, thus avoiding to overfit data.
Unfortunately Rademacher Complexity has many disadvantages:• We need to fix, a priori the class of function (which for a Human it may not even
exist)• We have to make the hypothesis that every Human performs always at his best
(always find the best possible solution that is able to find)• We have to discard the labels (different tasks with different difficulty levels, in a
fixed domain, results in the same Human Rademacher Complexity)
14X. Zhu, B. R. Gibson, and T. T. Rogers. Human rademacher complexity. In Neural Information ProcessingSystems, pages 2322-2330, 2009.
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Algorithmic Stability
We designed and carried out a new experiment, with the goal of estimating the averageHuman Rademacher Complexity and Human Algorithmic Stability for a class ofstudents: the latter is defined as a measure of the capability of a learner tounderstand a phenomenon, even when he is given slightly perturbedobservations.
In order to measure the Human Algorithmic Stabilitywe do not need to make any hypothesis.
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Experimental Design (I/V)
Shape Problem
· · · · · · · · · · · ·
Two problems:• Shape Simple: (Thin) vs (Fat)• Shape Difficult: (Very Thin + Very Fat) vs (Middle)
Experimental Design (II/V)
Word Problem: Wisconsin Perceptual Attribute Ratings Database15
Word Simple:(Long) vs (Short)• pet• cake• cafe• · · ·• honeymoon• harmonica• handshake
Word Difficult:(Positive Emotion) vs (Negative Emotion)• rape• killer• funeral• · · ·• fun• laughter• joy
15D. A. Medler, A. Arnoldussen, J. R. Binder, and M. S. Seidenberg. The Wisconsin Perceptual Attribute RatingsDatabase. http://www.neuro.mcw.edu/ratings/, 2005.
Experimental Design (II/V)
Human ErrorFour Problems:
• Shape Simple (SSS)
• Shape Difficult (SSD)
• Word Simple (SWS)
• Word Difficult (SWD)
Procedure: for each domain
• Extract a dataset S jn
• Extract a sample on{X j
n+1, Y jn+1}
• Give to the student S jn for
learning
• Ask to the student to predict thelabel of X j
n+1
Filler Task after each task.
Ln (A) =1
m
m∑j=1
`(AS j
n, Z j
n+1).
Human RademacherComplexityTwo Problems (Labels aredisregarded):
• Shape (RS)
• Word (WS)
Procedure: for each domain
• Extract a dataset S jn
• Randomize the n labels
• Give to the student S jn for
learning
• Ask to the student to predict thelabels of the same S j
n indifferent order
Filler Task after each task.
Human AlgorithmicStabilityFour Problems: SSS, SSD, SWS,SWD.Procedure: for each domain
• Extract a dataset S jn
• Extract a sample on{X j
n+1, Y jn+1}
• Give to one student S jn for
learning
• Ask to the student to predict thelabel of X j
n+1
• Give to another student S jn−1
for learning
• Ask to the student to predict thelabel of X j
n+1
Filler Task after each task.
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Results (I/III)• Ln (A) is smaller for simple tasks than for difficult ones• Error of ML models usually decreases with n, results on HL are characterized by
oscillations, even for small variations of n (small sample considered, only a subsetof the students are willing to perform at their best when completing thequestionnaires
• Oscillations in terms of Ln (A) mostly (and surprisingly) affect simple tasks (SSS,SWS)
• Errors performed with reference to the Shape domain are generally smaller thanthose recorded for the Word domain
5 10 15 20 250
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
SWD
SWS
SSD
SSS
Figure: Ln (A) in function of n
Results (II/III)
• HAS for simple tasks are characterized by smaller values of Hn (A)• HAS for the Shape domain is generally smaller than for the Word domain.• HAS offers interesting insights on HL, because it raises questions about the ability
of humans to learn in different domains.
5 10 15 20 250
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
SWD
SWS
SSD
SSS
Figure: Hn (A) in function of n
Results (III/III)• HRC is not able to grasp the complexity of the task, since labels are neglected
when computing Rn(F)• The two assumptions, underlying the computation of HRC, do not hold in practice:
in fact, the learning process of an individual should be seen as a multifacetedproblem, rather than a collection of factual and procedural knowledge, targetedtowards minimizing a “cost” 16.
• HRC decreases with n (as in ML), and this trend is substantially uncorrelated withthe errors for the considered domains.
5 10 15 20 250
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
RW
RS
Figure: Rn(F) in function of n
16D. L. Schacter, D. T. Gilbert, and D. M. Wegner. Psychology, Second Edition. Worth Publishers, 2010.
Outline
Goal of Our Work
Introduction
Machine LearningRademacher ComplexityAlgorithmic Stability
Human LearningRademacher ComplexityAlgorithmic Stability
Experimental Design
Results
Conclusions
Conclusions
• Human Algorithmic Stability is more powerful than Human RademacherComplexity to understand about human learning
• Future evolutions will allow to:
• analyze students’ level of attention through the results of the filler task
• Include other heterogeneous domains (e.g. mathematics) and additional rules, ofincreasing complexity
• Explore how the domain and the task complexity influence Human Algorithmic Stability,and how this is related to the error performed on classifying new samples.
• Dataset of anonymized questionnaires will be publicly available.
Human Algorithmic Stabilityand Human Rademacher Complexity
European Symposium on Artificial Neural Networks,Computational Intelligence and Machine Learning
Mehrnoosh Vahdat, Luca Oneto, Alessandro Ghio,Davide Anguita, Mathias Funk and Matthias Rauterberg
University of Genoa
DIBRISSmartLab
April 22-24, [email protected] - www.lucaoneto.com