解決正確的問題 - 如何讓數據發揮影響力?

Preview:

Citation preview

解決正確的問題:如何讓數據發揮影響力?Pei-shen Wu, MD

peishen.wu@gmail.com

Outline• 是什麼阻礙了組織運用數據資料的力量?• 怎麼解決?從 practitioner的角度出發• 重新思考數位時代下的學習

資料分析需求的分佈

因果關係驗證

資料探索、假設建立、相關性探究大多數的分析需求在這

與資料人員的對話「近期增加了產品品項,但客戶人數不見起色,想問最近用戶的活躍狀況?」「這些用戶的狀況跟去年同期比較如何?」「這些用戶是否有地區差異?」「與這些用戶同班的用戶的狀況?」…

回應單一個問題需要的功夫SELECT keyID,userID,dot_email,gender,points,user_nickname,username,userEmail,userSendableEmail,userRole,userCity,userSchool,userGrade,joinedTime,userBirthdate,match_score FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,pre.userEmail AS userEmail,pre.userSendableEmail AS userSendableEmail,pre.userRole AS userRole,pre.userCity AS userCity,pre.userSchool AS userSchool,pre.userGrade AS userGrade,pre.joinedTime AS joinedTime,pre.userBirthdate AS userBirthdate,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,post.userEmail AS userEmail,post.userSendableEmail AS userSendableEmail,post.userRole AS userRole,post.userCity AS userCity,post.userSchool AS userSchool,post.userGrade AS userGrade,post.joinedTime AS joinedTime,post.userBirthdate AS userBirthdate FROM (SELECT * FROM (SELECT keyID, userID, dot_email, gender, points, user_nickname, username FROM (SELECT __key__.name AS keyID, user_id AS userID, user_email AS underline_email, current_user.email AS cu_email, user.email AS dot_email, gender, points, user_nickname, username FROM junyi_20161212.UserData_20161212)) AS pre INNER JOIN EACH (SELECT userEmail, userSendableEmail, userId AS userID, userRole, userCity, userSchool, userGrade, joinedTime, userBirthdate FROM FinalTable.UserFinalTmpInfo) AS post ON pre.userID = post.userID)) AS pre INNER JOIN EACH (SELECT keyID,match_score FROM (SELECT keyID, match_score FROM (SELECT keyID,prob1*prob2 AS match_score FROM (SELECT pre.keyID AS keyID,pre.prob1 AS prob1,post.prob2 AS prob2 FROM (SELECT * FROM (SELECT keyID, prob1 FROM (SELECT keyID, match_score AS prob1 FROM (SELECT keyID, match_score FROM (SELECT keyID, match_score_A*cdf AS match_score FROM (SELECT keyID, MAX(output) AS cdf, match_score_A FROM (SELECT keyID, IF(metric >= input,1,0) AS compare, output, match_score_A FROM (SELECT pre.keyID AS keyID,pre.comparekey AS comparekey,pre.metric AS metric,pre.match_score_A AS match_score_A,post.input AS input,post.output AS output FROM (SELECT * FROM (SELECT keyID, 1 AS comparekey, metric, match_score AS match_score_A FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT teacher_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT classID, MAX(match_score) AS match_score FROM (SELECT classID,match_score FROM (SELECT classID, MAX(match_score) AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT keyID AS student_keyID, match_score FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID)) AS post ON pre.student_keyID = post.student_keyID)) GROUP BY classID)) GROUP BY classID) AS post ON pre.classID = post.classID)) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))) AS pre INNER JOIN EACH (SELECT input, AVG(output) AS output, 1 AS comparekey FROM ( SELECT input, output/101 AS output FROM ( SELECT input, ROW_NUMBER() OVER() AS output FROM ( FLATTEN(( SELECT QUANTILES(metric,101) AS input FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))), input)))) GROUP BY input) AS post ON pre.comparekey = post.comparekey))) WHERE compare == 1 GROUP BY keyID, match_score_A))))) AS pre INNER JOIN EACH (SELECT keyID, prob2 FROM (SELECT keyID, match_score AS prob2 FROM (SELECT keyID, match_score FROM (SELECT keyID, timing_match_denumerator/timing_match_numerator AS match_score FROM (SELECT keyID, SUM(match_condition) OVER (PARTITION BY keyID) AS timing_match_denumerator, COUNT(match_condition) OVER (PARTITION BY keyID) AS timing_match_numerator, FROM (SELECT keyID, date, month, weekday, hour, minute, IF((hour == 9 OR hour == 10 OR hour == 11 OR hour == 12 OR hour == 13 OR hour == 14 OR hour == 15),1,0) AS match_condition FROM (SELECT keyID, date, month,

試圖以三週努力滿足三秒的好奇

回答速度趕不上問題產生的速度

資料人員第一線人員

是否可以更快?

需求提出 規格確認 初步實作 驗證除錯

Scale in complexity,not size

IT-centric Business Intelligence

因果關係驗證

資料探索、假設建立、相關性探究

資料人員

第一線人員

User-centric Business Intelligence

因果關係驗證

資料探索、假設建立、相關性探究

資料人員

第一線人員

市場上既已存在工具方案 (eg. Tableau)又何必自行研究?

Complexity is both domain + infrastructure

specific

How do you encode & implement this part?

即便是用 Google Bigquery仍是如此SQL 是

Business Logic 的組合語言

如果可以這樣該有多麼美好?SELECT keyID,userID,dot_email,gender,points,user_nickname,username,userEmail,userSendableEmail,userRole,userCity,userSchool,userGrade,joinedTime,userBirthdate,match_score FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,pre.userEmail AS userEmail,pre.userSendableEmail AS userSendableEmail,pre.userRole AS userRole,pre.userCity AS userCity,pre.userSchool AS userSchool,pre.userGrade AS userGrade,pre.joinedTime AS joinedTime,pre.userBirthdate AS userBirthdate,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,post.userEmail AS userEmail,post.userSendableEmail AS userSendableEmail,post.userRole AS userRole,post.userCity AS userCity,post.userSchool AS userSchool,post.userGrade AS userGrade,post.joinedTime AS joinedTime,post.userBirthdate AS userBirthdate FROM (SELECT * FROM (SELECT keyID, userID, dot_email, gender, points, user_nickname, username FROM (SELECT __key__.name AS keyID, user_id AS userID, user_email AS underline_email, current_user.email AS cu_email, user.email AS dot_email, gender, points, user_nickname, username FROM junyi_20161212.UserData_20161212)) AS pre INNER JOIN EACH (SELECT userEmail, userSendableEmail, userId AS userID, userRole, userCity, userSchool, userGrade, joinedTime, userBirthdate FROM FinalTable.UserFinalTmpInfo) AS post ON pre.userID = post.userID)) AS pre INNER JOIN EACH (SELECT keyID,match_score FROM (SELECT keyID, match_score FROM (SELECT keyID,prob1*prob2 AS match_score FROM (SELECT pre.keyID AS keyID,pre.prob1 AS prob1,post.prob2 AS prob2 FROM (SELECT * FROM (SELECT keyID, prob1 FROM (SELECT keyID, match_score AS prob1 FROM (SELECT keyID, match_score FROM (SELECT keyID, match_score_A*cdf AS match_score FROM (SELECT keyID, MAX(output) AS cdf, match_score_A FROM (SELECT keyID, IF(metric >= input,1,0) AS compare, output, match_score_A FROM (SELECT pre.keyID AS keyID,pre.comparekey AS comparekey,pre.metric AS metric,pre.match_score_A AS match_score_A,post.input AS input,post.output AS output FROM (SELECT * FROM (SELECT keyID, 1 AS comparekey, metric, match_score AS match_score_A FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT teacher_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT classID, MAX(match_score) AS match_score FROM (SELECT classID,match_score FROM (SELECT classID, MAX(match_score) AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT keyID AS student_keyID, match_score FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID)) AS post ON pre.student_keyID = post.student_keyID)) GROUP BY classID)) GROUP BY classID) AS post ON pre.classID = post.classID)) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))) AS pre INNER JOIN EACH (SELECT input, AVG(output) AS output, 1 AS comparekey FROM ( SELECT input, output/101 AS output FROM ( SELECT input, ROW_NUMBER() OVER() AS output FROM ( FLATTEN(( SELECT QUANTILES(metric,101) AS input FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))), input)))) GROUP BY input) AS post ON pre.comparekey = post.comparekey))) WHERE compare == 1 GROUP BY keyID, match_score_A))))) AS pre INNER JOIN EACH (SELECT keyID, prob2 FROM (SELECT keyID, match_score AS prob2 FROM (SELECT keyID, match_score FROM (SELECT keyID, timing_match_denumerator/timing_match_numerator AS match_score FROM (SELECT keyID, SUM(match_condition) OVER (PARTITION BY keyID) AS timing_match_denumerator, COUNT(match_condition) OVER (PARTITION BY keyID) AS timing_match_numerator, FROM (SELECT keyID, date, month, weekday, hour, minute, IF((hour == 9 OR hour == 10 OR hour == 11 OR hour == 12 OR hour == 13 OR hour == 14 OR hour == 15),1,0) AS match_condition FROM (SELECT keyID, date, month,

class.students <- con_relationshipByClass(dataset, outputUsers.role = "student")

class.teachers <- con_relationshipByClass(dataset, outputUsers.role = "teacher”, student_UserObj = class.students)

Regular_Preparation_Public.teachers <- UserObj_ANDmerge(con_probability_AgreaterthanB(dataset,UserObj.A = class.teachers,UserObj.B = class.students,metric = "seconds”, video = T, prob = T,window.begin = window.begin, window.end = window.end),

con_activity_timing(dataset, prob = T, video = T,complete.mission = T, assign.mission = T,hour = 9:15, window.begin = window.begin, window.end = window.end))

Regular_Preparation_Public.teachers <- view_User_FinalTable(dataset, Regular_Preparation_Public.teachers, output = "data")

如果可以這樣該有多麼美好?class.students <- con_relationshipByClass(dataset, outputUsers.role = "student")

class.teachers <- con_relationshipByClass(dataset, outputUsers.role = "teacher”, student_UserObj = class.students)

Regular_Preparation_Public.teachers <- UserObj_ANDmerge(con_probability_AgreaterthanB(dataset,UserObj.A = class.teachers,UserObj.B = class.students,metric = "seconds”, video = T, prob = T,window.begin = window.begin, window.end = window.end),

con_activity_timing(dataset, prob = T, video = T,complete.mission = T, assign.mission = T,hour = 9:15, window.begin = window.begin, window.end = window.end))

Regular_Preparation_Public.teachers <- view_User_FinalTable(dataset, Regular_Preparation_Public.teachers, output = "data")

有班級的使用者( 學生 ) 所屬的老師

ViewFunction

比較 A 與 B課堂時間影片跟習題的秒數回傳 A

以邏輯模組來操作已建置的 SQL

有班級的使用者( 學生 ) 所屬的老師

ViewFunction

比較 A 與 B課堂時間影片跟習題的秒數回傳 A

AB

邏輯模組的其他好處• Reusable code• Focus on business logic itself• Enabling very complex queries = Allowing rapid hypothesis generation and alignment with data

• Makes validation & debugging easier

Having data is not enough

Too much data is

Counter-productive

Lost in strange trends (not big pictures)and on numbers that lead to No action

It makes no senseif you can’t do or

no need to do anything about it.

7 天前來過,每次平均消費 3000NTD總共來過 4 次Recency = 7

Frequency = 4Monetary Value = 3000

Monetary Value

Frequency

常來 + 每次來花很多

少來 + 做不到業績

偶爾來一旦來就花很多

常來但每次來都很省

Monetary Value

Frequency

常來 + 每次來花很多

少來 + 做不到業績

偶爾來一旦來就花很多

常來但每次來都很省

來店有禮 吸引你常來蒞臨

Monetary Value

Frequency

常來 + 每次來花很多

少來 + 做不到業績

偶爾來一旦來就花很多

常來但每次來都很省

買滿千送百 給你湊數的誘因

同一件商品,若可以辨識你是價格不敏感者 賣高價品價格敏感者 打折賣

= “Actions we can take”

其他可以用來辨識「價格敏感度」的資料 ?

收入?學生?資產狀況小孩人數?Coupon的使用?是否是當地人?品牌忠誠度?市場熟悉程度?同類型產品,買哪種價位的?

Ignore coupons Offer high price

MaximizeRevenue

Theory: 消費行為受 Price-sensitivity 影響

“Data” “Action” “Outcome”

要倒回來想:主詞是能做什麼(Action)才能產生想要的

Outcome?

Data Action Outcome

Asking a relevant questionis domain specific skill

Take Action = P( 生理數據 ) 非專業人士:

Take Action = P( 生理數據 | 病史 ) 專業人士:

專業人士:P( HR<60 | 不規律呼吸 , 高血壓) = 腦壓上昇

醫療文獻稱這三者合併出現為 Cushing’s reflex

善用資料改善線上教育 slide (https://goo.gl/Yd6ajF)

善用資料改善線上教育 slide (https://goo.gl/Yd6ajF)

https://www.junyiacademy.org/

https://www.junyiacademy.org/

https://www.junyiacademy.org/

善用資料改善線上教育 slide (https://goo.gl/Yd6ajF)

老師及所屬學生的使用行為分析 翻轉教學成就孩子Theory: 好的 practice是可以被複製跟運用的

“Data” “Action” “Outcome”

找出領頭羊以便觀摩

醫師 老師病史詢問、各式檢查執行治療規劃治療方針衛教

評估先備知識安排學習情境、教材評估學習成效給予指導、輔導跟支持選擇教學策略

Data Action Outcome

What is the theory?What cause-effect is already known?

必須深入有關學習的理論才能有更多可能性

學生學習的行為 成就孩子

Learning theories

“Data” “Action” “Outcome”

協助挑選合適的教學策略

三種主流的觀點BehaviorismCognitivismConstructivism

數據最容易應用的思維

Behaviorism 行為主義刺激 行為 後果

刺激 行為 後果古典制約 Classical Conditioning

刺激 行為 後果操作制約 Operant Conditioning

行為主義的觀點Feedback loops + 合適的 Reward搭配可以系統化的影響學習行為

Behaviorism 行為主義• 被動的接受知識• 遺忘是因為久了沒用• 學習就是–行為機率變化 : P(刺激 ) = 行為

• Generalization是因為刺激源類似• 強調practice、 cues、 hints、 rewards …

行為主義的侷限認知資源低的學習 = 很強大、很有效–寫選擇題、填空、背誦、配對比對

認知資源高的學習 = 解釋力很差–學會能賴以工作的外語–做別人沒做過的事、 ill-defined problems–創作、申論題

What can we learn from other theories?

Cognitivism 認知主義–內在 knowledge state變化 State & Path analysis

學習者必須主動的把知識 encode到記憶中–強調新知識跟舊知識的關係 倒帶行為

0:10 – 0:60Peak at 0:30

2:00-2:20

3:50 -

「質數判別方法」影片劇本• 找出 1 to 10哪些是質數?– [0:10] 1 不是質數,也不是合數– [0:25] 合數就是除了質數以外的數– [0:40] 0也不是質數,也不是合數– [0:50] 2 是質數唯一的偶數,其他質數都是奇數

• [2:00] 再次強調,只有 2 是偶數的質數• [2:10] 13 是否為質數?• [3:50] 33 是否為質數?• [4:10] 一個數一定可以被自己跟 1整除• [4:30] 33 可以被 3整除,所以不是質數

把影片內容依主題切成九段

從頭看到尾,只看一次 只看前面的觀念,不做 13,33的練習

前面的觀念看兩次後,才做 13,33練習 重複從頭到尾看很多次

能對教材「倒帶」「暫停」「跳過」是少數能呈現學生對自己的學習做選擇的結果

學生所做的選擇反映他如何建構他自己的知識

Issues not addressed by Behaviorism

Context 情境Intrinsic motivation(內在動機 )–Competence = 成就感–Autonomy = 自我控制感,可以自己做選擇–Relatedness = 社群、夥伴、團隊

使用者 long-term engagement 關鍵1.社群的存在2.內在動機驅動3.Experience本身會 evolve over time ,而非短期的 funnel 或 habit loops

創造一個會隨時間逐漸 unfold 的體驗一個會隨使用者的投入與技能成長改變的體驗

Recommended