1
William Hersh 1 , David A. Dorr 1,2 , Melissa Haendel 1,3 , Nicole Vasilevsky 1,3 , Bjorn Pederson 1 , Jackie Wirz 1,3 , Shannon McWeeney 1 1 Department of Medical Informatics and Clinical Epidemiology, 2 Department of Medicine, 3 Ontology Development Group, Library Oregon Health & Science University, Portland, OR Students felt the course Acknowledgements This work is supported by NIH Grants 1R25EB020379-01 and 1R25GM114820-01. Skills Course Topics Why? Course Offerings Defining The Problem Wrangling Data Methods, Tools And Analysis Scientific Communication Data Identification And Resources ? v Problems amenable to analytics v Importance of question v Team de\initions v Scope v When we do this wrong: methods don't match v Finding the right data v Search methods v Use of metadata v Data management v Exploratory Data Analysis v Data Dictionary v As you touch data, what can go wrong? v Visualization v Matching algorithms to problems v Reporting Findings and Limitations v Giving “Elevator Speech” on ideas of how to approach problem v Critique of related problem Major challenge: how to manage, analyze and interpret vast amounts of data being generated in biomedical research One goal of NIH Big Data to Knowledge (BD2K) initiative: provide training for students and researchers to address this Research team in the Department of Medical Informatics and Clinical Epidemiology (DMICE) is developing skills courses Approach Intro Course Data After Dark OERs and courses connect the dots that help researchers understand how to apply data science techniques in the context of their whole research life cycle § Skills course and OER topics are aimed to \ill speci\ic gaps § Teaching students how to ask the question and follow through ① Develop Open Educational Resources (OERS) ② Teach Skills Courses 19 | Semantic data interoperability 20 | Semantic Web data 21 | Context-based selection of data 22 | Translating the Question 23 | Implications of Provenance and Pre-processing 24 | Data tells a story 25 | Choice of Algorithms and Algorithm Dynamics 26 | Statistical Signi\icance, P-hacking and Multiple-testing 27 | Displaying Con\idence and Uncertainty 28 | Visualization and Interpretation 29 | Replication, Validation and the spectrum of Reproducibility 30 | Regulatory Issues in Big Data for Genomics and Health 31 | Hosting data dissemination and data stewardship workshops 32 | Guidelines for reporting, publications, and data sharing OER Modules Challenges Scope Images Style Dissemination How to scope generic curricula for different levels of users How to translate diverse teaching styles into general materials How to maximize dissemination while protecting intellectual property How to incorporate images and other copyrighted materials into open resources Available at: http://skynet.ohsu.edu/bd2k Advanced Course § Week long course in Summer 2015 § Offered to interns and undergraduates § Taught basics of data science in the context of the research life cycle § Two-evening course in January 2016 § Offered to OHSU students, staff and faculty § Taught basics of data science § Four-evening course to be offered in May 2016 § Offered to OHSU students, staff and faculty § Teach advanced topics in of data science 01| Biomedical Big Data Science 02| Introduction to Big Data in Biology and Medicine 03| Ethical Issues in Use of Big Data 04| Terminology of Biomedical, Clinical, and Translational Research 05| Computing Concepts for Big Data 06| Clinical Data and Standards Related to Big Data 07| Basic Research Data Standards 08| Public Health and Big Data 09| Team Science 10 | Secondary Use (Reuse) of Clinical Data 11 | Publication and Peer Review 12 | Information Retrieval 13 | Version control and identi\iers 14 | Data annotation and curation 15 | Data and tools landscape 16 | Ontologies 101 17 | Data modeling 18 | Data metadata and provenance Couture Curricula - BD2K Data Science Tailored to Your Needs

Couture Curricula - BD2K Data Science Tailored to Your Needs

Embed Size (px)

Citation preview

Page 1: Couture Curricula - BD2K Data Science Tailored to Your Needs

WilliamHersh1,DavidA.Dorr1,2,MelissaHaendel1,3,NicoleVasilevsky1,3,BjornPederson1,JackieWirz1,3,ShannonMcWeeney1

1DepartmentofMedicalInformaticsandClinicalEpidemiology,2DepartmentofMedicine,3OntologyDevelopmentGroup,LibraryOregonHealth&ScienceUniversity,Portland,OR

Students felt the course

AcknowledgementsThisworkissupportedbyNIHGrants1R25EB020379-01and1R25GM114820-01.

Skills Course TopicsWhy?

Course Offerings

Defining The Problem Wrangling Data

Methods, Tools And Analysis

Scientific Communication

Data Identification And Resources

?v  Problemsamenableto

analyticsv  Importanceofquestionv  Teamde\initionsv  Scopev  Whenwedothiswrong:

methodsdon'tmatch

v  Findingtherightdatav  Searchmethods

v  Useofmetadatav  Datamanagement

v ExploratoryDataAnalysisv DataDictionaryv Asyoutouchdata,whatcangowrong?

v  Visualizationv Matchingalgorithmstoproblems

v  ReportingFindingsandLimitationsv  Giving“ElevatorSpeech”onideasof

howtoapproachproblemv  Critiqueofrelatedproblem

Majorchallenge:howtomanage,analyzeandinterpretvastamountsofdatabeinggeneratedinbiomedicalresearchOnegoalofNIHBigDatatoKnowledge(BD2K)initiative:providetrainingforstudentsandresearcherstoaddressthisResearchteamintheDepartmentofMedicalInformaticsandClinicalEpidemiology(DMICE)isdevelopingskillscourses

Approach

IntroCourse

DataAfterDark

OERsandcoursesconnectthedotsthathelpresearchersunderstandhowtoapplydatasciencetechniquesinthecontextoftheirwholeresearchlifecycle

§  SkillscourseandOERtopicsareaimedto\illspeci\icgaps

§  Teachingstudentshowtoaskthequestionandfollowthrough

① DevelopOpenEducationalResources(OERS)

② TeachSkillsCourses

19|Semanticdatainteroperability20|SemanticWebdata21|Context-basedselectionofdata22|TranslatingtheQuestion23|ImplicationsofProvenanceand

Pre-processing24|Datatellsastory25|ChoiceofAlgorithmsand

AlgorithmDynamics26|StatisticalSigni\icance,P-hacking

andMultiple-testing27|DisplayingCon\idenceand

Uncertainty28|VisualizationandInterpretation29|Replication,Validationandthe

spectrumofReproducibility30|RegulatoryIssuesinBigDatafor

GenomicsandHealth31|Hostingdatadisseminationand

datastewardshipworkshops32|Guidelinesforreporting,

publications,anddatasharing

OER Modules

ChallengesScope

Images

Style

Dissemination

Howtoscopegenericcurriculafordifferentlevelsofusers

Howtotranslatediverseteachingstylesintogeneralmaterials

Howtomaximizedisseminationwhileprotectingintellectualproperty

Howtoincorporateimagesandothercopyrightedmaterialsintoopenresources

Available at:http://skynet.ohsu.edu/bd2k

AdvancedCourse

§  WeeklongcourseinSummer2015§  Offeredtointernsandundergraduates§  Taughtbasicsofdatascienceinthecontextoftheresearchlifecycle

§  Two-eveningcourseinJanuary2016§  OfferedtoOHSUstudents,staffandfaculty§  Taughtbasicsofdatascience

§  Four-eveningcoursetobeofferedinMay2016

§  OfferedtoOHSUstudents,staffandfaculty§  Teachadvancedtopicsinofdatascience

01|BiomedicalBigDataScience02|IntroductiontoBigDatainBiology

andMedicine03|EthicalIssuesinUseofBigData04|TerminologyofBiomedical,

Clinical,andTranslationalResearch

05|ComputingConceptsforBigData06|ClinicalDataandStandards

RelatedtoBigData07|BasicResearchDataStandards08|PublicHealthandBigData09|TeamScience10|SecondaryUse(Reuse)ofClinical

Data11|PublicationandPeerReview12|InformationRetrieval13|Versioncontrolandidenti\iers14|Dataannotationandcuration15|Dataandtoolslandscape16|Ontologies10117|Datamodeling18|Datametadataandprovenance

Couture Curricula - BD2K Data Science Tailored to Your Needs