Justin Middleton, Kathryn T. Stolee, Emerson Murphy-Hill...Understanding Analyst Workflow through...

Preview:

Citation preview

  • Understanding Analyst Workflow

    through Baseball AnalyticsJustin Middleton, Kathryn T. Stolee, Emerson Murphy-Hill

    North Carolina State University

    jamiddl2@ncsu.edu

    Analytics + Software DevelopmentA modern match! But how have the co-developmentsof these fields influenced how the analyst learns andpractices the analytic workflows to fulfill their job?

    This Photo by Unknown Author is licensed under CC BY-SA

    Who are they?10 analysts

    3 independent consultants3 university students2 MLB employees (current and former)1 hobbyist1 professor

    Of the participants, 9 of 10have a formal background inmath or statistics Also, 5 of10 have some computerscience training, 3 of 10 witheconomics.

    Future Work

    What do they use?

    How do they work?

    How do they learn?

    • Value of a research question guided by resulting wins and runs.

    • Data often from personal, community web scrapers of MLB sources.

    • Process rarely formalized; iterations quick, agile, and exploratory.

    • Self-awareness of lack of formal software training often used to explain lack of documenting and testing.

    • Biggest barriers: lack of time, lack of experience, lack of complex data.

    • Learn by doing (10/10); some use baseball as an educational sandbox for useful techniques

    • Blogs (10/10), with and without code.• For some, books are essential; for

    others, ineffective.• Code examples unanimously desired;

    efficiency descriptions often not.• Resource must be credible, accessible,

    and for some, open-source.

    • Are these descriptions representative? Create surveys to reach a broader, more varied population of analysts, baseball and not.

    • What is it about blogs that makes them effective? Perform a feature analysis of statistical tutorials.

    A Focus on a Specific Case

    Baseball analytics combines a long,documented history of analysis with freeaccess to much of its data. This hasallowed professional and amateuranalysts alike to pose and answerquestions, like those below, in public.

    The Methodology for Community Analysis

    We recruited and interviewed 10 analystsfrom baseball conferences, onlinecommunities, and references to ask about

    1. The experiences and resources, formaland informal, by which they developedtheir workflow, and

    2. How the parts of their workflowsupport or hurt their search for results

    We applied techniques of groundedtheory to analyze and aggregate thethemes common among the community.

    These photos by Unknown Author are licensed under CC BY-SA

    +

    This Photo by Unknown Author is licensed under CC BY

Recommended