29
Sallie Keller Director and Professor of Statistics Social and Decision Analytics Laboratory http://sdal.vbi.vt.edu/ Social and Decision Analytics Lab

Sdal overview sallie keller

Embed Size (px)

DESCRIPTION

SDAL addresses social science in new ways that will transform how we understand the world. Among our goals: creating smart and resilient cities, combatting homelessness, understanding the spread of disease and developing effective public health responses, identifying innovation drivers, and meeting the demand for educated graduates in the field.

Citation preview

Page 1: Sdal overview  sallie keller

Sallie Keller Director and Professor of Statistics

Social and Decision Analytics Laboratory http://sdal.vbi.vt.edu/

Social and Decision Analytics Lab

Page 2: Sdal overview  sallie keller

Social and Decision Analytics Lab

•  Part of Virginia Bioinformatics Institute •  Central to “Information Biology” theme –  Study of massively interacting systems, from molecular to

social phenomena •  Collaboration across VT and beyond –  Embrace VBI mantra of transdisciplinary team science

Page 3: Sdal overview  sallie keller

Social and Decision Analytics Lab Mission

•  Unite statisticians and social scientists to create quantitative methods for evidenced-based social and behavioral analyses in support of policy decision-making

•  Focus on social issues that will transform our understanding of the world –  Resilient community ecosystems –  Influences of and how to respond to public health crises –  Economic consequences of an educated society –  Causes of social disasters such as increased urbanization and

homelessness –  Drivers of innovation

Page 4: Sdal overview  sallie keller

Social and Decision Analytics Lab Thrusts

1.  Developing the Science of “All” Data

2.  Metropolitan Analytics

3.  Health Measurement Analytics

4.  Education and Labor Force Analytics

5.  Industrial Innovation Analytics

•  Theory and methods for integration new and traditional sources of data for understanding of the social condition

•  Developing analytics to understand community responses to shocks and the subsequent impact on the well being of the community

•  Integration of public and private sector sources of data on medical care spending to unravel costs and impacts for a “healthy” population

•  Developing analytics to understand and shape learning processes, workforce dynamics, and the impact of education on labor force outcomes

•  Partner with industry to move beyond business analytics and develop information integration technologies able to accelerate innovation

Page 5: Sdal overview  sallie keller

Science of Big Data Origins: Bioinformatics

•  Modern “bioinformatics” has grown to encompass interactions among many levels and properties of individuals, groups and environments

•  It is computationally enabled science –  Increasingly blurring the distinction between social,

environmental, socio-technical and biological domains –  Unprecedented uptake of diverse unstructured information –  In-silico capture and generation of detailed interaction

dynamics •  Relevance to strategic & tactical decision making

Page 6: Sdal overview  sallie keller

Thrust 1: Developing the Science of Big Data

Big Data and Social Systems •  Research has traditionally been

based on statistically designed experiments and surveys –  Clean, well-controlled, limited in scale

(~103) and/or resolution

•  Bringing “Big Data” to bear –  Data informed high performance

computing –  Complex socio-technical systems –  Quantitative social science methods and

practice at scale

Page 7: Sdal overview  sallie keller

New Data Flows for Social Behavioral Observing

•  Data collected faster, while individuals are in the act of behaving in real life situation –  Increase in the sensitivity of our instruments –  Increased granularity and speed of data collection –  Capture behavior that occurs infrequently

•  Adapt methods to make the best use of these data •  New data streams produce new discoveries but should

not be allowed to degrade the scientific approach

Page 8: Sdal overview  sallie keller

•  It is not just about size –  “All data revolution” –  Traditional and new sources

•  Statistics / analytics –  Replication, reproducibility,

representativeness –  Bias and precision –  Descriptive, association, inference –  Change point detection

•  Modeling and simulation –  Scenario development –  Counterfactual analysis

Big Data - Doesn’t matter what its called, only matters what you do with it

Page 9: Sdal overview  sallie keller

Thrust 2: Metropolitan Analytics

Page 10: Sdal overview  sallie keller

Urban  and  rural  popula,on  by  regions,  1950,  2011  and  2050    per  cent  of  total  popula,on    

 

Increasing Urbanization – Opportunities for Innovation

Page 11: Sdal overview  sallie keller

•  Planning and design •  Enhancing productivity

through education •  Enhancing the quality of

life and social engagement •  Optimize adaptability to

population trends and impact of technology

•  Managing factors influencing climate change

•  Controlling poverty

Growth in Cities – Accelerating Role for Urban Leadership

Page 12: Sdal overview  sallie keller

Source: Joel Gurin, NYU

Characteristics •  Multi-sourced •  Observational •  Noisy •  Multi-purposed Data Infrastructure •  For whom? •  With whom? •  To what end?

Data is a New Asset Class – Fuel for the City

Page 13: Sdal overview  sallie keller

Environment

Meteorology, pollution, noise, flora, fauna

People

Relationships, location,

economics, communication, activities, health

Infrastructure

Condition, operations

Instrumenting a Metropolitan Area

Page 14: Sdal overview  sallie keller

•  Leverage ConnectArlington fiber optics –  60 miles of fiber optics that connects over 90 sites

•  Goal: Improve the county’s quality of life and services while accelerating the county’s efficiency and resiliency

•  Initial projects: –  Emergency 911 calls, enhancing situational awareness. –  Relationship of network of non-profit providers during

emergency events and social cohesion of neighborhoods –  Develop a mapping of the “age” of the county.

•  Foundation for an urban data infrastructure platform –  Useful for research, practice, and policy option development

Arlington County Virginia Tech Partnership

Page 15: Sdal overview  sallie keller

•  Synthetic refers to information synthesis •  Ingests information –  Data –  Judgment –  Models (procedures)

•  Protect information •  Capture interaction patterns •  Capture and generate dynamics •  Result: layers of information

projected onto a dynamically changing coordinate system

Build the Infrastructure– Synthetic Information Platform

Page 16: Sdal overview  sallie keller

•  Questions should guide the development of metropolitan analytics if they are to inform decision making

•  Analytics – Provide guidance for metropolitan strategies –  To prioritize issues and create solutions –  To track performance of system components, outcomes, and

impact –  To continuously assess its position relative to the competition

and discern the factors that result in a competitive advantage for the city

–  To benchmark against previous performance, other cities, stated goals, other characteristics

•  Infrastructure will make or break the success

Arlington is But One Example

Page 17: Sdal overview  sallie keller

“In attempting to arrive at the truth, I have applied everywhere for information, but scarcely an instance have I been able to obtain hospital records fit for any purpose of comparison. If they could be obtained, they would enable us to decide many other questions besides the ones alluded to. They would show subscribers how their money was spent, what amount of good was really being done with it, or whether their money was not doing mischief rather than good.”

Florence Nightingale (1864)

Thrust 3: Health Measurement Analytics

Page 18: Sdal overview  sallie keller

Health and the Environment Goal: Identify links between air pollution and acute health events at community level •  Pathophysiological link between out-of-

hospital cardiac arrest and ozone level •  Case cross-over, time stratified design

–  Houston, 2004-2011 –  11,754 cases –  Predictor variable is aggregate

ozone over a 3 hour window leading up to the event

–  20 ppb increase in ozone previous 1 to 3 hours was associated with a 4.4% increased risk

Ensor, et al., Circulation, Volume 127(11):1192-1199

Page 19: Sdal overview  sallie keller

Growth in Health Care Spending

•  Health care predicted to reach 20% of GDP by 2020

•  Health care expenditures increased ~29% (2002-2006)

•  Developing a satellite account on medical care spending

•  Data include public and private sources

Source: Congressional Budget Office.

0  

5  

10  

15  

20  

25  

30  

35  

1960   1970   1980   1990   2000   2010   2020   2030  

Percen

tage  of  G

DP  

Year  

Spending  on  Health  Care  as  Percentage  of  GDP    

(1960-­‐2035)  

All  Other  Health  Care  Aizcorbe  et  al.,    

Survey  of  Current  Business  June  2012:34-­‐47  

Page 20: Sdal overview  sallie keller

Growth  in  Medical  Care  Spening,  2002-­‐2006   Percent  Endocrine   70.2  Blood   68.9  Complica,ons  of  pregnancy   68.9  Residual  codes  and  unclassified   42.5  Musculoskeletal  system       38.6  Injury  and  poisoning   34.2  Genitourinary  system.   30.5  Diges,ve  system     28.2  Circulatory  system     25.6  Nervous  system     25.3  Neoplasms     24.0  Mental  illness     16.7  Respiratory  system   14.8  Skin   5.8  Symptoms  and  ill-­‐defined   2.4  Congenital  anomalies3     -­‐8.3  Infec,ous  and  parasi,c   -­‐8.7  Certain  perinatal  condi,ons     -­‐28.1  

Growth in spending varies by disease

Page 21: Sdal overview  sallie keller

Thrust 4: Education and Workforce Analytics Management of costs and expenditures •  Be less reliant on public funding Achieve higher standards and greater outcomes •  Produce graduates with job-ready

skills whose collective productivity can immediately impact the economy

Training for higher cognitive capabilities •  Utilize a greater diversity of

learning ecosystems •  Across diverse backgrounds,

abilities, and aspirations

Page 22: Sdal overview  sallie keller

State-of-the Art: Learning Analytics Dashboards

Page 23: Sdal overview  sallie keller

Moving Past the State-of-the-Art

This eclectic approach is both a strength and a weakness: it facilitates rapid development and the ability to build on established practice and findings, but it - to date – lacks a coherent articulated epistemology of its own. Clow, 2012

What you have is a thermometer with no theory of action behind it. If I have a fever, nothing here is going to tell me how to deal with the fever. All it’s going to do is tell me I have a fever. Mark Schneider, AIR, 2013

Integration of theory into analytics •  Education, pedagogy, learning, and development

Page 24: Sdal overview  sallie keller

  Past Present Future

Descriptive

What happened? What is happening now?

What will happen?

Historical Reporting Assessment Reporting: Alerts

Extrapolations: Alerts

Associative Inference

What variables best explain what happened?

What interventions seem reasonable?

What is the best/worst that can happen?

Relationships and Modeling

Options Optimization, Simulation

Causative Inference

How and why did it happen?

What interventions are prescribed?

What does theory suggest can happen?

Theory-based Relationships and Modeling

Recommendations Theory-based Predictions, Optimization, Simulation

Key questions for education analytics

Page 25: Sdal overview  sallie keller

Project Talent

American  Ins,tutes  of  Research  Projects  in  2014+:  •  Data  collec,on  50  years  later.  •  Study  of  veterans,  twins,  and  other  groups.  •  Role  of  social  media  and  technology  on  lives  (VBI  SDAL)  

Page 26: Sdal overview  sallie keller

Child HANDS – Helping Analyze Needed Data Securely

•  Tracking early childhood-related data across public agencies.

•  Tracks individuals over time without tracking personal information

•  Flexible; built to bring in additional data and connect with Virginia Longitudinal Data System

Page 27: Sdal overview  sallie keller
Page 28: Sdal overview  sallie keller

Thrust 5: Industrial Innovation Analytics

Partner with industry to move beyond business analytics and develop information integration technologies able to accelerate innovation through –  Demand-driven and transparent

supply chains –  Continuous feedback & tracking –  Dynamic risk analysis –  Knowledge discovery through

social media –  Diffusion of technology

Page 29: Sdal overview  sallie keller

It is a Brave New World •  We need a 10 year strategy –  The landscape is ever changing, tweet today what tomorrow?

•  What should government be responsible for? •  How can non-government data influence government

statistics development and official reporting? •  What alliances need to be shaped and built –  Why should the private sector “openly” participate?

•  Transparency and information sharing are critical •  Data quality and standards of use need to be formalized •  Needs are motivated by “common good questions”