Upload
kimlyman
View
135
Download
1
Embed Size (px)
DESCRIPTION
SDAL addresses social science in new ways that will transform how we understand the world. Among our goals: creating smart and resilient cities, combatting homelessness, understanding the spread of disease and developing effective public health responses, identifying innovation drivers, and meeting the demand for educated graduates in the field.
Citation preview
Sallie Keller Director and Professor of Statistics
Social and Decision Analytics Laboratory http://sdal.vbi.vt.edu/
Social and Decision Analytics Lab
Social and Decision Analytics Lab
• Part of Virginia Bioinformatics Institute • Central to “Information Biology” theme – Study of massively interacting systems, from molecular to
social phenomena • Collaboration across VT and beyond – Embrace VBI mantra of transdisciplinary team science
Social and Decision Analytics Lab Mission
• Unite statisticians and social scientists to create quantitative methods for evidenced-based social and behavioral analyses in support of policy decision-making
• Focus on social issues that will transform our understanding of the world – Resilient community ecosystems – Influences of and how to respond to public health crises – Economic consequences of an educated society – Causes of social disasters such as increased urbanization and
homelessness – Drivers of innovation
Social and Decision Analytics Lab Thrusts
1. Developing the Science of “All” Data
2. Metropolitan Analytics
3. Health Measurement Analytics
4. Education and Labor Force Analytics
5. Industrial Innovation Analytics
• Theory and methods for integration new and traditional sources of data for understanding of the social condition
• Developing analytics to understand community responses to shocks and the subsequent impact on the well being of the community
• Integration of public and private sector sources of data on medical care spending to unravel costs and impacts for a “healthy” population
• Developing analytics to understand and shape learning processes, workforce dynamics, and the impact of education on labor force outcomes
• Partner with industry to move beyond business analytics and develop information integration technologies able to accelerate innovation
Science of Big Data Origins: Bioinformatics
• Modern “bioinformatics” has grown to encompass interactions among many levels and properties of individuals, groups and environments
• It is computationally enabled science – Increasingly blurring the distinction between social,
environmental, socio-technical and biological domains – Unprecedented uptake of diverse unstructured information – In-silico capture and generation of detailed interaction
dynamics • Relevance to strategic & tactical decision making
Thrust 1: Developing the Science of Big Data
Big Data and Social Systems • Research has traditionally been
based on statistically designed experiments and surveys – Clean, well-controlled, limited in scale
(~103) and/or resolution
• Bringing “Big Data” to bear – Data informed high performance
computing – Complex socio-technical systems – Quantitative social science methods and
practice at scale
New Data Flows for Social Behavioral Observing
• Data collected faster, while individuals are in the act of behaving in real life situation – Increase in the sensitivity of our instruments – Increased granularity and speed of data collection – Capture behavior that occurs infrequently
• Adapt methods to make the best use of these data • New data streams produce new discoveries but should
not be allowed to degrade the scientific approach
• It is not just about size – “All data revolution” – Traditional and new sources
• Statistics / analytics – Replication, reproducibility,
representativeness – Bias and precision – Descriptive, association, inference – Change point detection
• Modeling and simulation – Scenario development – Counterfactual analysis
Big Data - Doesn’t matter what its called, only matters what you do with it
Thrust 2: Metropolitan Analytics
Urban and rural popula,on by regions, 1950, 2011 and 2050 per cent of total popula,on
Increasing Urbanization – Opportunities for Innovation
• Planning and design • Enhancing productivity
through education • Enhancing the quality of
life and social engagement • Optimize adaptability to
population trends and impact of technology
• Managing factors influencing climate change
• Controlling poverty
Growth in Cities – Accelerating Role for Urban Leadership
Source: Joel Gurin, NYU
Characteristics • Multi-sourced • Observational • Noisy • Multi-purposed Data Infrastructure • For whom? • With whom? • To what end?
Data is a New Asset Class – Fuel for the City
Environment
Meteorology, pollution, noise, flora, fauna
People
Relationships, location,
economics, communication, activities, health
Infrastructure
Condition, operations
Instrumenting a Metropolitan Area
• Leverage ConnectArlington fiber optics – 60 miles of fiber optics that connects over 90 sites
• Goal: Improve the county’s quality of life and services while accelerating the county’s efficiency and resiliency
• Initial projects: – Emergency 911 calls, enhancing situational awareness. – Relationship of network of non-profit providers during
emergency events and social cohesion of neighborhoods – Develop a mapping of the “age” of the county.
• Foundation for an urban data infrastructure platform – Useful for research, practice, and policy option development
Arlington County Virginia Tech Partnership
• Synthetic refers to information synthesis • Ingests information – Data – Judgment – Models (procedures)
• Protect information • Capture interaction patterns • Capture and generate dynamics • Result: layers of information
projected onto a dynamically changing coordinate system
Build the Infrastructure– Synthetic Information Platform
• Questions should guide the development of metropolitan analytics if they are to inform decision making
• Analytics – Provide guidance for metropolitan strategies – To prioritize issues and create solutions – To track performance of system components, outcomes, and
impact – To continuously assess its position relative to the competition
and discern the factors that result in a competitive advantage for the city
– To benchmark against previous performance, other cities, stated goals, other characteristics
• Infrastructure will make or break the success
Arlington is But One Example
“In attempting to arrive at the truth, I have applied everywhere for information, but scarcely an instance have I been able to obtain hospital records fit for any purpose of comparison. If they could be obtained, they would enable us to decide many other questions besides the ones alluded to. They would show subscribers how their money was spent, what amount of good was really being done with it, or whether their money was not doing mischief rather than good.”
Florence Nightingale (1864)
Thrust 3: Health Measurement Analytics
Health and the Environment Goal: Identify links between air pollution and acute health events at community level • Pathophysiological link between out-of-
hospital cardiac arrest and ozone level • Case cross-over, time stratified design
– Houston, 2004-2011 – 11,754 cases – Predictor variable is aggregate
ozone over a 3 hour window leading up to the event
– 20 ppb increase in ozone previous 1 to 3 hours was associated with a 4.4% increased risk
Ensor, et al., Circulation, Volume 127(11):1192-1199
Growth in Health Care Spending
• Health care predicted to reach 20% of GDP by 2020
• Health care expenditures increased ~29% (2002-2006)
• Developing a satellite account on medical care spending
• Data include public and private sources
Source: Congressional Budget Office.
0
5
10
15
20
25
30
35
1960 1970 1980 1990 2000 2010 2020 2030
Percen
tage of G
DP
Year
Spending on Health Care as Percentage of GDP
(1960-‐2035)
All Other Health Care Aizcorbe et al.,
Survey of Current Business June 2012:34-‐47
Growth in Medical Care Spening, 2002-‐2006 Percent Endocrine 70.2 Blood 68.9 Complica,ons of pregnancy 68.9 Residual codes and unclassified 42.5 Musculoskeletal system 38.6 Injury and poisoning 34.2 Genitourinary system. 30.5 Diges,ve system 28.2 Circulatory system 25.6 Nervous system 25.3 Neoplasms 24.0 Mental illness 16.7 Respiratory system 14.8 Skin 5.8 Symptoms and ill-‐defined 2.4 Congenital anomalies3 -‐8.3 Infec,ous and parasi,c -‐8.7 Certain perinatal condi,ons -‐28.1
Growth in spending varies by disease
Thrust 4: Education and Workforce Analytics Management of costs and expenditures • Be less reliant on public funding Achieve higher standards and greater outcomes • Produce graduates with job-ready
skills whose collective productivity can immediately impact the economy
Training for higher cognitive capabilities • Utilize a greater diversity of
learning ecosystems • Across diverse backgrounds,
abilities, and aspirations
State-of-the Art: Learning Analytics Dashboards
Moving Past the State-of-the-Art
This eclectic approach is both a strength and a weakness: it facilitates rapid development and the ability to build on established practice and findings, but it - to date – lacks a coherent articulated epistemology of its own. Clow, 2012
What you have is a thermometer with no theory of action behind it. If I have a fever, nothing here is going to tell me how to deal with the fever. All it’s going to do is tell me I have a fever. Mark Schneider, AIR, 2013
Integration of theory into analytics • Education, pedagogy, learning, and development
Past Present Future
Descriptive
What happened? What is happening now?
What will happen?
Historical Reporting Assessment Reporting: Alerts
Extrapolations: Alerts
Associative Inference
What variables best explain what happened?
What interventions seem reasonable?
What is the best/worst that can happen?
Relationships and Modeling
Options Optimization, Simulation
Causative Inference
How and why did it happen?
What interventions are prescribed?
What does theory suggest can happen?
Theory-based Relationships and Modeling
Recommendations Theory-based Predictions, Optimization, Simulation
Key questions for education analytics
Project Talent
American Ins,tutes of Research Projects in 2014+: • Data collec,on 50 years later. • Study of veterans, twins, and other groups. • Role of social media and technology on lives (VBI SDAL)
Child HANDS – Helping Analyze Needed Data Securely
• Tracking early childhood-related data across public agencies.
• Tracks individuals over time without tracking personal information
• Flexible; built to bring in additional data and connect with Virginia Longitudinal Data System
Thrust 5: Industrial Innovation Analytics
Partner with industry to move beyond business analytics and develop information integration technologies able to accelerate innovation through – Demand-driven and transparent
supply chains – Continuous feedback & tracking – Dynamic risk analysis – Knowledge discovery through
social media – Diffusion of technology
It is a Brave New World • We need a 10 year strategy – The landscape is ever changing, tweet today what tomorrow?
• What should government be responsible for? • How can non-government data influence government
statistics development and official reporting? • What alliances need to be shaped and built – Why should the private sector “openly” participate?
• Transparency and information sharing are critical • Data quality and standards of use need to be formalized • Needs are motivated by “common good questions”