Ethical use of Twitter for #DigDisDet ¢â‚¬¢ Tuskegee Syphilis Study (1932-1972) Motivation ¢â‚¬¢ Laud Humphreys,

  • View
    0

  • Download
    0

Embed Size (px)

Text of Ethical use of Twitter for #DigDisDet ¢â‚¬¢ Tuskegee Syphilis Study...

  • C a i t l i n R i v e r s V i r g i n i a B i o i n f o r m a t i c s I n s t i t u t e

    V i r g i n i a T e c h

    Ethical use of Twitter for #DigDisDet

  • Motivation

    • Willowbrook Hepatitis Study (1963-1966)

    • Brooklyn Jewish Chronic Disease Cancer Study (1963)

    • Stanford Prison Experiment (1971)

    • Tuskegee Syphilis Study (1932-1972)

  • Motivation

    • Laud Humphreys, sociologist (1960s)

    • “To avoid bias, Humphreys secretly followed some men and recorded the license number on their vehicles. A year later, Humphreys showed up at their private homes and claimed to be a health service interviewer. He asked them questions about their marital status, race, job, and other personal questions”

    -Historical Cases of Unethical Research

    Serena Marsden & Melissa Melander

    University of North Dakota

    http://www.und.edu/instruct/wstevens/PROPOSALCLASS/MARSDEN&MELANDER2.htm

  • ● ‘Microblogging’ social media service

    ● Connecting with people who share interests

    ● Default privacy is ‘open’

    ● ~500 million users

    ● ~340 million tweets sent daily around the world

    Twitter in a nutshell

  • Twitter API

    ● Advanced programming interface (API) ● Most convenient API of the social networks ● Streaming API provides ~1% of tweets ● Search term and author-specific APIs also available ● API accounts freely available

    Wikimedia Commons

  • Twitter API

    ● Data streamed include: ○ Tweet text ○ Username ○ Timestamp ○ Text location* ○ Geolocation* ○ Number of friends

    and followers ○ And more...

  • Twitter for research

    ● Population-level research for trends and patterns

    ● Syndromic surveillance (e.g. ILI), vaccine sentiments, disaster response, natural disaster surveillance etc.

    ● User-centric use case possible ○ Longitudinal study? ○ Contagion within social

    network? ○ Cascades

    www.connectedaction.net

  • Existing guidelines

    “Research involving the collection or study of existing data, documents, records, pathological specimens, or diagnostic specimens, if these sources are publicly available or if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects.” - Department of Health and Human Services Policy for Protection of Human Research Subjects (Section 46.101, section 4)

  • Existing guidelines

    “Our Services are primarily designed to help you share information with the world. Most of the information you provide us is information you are asking us to make public. [...] Our default is almost always to make the information you provide public for as long as you do not delete it from Twitter [...]. Your public information is broadly and instantly disseminated.” -Twitter Privacy Policy; (Flesh Kincaid Grade Level 12)

  • Existing Norms

    W i k i m e d i a C o m m o n s

    Differing expectations in public vs private spaces

    Counting red shirts at the mall

    Ok

    Counting red shirts in homes

    No

    Following a red shirts around the mall to learn about purchasing behavior

    No

  • Issues What if… ● users don’t know or

    understand their data are available?

    ● identifiable data are used in a way that harms the user?

    ● natural privacy boundaries are violated?

    ● All would violate IRB but what about online spaces?

    Chan Lowe, Sun Sentinal

  • Proposed DDD Norms

    www.scienceprogress.org

    Applied to DigDisDet

    Data collected and analyzed in aggregate

    Ok

    Data collected from a specific user

    No

    Data collected from specific users from multiple sources

    No

  • Proposed DDD Norms

    1. Avoid publishing identifiable data.

    • Tweet text • Author handle

    2. Do not use data to procure more data from other sources.

    • “Snowball” sampling using identifying info

    • Surfing linked accounts

  • Ideas for DDD Norms

    3. Be especially careful with geographic data.

    • Protect coordinates as you would any other identifying data

    4. Seek IRB approval for individual-based study designs.

    • Likely requires consent • Following a user who

    identifies as depressed

  • A Cautionary example

    • Name • Cell phone number • Favorite music, TV, sports,

    hobbies • Doesn’t like to read

    • School he attends • Love life • Where he vacations • Bad habits • **His social network

  • Parting Motivation

    • Laud Humphreys, sociologist (1960s)

    • “To avoid bias, Humphreys secretly followed some men and recorded the license number on their vehicles. A year later, Humphreys showed up at their private homes and claimed to be a health service interviewer. He asked them questions about their marital status, race, job, and other personal questions”

    -Historical Cases of Unethical Research

    Serena Marsden & Melissa Melander

    University of North Dakota

    http://www.und.edu/instruct/wstevens/PROPOSALCLASS/MARSDEN&MELANDER2.htm

  • What else can we do?

    Caitlin Rivers, MPH Network Dynamics and Simulation Science Laboratory Virginia Bioinformatics Institute Virginia Tech

    cmrivers@vbi.vt.edu

    Slide 1 Motivation Motivation Twitter in a nutshell Twitter API Twitter API Twitter for research Existing guidelines Existing guidelines Existing Norms Issues Proposed DDD Norms Proposed DDD Norms Ideas for DDD Norms A Cautionary example Parting Motivation What else can we do?