34
Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Embed Size (px)

Citation preview

Page 1: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Introduction to ESDS Qualidata:

Creating and delivering re-usable qualitative data

Libby Bishop and Louise CortiESDS Qualidata

RC33 AmsterdamAugust 2004

Page 2: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

ESDS Qualidata

Page 3: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Qualitative data collections

• data from National Research Council (ESRC) individual and programme research grant awards

• data from ‘classic’ social science studies

• other funders/sources

• focus on DIGITAL Collections, but also facilitate paper-based archiving

Page 4: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Types of qualitative data

• diverse data types: in-depth interviews ; semi-structured interviews; focus groups; oral histories; mixed methods data; open-ended survey questions; case notes/records of meetings; diaries/ research diaries

• multimedia: audio, video, photos and text (most common is interview transcriptions)

• formats: digital, paper, analogue audio-visual

Page 5: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Classic datasets

• Peter Townsend – Poverty, old ageand Katherine Buildings

• Paul Thompson – oral history and Edwardians, social mobility

• Mildred Blaxter – Mothers and Daughters

• National Social Policy and Social Change Archive

Page 6: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Diverse uses for existing data

• enrich context description

• how was it really done documentation of methods– team ‘discussions’ about coding– what, exactly, is ‘semi-structured’?

• augment data you collect– historical comparative case– expand sample size

• datasets for teaching

Page 7: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Are data always re-usable?

• restrictions on secondary analysis

• accessible

• coherent

• format– medium– layout

• processing before delivery

Page 8: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Good archiving = good research

• thorough documentation

• well organised and labelled files

• major stages of research recorded

• consent, copyright and related issues clarified

Page 9: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Characteristics of a good archived research collection

• intellectual content

• extensive raw data created

• supporting documentation

• consent

• transcription

• identifiers removed

• data listing

Page 10: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Intellectual content

• builds on previous research

• addresses new issues

• innovative approach to discipline

• innovative approach to qualitative methodology

Page 11: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Extensive raw data

• types of research data assembled

– in-depth interviews– focus groups– field notes/participant observation– case study notes

• images and sound recordings

• range of material – broad focus

Page 12: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Describing qualitative data

• Full catalogue record

• Data listing (ID, biog details, date of interviews, media, format, transcript details)

• Online PDF User Guide

• Use/ processing notes

• Archival listing for large collections

Page 13: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

• CAT RECORD

Page 14: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Supporting documentation• examples

– funding application– description of methodology– communication with informants on confidentiality– coding schemes/themes– technical details of equipment – interview schedules– end of award report– documentation from CAQDAS software packages, e.g.

analytical memos– bibliographies, resulting publications

• Anything that adds insight or aids understanding and re-use

Page 15: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

• USER GUIDE HERE

Page 16: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

User Read FileUK DATA ARCHIVE DOCUMENTATION 4594 - Policing, Cultural Change and 'Structures of Feeling in Post-War England, 1945-1999

Access conditions Until 1 May 2008, the depositor's permission must be sought for access - please contact Qualidata at UKDA for further details. Users should note that no access at all is permitted to the Metropolitan Police Commissioner's interview transcript (int54) until 31st January 2005. Conversion of data and documentation formats All 65 interview/focus group transcript files were converted to both MS WORD 97 and rich text formats. Both the MS WORD 97 and the rich text files are available to users. The hard copy documentation was scanned and is available as a one volume Acrobat PDF user guide. Anonymisation Some limited edits have been made to interview transcripts during processing to protect the identity of respondents. Care has been taken to ensure that this does not compromise the quality of information available. Data and documentation problems There are some spelling mistakes in the interview transcriptions, (left in situ due to limited processing resources), and the format transfer to Word has produced odd characters within the files in a very few cases. These issues should not present problems for secondary users. Notes from data delivery and post-order corrections

Page 17: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Transcribing research 1

• integrated into the ongoing research

• full transcriptions or summaries

• avoid stockpiling

• costs and benefits– self transcription– internal team transcription– external transcription

Page 18: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Transcribing research 2

• budget

– estimated number of interviews x 4 hours x 60 minute tape x hourly salary

• examples of good and bad

• full transcriptions– consistent layout– speaker tags– line breaks– header with identifier other details – checked for errors

Page 19: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Example of good transcription

LP: And how long have you lived in this house?

4G: This house? Four years past April.

LP: And you said you were in, was it Ferrier?

4G: Ferrier Gardens.

LP: For twenty years?

4G: Twenty-four years. Twenty-two doon the stair, and two years up the stair.

Page 20: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Identifiers removed

• confidentiality respected

• anonymisation?

• problems of anonymisation– applied too weakly– applied to strongly– timing – potential for distortion– examples

• user undertakings

• appropriate and sympathetic

Page 21: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Listing research data

• contents

• key elements– general– specific to project

• template approach

• point of entry

Page 22: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

• DATA LIST HERE – EDWARD?

Page 23: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Value of data properly prepared for re-use

• widely disseminated and accessible

• suitable formats for use and preservation

• coherent data and methodology

• appropriate for CAQDAS packages

Page 24: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Preparing qualitative data for sharing

•Sharing requires standards –XML mark-up

•Processing steps:

•Scan

•Optical character recognition (OCR)

•Proof

•Format

•XML mark-up

Page 25: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

XML mark-up enables• Access to content and structure

– Speaker tags– Coded textual/audio data– Links to contextual documentation

• Audio files; fieldnotes; photos; analytical annotations etc– Links to other sources via geo-referencing

• Micro data; aggregate statistics; maps; census data etc.• Data providers to publish to online systems, such as ESDS

Qualidata Online • Meet needs of researchers requesting a standard they can

follow• Encourage more qualitative data analysis software (CAQDAS)

companies to pursue XML-outputs based on this standard

Page 26: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

How we get from tifs to…

Page 27: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

…XML mark-up ready for online<u n=“31”>…<s n="44"> My father was, in the daytime he was a

boilermaker on the old <name type="organisation">North <add place="supralinear">Staffordshire</add><del type="word change">Circular</del>Railway</name> and then every night he played in the theatre orchestra.

</s>…<s n="46">And he

<add place="supralinear">'d to go to</add><del>had got to be at</del> work at six the next morning! <note place="end of paragraph">Cornet player.</note>

</s></u>

Page 28: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Word doc created from OCR

Page 29: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Issues in scanning and OCR

• Scanning done at 300 dpi, grey scale

• OCR varies hugely with quality of original, special challenges include (but are not limited to):

– Character recognition

– Stray marks on page

– Missing words

– Interviewer’s notes

– “Creative” character interpretation: section breaks, font changes, footnotes, super- and sub-scripts, and so on.

• Partially automated with macros, but much judgement (clerical and research) still required

Page 30: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Final Word file(human and Excel readable)

Page 31: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Using Excel macros to create XML transcript

Page 32: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Current final product:basic XML mark-up

<u id="96" who="subject">I would rather nae ken if I had cancer. I told my man that, I says "If I have cancer, don't tell me". I mean you might hae an idea yourself, but I wouldnae like to be telt. I told him that.</u>

<u id="97" who="interviewer">And how has your own health been over the years?</u>

<u id="98" who="subject">Och, up an' doon, y'ken .</u>

Page 33: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

Need for publishing tools

• Once XML schema is more developed, next step is to develop publishing tools to automate as much of mark-up as possible

• Currently using simple scripts to find and mark <u> and <s>; much work still done manually

• Looking into options for automatic mark-up of some components (e.g. natural language processing and information extraction)

• Would like to work closer with CAQDAS suppliers to ensure use of similar mark-up semantics

Page 34: Introduction to ESDS Qualidata: Creating and delivering re-usable qualitative data Libby Bishop and Louise Corti ESDS Qualidata RC33 Amsterdam August 2004

ESDS Qualidata

• Contact

[email protected][email protected]

– www.esds.ac.uk/qualidata