28
1 Data entry – principles and practices Module 2 Session 2

1 Data entry – principles and practices Module 2 Session 2

Embed Size (px)

Citation preview

Page 1: 1 Data entry – principles and practices Module 2 Session 2

1

Data entry – principles and practices

Module 2 Session 2

Page 2: 1 Data entry – principles and practices Module 2 Session 2

2

Overview

This session is concerned with the principles and practices of data entry so that participants can:

i. advise others on how to do effective data entry

ii. explain principles of good data entry through practice with a small set of data

Page 3: 1 Data entry – principles and practices Module 2 Session 2

3

Design survey

Design questionnaire

Enumerators collect data in the field

Data entered onto computer

Manual checking, editing etc.

Data analysis

Reporting of results

Computer data management

Data management cycle

Conception

Now we start looking at entering data

Page 4: 1 Data entry – principles and practices Module 2 Session 2

4

Contents

Review different types of questions that can be found on questionnaires

Review different data types Enter a small dataset onto the computer Summarise steps in data entry, and principles

of good data entry

The Epi Info software is used for data entry.

Page 5: 1 Data entry – principles and practices Module 2 Session 2

5

Learning Objectives

At the end of this session participants should be able to:enter questionnaire data onto the computersummarise the steps in the data entry processproduce a checklist of data entry principlesdescribe double data entry

Page 6: 1 Data entry – principles and practices Module 2 Session 2

6

Questions and data types Preliminary review of the questions on a

questionnaire gives the data entry person an idea of:

the types of data to be entered the complexity of the data to be enteredquality of the data on the questionnaires.

It is also essential for designing the computer data entry screens.

Here we look at some example questions.

Page 7: 1 Data entry – principles and practices Module 2 Session 2

7

Types of questions

These are examples of numeric data.

First one take values of 1, 2, 3, etc. Units are in years. What is maximum value?

Second one can take values of 0, 1, 2, 3, etc.

Units are in months. Duration cannot be more than 12

months.

For how long has

(NAME) stayed in

in the

household

during the

last 12 months?

(In months) 

What is

(NAME'S) age

in completed

years?

Page 8: 1 Data entry – principles and practices Module 2 Session 2

8

Types of questions

This is an example of categorical data.

It has two possible values – male and female.

Coded as 1 and 2. The codes are entered

onto the computer.

Sex  

Male.................... 1

Female................ 2

Other similar examples are Yes/No types of response.Coding is often Yes = 1 No = 0; or Yes = 1 No = 2.

[Should be consistent throughout.]

Page 9: 1 Data entry – principles and practices Module 2 Session 2

9

Types of questions

This is also categorical

data.

There are 12 possible values;

coded 1 to 12. Need sufficient space in

computer system to be able to enter up to 2-digit numbers.

What is therelationship ofof (NAME) to the head of

household?Head...................................…………………………1

Spouse....................................…………………………2

Son/daughter…………………………………………….3

Grand child..................................…………………4

Step child..................................…………………..5

Parent of head or spouse.....................................6

Sister/Brother of head or spouse.....................................7

Nephew/Niece………………………………………….8

Other relatives...........................................…9

Servant...........................................……………………10

Non Relative..............................................…11

Others..............................................…………..12

Page 10: 1 Data entry – principles and practices Module 2 Session 2

10

Types of questions

This is also a categorical variable.

Are the categories in any particular order?

Are the categories mutually exclusive?

What is [NAME'S]

current schooling

Status?

Never attended............................................................01

Left school.......................…………………………02

Currently attending:

Nursery...................................................................03

Primary...........................................................04

Post primary.......................................................05

Secondary..........................................................06

Post secondary**..............................................07

A diploma course.............................................08

University.....................................................09

Apprenticeship...................................................10

Page 11: 1 Data entry – principles and practices Module 2 Session 2

11

Multiple response questions

Multiple response questions can be in the form of:

Multiple dichotomyResponses listed but not orderedRanked e.g. List 1st, 2nd, 3rd.

How should these be entered?

Page 12: 1 Data entry – principles and practices Module 2 Session 2

12

Example: Multiple dichotomy

Question from UNHS. S5b10.

Does this household own any of the following?

Yes =1 No= 2

Motor vehicle 1

Motor cycle 2

Bicycle 1

Boat/canoe 2

Donkey 2

Page 13: 1 Data entry – principles and practices Module 2 Session 2

13

Example: Listed but not ordered multiple responses

UNHS S3a3.

What sort of sickness/injury did [x] suffer? (column.3)

If code 01 (malaria) did in column (3)

5. What type of drug did [X] take?

Malaria 01

Respiratory 02

Measles 03

Diarrhoea 04

Aids 05

Pregnancy related

problems 06

Dental 07

Accident 08

Intestinal worms 09

Sick infections 10

Others 11

None ………………………………1

Chroloquine 2

Fansidar…………………………… 3

Camaquine ……………………….. 4

Quinine ……………………………..5

Panadol …………………………….6

Aspirin ……………………………...7

Herbs ……………………………... 8

Others …………………………….. 9

(5a) (5b) (5c)

2 5 3

Page 14: 1 Data entry – principles and practices Module 2 Session 2

14

Example: Ranked multiple responses

UNHS S3bq3: What are the main channels of communication from which you receive AIDS/HIV information and Education? (Note that the channels should be ranked in order of the three most important)

(use codes at the bottom of page)

1st 2nd 3rd

(3) 08 (4) 01 (5) 07

Channels of communication (codes for col. (3), (4), and (5)

Radio 01 Posters 05 Teachers 09

TV 02 Billboards 06 Political leaders 10

Film 03 Family 07 Trad. Leaders 11

Drama 04 Friends 08 Religious leaders 12

Page 15: 1 Data entry – principles and practices Module 2 Session 2

15

Computerisation

The dichotomous Multiple Response questions require one column for each Yes/No (or 1/0) response each one indicating whether respondent

ticked / did not tick item in the list.

In the ordered or ranked multiple responses, can have as many columns as there are alternatives in the question, but the first records the most important etc..

Page 16: 1 Data entry – principles and practices Module 2 Session 2

16

More complex questions

Did [NAME] If code 01

fall sick or What sort of How many (Malaria)

get injured sickness/injury days were in col (3)

during the did [NAME] lost (suffered)

last 30 days? suffer? by [NAME] What type of

due to the drugs did

Malaria...............................................01 illness/Injury? [NAME] take?Yes............................................................1 Respiratory..........................................................02

No.......................................................2 Measles.................................................................03

Don't know.....................3 Diarrohea..............................................................…04 None………………………………………………….1

AIDS......................................................................05 Chloroquine……………………………………..2

(if no or Pregnancy Fansidar……………………………………………..3

don't know Related Problems..................................................................06 Camaquine.……………………………………..4

skip to Dental..................................................................07 Quinine...…………………………………………………5

col. (11) Accident...............................................................08 Panadol...…………………………………………….6

Intestinal Infections................................09 Asprin...…………………………………………….7

Skin Infections..........................................10 Herbs…………..…………………………………………8

Hyper - tension..................................................11 Other (specify)…………..…………….9

Ulcers....................................................................12

Mental Illness...................................................................13

Other fever...............................................14

Others...................................................................15

If code 1 in col [2]

How shouldthese data be entered?

Page 17: 1 Data entry – principles and practices Module 2 Session 2

Missing values

Surveys will always have missing data Data can be missing for a variety of reasons:

respondent did not know the answer; respondent refused to answer; question was not applicable; question was missed by the fieldworker; response was not recorded clearly; etc.

17

Page 18: 1 Data entry – principles and practices Module 2 Session 2

Coding missing data

Assigning codes to missing data – avoids blanks in the data.

Code must not be a possible value. For numeric data (e.g. Age) negative value

often used (e.g. -99) For categorical data use a code higher than

any valid code for the question (e.g. 99)

18

Page 19: 1 Data entry – principles and practices Module 2 Session 2

Missing value codes

Different codes could be used for different types of missing data. 99 or -99 = question missed by fieldworker 88 or -88 = question not applicable 77 or -77 = don’t know or refused to answer

Should be consistent throughout

19

Page 20: 1 Data entry – principles and practices Module 2 Session 2

Unique Identifier

Each set of data should have a unique identifier.

Often referred to as a Primary Key. In household surveys for example you often

have a Household ID. This would be unique for each household and

enables you to easily find the data for the household.

20

Page 21: 1 Data entry – principles and practices Module 2 Session 2

21

Activity 2

In pairs. Look at questionnaires. Identify types of questions, and types of

data.

Class discussion.

Page 22: 1 Data entry – principles and practices Module 2 Session 2

22

Brief introduction to Epi Info…

Epi Info is a series of freely distributable programs for Microsoft Windows, for managing databases (especially public health ones)

can customize the data entry process (layout similar to questionnaire), enter and analyse data.

Page 23: 1 Data entry – principles and practices Module 2 Session 2

23

Brief introduction to Epi Info…

Projects (file, .mdb)

View: info about the screen appearance, or how the survey looks, and how data is entered into the data table.It has fields (variables) which are created to hold data.

Data Tables stores the data

Epi Info contains:

which have

Page 24: 1 Data entry – principles and practices Module 2 Session 2

Data entry in Epi Info

Points to note: View can span several “pages” Space assigned for “Other, specify” text Questions can be skipped if not relevant

Demonstrate data entry using the Household Survey data

24

Page 25: 1 Data entry – principles and practices Module 2 Session 2

25

Activity 4 & 5

Entering a small dataset into Epi Info.

Record some principles of good data entry.

Record the steps in the data entry process.

Page 26: 1 Data entry – principles and practices Module 2 Session 2

26

Double data entry

Data entry needs to be checked. If data set is small, can print out and check

manually. If dataset is large, this can be resource-

intensive and time consuming, - How many records do you need to check?

Double data entry = dataset is entered twice (by different people) and datasets compared.

Discrepancies are checked and corrected.

Page 27: 1 Data entry – principles and practices Module 2 Session 2

Data Compare Utility

Utilities -> Data Compare File -> New Script Step 1: Epi Info View – select the files to compare Step 2: Checks that structure of the files is the same Step 3: Select the unique identifier Step 4: Select the fields to compare (all) View -> Read-Only Demonstration of Data Compare using data1 and

data2

27

Page 28: 1 Data entry – principles and practices Module 2 Session 2

Activity 7

Use the Data Compare utility to compare data entered in Activity 4 with data entered by another group

28