32
CAPI Data Quality in Namibia and Lesotho Gregory Martin Methodology and Software Development Branch International Programs Population Division U.S. Census Bureau 1

2016 aapor gregory martin

Embed Size (px)

Citation preview

Page 1: 2016 aapor gregory martin

CAPI Data Quality in

Namibia and Lesotho

Gregory Martin Methodology and Software Development Branch

International Programs Population Division U.S. Census Bureau

1

Page 2: 2016 aapor gregory martin

U.S. Census Bureau

International Programs

Promote international development and

capacity building on a reimbursable basis

through:

Technical assistance

Training

Software and methodological tools

2

Page 3: 2016 aapor gregory martin

What is CSPro?

Census and Survey Processing System First release in 2000

Initially designed for key-from-paper

Product Suite Data entry

Edits and imputations

Tabulation

Public domain

3

Page 4: 2016 aapor gregory martin

Who Uses CSPro?

Used in over 160 countries by:

National statistical offices (NSOs)

Other ministries (health, education)

Non-governmental organizations (NGOs)

Universities

Hospitals

Private sector

4

Page 5: 2016 aapor gregory martin

CSPro Applications

Censuses (population and housing, agriculture, and economic)

Demographic and labor force surveys

Health and nutrition surveys

Household income and expenditure surveys

Major international projects: Demographic and Health Survey (DHS), ICF

International

Multiple Indicator Cluster Survey (MICS), UNICEF

5

Page 6: 2016 aapor gregory martin

CAPI History

Windows Mobile (2006)

Unicode support (2012)

CSEntry Android application (2014)

6

Page 7: 2016 aapor gregory martin

CSPro Data Entry Features

Develop and test on Windows

Target multiple platforms

Multiple languages

Tightly controlled path

Powerful programming language

Data synchronization

7

Page 8: 2016 aapor gregory martin

USAID-Funded Survey Project

Namibia Statistics Agency Namibia Household Income and

Expenditure Survey (NHIES): 2015/2016

Rolling sample survey Demographic, economic, and

consumption modules Multiple visits to a household Roughly 10,000 households 10” Samsung tablets

8

Page 9: 2016 aapor gregory martin

UNFPA-Funded Census Project

Lesotho Bureau of Statistics

Lesotho Population and Housing Census: 2016

Single enumeration period

Around two million people

7” and 8” tablets from Lenovo, xTouch, and Click

9

Page 10: 2016 aapor gregory martin

CAPI Considerations

Designing applications as a distributed team

GPS and mapping

Case management

In-field automated sampling

Monitoring the quality of data collection

Data synchronization and backups

IT support

10

Page 11: 2016 aapor gregory martin

Distributed Application Design

CSPro application files are all text files and work well in revision control systems

Teams used Git with the SourceTree UI

Namibia: Privately hosted using Bonobo Git Server

Lesotho: Publicly hosted using GitHub:

https://github.com/lso-statistics/Lesotho-census-2016.git

11

Page 12: 2016 aapor gregory martin

Distributed Application Design

Advantages: Revision history

Changes to programs tracked, documented, and reviewed

Sense of ownership of and responsibility over modifications

Ability to simultaneously work on programs Undo ability

Disadvantages: Took some time for teams to adjust to paradigm shift Occasional difficulties merging changes

12

Page 13: 2016 aapor gregory martin

Distributed Application Design

13

Page 14: 2016 aapor gregory martin

Mapping

Used Google Earth to display enumeration maps

GPS points collected during listing overlaid on base maps to determine whether households were in boundary

Not all tablets had data plans so Google Earth maps had to be cached when Wi-Fi was available

14

Page 15: 2016 aapor gregory martin

Mapping

“The PSU boundaries are difficult to distinguish since only a small footpath or a dry creek separates one PSU from another. Having a GPS unit is the only way to tell where one PSU ends and another begins.”

15

Page 16: 2016 aapor gregory martin

Mapping

16

Page 17: 2016 aapor gregory martin

Case Management

In Namibia: Teams consisted of one supervisor and two interviewers,

hired for the duration of the survey Interviews occurred in an area over two weeks Headquarters staff assigned enumeration areas to

supervisors Interviewers listed households and then transmitted the

household listing to the supervisor The supervisor merged the listing files and then a 12

household sample was selected by algorithm Supervisors made household assignments, which were

assigned to a single interviewer, not split

17

Page 18: 2016 aapor gregory martin

Case Management

Assignments could be reassigned, and any data previously collected was maintained

Data collection for a household was split between interviewers and supervisors The household roster was transmitted from the interviewer to the

supervisor to ensure the data quality of the supervisor’s data collection

Interviewers had to validate all household data before transmitting it to a supervisor

Supervisors had to validate all enumeration area data before transmitting it to the headquarters server

If data did not pass validation tests, staff had to explain why it was submitted incomplete or with consistency problems

18

Page 19: 2016 aapor gregory martin

Case Management

19

Page 20: 2016 aapor gregory martin

Case Management

Case management and data flow was tightly controlled, but the system allowed for some flexibility

Headquarters staff could provide codes to unlock certain actions Working in an area meant for a

different supervisor Resampling an enumeration area Submitting incomplete data for an

enumeration area

20

Page 21: 2016 aapor gregory martin

Automated Sampling

Systematic sampling routine built into the program

Reduced errors of supervisors not properly following the sampling algorithm or introducing bias in household selection

Ensured that replacement households came from same stratum as replaced household

21

Page 22: 2016 aapor gregory martin

Monitoring Data Quality

In Namibia:

The programs had many consistency checks and supervisors were not required to review collected data

Regional supervisors could look at data on a case-by-case basis after the data had been transmitted to headquarters

22

Page 23: 2016 aapor gregory martin

Monitoring Data Quality

In Lesotho:

Looser consistency checks in the programs

Supervisors reinterviewed roughly five households per enumeration area

Reinterview program compared the supervisor-entered data with the data originally entered

23

Page 24: 2016 aapor gregory martin

Monitoring Data Quality

In Lesotho: Supervisors viewed reports

showing the status of assignments in an enumeration area

Supervisors could also view basic demographic reports

Reports generated using D3.js and displayed on the tablet’s web browser

24

Page 25: 2016 aapor gregory martin

Data Synchronization

In Namibia: Supervisors had SIM cards and controlled all

transmissions to headquarters MD5 hashes stored so only modified data was transferred

from headquarters

Only newly collected data sent to headquarters

Interviewers did not have SIM cards and updated their programs and files via the supervisor

Transmissions between supervisors and interviewers done via a locally created Wi-Fi hotspot and later Bluetooth

25

Page 26: 2016 aapor gregory martin

Data Synchronization

26

Page 27: 2016 aapor gregory martin

Data Synchronization

Lesotho was similar to the Namibia model

Enumerators sent data via Bluetooth to constituency supervisors

Supervisors sent data via FTP to headquarters

Files were updated from headquarters to supervisors to enumerators

27

Page 28: 2016 aapor gregory martin

IT Support

In Namibia:

An IT coordinator was in charge of two provinces, or 4-6 teams

Headquarters staff set up the 100+ tablets in a single day

28

Page 29: 2016 aapor gregory martin

IT Support

In Lesotho:

50 IT coordinators distributed throughout the country, in charge of around 6,000 tablets

IT staff took two weeks to set up all of the tablets

Pondered the use of a MDM for deployment in the future

Enumerators could use WhatsApp to ask IT staff questions

WhatsApp also used to send announcements to staff

29

Page 30: 2016 aapor gregory martin

General Comments

Interviewers generally liked using tablets to conduct a CAPI survey Data quality was improved due to consistency checks in the programs

In-field coding using lookups eliminated the need for a time consuming coding process at headquarters

Lesotho used three different brands of tablets, which led to some challenges, including subpar GPS reading on cheaper tablets; Namibia, using only one brand, had an easier time managing the equipment

CAPI worked well for Namibia’s rolling survey, allowing easy updating of the programs and the ability to add additional questions in later survey rounds

The programming skills required were greater than for key-from-paper surveys, and required enhanced IT knowledge

CAPI development using CSPro will become easier when there are census/survey templates that organizations can utilize

30

Page 31: 2016 aapor gregory martin

Thank You

31

Page 32: 2016 aapor gregory martin

Thank You

32