Upload
kannan-anjurtupil
View
160
Download
2
Tags:
Embed Size (px)
Citation preview
1
QUALITY QUALITY
OF OF
DATADATA
QUALITY QUALITY
OF OF
DATADATA
2
LEARNING OBJECTIVESLEARNING OBJECTIVESLEARNING OBJECTIVESLEARNING OBJECTIVES
Realise importance of correct data for program management
Realise distinction between random data errors and falsified data
Understand causes of poor data quality Being able to check data quality through
supervision and review of reports Learn when and how to correct erroneous
data
3
Occurrence & importance of Occurrence & importance of errorserrors
In business context: Error rates of 1-5 % are not exceptional Estimated cost ≈ 10 % of revenue Problems with data quality when data originate from
multiple sources After initial enthusiasm to improve data quality, focus
on data quality generally slowly fades
In disease control context Error occurrence? Impact on program performance? Checking of errors: limited effort
4
Errors in RNTCP?Errors in RNTCP?
Based on pre-test carried in all countries:
Real possibility of errors in subdistrict reports
Minor possibility in district reports
Little attention to checking for errors
5
Errors identified in 1039 TB patientscohort review method in NY city: Munsiff et all IJTLD, 2006, 10 : 1133-9
• 41% of cases presented errors• multiple errors per patient: 596 / 424 = 1.4 • What kind of errors?
- program info errors 55 %
- patient related errors 45 %
NB. Error rates in HMIS > 50 % Gillies A. Methods Inf Med 2000, 39 : 208-12
6
DataData qualityquality: definitiondefinition The state of
validity,validity,
reliability,reliability,
consistency, consistency,
timelinesstimeliness
and completenessand completeness
making data appropriate for a specific use
Problems with data quality do not only arise from incorrect data
Inconsistent data is a problem as well
7
Data Quality ~ Data Quality ~ ManagementManagement
Quality AssuranceQuality Assurance
Activities to ensure quality before data collection
Quality AssuranceQuality Assurance
Activities to ensure quality before data collection
Quality ControlQuality Control
Monitoring and maintaining quality of data during RNTCP implementation
Quality ControlQuality Control
Monitoring and maintaining quality of data during RNTCP implementation
Data managementData management
Handling and analysis of data throughout the RNTCP surveillance
Data managementData management
Handling and analysis of data throughout the RNTCP surveillance
8
Quality assurance & Quality assurance & controlcontrol
Quality assurance Quality control - anticipates problems before they occur - responds to observed problems
- uses all available information to generate improvements
- uses ongoing measurements to make decisions on the processes or products
- is not tied to a specific quality standard - requires a pre-specified quality standard for comparability
- is applicable mostly at the planning stage - is applicable mostly at the processing stage
- is all-encompassing in its activities - is a set procedure that is a subset of quality assurance
9
Quality controlQuality control
Quality control is a regulatory procedure through which we:
measure quality compare quality with pre-set standards act on the differences
The objective of quality control is to achieve a given quality level with minimum cost (ex. EQA sampling)
10
Dimensions of data Dimensions of data qualityquality
1. Intrinsic data quality accuracy (validity and reliability)
2. Contextual data quality relevant
timely
complete
3. Representational data quality interpretability, easy to understand
4. Accessibility data quality accessibility, security
11
Intrinsic data qualityIntrinsic data qualityACCURACYACCURACY
Exact conformity to the true value
WHY IMPORTANT?
Accurate data = precondition for accurate decisions!!
Two concepts: validityvalidity and reliabilityreliability
QUESTION: is this guaranteed?
12
ValidityValidity= the degree to which
a measurement reflects the truth
There should be no systematic error or bias
What is a valid sputum result for an open TB case?What is a valid sputum result for an open TB case?
A result is valid if it corresponds to the true value!A result is valid if it corresponds to the true value!
Open TB case = sputum positive!!Open TB case = sputum positive!!
13
ReliabilityReliabilityThe degree to which a measurement gives the same
result: each time it is used under the same condition with the same subject
A necessary but not sufficient condition for validity because one can make the same errors twice
Reliability = repeatibility of measurementsReliability = repeatibility of measurements
Reliability is inversely related to Reliability is inversely related to random errorrandom error
14
Dimensions of data Dimensions of data qualityquality
1. Intrinsic data quality accuracy
2. Contextual data quality relevant
timely
complete
15
RELEVANCERELEVANCE
(usefulness)(usefulness)
Reflects the degree to which information meets the real needs of clients.
Is concerned with whether the available
information sheds light on the issues that are important to users.
16
RELEVANCERELEVANCE
A good information source should include all relevant content and exclude all irrelevant content.
. Decision making for RNTCP management
Relevant for what?
.
Assessing relevance is subjective and depends upon the varying needs of users!
17
TIMELINESSTIMELINESS
Refers to the moment data are compiled, reported and analysed
Given RNTCP’s normalization of the data reporting
system, timeliness is not a major issue in India.
But it could be an issue in remote areas and in PPM
18
COMPLETENESSCOMPLETENESS
No missing data (records, items) All data fields that have to be filled up,
should indeed contain data.
QUESTION: does this presently happen??
19
Missing records
• Annual report 2001 NTP Bangladesh
Reports DOTS areas non DOTS areas
-------------------------------------------------------
Received 2230 180
Missing 59 4
% missing 3% 2%
20
Dimensions of data Dimensions of data qualityquality
1. Intrinsic data quality accuracy
2. Contextual data quality
relevant
timely
complete
3. Representational data quality interpretability, easy to understand
21
Representational data Representational data qualityquality
Interpretability
Data must be in appropriate language and units, and the data definitions must be clear to all (language, jargon, concepts)
Ease of understanding
Data must be clear, without ambiguity, and easily comprehended.
22
Dimensions of data Dimensions of data qualityquality
1. Intrinsic data quality accuracy
2. Contextual data quality relevant timely complete3. Representational data quality interpretability, easy to understand4. Accessibility data quality
accessibility, security
23
ACCESSIBILITYACCESSIBILITY
Essential element of any data quality assessment. Essential element of any data quality assessment.
If data is not accessible, then it has little or no valueIf data is not accessible, then it has little or no value ..If data is not accessible, then it has little or no valueIf data is not accessible, then it has little or no value ..
Accessibility = precondition for use, but no guarantee for use!
Data items should be easily obtainable and legal to collect. Data items should be easily obtainable and legal to collect. In computer era, guidelines have to be established for who In computer era, guidelines have to be established for who may access which datamay access which data
24
SECURITYSECURITYThe protection of data from: ☞unauthorized modification (accidental or
intentional)☞equipment malfunction (computer crash), ☞natural disasters (fire, tsunami..) and crime
Be aware!Security threats are more serious when HMIS is computerized:
unauthorized access to data damage to files (viruses…)
Be aware!Security threats are more serious when HMIS is computerized:
unauthorized access to data damage to files (viruses…)
25
Data management covers the whole process, starting from data recording to transcription, compilation, analysis & interpretation,
reporting, feedback and use.
TB CENTRE (OPD or lab)TB CENTRE (OPD or lab)
TRANSCRIPTIONTRANSCRIPTIONRECORDINGRECORDING
COMPILATIONCOMPILATION
ANALYSIS & INTERPRETATION
ANALYSIS & INTERPRETATION
REPORTINGREPORTING
FEEDBACK & USEFEEDBACK & USE
26
Where can errors occur?Where can errors occur?
At each step, especially during:
Data recording
Manual data transcription
Data compilation
Data entry in computer
Analysis
Interpretation
27
Step in data flow Source of error
Data recording Information not registered
Wrong information (wrong address, etc )
Right information wrongly entered (in the wrong place)
Missing records
Data compilation Wrong counts
Missed reports
Duplicate counting
Compterised data entry
Wrong entry
Partial entry
Partial entry of records
Template based computerised data analysis
nil
28
Prevention of data errorsPrevention of data errors
clarity of the instructions
training and motivation of the staff
honesty of the staff
user-friendliness of the data supports, such as data forms and templates
supervision
29
Prevention of data errors Prevention of data errors computerized data handling :
improves the accuracy of the data
prevents processing and analysis errors
makes fudging less easy, once the data have been entered in the computer
use of independent double entry techniques (and checking of inconsistencies between the 2 entries)
data entry formatted to acceptable ranges and modalities only
30
How to proceed with the How to proceed with the data verification?data verification?
1. Be alert
2. Routine checking of data
3. Quarterly report checking
31
BE ALERT!BE ALERT!
Registers that look meticulously clean All data entered with the same pen Lack of variation / identically results every quarter A too nice performance:
absence of initial defaulters too low death rates too high cure rates absence of defaulting in IP …
Be alert to the likelihood of intentional falsification of data!!!
Do not accept data without checking
their veracity!!!
32
How to proceed with the data How to proceed with the data verification?verification?
Routine checking of the data through supervision
Completeness checking Consistency checking
Quarterly report checking Range checking Modality checking
33
Completeness checkingCompleteness checking
Completeness of report = all data have been reported!
A minimal completeness check verifies if
all variables contain data.
A minimal completeness check verifies if
all variables contain data.
Example:
200 NSP cases and age information only for 187 casesInformation is incomplete!
How to solve?
Verify via the original reports.
34
Consistency checkingConsistency checkingChecks whether the values of data items are concordant
Example: CAT III and Sputum+
How to check for inconsistencies?How to check for inconsistencies?
By cross tabulation
CAT Sputum result
SP+ SP-
CAT I 1162 114
CAT II 300 148
CAT III 16 103016
Contradiction
35
Range checkingRange checking
Any method of detecting whether a quantitative variable is within an acceptable range
Example 1: Height of an adult patient
Acceptable range = 1.00 m to 2,00 m
3.00 m is impossible
0.98 is possible, but needs verification
Example 2: Age of an adult patient
Acceptable range =15 to 100 years
150 years is impossible
Any “impossible” or “out of range” value should be verified via the original record or the patient.
Any “impossible” or “out of range” value should be verified via the original record or the patient.
36
Modality checkingModality checking
The data of a qualitative variable are classified in groups or modalities.
Each data should belong to one modality only
Example : Sex
Two modalities: Male or Female
Other values are impossible!
“Not known” is sometimes entered but is not a valid modality and should be verified and corrected!
37
Correction of errorsCorrection of errors
ERROR ERRORS ??
Go back to the original data source.
But what if the original data source is erroneous?
The best method is to go back to a previous step in the data flow, and verify patient records, lab records, etc.
If correct data found, then modify the erroneous data If correct data not found, then report as “missing”.
38
Errors in dataErrors in data
Risk for wrong decisionsRisk for wrong decisions
Information has to be of good quality• correct data• correct data processing
Valid Valid
ReliableReliable
CompleteComplete
ConsistentConsistent
TimelyTimely
39
Erroneous dataErroneous dataErroneous dataErroneous data
BadBad informationinformationBadBad informationinformation
WrongWrong decisionsdecisionsWrongWrong decisionsdecisions
Appropriate actions??Appropriate actions??Appropriate actions??Appropriate actions??
40
Don’t forget : there is more room for error than shown in this picture