Jeroen Pannekoek, Mark van der Loo and Bart van den Broek

Implementation and Evaluation of Automatic Editing

Introduction

Automatic data editing can involve many different kinds of actions that each perform a specific task in the editing process.

Current work at SN is targeted at supporting the implementation of these editing tasks with standardised re-usable methods and software tools.

But the effectiveness of such implementations depends very much on the parameterisation of methods and especially specification of edit-rules and other rules that drive the automatic editing functions.

This means monitoring the effects on the data but also feedback on the sets of (edit)rules used by the different tasks.

This presentation

• The types of rules that are input to the automatic editing

• The automatic editing task or process steps

Main point:• Ways of generating feetback from the automatic editing

process that can help in the improvement of the configuration of the different process steps.

Input Rule Sets: Verification and Modification

Verification of data values (Cheking- or edit-rules) Profit = Revenues – Costs Employees in FTE < Employees

Modification of data values (Direct “if-then” type of rules)Correction: value -> value If Wages > 10 000 * Employees Then Wages <- Wages /1000Error localisation: value -> missing If (Employees > 0 & Wages = 0) Then Wages <- NAImputation: missing -> value If (Employees = 0 & Wages = NA) Then Wages <- 0

Editing process steps

Raw data• Correction of thousand

errors• Corrections with other rules

• Correction of typos• Correction of rounding

errors• Error localisation with rules• Error localisation Fellegi-

Holt• Deductieve imputation• Regression (NN) imputation• Adjustment of imputed

values

Corrected data

Directmodification rules

Edit rules

Log file

Effects of editing: data related and edit related views

Data related views• Status of data cells (observed, missing, imputed etc.)• Values of data (e.g. estimates of means, totals, variances

Edit related views• Status of edits (violated, satisfied, not verifiable)• Values of edits (tolerances, scores)

Across process steps:

Status of data cells

At each step we have available and missing data valuesThese can be subdivided according to the way they are changed with respect to a previous step or the raw data.

All cellsAvailable Missingunaltered

modified

made available (imputed)

unaltered (still missing)

made missing(cancelled)

Data cell status

Left: Childcare institutions

Right: SBS Wholesale

Data values

Means and estimated CI by process stepChildcare Institutions:Turnover,Revenues

Edit verification status

Edit tolerance or score

By how much is an edit violated?(an edit-related score function)

Edit tolerances for Wholesale

Plots of tolerances

Height of box proportional to sqrt(# positive tolerances)

Left side: numbers of not evaluated tolerances.

HB scores for Childcare

Hidiroglou-Berthelot scores for two ratio’s

Left:Wages/Employees

Right:Revenues/Costs

Hard edit-rule:0.5×Costs < Revenues <2×Costs

Concluding remarks

– Step-by-step evaluation of indicators can lead to :• improvements in edit-rules (1000-errors, minus

signs, relaxation of bounds)• improvements in configuration of methods

(imputation)• efficient selective editing (review specific corrections)

– Other benefits of indicators by process step:it makes automatic editing more transparent, and more easily accepted by editing staff.

Concluding remarks

Thank you for your attention!

Jeroen Pannekoek, Mark van der Loo and Bart van den Broek

Documents

Cognitive Processes during Text Comprehension Paul van den Broek University of Minnesota

1 Detectie van kartels, twee innovatieve projecten Stijn van den Broek, 24 april 2014 Vereniging voor Mededingingsrecht

Heijmans-Eric van den Broek

Digitale duurzaamheid bij Nederlands Fotomuseum - Martijn van den Broek

H3fa1St. etienne frans - merel van den broek

Onderzoek Machine Fabriek Van Den Broek MPZ [19812]

eGovernment12 - Thijs van den Broek - TNO

Gijs van den broek global b2 b trade made easy

Gratis Liplekker Resepte - Verskeie pannekoek idees · 2018. 10. 17. · Gratis Pannekoek resepte met komplimente van Terug na Indeks Algemeen Wanneer ŉ vleisvulsel gebruik word,

VRAAG HET AAN...Nette broek inkorten 134 Hoe maak ik een gapende tailleband van een (jeans)broek passend? 136 Hoe zet ik een nieuwe rits in mijn (jeans)broek? 142 Hoe repareer ik een

lenin filosofo, anton pannekoek

Anton Pannekoek

Monitoringactie over 2009 Kees Versluijs, Jaap Bogte, Huub van den Broek

S&T in GEO FP7 projects Contributions and Benefits Maud van den Broek

“Nieuw Toezicht” Petra van den Broek Voor gmr Utrecht 13 februari 2009

Stichting Buurtschap Stuivezand ‘T STUIVEKRANTJE · Suzan van Beek Evaluatievergadering Jens v/d Broek Linsey v/d Broek Menno van Dongen Robin Hereijgers Joeri Huijbregts Celina

HET ONDERZOEK VAN JOHAN VAN DEN BROEK: ‘Taxuskever …

Pannekoek - Lenin as Philosopher.pdf

Koen van den Broek — Cut Away the Snoopyresearch.gold.ac.uk/21008/1/Van den Broek Cut Away the Snoopy 20… · the Snoopy’ is an expression often used by John Chamberlain. I like

Presentatie Mg Van Den Broek Hv A 13 03 2008 Weblogs Als Comunicatietool