View
74
Download
0
Category
Preview:
DESCRIPTION
Training Course on EDIT. For Users. Outline of the module. Introduction Using EDIT - integration with other tools Objects in EDIT for Users EDIT Graphical User Interface Future developments. A - Introduction. EDIT is a tool for data validation - data edit/imputation. - PowerPoint PPT Presentation
Citation preview
Eurostat
Training Course on EDIT
For Users
1
Eurostat
Outline of the moduleA. IntroductionB. Using EDIT - integration with other toolsC. Objects in EDIT for UsersD. EDIT Graphical User InterfaceE. Future developments
2
Eurostat
A - Introduction
3
Eurostat
EDIT is a tool for data validation - data edit/imputation
• What is data validation? - An activity aimed at verifying whether the value of a data item comes from the given set of acceptable values:
• What is data editing? - The activity aimed at identifying erroneous entries and correcting them if necessary.
Example: the response is missing or incorrect.
4
Eurostat
How EDIT works shortly?
5
Uploads dataset(s) from external files
Define a format
Define a program containing rules and file operations to be executed on the dataset(s)
Get the report containing errors (if any)
A format contains a description of the data in a dataset
A dataset is a set of data according to a specific format
Execute the job
For u
sers
Eurostat
EDIT User types• 'User‘ - Executes programs on datasets and accesses the
reports.• 'Programmer‘ - Manages the metadata needed by the user
to execute programs;• Implements 'formats‘;• Implements ‘validation rules’ by means of 'programs';• Defines other operations on files by mean of 'programs';• Sets up the unattended mode configuration.
• 'Administrator'• Manages users and permissions.
6
Eurostat
'User' type functionalities
• ‘Change Password’• Allows users to change their password;
• ‘Dataset Import/Export’• Allows users to import and export data to and from EDIT
as well as monitor any ongoing import/export processes;• ‘Job Execution’
• Allows users to execute programs on imported datasets and view/export the results of the execution.
7
Eurostat
The 'User' Workflow
Data Import
Job Executi
onJob
ResultsData
Export
8
Eurostat
The link between 'User workflow' and 'User interface'
9
Eurostat
What can we do by means of a ‘program’?
• Run programs containing mainly validation rules / computations: A1 – Single column – only a column is involved;A2 – Multiple columns – two or more columns within a single
record are involved;B - Vertical – multiple records involved;C - Hierarchical – multiple datasets involved.
• Perform dataset operations: Copy, Merge, Alter, Aggregate, etc.
• Use specialised functions like outlier detection: Terror, Hidiroglu-Berthelot, σ-Gap;
• Accepted formats: SDMX-ML, GESMES, CSV, FLR.
10
11
Accepted data formatsGESMES (BOP ITS, BOP FDI)UNA:+.? 'UNB+UNOC:3+FR2+4D0+100929:1637+IREF000243++GESMES/TS'UNH+MREF000001+GESMES:2:1:E6'BGM+74'NAD+Z02+ECB'NAD+MR+4D0'NAD+MS+FR2'IDE+10+EUROSTAT_BOP_01 reporting'DSI+BOP_FDI_A'STS+3+7'DTM+242:201009291637:203'DTM+Z02:20072009:702'IDE+5+EUROSTAT_BOP_01'GIS+AR3'GIS+1:::-'ARR++A:FR:N:2:330:N:4A:E:9999:9999:20072009:702:0:A:F+0:A:F+0:A:F‘ ARR++A:FR:N:2:330:N:4F:E:9999:9999:20072009:702:0:A:F+0:A:F+0:A:F'ARR++A:FR:N:2:330:N:7Z:E:9999:9999:20072009:702:0:A:F+0:A:F+0:A:F'ARR++A:FR:N:2:330:N:A1:E:1100:9999:20072009:702:5824:A:F+5930:A:F+4204:A:F'ARR++A:FR:N:2:330:N:A1:E:1495:9999:20072009:702:5828:A:F+5932:A:F+4206:A:F'
CSV (with or without header) (SBS, CVTS,TOURISM)9H; 2008; LT; 2; B-N_X_K642; 11930; 16236; ; ; ; ; UNIT; ; ; ; ; ; TT0; ; ; ; ; D089H; 2008; LT; 3; B-N_X_K642; 11930; 1001; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D089H; 2008; LT; 4; B-N_X_K642; 11930; 529; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D089H; 2008; LT; 30; B-N_X_K642; 11930; 17766; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D089H; 2008; LT; 2; B-E; 11930; 1138; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D089H; 2008; LT; 3; B-E; 11930; 104; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D089H; 2008; LT; 4; B-E; 11930; 61; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D08
FLR example 1001E20100121814 00 804.822001E20100121816 93 5295.54001E20100121814 99 6166.24001E20100125290334 581.371
FLR example 22010010011 010252000405595911005909580E 01ZZZZZ 2691.966 2734482.0 0.02010010011 010252000405595911004009600E 01ZZZZZ 237.543 341202.0 0.0
multi-year 2007, 2008, 2009 observations
Eurostat
B - Using EDIT - integration with other tools
12
Eurostat
Ways of using EDIT
• As a web-based application – called by other applications;
• Standalone – running on a PC;• Client – server – running in a Data Centre.
13
Eurostat
EDIT as Web-based application• Web-based Interface
• Unified interface for both the standalone version and the server deployment;
• EUROSTAT Look & Feel;• Light interface, simplified workflows.
• ECAS account is needed.
14
Eurostat
EDIT running standalone
Downloadable package; Standalone installation supported by Windows XP
and Windows 7; Simple installation wizard; Full functionality; Standard authentication is requested.
15
Eurostat
Client - server mode for EDIT EDIT runs on a UNIX machine; The current setup is EDIT installed at Eurostat &
other DGs; Contains all registered domains (= user specific
workspaces) as by default imbedded; ECAS credentials needed for external users.
16
Eurostat
EDAMIS integration
• EDAMIS allows transmitting data files through a single entry point;
• EDAMIS can send data to EDIT by placing the files in a configurable location;
• EDIT detects metadata based on the EDAMIS naming convention;
• EDIT performs the processing in unattended mode.
17
Eurostat
SDMX integration• Statistical Data and Metadata Exchange (SDMX)
initiative is sponsored by seven institutions (the BIS, the ECB, Eurostat, the IMF, the OECD, the UN and the World Bank);
• SDMX describes and universalises the way to exchange statistical data and metadata;
• EDIT can import SDMX-ML datasets.
18
Eurostat
C - Objects in EDIT for Users
1. Datasets instantiations - lookups;
2. Programs, jobs
19
Eurostat
1 - Dataset instantiations• Dataset Instance (Dataset) – a collection of data rows
according to the structure of a format;• A two dimensional table composed by rows and columns:
• Columns correspond to the fields defined in the format;• Records – no limit on size or number.
20
Eurostat
Dataset example – Table AES (Adult Education Survey)
21
Eurostat
22
The description of the table AES
Eurostat
Example: 'Format' – 'Dataset instantiation'
23
Format
Dataset instantiation
Eurostat
The same format – different datasets
Eurostat
Lookup tables – code lists• Lookup – An auxiliary dataset containing a list of
values to be used for validating codes;• Code lists – usually lookup tables refer to code lists;• One can use several code lists inside the same
program – as many as needed for the given data sets – 'Country', NACE, NUTS;
• Several versions of the same code list can be used from within the same program, if needed.
25
Eurostat
2 - Programs, jobs
• Program – a set of operations to be performed on a specified dataset definition (format);
• No specific dataset is associated with a program, only formats (dataset definitions) should be specified;
• Job – the association between a 'Program' and concrete 'Dataset Instances';
• Possible operations types of rules/checks: Single and Multiple column(s), Vertical and Hierarchical.
26
Eurostat
Validation report• It contains:
• Job results – information about the job;• Error statistics – summary of the errors;• Error report – detailed list of errors.
27
Eurostat
Error statistics• The error statistics are displayed in a table format and
it consists of the following columns: • Rule name: The name of the program rule that
failed;• No of Failures: Individual rows that the error
appeared through job execution; • Rule Message: Rule’s error message as defined in
the program.
28
Eurostat
Errors statistics
29
Rule Name No of Failures Rule Message
RC07 10 Error : This region’s code is not valid
RC185 1 Error : IntPrv does not contain the expected values
SC04 8 Error : Invalid value (if MAINSTAT in(20, 31, 32, 33, 34, 35, 36, -1) then JOBISCO should be -2)
SC05 8 Error : Invalid value (if MAINSTAT in(20, 31, 32, 33, 34, 35, 36, -1) then LOCNACE should be -2)
SC32 1 Error : Invalid value (if SpkPrv04 <> 1 AND pskPrv05 <> 1 then SpkEquip should be -2)
SC33 1 Error : Invalid value (if SpkPrv04 <> 1 AND pskPrv05 <> 1 then SpkPHelp should be -2)
CC04 1 Error : Invalid value ()
Eurostat
Detailed error report
30
No MESSAGE SEVERITY EXP NAME PARTITION AUXILIARY DATA
1 This region code is not valid Error RC07 ROW_NUMBER=1 REGION= “JP”
2 Invalid value (if MAINSTAT in(20,31,32,33,34,35,36,-1) then JOBISCO should be -2)
Warning SC04 ROW_NUMBER=3 MAINSTAT=20 JOBISCO=2
3 Invalid value (if MAINSTAT in(20,31,32,33,34,35,36,-1) then LOCNACE should be -2)
Error SC05 ROW_NUMBER=3 MAINSTAT=20 LOCNACE=7
4 This region code is not valid Error RC07 ROW_NUMBER=4 REGION=EG
5 Invalid value (if MAINSTAT in(20,31,32,33,34,35,36,-1) then LOCNACE should be -2)
Error SC05 ROW_NUMBER=4 MAINSTAT=31 LOCNACE=6
6 This region code is not valid Error RC07 ROW_NUMBER=6 REGION=EG
Eurostat
D - EDIT GRAPHICAL USER INTERFACE
31
Eurostat
EDIT - Log in
32
Eurostat
EDIT Home page
33
Menu options
User profile information
Here password can be changed
Eurostat
Defining dataset: import dataset
34
Select a file on your hard drive
Starting line
Go in >Dataset>> Import dataset
Select a file type (CSV / GESMES / FLR / SDMX)
Reuse saved parameters
Save properties for further use
Screen part I
Eurostat
Defining dataset: import dataset
35
Select a format
Select columns to import Use the arrows to add
remove fields
Reuse saved configuration
Save configuration for further use
Screen part II
Provide a name for the new dataset
Click to import
Eurostat
Defining dataset: import dataset
36
Status is FAILED
Unsuccessful import
Click to download the importing report in text format
Eurostat
37
Defining dataset: import dataset
Status is COMPLETED
Successful import with warnings
Click to download the importing report in text format
In the report, two records were skipped (lines 2 and 5)
Eurostat
Defining dataset: import dataset
38
Status is COMPLETED
Successful import
Click to look at the content imported
Delete a selected dataset
After importing, EDIT redirects you to the
search dataset screen
Eurostat
Defining dataset: import dataset
39
Select fields to be hidden in the display
Hidden fields
Click to hidden fileds
EDIT hides the selected fields
Eurostat
Defining dataset: import dataset
40
Select an logical operator
The corresponding records are filtered
Select a field in the datatset (e.g. WEIGHT)
Unfold the Basic filtering options
Enter a value
Eurostat
Defining dataset: import dataset
41
Create an expression aided by the lists of fields, operators and functions
The corresponding records are filtered
Unfold the Advanced filtering options
Click to apply the search criteria
Eurostat
Defining dataset: import dataset
42
Export in CSV format
Customize your view
Eurostat
Defining dataset: search dataset
43
Search criteria
List of already imported datasets
View details of the dataset with filtering options
Export the dataset in CSV format
Delete the dataset
Archive the dataset
Restore an archived dataset
Eurostat
44Delete the dataset
Defining dataset: Import/Export dataset
List of Import/export history
Import/Export history search
Search criteria
View details of the dataset with filtering options
Eurostat
Defining jobs: Create a job
45
Search criteria
List of existing programs to be executed
Click to create a job for this program
Menu option
Eurostat
46
Defining jobs: Create a job
Enter a name and a description
Execute the job
Choose the dataset to validate (if
several)
Eurostat
47
Defining jobs: Create a job
During the validation process, only cancellation is possible
When the validation is finished the date is displayed
Validation is RUNNING
Eurostat
48
Defining jobs: Create a job
When the validation is finished the date is displayed
Validation is COMPLETED
Delete the job Copy the job
Click to view the results
Eurostat
49
Defining jobs: Create a job
Click to view the Error table
VIEW RESULTS OF A JOB
Eurostat
50
Defining jobs: Create a job
Error message number
Unfold Basic
filtering
Filtering by Error fields
Unfold Advanced filtering
Export the error table (CSV)
VIEW ERROR TABLE OF A JOB
Eurostat
51
Defining jobs: Create a job
Click to view
details of the error
Name of the rule into the program
Message contained into the program
Row number where the error occured into the dataset Variable values
defined into the program
Severity used into the program
Eurostat
52
Defining jobs: Create a job
Select the dataset fileds
to display
Error information
Dataset record (fields selected)
DETAILED VIEW OF ERROR
Eurostat
53
Defining jobs: Create a job
Click to Export the error table in CSV format
EXPORT ERROR REPORT OF A JOB
Eurostat
54
Defining jobs: Create a job
Choose CSV or FLR format
EXPORT ERROR REPORT OF A JOB
CSV parameters
Error fields selected
Optionally, select Ascending or
Descending order for any error field Export table
Eurostat
55
Defining jobs: Create a jobVIEW PROGRAM DETAILS
Content of the program
Eurostat
56
Defining jobs: job statistics
Job statistics
Menu option
Eurostat
57
Defining jobs: search job
Enter the search criteria
The corresponding jobs are displayed (all jobs if no selected criteria)
Delete the job Copy the job
Click to view the results
Eurostat
E - Future developments Internationalisation – to offer the translation of the
menus in other languages; Gesmes full integration (registry); SDMX 2.1 formats.
58
Eurostat
Useful links• To EDIT page: http://ec.europa.eu/eurostat/edit• To VIPv page: CIRCAbc -> Eurostat -> VIP Validation Project• Generic data validation and editing service: mailto: ESTAT-
VALIDATION@ec.europa.eu• EDIT as web – client - https://webgate.ec.europa.eu/eurostat/edit• CIRCAbc for:
• EHSIS: https://circabc.europa.eu/w/browse/0b5ab24d-68a0-419f-a6bd-e41eb84f33fb
• BoP: https://circabc.europa.eu/w/browse/01940df9-91ec-407b-9ba4-0f5c47086e0c
• BoP:https://circabc.europa.eu/w/browse/ef8b542b-35a8-401c-9dd4-37f61e49f34d
59
Eurostat
Questions?
Thank you for your attention!
60
Recommended