17
Database Management 2nd EUKLEMS Consortium Meeting, 9-11 June 2005, Helsinki This project is funded by the European Commission, Research Directorate General as part of the 6th Framework Programme, Priority 8, "Policy Support and Anticipating Scientific and Technological Needs".

Database Management

  • Upload
    nhung

  • View
    16

  • Download
    0

Embed Size (px)

DESCRIPTION

Database Management. 2nd EUKLEMS Consortium Meeting, 9-11 June 2005, Helsinki This project is funded by the European Commission, Research Directorate General as part of the 6th Framework Programme, Priority 8, "Policy Support and Anticipating Scientific and Technological Needs". - PowerPoint PPT Presentation

Citation preview

Page 1: Database Management

Database Management

2nd EUKLEMS Consortium Meeting, 9-11 June 2005, HelsinkiThis project is funded by the European Commission, Research Directorate General as part of the 6th Framework Programme, Priority 8, "Policy Support and Anticipating Scientific and Technological Needs".

Page 2: Database Management

EUKLEMS Database requirements

• Interface with Consortium inputs

• Produce ‘consistent’ KLEMS data from disparate data

• Store EUKLEMS database

• Generate Productivity estimates

• Store and document ‘Assumptions’ and Methods

• Interface with EUKLEMS public output

Page 3: Database Management

Examples of Practical Problems (1)• Aggregation:

• Chain aggregate 2-digit output by industry to a 1-digit level

• Disaggregation:• Estimate computer mfg. and other machinery output given output of total

machinery mfg.

• Ensure that the results aggregate to the observed total.

• Employ available indicators (“pattern data”) of how to split the two.

• Balancing• Estimate investment by industry and by asset type.

• Observe detail in only 1 dimension at a time.• Estimates aggregated by type to equal the observed totals by industry.• Estimates aggregated by industry to equal the observed totals by type.• Employ available indicators (e.g. from other countries) to estimate the

detailed breakdown.

Page 4: Database Management

Examples of Practical Problems (2)

• Concording• Converting data for timeseries breaks in national industry codes

• Converting data from national industry code to EUKLEMS definitions

• Cross-classification (technically the same as concording) • Estimating labor by ‘quality-level’ using data by occupation.

Page 5: Database Management

Productivity Research Data System

• Programmed as application layer on top of SAS, designed to overcome practical data hurdles

• Data to be manipulated are coded in relational databases

• Identities coded in meta data

• SAS macros to manipulate the data make use of the meta data• Aggregation, disaggregation, balancing, and concording

• Specialized tools to estimate Capital stocks, labour services, capital services, TFP, Domar weighting, alternative deflation

• In addition:• Documentation

• reference manual (also on-line)• users guide

• User community (in statu nascendi)

Page 6: Database Management

What is a relational database?

• Seperate data and meta-data

• Meta-data has number of dimensions, e.g.• Year: 1970, 1971,…..

• Country: A, B,C,..

• Industry: 1, 2, 3, ….

• Unit: nominal, real, deflator

• Type: investment in building, machinery,…,

• 1970_B_3_nom_mach: 5,870

Page 7: Database Management

General lay-out of Analytical module

• 1. Collection of basic data• A. Basic data by country

• 2. Making meta-data (hierarchies, concordances)• B. Data in Prodsys readable format

• 3. Combining datasources: (dis)agg, concor, balancing• C. Input data KLEMS: full time-series and industries

• 4. Applying productivity tools: harmonization and experiment• D. Output data KLEMS

• 5. Selection of series for analytical database• E. Public analytical database

• Statistical module is same as analytical, except only data and procedures which are backed by NSO

NSO 1 2 3 4 5

Partner

E. Public output

A. Basic data

B. Readable

format

C. Input KLEMS

D. Output KLEMS

Page 8: Database Management

Partner Involvement

• Use SAS/Prodsys or Do not use SAS/Prodsys? Advantages of using SAS/Prodsys is full flexibility to experiment, but steep learning curve and expensive

• Importantly: SAS/Prodsys is not needed for data delivery, only Excel!

Page 9: Database Management

Data delivery standard

• Each table neeeds the following information• Description of source (publisher, NSO) and version (release

date)

• Classification Meta-data associated with each dimension of national database

• Concordance with EUK dimension lists

• Document with guidelines will be prepared

Page 10: Database Management

Tree structure of data hierarchy: A as Grand Total with nodes (‘children’) B and C

each subdivided further down

A

B C

D E F G H

Page 11: Database Management

Highest node on top, children indented,each node always immediately followed by itsown children:

A

.B

..D

..E

..F

.C

..G

..H

Page 12: Database Management

EU KLEMS row tree structure

Page 13: Database Management

Sequence of processing raw data

• Prepare metadata (hierarchy) and concordance

• Metadata and concordance are read into ProdSys

• Then data is read into ProdSys

• Metadata determine how original raw data gets ‘translated’ to SAS (table) version of original raw data

• Metadata + already present EU KLEMS metadata + concordance determine how SAS version of original raw data gets transformed into EU KLEMS data

• You don’t have to rearrange the actual data into this hierarchical structure; ProdSys assigns the data to the correct category in SAS once the category is part of the metadata.

Page 14: Database Management

Concordances

• A raw data category may have to be split into several EU KLEMS categories

• We need to be able to determine how the split is made: weights

• Use pattern data to calculate these weights

Page 15: Database Management

NACE1 (4-digit)-to-EUK concordance

nace1-4-digit-to-euk72.xls

Page 16: Database Management

Part of Dutch IO-to-EUK concordance

Page 17: Database Management

Next steps

• Document will be prepared on data delivery

• User guide for Prod Sys

• Prod Sys is still in development