44
ISB Publication Doc Ref: ISB-000289 Side Note - Managing and using identifiers Version: 1.0 Issue Date: 20/02/2015

Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

ISB Publication Doc Ref: ISB-000289

Side Note - Managing and using identifiers

Version: 1.0 Issue Date: 20/02/2015

Page 2: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 2 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

Document Version History

Version Status Date Modified by Description

1.0 Final 20/02/2015 TSS For publication

Page 3: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 3 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

Contents 1 Introduction ________________________________________________ 4

1.1 Document Purpose ____________________________________________ 4

1.2 Summary _____________________________________________________ 4

2 Identifier Principles __________________________________________ 5

3 Reference Model ____________________________________________ 6

4 Identifier Rules _____________________________________________ 9

4.1 Generating the Party_ID ________________________________________ 9 4.1.1 Legacy-format Party Data __________________________________________________ 10 4.1.2 ISB-conformant Party Data_________________________________________________ 11 4.1.3 Priority order for matching _________________________________________________ 12 4.1.4 Party_ID Creation ________________________________________________________ 14 4.1.5 Example _______________________________________________________________ 19

4.2 Generating the Locator_ID _____________________________________ 21 4.2.1 Postal Address __________________________________________________________ 21 4.2.2 Geographical Location ____________________________________________________ 22 4.2.3 Telephone Number _______________________________________________________ 22 4.2.4 Other Locators __________________________________________________________ 22 4.2.5 Locator uniqueness _______________________________________________________ 23

4.3 QE Rules ____________________________________________________ 24

5 Assumptions ______________________________________________ 25

6 Data Sources & Identifiers ___________________________________ 26

7 References ________________________________________________ 41

Page 4: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 4 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

1 Introduction

1.1 Document Purpose This document lists the set of data entities within the BDA which cannot be identified and distinguished from each other using data naturally occurring within the entity (ie using “natural keys”) and where therefore an identifier must be constructed and managed (ie a “surrogate key”). The purpose of this document is to recommend how each of these identifiers can be created and managed for use between ESCS systems when they exchange data

1.2 Summary The table below provides a summary of the data entities that will be covered in this document and the method that will be used to construct the primary keys.

Entity Sub-Type Surrogate Key Method

Party Person Party_ID Derive primary key

Organisation Party_ID Derive primary keys

Locator Postal Address Locator_ID Hybrid solution for primary keys

Geographical Location

Locator_ID Self-reference primary key

Telephone Number

Locator_ID Self-reference primary key

E-mail Address Locator_ID Self-reference primary key

URL Locator_ID Self-reference primary key

QE Qualification Element

AO_Qualification_Element_ID Two step process

Page 5: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 5 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

2 Identifier Principles Terminology Identifier:

An Identifier is an attribute whose value denotes (is a name for) and distinguishes (successfully discriminates between others of the same type) an object or entity. Identifiers may identify real-world objects (eg people) or data entities (eg the data record of a pupil’s enrolment in a school). Some data entities hold data about real-world objects, but there is an important difference between identifying a data record about a person and identifying the person themselves. This document is concerned with identifying data records. All identifiers must be managed and the management processes and quality of management processes yield identifiers that vary in their uniqueness (ie their power to accurately distinguish between identified objects or entities). UID:

Unique Identifiers (UIDs) are identifiers that have the following characteristics:

• They are used and recognised by multiple systems • Their uniqueness properties meet the business needs of the using

systems A UID provides a means of identifying a data entity that is “shared” between the using systems (as opposed to identifying a data entity in one system, with a different identifier identifying the data entity about the same object in a different system). It comes close to being an object identifier for the group of using systems. Examples include the Unique Pupil Number (UPN) used in schools and the National Insurance Number (NINO). All UIDs have a (finite and defined) scope of use. So, for example, the UPN is used to identify pupils for the purposes of state education. The NINO is used to identify people in their financial dealings (tax and benefits) with government. The UPN is not recognised by HMRC or DWP, because neither organisation is connected with state education. Primary Key:

A primary key is the set of data attributes which uniquely identify one instance of one data entity. A primary key can be constructed from a Natural Key or a Surrogate Key, see below. Natural Key:

A natural key is a primary key constructed from naturally-occurring data about the entity. Eg a date, a time, a location and a list of participants could uniquely define an event. Natural keys have two big advantages:

Page 6: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 6 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

1. They use only data that will naturally be held (eg any definition of an event would hold the data items listed above) so incur no data overhead

2. When entities are self-identifying via natural keys, they incur no management overhead, as is required for surrogate keys (see below)

Surrogate key: A surrogate key is a primary key for a data entity where no natural key exists. Eg a date of birth and person name would not uniquely identify a pupil, therefore a ‘surrogate’ key in the form of an identifier must be used. There is an overhead, often substantial, for associating one identifier with exactly one object or data entity and the processes involved are error-prone.

3 Reference Model The proposals in this document will enable the following types of systems to operate successfully, however it is not necessary for all of these types of systems to be implemented:

Operational system: a system such as a school, college or university Management Information System (MIS) or a Children’s Service Case Management System (CMS) which collects and manages data in order to conduct business operations ODS: an Operational Data Store is a system that is updated with changes to enterprise data whenever it is created, modified or deleted by an operational system. In this way the ODS contains a nearly current picture of the enterprise data across all the operational systems. The ODS does not hold historic enterprise data, only current data

Identifiers “reference model”

ODS

Operational system:Eg MIS

Operational system:Eg MIS

Operational system:Eg MIS

Data Warehouse

Page 7: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 7 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

Data Warehouse: a system that contains aggregated historic data so that longitudinal statistical analyses can be undertaken. The Data Warehouse may receive data directly from operational systems or from an ODS. The considerations that have led to the recommendations in this document are:

1. There may be instances of identifiers within received data but, in the absence of a single master identifier source for the whole of ESCS, groups of ESCS systems may share sets of identifiers that are disjoint with the sets of identifiers used by other groups.

2. The solution offered should maximise the linking of data received across multiple messages by intelligent use of identifiers but will not attempt to make connections between data records that have not been made reliably by source operational systems. In particular, the ETL operations that get data into the ODS will not attempt to use fuzzy matching across received data records, but will recognise links already made using identifiers embedded in messages.

3. In the absence of a single master identifier source for the whole of ESCS, there will be some instances where two independently-created records exist that are not recognised as being about the same real-world object. This is likely to occur where a party has two distinct roles (eg one person who is both a Parent and Learner) that have limited touch points to make the linkage.

4. The need to allow groups of operational systems to exchange data with each other in ISB format independently of other groups of systems.

5. Support the operation of an Operational Data Store (ODS) which is updated in near real-time with enterprise transactions. In an ODS, or in any data warehouse, there will be a need for data about a real-world object to continue to be sent to the ODS and successfully linked to historic data after a source has migrated from legacy format messages to ISB format messages.

6. Support the need for a single master list of identifiers (MLI) within any data warehouse (eg in DfE, EFA or a LA) system that is used for linking newly received data to already stored data, whether the new data is in legacy format or in ISB format.

7. The need for the single MLI to operate efficiently.

8. The desire for the single MLI to be within the ODS structure.

9. The long-term desire that the single MLI eventually become a MLI for external systems in ESCS (and therefore to play a part in real-time operational transactions).

Page 8: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 8 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

10. The need to be able to provide feedback from the ODS to an operational system, for example about errors, using the data source’s own surrogate keys.

11. Within the constraints above, it is acceptable for the surrogate keys used within the ODS to be different from those used by systems or groups of systems outside the ODS

A complete solution for surrogate keys therefore will look at what can be used between various groups of ESCS systems and separately what can be used within the reference model. Note: The format and length of the code (surrogate key) is defined within the following business data standards: ISB Business Data Standard Party ISB Business Data Standard Locator ISB Business Data Standard QE Outcome

Page 9: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 9 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

4 Identifier Rules 4.1 Generating the Party_ID Currently groups of ESCS systems share access to well-managed and good quality UIDs about Parties (for example the UPN identifies pupils in schools, the ULN identifies learners from 13+ in colleges and universities). We propose to use these as the basis for the Party Surrogate Key. To reduce the chance of duplicates being created in the ODS (instances of two data entities stored in the ODS with two different surrogate key values relating to the same real-world object) a matching and surrogate key algorithm will be applied as data is received. This will need to consider two cases

• Data received in Legacy-format • Data received in ISB format with an existing Party_ID surrogate key

The generic approach proposed is to take a high quality UID and prefix with the identifier type to make a new identifier. This avoids clashes between identifiers created from different UID schemes. We would then hash the result to anonymise it into a numeric value with no evident structure.

eg a Party entering with UPN of 123456789 undergoes following steps: 1. UPN + 123456789 = UPN123456789 2. UPN123456789 is hashed to create, e.g. 9862357694525587 3. Party is assigned a Party_ID of 9862357694525587

There are two options for step 2: a) use a good standard hash function such as SHA-1 to create the final

number b) create a namespace in accordance with the standard IETF 4122 (A

Universally Unique IDentifier (UUID) URN Namespace) see http://tools.ietf.org/html/rfc4122.

a. IETF recommends “variant 5” of RFC 4122 to create an identifier space from an existing namespace. This applies a hash function to the name using the SHA-1 cryptographic hash function (designed by the United States National Security Agency and published by the United States NIST as a U.S. Federal Information Processing Standard). IETF 4122 then adds two additional hexadecimal characters by adding 8 bits in to the middle of the number to ensure that the identifier is identified as coming from an ESCS namespace and is universally unique (not just unique within ESCS). This would create an IETF 4122 “universally unique ID” (UUID).

However, if it is felt that the addition of extra characters to create this universal uniqueness is unnecessary it would still be possible to use option (a) – simple SHA-1 hash of the value created at step 1; Whichever of the two options are chosen above the advantages of this approach are

1. Provides a common numeric identifier

Page 10: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 10 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

2. Would give numbers that betray no information about how or when the identifier was created. For example, it removes any information about school enrolment from the UPN.

3. Overcomes any data protection or other restrictions imposed on the use of ‘native’ UIDs, such as those that apply to the UPN or NINO.

4.1.1 Legacy-format Party Data

When data arrives in Legacy-format we will use identifiers embedded in legacy data records to find if data about this object has been received into a data warehouse system previously or if reference data about the object exists. A search using each embedded identifier in turn should be made, stopping when a match is found. The embedded identifiers will be searched for in order of decreasing reliability of the identifier, based on its accuracy and uniqueness. If a match is found: pick up and use the entity surrogate key from the existing record and use as the entity surrogate key for constructing a BDA entity for the new data and move it into the ODS. Note that the ODS can be used as the master store of identifiers. For example, for Party entities, all identifiers known in relation to every Party in the ODS will be held in the Party Relationship Role table in the ODS. Once an identifier value has been found in the PRR in the ODS, the Party surrogate key can be abstracted from the PRR and used as the surrogate key for the Party_ID constructed from the newly received data. If the received data is new (no prior record can be found holding any of the identifiers received in the new data), then create a new surrogate key for use within the ODS according to the method described in section 4.1 above.

UPNULN

LEGACY DATA

Match found? Create Derived ID (DID)Y

Create Derived ID (DID)N

Link data to existing record

Create new record

Match ID in preference order

NOTE- e.g. the matched ID is the UPN (123456)- prefix the matched ID with the ID name (i.e. UPN123456)- hash the data- e.g. Party ID is now XYZ599918

Page 11: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 11 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

4.1.2 ISB-conformant Party Data

ISB conformant data will arrive with entity surrogate keys in place and we have assumed that ISB conformant systems will either be able to create their own Party_ID using the ISB Derived ID Algorithm or will have received an input of data with this key. Data received in ISB format will have related identifiers already converted to Party Relationship Role (PRR) entities. Each PRR can be used to determine whether the received data relates to a Party already known to the ODS. Therefore, before inserting new data into the ODS, use the identifiers in each related PRR, in the same priority order as above, to search for prior knowledge of the received entity. If no match on PRR identifiers can be found, use the entity surrogate key received from source. However, if a match on a PRR identifier is found:

1. Create a PRR to hold the surrogate key as supplied by the source system. This is necessary so that subsequent communications with the source can use the surrogate key as known to the source

2. Replace the surrogate key received from the source with that from the entity previously known. This means that the ODS will now be using a different surrogate key internally to that used by the sending source system

3. All entities received from the source system that contain the replaced surrogate key, either as a PK or as a FK, should be changed to use the ODS internal surrogate key

4. When communicating data back to the source, make the reverse entity surrogate key replacements so that data going back to source carries the surrogate key values used by the source.

DIDUPNULN

ISB CONFORMANT DATA

Match on DID found? Y

N

Append data to existing record

Match on other ID’s? Y

Create new DID and record

Append data to existing record

Store source data DID

NOTE

The source data DID will be stored for returning data back to the source

N

Page 12: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 12 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

4.1.3 Priority order for matching

This is used irrespective of the format of the data but legacy data would miss step one:

Data Source Sub Type Sub Type Grouping

Priority Identifier

ISB Conformant records (where available)

Person Learner 1 DID

Alternative Provision Census

Children Looked After

Early Years Foundation Stage Profile

Key Stage 1 Key Stage 2 (tests) National Pupil

Database PRU Census School Census

Person Learner 2 UPN

Awarding Body Data Person Learner 3 Cand_ID

Awarding Body Data

PRU Census School Census

Person Learner 4 ULN

Early Years Census Person Learner 5 Ofsted UPN

ISB Conformant records (where available)

Organisation School / LOP

1 DID

Alternative Provision Census

Consistent Financial Reporting

Early Years Census

Organisation School / LOP

2 LA Number + DfE Establishment Number

Page 13: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 13 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

Data Source Sub Type Sub Type Grouping

Priority Identifier

Early Years Foundation Stage Profile

Edubase General Hospital

Schools / SLASC Independent

Schools / SLASC Key Stage 1 Key Stage 2 (tests) National Pupil

Database Ofsted Inspections PRU Census School Census School Workforce

Edubase Key Stage 2 (tests) National Pupil

Database

Organisation School / LOP

3 URN

Edubase Organisation School / LOP

4 UKPRN

ISB Conformant records (where available)

Organisation Awarding Organisation

1 DID

Awarding Body Data Organisation Awarding Organisation

2 Centr_ID

3 NCN

QAN WebService Look Up tables

Organisation Awarding Organisation

2 Awarding Body Code (AB)

3 Awarding Body Identifier (AB_ID)

4 Awarding Body Acronym (AB_Acronym)

Page 14: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 14 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

Data Source Sub Type Sub Type Grouping

Priority Identifier

Qualifications Reference Database

Organisation Awarding Organisation

2 Awarding Body Number (AB_Number)

3 Awarding Body Code Admin (AB_Code_Admin)

4 Awarding Body Code (AB_Code)

4.1.4 Party_ID Creation

When creating a Party ID, the derivation algorithm should give repeatable results based on the data input to the algorithm. The data input to the algorithm should use an existing identifier, prefixed with the name of the identifier to ensure there are no accidental clashes (eg to ensure that a person with ULN value 1 is not mistaken for a person with UPN value 1). If many identifiers are available then pick in this case the identifier with the widest community of use. Note that this can lead to different identifier priority order to that used for matching. We propose that selection of identifier the use to derive a Party ID be as follows:

UID used to create Party_ID

Sub Type Sub Type Grouping

Priority UID Rationale for Priority

Person

Learner

1 UPN Over 7 million learners have a UPN.

2 ULN ULN is used in HE, FE, schools and Awarding Organisations.

3 Candidate No./ UCI

Candidate Number / Unique Candidate ID Is allocated to learners who are undertaking an exam. Should be used as third identifier as UCI is only allocated by JCQ registered awarding

Page 15: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 15 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

UID used to create Party_ID

Sub Type Sub Type Grouping

Priority UID Rationale for Priority

organisations. Candidate No is used primarily in the Awarding Body data source.

Teacher 1 TRN Teacher Reference Number All teachers should have this number allocated.

2 URN + Staff No.

URN (School Number) + Staff Number should be used if the TRN has not been allocated/ applied for yet. As details of who the employer is contractually (LA, school, agency etc...) is not known the school URN will be used as the teacher teaches at a particular school.

Non-Teaching Staff

1 URN + Staff No.

URN (School Number) + Staff Number should be used for non-teaching staff. As details of who the employer is contractually (LA, school, agency etc...) is not known the school URN will be used as the staff member works at a particular school.

Page 16: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 16 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

UID used to create Party_ID

Sub Type Sub Type Grouping

Priority UID Rationale for Priority

Parent 1 Child Party_ID + Parent Name (suffixed with Parent1 or Parent2) + Contact Number

As there is not a recognised universal identifier for a parent, this combination of the Child Party_ID, Name and Contact Number will be used to identify the parent.

Organisation School / Learning Opportunity Provider (LOP)

1 URN Unique Reference Number allocated by Edubase.

2 LA No. + DfE Establishment No.

The LA Number and Establishment Number together can uniquely identify a LOP.

3 UKPRN UKPRN is issued by the UK Register of Learning Providers (UKRLP). This should be used as the third identifier as organisations can choose to register with the UKRLP. It is not mandatory.

4 UPIN Unique Provider Identification Number is only issued (in the main) for schools with a sixth form and all academies, if funded by either SFA (Skills Funding Agency) or EFA (Education Funding Agency). This identifier may not be applicable to all LOP’s.

Page 17: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 17 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

UID used to create Party_ID

Sub Type Sub Type Grouping

Priority UID Rationale for Priority

5 Admission No. Each school will have an admission number to identify the school.

Awarding Organisation

1 AB (Awarding Body Code)

Applicable to QWS Awarding Body Code which used to determine the awarding body. (1-3 digit code eg 150)

2 AB_ID (Awarding Body Identifier)

Applicable to QWS Awarding Body Identifier which used to determine the awarding body. (8 digit code eg 20000109)

3 AB_Acronym (Awarding Body Acronym)

Applicable to QWS Awarding Body Acronym which used to determine the awarding body. (Text acronym eg EDEXCEL)

4 A_body The values are the same as Awarding Body Acronym which used to determine the awarding body. (Text acronym eg EDEXCEL)

5 AB_Number Applicable to QRD Awarding Body Number which used to determine the

Page 18: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 18 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

UID used to create Party_ID

Sub Type Sub Type Grouping

Priority UID Rationale for Priority

awarding body. (2-3 digit code eg 266)

6 AB_Code_Admin

Applicable to QRD Awarding Body Code Admin which used to determine the awarding body. (generally a text acronym eg HCIMA)

7 AB_Code Applicable to QRD Awarding Body Code which used to determine the awarding body. (Text or acronym eg Institute of Hospitality)

8 Recognition No.

Recognition Number is allocated by Ofqual to awarding organisations such as AQA.

Company 1 Company Registration No.

Company Registration Number is allocated by Companies House on registration of a company. A CRN can be used as an identifier of awarding organisations or potentially academies depending on their legal structure.

Charity 1 Registered Charity No.

Registered Charity Number used if organisation is classed as charity. CRN is not issued.

Page 19: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 19 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

UID used to create Party_ID

Sub Type Sub Type Grouping

Priority UID Rationale for Priority

Government Body

1 Name in full (e.g. Department for Education)

Government bodies are not allocated a unique number to identify them in the same way as companies and charities but have official titles.

2 Recognised abbreviation (e.g. DfE or OFSTED)

Many government bodies are either abbreviated or known by their abbreviation.

3 LA Number Issued by ONS to all LAs but note the format changed.

4.1.5 Example

Party data for learners is received from a legacy data source and there is a list of potential identifiers that can be used to search and identify if the data about these learners exists. From the identifiers list, it is known that UPN, ULN and Candidate No. / UCI are the potential identifiers which could be recorded. A learner can have one or more of the possible identifiers listed against their record. The search on the ODS goes through the list of possible identifiers in priority order to establish a match. The table below shows the incoming records from a data source.

Name DOB School UPN ULN

Joe Blogs 02/12/95 St Cuthbert’s 12345678 Not known

Jane Smith 07/09/96 St Cuthbert’s Not Known 98765432

Jack Jones 05/08/96 St Cuthbert’s 652664574 Not known

The table below shows the existing records held in Party Relationship Role in the ODS.

Party ID Party Type UPN ULN

Page 20: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 20 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

34500923 Person 12345678

54301235 Person 98765432

The ODS has identified in the Party Relationship Role table, that the UPN 12345678 for Joe Blogs on the incoming record matches the UPN associated with the Party ID 34500923. This can then be used to construct a BDA entity for the new data and then move it into the ODS. For Jane Smith there is no UPN recorded, hence the next available identifier is the ULN 98765432. This has been found in the Party Relationship Role table and is associated with the Party ID 54301235. Party ID 34500923 and Party ID 54301235 can now be used as the surrogate keys for the Party entity constructed from the newly received data. Jack Jones’s UPN 652664574 cannot be found in the Party Relationship Role table in the ODS, therefore a new derived ID (DID) will need to be created based on the algorithm, which will transform the original UPN 652664574 to a new DID of 997123065 which will be assigned to the record and used as the Party_ID.

Page 21: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 21 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

4.2 Generating the Locator_ID 4.2.1 Postal Address

For addresses we will use a similar process to the method used for Parties for linking legacy format data and for deriving an identifier. There is no intention to manage uniqueness of known addresses so we do not propose that source surrogate keys be substituted, but there is value in linking legacy data to that previously received. Also the range of identifiers available for matching and to create the primary key is much reduced and it is possible to use same priority for matching and creating an ID. These are shown in the table below;

Sub Type Priority UID Rationale for Priority

Postal Address

1 DID ISB Conformant records (where available)

2 UPRN Unique Property Reference Number All addresses stored in AddressBase have a unique number as the identifier

3 PAON + SAON + Postcode hashed

Combination of these 3 fields together can be used to uniquely identify an address if the address is received in the British Standard BS7666 format.

4 AddressLines 1-5 hashed

Hashing algorithm used to convert the 5 line address into an integer and store that number as the identifier.

5 All address items provided hashed

Hashing algorithm used to convert all address items into an integer and store that number as the identifier.

Page 22: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 22 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

4.2.2 Geographical Location

For the geographical location the data is self-referencing and so we can use this as the Locator_ID Key

Sub Type Priority UID Rationale for Priority

Geographical Location

1 Property Northing + Property Easting

Property Northing and Easting together is unique, however it may not be commonly held. It is used in Edubase for schools under location and can be used as an alternative search.

4.2.3 Telephone Number

For the Telephone number the data is self-referencing and so we can use this as the Locator_ID Key

Sub Type Priority UID Rationale for Priority

Telephone Number

1 Telephone Number

The telephone number will be used as the identifier. Before a telephone number is used as its own identifier, the non-numeric characters such as brackets and spaces must be removed, e.g. (0207) 545 1234 is a unique telephone number so can be used as the Locator ID and would appear as 02075451234. Note: telephone numbers shorter than 11 digits once non-numeric characters are removed will not be accepted.

4.2.4 Other Locators

For Locators e-mail address and URL the unique nature of the field means the data is self-referencing, and so can be used as the Locator_ID

Page 23: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 23 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

4.2.5 Locator uniqueness

In the case of Locators, the reference model is not trying to construct a master reference file of unique addresses and telephone numbers. The prime requirement is that each address, telephone number etc. received from a data source should be stored so that it can be linked by a Party Contact to a Party. For example, if one party uses a particular format for an address that means we cannot match to another party at the same address we do not need to resolve this into a single address entity all we need to do is assign the correct address to the correct party. .

Page 24: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 24 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

4.3 QE Rules Qualifications can be managed differently to Parties and Locator as the initial step is to create a reference set of qualification data to assign qualification outcomes against. In creating this reference set DfE will have received managed data sets from different organisations, primarily Ofqual and, eventually, from Awarding Organisations that follow the JCQ A2C specifications. These data sets will contain their own ID for the qualification elements and these can be used as to create the primary key AO_Qualification_Element_ID. There may be instances where different bodies have used the same identifier for their qualifications and here the Party_ID of the Awarding Body can be used to create uniqueness across data sets. The creation of this reference set must be a managed process and so some degree of manual intervention may be possible. It is also expected that the data received from conformant systems is dependent on the awarding organisation providing a product catalogue that the results are based upon. This is primarily JCQ and any other awarding organisations that elect to adopt the ISB/A2C data standards. Having created the Reference set it is necessary to match QE Outcomes against this data in those cases where the primary key AO_Qualification_Element_ID is not present. Here it will be necessary to make a check will be made to identify if the incoming data using the appropriate identifiers. It is anticipated that approximately 95% of the records currently being collected can be matched using the QAN and that this can be improved by including the DiscCode where this is available. Further work in examining the data sets may reveal further mechanisms to optimise this process.

Page 25: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 25 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

5 Assumptions The following assumptions have been made with regard to the data entities and the party roles: Parents

Parents are not allocated a unique identifier to state they are a parent; however, a parent could also be a teacher, member of staff or be a learner themselves. This will mean that a data warehouse may contain duplicate records for one individual based on the role they are recorded against. The purpose of holding parent data is to find contact details such as telephone number which is unique however it is likely that two parents may use the same number. Staff Number

Some teachers and non-teaching staff do not hold a unique identifier such as a TRN (Teacher Reference Number), apart from their staff number; however the staff number is only unique to the issuing authority/ organisation that allocates it and pays the salary/ wages to that individual. As it is not known who employs the teacher or non-teaching staff contractually (i.e. school, local authority or agency), the school number (URN) along with the staff number will be used to create a unique identifier. UPN Scenarios

i. Learner has an UPN throughout their education and is used on

their qualifications as the identifier ii. Learner has an UPN throughout their education and it is NOT used

on their qualifications as the identifier iii. Learner is taught at home and is therefore a home learner so

does NOT have a UPN iv. Learner has both a UPN and ULN and the ULN is used on their

qualifications as the identifier

Page 26: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 26 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

6 Data Sources & Identifiers Following table identifies the operational data sources and relevant identifiers for use in a data warehouse;

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

AddressBase Based on AddressBase Premium identifiers:

RECORD_IDENTIFIER - int 2

LOCAL_CUSTODIAN_CODE - int 4

UPRN - int 12

X_COORDINATE - float 6 (2)

Y_COORDINATE - float 7 (2)

RM_UDPRN - int 16

LPI_KEY - char 14

USRN - int 8

ORG_KEY - char 14

XREF_KEY - char 14

CROSS_REFERENCE - char 50

SUCC_KEY - char 14

Locator – Postal Address

UPRN

Address data – inc Post Code

Alternative Provision Census

UPN - A(13)

SAON - A(100)

PAON - A(100)

Street – A(100)

Party – Person

Party – Organisation

Locator – Postal Address

Collections from non-maintained schools unlikely to have UPN

Page 27: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 27 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

Locality – A(35)

Town – A(30

Administrative Area – A(30)

Post town – A(8)

Postcode – A(8)

Address Line 1 – A(40)

Address Line 2 – A(40)

Address Line 3 – A(40)

Address Line 4 – A(40)

Address Line 5 – A(40)

LA Number - 9(3)

DfES Establishment Number - 9(4)

Assessment Components (A_COMP) reference data

Indicator Identifier - A(12)

Assessment Identifier - A(3)

Component Parameters – A(22)

QE – Assessable

QE Availability

Look up data about assessment components types

Awarding Body data (KS4, NISVQ)

cand_id - A(13)

Qualcode - A(12)

centr_id - A(12)

ULN – A(10)

Sylcode – A(12)

Party - Organisation

Party - Person

Party Role

PRR

QE Award

Party_Id

Sylcode?

Qualcode?

Locator_id

Qual results – candidate no

Note : Candidate No - is often the UCI, but do not know whether it is always the UCI, not all Awarding

Page 28: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 28 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

Sch_code - A(12)

A_body - A(10)

c_post – A(8)

c_add1 – A(70)

c_add2 – A(70)

c_add3 – A(70)

c_add4 – A(30)

NCN - A(10)

DCSF_no - 9999999

QE Outcome

PRR Party Name

PRR Party Contact

Locator – Postal Address

Party Contact

QE Booking

QE Learning Booking

Party Name

Organisations may support it as it is a JCQ concept (and not all AOs are JCQ members)

Children Looked After

Child identifier (i.e. CHILD_LA_CODE) - A(10)

Unique Pupil Number (UPN) - A(13) or A(3)

Party - Person UPN Collection with children’s services Unique Identifier

Consistent Financial Reporting

LA Number - 9(3)

DfES Establishment Number - 9(4)

School Email Address - A(254)

LEA

Estab

Party – Organisation

Locator – E-mail Address

Establishment No.

E-mail Address

Collection about school –URN will be used

Early Years Census

LA Number - 9(3)

Establishment Unique Reference

Party – Organisation

Locator – Postal

Establishment No. Collection of 2 -5 yr olds those not in

Page 29: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 29 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

Number - 9(6)

Ofsted EY URN - Either AA999999 or 999999

SAON – A(100)

PAON – A(100)

Street – A(100)

Locality – A(35)

Town – A(30)

Administrative Area – A(30)

Post town – A(30)

Postcode - A(8)

Address Line 1 – A(40)

Address Line 2 – A(40)

Address Line 3 – A(40)

Address Line 4 – A(40)

Address Line 5 – A(40)

School Email Address – A(254)

Telephone Number – A(35)

Address

Locator – Telephone Number

Locator – Email Address

Ofsted EY URN

maintained school may not contain UPN

Early Years Foundation

Unique Pupil Number (UPN) - A(13)

LA Number - 9(3)

Party – Person

Party – Organisation

UPN

Establishment No.

Assessment at EY see above

Page 30: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 30 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

Stage Profile DfES Establishment Number - 9(4)

Establishment Unique Reference Number - 9(6)

Assessment Identifier - A(3)

Postcode - A(8)

Locator – Postal Address

QE Outcome

Postcode

Edubase Property Easting - 999999.9

Property Northing - 999999.9

LA Number - 9(3)

DfES Establishment Number - 9(4)

Establishment Unique Reference Number - 9(6)

Unique Property Reference Number (UPRN) - BS7666 UPRN 999999999999

Parliamentary Constituency Code - Defined in Pupil CBDS (Pupil Address Section)

Learning & Skills Council Code - Defined in Pupil CBDS (Pupil Address Section)

Telephone Number - A(35)

Street – A(100)

Locality – A(35)

Party – Organisation

Locator – Postal Address

Locator – Geographical Location

Locator – URL

Locator – Telephone Number

Establishment No.

URN

UKPRN

UPRN

Postcode

Property Easting

Property Northing

URL

Telephone Number

School ref data – URN, establishment no. plus others

Page 31: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 31 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

Town – A(30)

Postcode - A(8)

Address Line 3 – A(40)

Establishment Number - integer (4)

FE/HE Identifier - integer (6)

Local Authority – text(3)

Previous Establishment Number - Integer (4)

Telephone Number

UK Provider Reference Number (UKPRN) - Integer (8)

Unique Reference Number (URN) - Integer (6)

Website Address

General Hospital Schools / SLASC

LA - N(3)

Estab - N(4)

TelephoneSTD - N(5)

TelephoneNumber - N(7)

School Email – A(254)

Email – A(254)

Postcode – A,N(8)

Party – Organisation

Locator – Postal Address

Locator – Telephone Number

Locator – Email Address

School Level Annual School Census

Note: no pupil level data is collected in this census.

Source: http://media.education.gov.uk/assets/files/pdf/s/2013%20slasc%20business%20and%20technical%20specification%20v1-

Page 32: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 32 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

SAON - A,N(100)

PAON - A,N(100)

Street - A(100)

Locality - A(35)

Town - A(30)

Administrative Area - A(30)

Post Town - A(30)

Address Line 1 (200101) – A(40)

Address Line 2 (200102) – A(40)

Address Line 3 (200103) – A(40)

Address Line 4 (200104) – A(40)

Address Line 5 (200105) – A(40)

2.pdf

Independent Schools / SLASC

LA - N(3)

Estab - N(4)

TelephoneSTD - N(5)

TelephoneNumber - N(7)

School Email – A(254)

Email – A(254)

Postcode – A,N(8)

SAON - A,N(100)

Party – Organisation

Locator – Postal Address

Locator – Telephone Number

Locator – Email Address

School Level Annual School Census

Note: no pupil level data is collected in this census.

Source: http://media.education.gov.uk/assets/files/pdf/s/2013%20slasc%20business%20and%20technical%20specification%20v1-2.pdf

Page 33: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 33 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

PAON - A,N(100)

Street - A(100)

Locality - A(35)

Town - A(30)

Administrative Area - A(30)

Post Town - A(30)

Address Line 1 (200101) – A(40)

Address Line 2 (200102) – A(40)

Address Line 3 (200103) – A(40)

Address Line 4 (200104) – A(40)

Address Line 5 (200105) – A(40)

Key Stage 1 Unique Pupil Number (UPN) - A(13)

LA Number - 9(3)

DfES Establishment Number - 9(4)

Party - Person

Party - Organisation

QE – Assessable

UPN

Establishment No.

UPN used

Key Stage 1 Phonics

Not known UPN used

Key Stage 2 (tests)

CurrLA

CurrEstab

CurrDfENo. (LA + Estab No)

PrevLA

Party – Person

Party – Organisation

UPN

CurrDfENo.

PrevDfENo.

UPN used

Note: this list is based on the KS2 data source Jonathan has received for testing.

Page 34: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 34 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

PrevDfENo. (LA + Estab No)

URN

UPN

National Pupil Database

Unique Pupil Number (UPN) - A(13)

LA Number - 9(3)

DfES Establishment Number - 9(4)

Establishment Unique Reference Number - 9(6)

Sub-dwelling – A(100)

Dwelling – A(100)

Street – A(100)

Locality – A(35)

Town – A(30)

Administrative Area – A(30)

Post town – A(30)

Postcode - A(8)

Address Line 1 – A(40)

Address Line 2 – A(40)

Address Line 3 – A(40)

Address Line 4 – A(40)

Address Line 5 – A(40)

Party – Person

Party – Organisation

Locator – Postal Address

Locator – Telephone Number

UPN

URN

Establishment No.

Postcode

Telephone Number

UPN used and URN

Page 35: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 35 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

Telephone Number - A(35)

Ofsted Inspections

LA Number - 9(3)

DfES Establishment Number - 9(4)

Party - Organisation Establishment No.

URN and own ‘URN’ for other schools they inspect (e.g nursery’s)

N.B. same URN no. can be issued by DfE and OfSTEAD but refer to diff schools

ONS geographical data

Not known LA codes as reference set

Optional Tests Not known Needs further analysis

Ordnance Survey Reference Data

Not known UPRN, eastings and Northings

PRU Census Unique Pupil Number (UPN) - A(13)

Pupil's Former UPN - A(13)

ULN – 9999999999

SAON – A(100)

PAON – A(100)

Sub-dwelling – A(100)

Party – Organisation

Party – Person

Locator – Postal Address

Locator - Email Address

Locator – Telephone

Establishment No.

UPN

ULN

Postcode

E-mail Address

Telephone Number

Collection should have UPN and URN

Page 36: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 36 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

Dwelling – A(100)

Street – A(100)

Locality – A(35)

Town – A(30)

Administrative Area – A(30)

Post town – A(30)

Postcode - A(8)

Address Line 1 – A(40)

Address Line 2 – A(40)

Address Line 3 – A(40)

Address Line 4 – A(40)

Address Line 5 – A(40)

LA Number - 9(3)

DfES Establishment Number - 9(4)

School Email Address - A(254)

Telephone Number - A(35)

Number

QAN WebService Look Up tables

Qualification Identifier – 99999999

Qualification Accreditation Number (QAN) - A(8)

AB (Awarding Body Code) - A(3)

AB_ID (Awarding Body Identifier) –

QE-Award, Scheme

QE Classification

QE Sector Subject Area

QE Sector Subject

QAN

Discount Code

Qual Accreditation No look up

Page 37: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 37 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

99999999

AB_Acronym (Awarding Body Acronym) - A(20)

Qualification Type Code - A(3)

Discount Code Map Code - A(4)

Sector Subject Framework Tier 2 Code - A(4)

Sector Subject Framework Tier 1 Code - A(2)

Awarding Body Identifier – 99999999

Qualification Type Identifier – 99999999

Qualification Type Code - A(3)

Qualification Code - A(3)

Discount Code Map Identifier – 99999999

Discount Code Identifier – 99999999

Discount Code - A(4)

Sector Subject Framework Tier 2 Identifier – 99999999

Hierarchy

Qualification Element Funding

QE Grade Range Grade

Qualification Element Grade Range

QE Grade Performance Points

Qualification Element Framework

Qualification Framework

Party

Party Role

Qualifications Reference

QUID (QAN) QE QUID (QAN) Qual Accreditation No look up

Page 38: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 38 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

Database AB_Number

AB_Code_Admin

AB_Code

AB_Code_NDAQ

PointsID

Disc_Code

SSA1Code

SSA2Code

Syllabus_Ref

NISVQ_Type

Qual_Number

QE

Party - Organisation

Disc_Code

AB_Number

School Census Unique Pupil Number (UPN) - A(13)

Pupil's Former UPN - A(13)

ULN – 9999999999

LEA Number

Estab

LAEstab

Sub-dwelling – A(100)

Dwelling – A(100)

SAON – A(100)

Party – Person

Locator – Postal Address

Locator – Email Address

Locator – Telephone Number

UPN

Former UPN

ULN

Original UPN

Postcode

PAON

SAON

Email Address

Telephone Number

Collection should have UPN and URN

Page 39: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 39 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

PAON – A(100)

Street – A(100)

Locality – A(35)

Town – A(30)

Administrative Area – A(30)

Post Town – A(30)

Postcode – A(8)

AddressLine1 – A(40)

AddressLine2 – A(40)

AddressLine3 – A(40)

AddressLine4 – A(40)

AddressLine5 – A(40)

Email Address

Contact Telephone Number – A(35)

School Workforce

LA Number - 9(3)

DfES Establishment Number - 9(4)

Teacher Number – 9999999

National Insurance Number (NINO) – AannnnnnA

Party – Organisation

Party – Person

Establishment No.

Teacher Number

NINO

Collection about staff in school should have URN – need to identify what staff UIDs used

School Level Database

Not known School level URN used

Page 40: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 40 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0 Issue Date: 20/02/2015

Data Source Identifiers in data source Entity – Sub Type Key Identifiers used Relevant characteristics of data source

YPLA Finance Data

Not known

Page 41: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 41 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

7 References

Website Topic URL

Education.gov.uk UPN http://www.education.gov.uk/researchandstatistics/datatdatam/upn/a0064543/what-is-a-unique-pupil-number

Education.gov.uk ULN http://www.education.gov.uk/a00220050/uln

Loughborough University

CIN Census http://www.lboro.ac.uk/research/ccfr/Publications/Developing%20definitions%20of%20local%20authority%20services.pdf

WJEC UCI http://www.wjec.co.uk/index.php?nav=140

HESA TRN http://www.hesa.ac.uk/component/option,com_studrec/task,show_file/Itemid,233/mnl,08053/href,e%5E_%5ETREFNO.html

HMRC NINO http://www.hmrc.gov.uk/ni/intro/basics.htm

Education.gov.uk Edubase FAQ’s http://www.education.gov.uk/edubase/faq.xhtml

Education.gov.uk Edubase Glossary

http://www.education.gov.uk/edubase/glossary.xhtml?letter=U

UKRLP UKPRN http://www.ukrlp.co.uk/html/ukrlp/ukprnhelp.html

UKRLP UKRLP About http://www.ukrlp.co.uk/

Skills Funding Agency

SFA About http://skillsfundingagency.bis.gov.uk/aboutus/

The Data Service Schools data http://www.thedataservice.org.uk/datadictionary/datasets/0910/schools_data_-_admin_0910.htm

Education.gov.uk LA Codes http://www.education.gov.uk/rsgateway/DB/STA/t000990/index.shtml

HESA Awarding Body https://www.hesa.ac.uk/index.php?option

Page 42: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 42 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

Website Topic URL =com_collns&task=show_manuals&Itemid=233&r=05011&f=047

Examination Officers’ Association

Centre Number http://www.examofficers.org.uk/jargon-buster

Woodland Trust Registered Charity Number

http://www.woodlandtrust.org.uk/en/support-us/legacies/faq/Pages/registered-charity-number.aspx

Ordnance Survey UPRN http://www.ordnancesurvey.co.uk/oswebsite/products/technical-faq.html

Ordnance Survey OS Master Map Address Layer

http://www.ordnancesurvey.co.uk/oswebsite/docs/technical-specifications/os-master-map-address-layer-2-technical-specification.pdf

Collins Dictionary Telephone Number

http://www.collinsdictionary.com/dictionary/english/telephone-number

Dictionary.com Email Address http://dictionary.reference.com/browse/electronic%20mail%20address

Collins Dictionary URL/ Website http://www.collinsdictionary.com/dictionary/english/website?showCookiePolicy=true

Education.gov.uk CBDS Database http://www.education.gov.uk/schools/adminandfinance/schooladmin/a0058744/common-basic-data-set-cbds-database

Education.gov.uk Qualifications http://www.education.gov.uk/section96/search/search.cfm

Edexcel QAN http://www.edexcel.com/iwantto/Pages/gq-qan.aspx

Education Data A Comp http://www.educationdata.org.uk/a_comp/

Ofqual Qualifications http://register.ofqual.gov.uk/Qualification

DfE QAN Website QAN https://collectdata.education.gov.uk/qwsweb/(S(2x100p55orv1zi55aiypkz32))/Main.aspx

AQA Awarding Body http://web.aqa.org.uk/qual/newgcses/art_

Page 43: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 43 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

Website Topic URL dan_dra_mus/new/art_overview2.php

Education.gov.uk QTS (Teachers) http://www.education.gov.uk/b00204081/award-of-qts

Land Registry Blog PAON, SAON http://productblog.landregistry.gov.uk/faqs/what-does-paon-saon-mean/#

National Pupil Database Wikispace

Identifiers https://nationalpupildatabase.wikispaces.com/IDs

Education.gov.uk SLASC http://media.education.gov.uk/assets/files/pdf/s/2013%20slasc%20business%20and%20technical%20specification%20v1-2.pdf

Education.gov.uk SLASC http://www.education.gov.uk/researchandstatistics/stats/slasc/a00217199/slasc-2013-technical-specification

Dictionary.com Derive http://dictionary.reference.com/browse/derive

Page 44: Data.gov.uk - ISB Publication Doc Ref: ISB-000289 · data after a source has migrated from legacy format messages to ISB format messages. 6. Support the need for a single master list

Technical Support Service Managing And Using Identifiers

Page 44 of 44 Version: 1.0 Status: Final

File: Side-Note-Managing-and-using-identifiers-v1-0

Issue Date: 20/02/2015

© Crown copyright 2014 The Information Standards Board (ISB) is an advisory body to the Department for Education (DfE) and the Department for Business, Innovation and Skills (BIS). The information it produces is subject to Crown copyright, which is administered by the National Archives. The Crown copyright protected information in this document (other than ISB or Departmental logos) may be reproduced free of charge in any format or medium under the terms of the Open Government Licence, available from the National Archives website. Any reuse is subject to the material being reproduced accurately and not used in a misleading context. It must be acknowledged as being protected by Crown copyright and the title of the source material must be supplied with the ISB named as the corporate author. Authorisation to reproduce any information in this standard which is identified as being the copyright of a third party must be obtained from the copyright holders concerned.