10
1 PharmaSUG 2020 - Paper SS-151 Supplementary Steps to Create a More Precise ADaM define.xml in Pinnacle 21 Enterprise Majdoub Haloui, Hong Qi, Merck & Co., Inc, North Wales, PA, USA ABSTRACT The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides necessary information to describe the submitted ADaM datasets and their variables. A high quality define.xml is important for a smooth review process. Pinnacle 21 Enterprise enables the automation and standardization to generate a high quality define.xml. However, due to certain limitations of the current version of Pinnacle 21 Enterprise software, extra steps are needed to create a more precise define.xml after the import of ADaM datasets specification. These steps will generate an ADaM define.xml with better descriptions of attributes, controlled terms, and the source for certain variables. In this paper, the authors will introduce detailed steps leading to a more accurate ADaM define.xml file. INTRODUCTION Creating a high quality define.xml in the past required a solid knowledge of the standards and mastery of XML. Pinnacle 21 Enterprise eliminates the need for the latter and overcomes the challenges of learning and becoming proficient with the standards. define.xml Generator is based on Excel, which allows you to focus on the metadata content instead of the complex XML syntax. Pinnacle 21 Enterprise is the leading industry web-based application, used by sponsors and CROs, to validate SDTM/ADaM datasets and define.xml against CDISC standards. FDA and PMDA use Pinnacle 21 Enterprise to review submission data from sponsors. Pinnacle 21 Enterprise has many useful features, including validating SDTM, ADaM and define.xml, as well as generating define.xml version 2.0. When creating define.xml, careful consideration must be taken so that all the information within the define is clear, concise and compliant with CDISC and regulatory agency requirements. This paper introduces extra steps needed to create a more precise define.xml after the import of ADaM datasets specification. These steps will generate an ADaM define.xml with better description of the attributes, controlled terms, and the source for certain variables. ADAM DEFINE.XML GENERATION PROCESS FLOW In the past, creating a submission-ready define.xml demanded a strong knowledge of the CDISC standards and mastery of XML. Absence of such knowledge turned out to be a major setback for many statistical programmers working on a regulatory submission. Pinnacle 21 Enterprise has a module that creates define.xml, however, this module has some limitations when creating Value Level Metadata for variables other than AVAL and AVALC (Display 1.1). Display 1.1 Pinnacle 21 Enterprise Module Creating Value Level Metadata

PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

1

PharmaSUG 2020 - Paper SS-151

Supplementary Steps to Create a More Precise ADaM define.xml in Pinnacle 21 Enterprise

Majdoub Haloui, Hong Qi, Merck & Co., Inc, North Wales, PA, USA

ABSTRACT The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides necessary information to describe the submitted ADaM datasets and their variables. A high quality define.xml is important for a smooth review process. Pinnacle 21 Enterprise enables the automation and standardization to generate a high quality define.xml. However, due to certain limitations of the current version of Pinnacle 21 Enterprise software, extra steps are needed to create a more precise define.xml after the import of ADaM datasets specification. These steps will generate an ADaM define.xml with better descriptions of attributes, controlled terms, and the source for certain variables. In this paper, the authors will introduce detailed steps leading to a more accurate ADaM define.xml file.

INTRODUCTION Creating a high quality define.xml in the past required a solid knowledge of the standards and mastery of XML. Pinnacle 21 Enterprise eliminates the need for the latter and overcomes the challenges of learning and becoming proficient with the standards. define.xml Generator is based on Excel, which allows you to focus on the metadata content instead of the complex XML syntax.

Pinnacle 21 Enterprise is the leading industry web-based application, used by sponsors and CROs, to validate SDTM/ADaM datasets and define.xml against CDISC standards. FDA and PMDA use Pinnacle 21 Enterprise to review submission data from sponsors. Pinnacle 21 Enterprise has many useful features, including validating SDTM, ADaM and define.xml, as well as generating define.xml version 2.0. When creating define.xml, careful consideration must be taken so that all the information within the define is clear, concise and compliant with CDISC and regulatory agency requirements. This paper introduces extra steps needed to create a more precise define.xml after the import of ADaM datasets specification. These steps will generate an ADaM define.xml with better description of the attributes, controlled terms, and the source for certain variables.

ADAM DEFINE.XML GENERATION PROCESS FLOW In the past, creating a submission-ready define.xml demanded a strong knowledge of the CDISC standards and mastery of XML. Absence of such knowledge turned out to be a major setback for many statistical programmers working on a regulatory submission. Pinnacle 21 Enterprise has a module that creates define.xml, however, this module has some limitations when creating Value Level Metadata for variables other than AVAL and AVALC (Display 1.1).

Display 1.1 Pinnacle 21 Enterprise Module Creating Value Level Metadata

Page 2: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

2

We collaborated with the Pinnacle 21 team to create a customized define.xml generator to help generate the define.xml based on our ADaM datasets specification in Excel format. This customized software allows statistical programmers to focus more on the metadata content of the study rather than the complex define.xml syntax (Display 1.2).

Display 1.2 Metadata Content in define.xml Generated by Customized Pinnacle 21 Enterprise

Module

The overview of ADaM define.xml generation process flow is shown in Figure 1.

Figure 1 ADaM define.xml generation Process Flow

Page 3: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

3

EXTRA STEPS This section describes the additional steps needed to create a more precise define.xml after the import of ADaM datasets specification to Pinnacle 21 Enterprise tool. These steps can bypass some limitations in the automation feature of the current Pinnacle 21 Enterprise software and generate an ADaM define.xml with better description of the attributes, controlled terms, and the source for certain variables. Moreover, a few steps also lead to an improved ADaM validation score.

1. PREDECESSOR ORIGIN TYPE FOR DATE VARIABLE IN CHARACTER FORMAT The Define-XML standard defines the following origin types “CRF”, “Derived”, “Assigned”, “Protocol”, “eDT” and “Predecessor”. “CRF”, “Protocol” and “eDT” origin types are generally used by SDTM variables while “Derived”, “Assigned” and “Predecessor” origin types are used by ADaM variables and parameters. The Define-XML Origin element is used to provide metadata traceability for SDTM, ADaM and SEND data, If a variable is carried over from another dataset into an ADaM dataset “as-is” (same value, same data type and same label) then the origin should be “Predecessor”. Variable Label /

Description Type Length or Display

Format Controlled Terms or ISO Format

Origin / Source /Method / Comment

VARIABLE Variable Label text 12

Predecessor: DATASET/VARIABLE

Display 2.1 Example of a Variable with Predecessor Origin Type

In Display 2.1, “DATASET.VARIABLE” is the predecessor value such as “DM.USUBJID”. Referred in the value of predecessor, DATASET must exist in the submission folder and the VARIABLE must exist in the DATASET. Many date/time variables in character format (DTC) are of “Predecessor” origin. These include but are not limited to the variables listed in Table 1.1.

Dataset Variable Label Type

ADAE AESTDTC Start Date/Time of Adverse Event date

ADAE AEENDTC End Date/Time of Adverse Event date

ADCM CMSTDTC Start Date/Time of Medication date

ADCM CMENDTC End Date/Time of Medication date

ADCM PRGRSDTC Progression Date After Treatment Phase date

ADEX EXSTDTC Start Date/Time of Treatment datetime

ADEX EXENDTC End Date/Time of Treatment datetime

ADLBGRD LBDTC Date/Time of Specimen Collection date

ADMH MHSTDTC Start Date/Time of Medical History Event date

ADMH MHDTC Date/Time of History Collection date

ADSL RFSTDTC Subject Reference Start Date/Time datetime

ADSL RFENDTC Subject Reference End Date/Time datetime

ADTL TRDTC Date/Time of Tumor Measurement date

Table 1.1 DTC of Predecessor Origin

Page 4: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

4

In the ADaM datasets specification, the type of these DTCs is documented as “Char” and the length is specified to define the characteristics of SAS variables. Taking ADSL.RFSTDTC as an example, Display 2.2 shows how RFSTDTC from the DM domain is defined in the ADSL tab.

Variable Name

Variable Label Type Length Sig Digits

Format Codelist / Controlled Terms

Origin Define Derivation

RFSTDTC Subject Reference Start Date/Time

Char 19 Predecessor DM.RFSTDTC

Display 2.2 ADSL.RFSTDTC defined in ADaM datasets specification

The define files generated by simply importing an ADaM datasets specification to Pinnacle 21 Enterprise describe ADSL.RFSTDTC as shown in Display 2.3. The attributes of ADSL.RFSTDTC deviate from its “Predecessor”, DM.RFSTDTC, in terms of Type, Length and Controlled Terms or Format (Display 2.4).

Variable Label / Description

Type Length or Display Format

Controlled Terms or ISO Format

Origin / Source /Method / Comment

RFSTDTC Subject Reference Start Date/Time

text 19

Predecessor: DM.RFSTDTC

Display 2.3 ADSL.RFSTDTC described in the ADaM define.xml

Variable Label / Description

Type Length or Display Format

Controlled Terms or ISO Format

Origin / Source /Method / Comment

RFSTDTC Subject Reference Start Date/Time

datetime

ISO8601 Derived: First dose of study medication

Display 2.4 DM.RFSTDTC described in the SDTM define.xml

In order to describe ADSL.RFSTDTC the same as its “Predecessor” DM.RFSTDTC, manual update of the Define module in Pinnacle 21 Enterprise is needed. This process is illustrated in Figure 2.

Figure 2 Update Attributes of DTC in Pinnacle 21 Enterprise to Achieve Consistent Attributes as Predecessor’s

Define in Pinnacle 21 Enterprise: Dataset Variable Label Type Length

ADSL RFSTDTC Subject Reference Start Date/Time text 19

Dataset Variable Label Type Length

ADSL RFSTDTC Subject Reference Start Date/Time datetime

ADSL.RFSTDTC with consistent attributes as its predecessor

Variable Label / Description

Type Length or Display Format

Controlled Terms or ISO Format

Source/Derivation/Comment

RFSTDTC Subject Reference Start Date/Time

datetime

ISO8601 Predecessor: DM.RFSTDTC

Replace with “datetime”

Delete

Export ADaM define.xml

Page 5: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

5

2. VARIABLE LENGTH The variable length specified in define.xml should match that of the variable value in the dataset. Variable length describes the maximum expected variable length. It should only be present for a data type of "text", "integer", or "float". In the case of Type="integer", the length refers to the maximum length of the numeric value expressed in characters. Since an integer can only be defined as 8 in length in the data specification for SAS variable(s), the actual length needs to be updated in both Variables and Value Level tabs of Pinnacle 21 Enterprise tool (Display 3.2 and Display 3.3) when it exceeds 8. Otherwise, the ADaM validation report shows “Error” for Rule ID SD1231 as shown in Display 3.1 below. This usually happens for the variable SRCSEQ in multiple datasets when its length exceeds 8.

Issue Summary

Dataset Rule ID Publisher ID Message FDA PMDA Found

ADRS SD1231

SRCSEQ value is longer than defined max length 8 when PARAMCD == 'BORCFIRC' Error Warning 973

ADRS SD1231

SRCSEQ value is longer than defined max length 8 when PARAMCD == 'ORINV' Error Warning 3090

ADRS SD1231

SRCSEQ value is longer than defined max length 8 when PARAMCD == 'ORIRC' Error Warning 3868

ADRS SD1324 Define.xml/dataset variable label mismatch Error Error 1

Display 3.1 Pinnacle 21 Enterprise Validation Report

Display 3.2 Length Entered for ADRS.SRCSEQ in Variables Tab of Pinnacle 21

Display 3.3 Length Entered for ADRS.SRCSEQ in Variable Level Tab of Pinnacle 21

Page 6: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

6

After the steps in Display 3.2 and Display 3.3, the define.xml generated shows SRCSEQ with “15” in Length /Display Format as in Display 3.4.

Analysis Dataset of Response (ADRS) [Location: adrs.xpt]

Variable Label / Description Type Length or Display Format

Controlled Terms or ISO Format

Source/Derivation/Comment

SRCSEQ Source Sequence Number

integer 15

Refer to Parameter Value Level Metadata

Display 3.4 ADRS.SRCSEQ in define.xml

3. CONTROLLED TERMINOLOGY AND ADAM IG The ADaM Implementation Guide (ADaM IG v1.1) lists many variables that are subject to controlled terminology (CT). As displayed in the below table, variables AGEU, SEX, RACE should have codelists in the define.xml.

Example:

However, the ADaM IG does not offer much guidance on providing CT for other variables. This does not mean that we do not need to define the CT for them. Section 2.6.3, General Considerations for codelists, of the Define-XML Version 2.0 Completion Guidelines document states: “In addition to variables subject to controlled terminology as per CDISC IGs and sponsor-specific controlled terminology, codelist should also be provided for all other variables and value-level definitions which have a predefined and finite set of categorical allowable values.” Some variables that should have CT are: PARAM/PARAMCD/PARAMN and AVISIT/AVISITN. For more situations where codelists are expected, please refer to the Define-XML Version 2.0 Completion Guidelines document Table 2.6.3.2: Situations where codelists are expected. Some National Cancer Institute (NCI) codes in codelist and external codelist cannot be directly imported to the Pinnacle 21 Enterprise tool from an ADaM datasets specification, and therefore, need to be entered manually as described below.

3.1 NCI CODE IN CODELIST OF CONTROLLED TERMS When an NCI code is missing in Codelist or Term tab for codelist value(s), Pinnacle 21 Enterprise define validation reports “Error” for Rule ID DD0031 or DD0032 (Display 4.1). This happens to codelist AEACN (Action Taken with Study Treatment), ASTL0DTYPT (Derivation Type of Target Lesion) and

Page 7: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

7

ADTL0PARMTYP (Parameter Type of Target Lesions). These NCI codes need to be typed in the codelists (Display 4.2) or Terms (Display 4.3) tab of Pinnacle 21 Enterprise define.

Issue Summary

Dataset Rule ID Publisher ID Message FDA PMDA Found

DD0032 Missing NCI Code for Term in Codelist 'ADTL0PARAMTYP' Error Error 1

DD0031 Missing NCI Code for Codelist 'AEACN' Error Error 1

DD0032 Missing NCI Code for Term in Codelist 'AEACN' Error Error 4

DD0031 Missing NCI Code for Codelist 'ADTL0DTYPE' Error Error 1

DD0031 Missing NCI Code for Codelist 'ADTL0PARAMTYP' Error Error 1

Display 4.1 Pinnacle 21 Enterprise Tool Validation Report Displaying Errors for Missing NCI Code

Display 4.2 NCI Code Entered for AEACN in Codelists Tab of Pinnacle 21

Display 4.3 NCI Codes Entered for AEACN in Terms Tab of Pinnacle 21 Tool

3.2 EXTERNAL DICTIONARIES/CODELIST Codelists, such as coding dictionaries provided by third party vendors, are referred as “External Codelist” in the Define-XML document. They require a different type of the information provided in the document compared to other CDISC or sponsor-defined Controlled Terminologies. Third party dictionaries such as MedDRA/WHODD have regulated terms (i.e. the same coding result would apply for the same AE under the same dictionary version regardless of sponsor or study). In a Define-XML document, this needs to be listed under the External Dictionaries with Dictionary name and Version specified (Display 5.1).

Page 8: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

8

Display 5.1 MedDRA Display in the External Dictionaries With the current version of the custom Pinnacle 21 Enterprise tool, if the user enters ‘MedDRA’ in the CT column of the ADaM spec, the tool will create a new ‘MedDRA’ codelist instead of linking to the MedDRA Dictionary already present in the tool (Display 5.2 a). The user will need to manually select ‘MedDRA’ from the codelist dropdown in the Pinnacle 21 tool (Dispaly5.3) for variables that should have ‘MedDRA’ codelist.

Display 5.2 Property tab in the Define of Pinnacle 21 Enterprise

Display 5.3 Codelist Term “MedDRA” Manually Entered in Define Pinnacle 21 Enterprise

a

b

Page 9: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

9

After this manual change, the corresponding define.xml generated shows and links the codelist as “Medical Dictionary for Regulatory Activities” (Display 5.4).

Display 5.4 Controlled Terms or Format for MedDRA in define.xml

4. ADRG AND ARM INFORMATION The submitted define.xml often displays links to ADRG and ARM (Display 6.1). However, the IDs, titles and exact file names (Href) of ADRG and ARM need to be manually entered into Documents of the Property tab in Pinnacle 21 Enterprise tool as shown in Display 5.2 b above.

Display 6.1 define.xml with Links to ADRG and ARM

Page 10: PharmaSUG 2020 - Paper SS-151 Supplementary Steps to ...The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides

10

CONCLUSION Due to proven benefits and an easy solution to create a compliant and complex ADaM define.xml, the Pinnacle 21 Enterprise tool has been broadly used, although not required, to generate ADaM define.xml. The extra steps we implemented in the current version are additional to the direct automation after importing an ADaM dataset specification and before exporting the define.xml. These steps are highly recommended to help enhance the precision of ADaM dataset and variable descriptions in define.xml for the regulatory submission.

ACKNOWLEDGMENTS The authors would like to thank Ms. Ellen Asam, Ms. Mary N. Varughese, and Ms. Amy Gillespie for reviewing the paper and their great suggestions.

RECOMMENDED READING 1. PhUSE. 2019. “Define-XML Version 2.0 Completion Guidelines’. Define-XML Version 2.0 Completion

Guidelines

2. ADaM Implementation Guide (ADaM IG v1.1). https://www.cdisc.org/system/files/members/standard/foundational/adam/ADaMIG_v1.1.pdf

CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the authors at:

Majdoub Haloui Principal Scientist, Statistical Programming Merck & Co., Inc. [email protected] Hong Qi Principal Scientist, Statistical Programming Merck & Co., Inc. [email protected]

Any brand and product names are trademarks of their respective companies.