36
27th of October 2016 Piotr Zakrzewski – The Hyve TranSMART Pro 17.1 project Technical Overview

tranSMART 17.1 technical overview

  • Upload
    thehyve

  • View
    405

  • Download
    1

Embed Size (px)

Citation preview

27th of October 2016Piotr Zakrzewski – The Hyve

TranSMART Pro 17.1 project Technical Overview

2

What does 17.1 mean for future development?

Improved ease of development● Clean up of repositories (single repo)● One step build● Dependencies update● Rest api improvements● Consolidation and extension of the star

schema to better fit tranSMART and new data types

● Documentation

3

What does 17.1 mean for future development?

Improved ease of development● Clean up of repositories (single repo)● One step build● Dependencies update● Rest api improvements● Consolidation and extension of the star

schema to better fit tranSMART and new data types

● Documentation

4

Repository StructureBefore you can deploy it here ...

5

Repository Structure

core-api core-db rest-api R modules core-api transmart

data legacy db

you need all of these ...

...and these...

6

Repository Structure16.2: - TranSMART 16.2 spans 10 core

repositories- Building & testing tranSMART requires a

special setup (that resides in yet another repository)

17.1:- Single repository with all core

components necessary for building working tranSMART WAR file

7

What does 17.1 mean for future development?

Improved ease of development● Clean up of repositories (single repo)● One step build● Dependencies update● Rest api improvements● Consolidation and extension of the star

schema to better fit tranSMART and new data types

● Documentation

8

Versioning of Artifacts 16.2:- Most components are versioned as

SNAPSHOTs- core-api, core-db, rest-api, transmartApp

and all other core components need to match strictly in revision in order to work

17.1:- Single repository: all changes to different

components come in a single PR

9

Build Process16.2:- Transmart 16.2 (Grails 2) uses Gant scripts for

building- git-repo used for fetching all repositories- custom groovy script (dependency manager)

needed for dev setup17.1:- Gradle build system (comes with Grails 3)- One step build (also with database setup)- just git clone && ./gradlew build

10

Test Setup16.2:- Custom script matching branches during

travis run- Different way to run tests locally and on

travis- No reliable way to run tests for all

components- Tested on H2 in-memory database17.1:- ./gradlew test both locally and on travis- tested against Oracle and Postgres - BDD Spock framework for testing

11

- Default option for Grails 3.X- Very versatile build system - Also very popular (gained momentum due to

adoption by Android)- Especially suitable for multi-project, multi-

language builds like tranSMART

12

What does 17.1 mean for future development?

Improved ease of development● Clean up of repositories (single repo)● One step build● Dependencies update● Rest api improvements● Consolidation and extension of the star

schema to better fit tranSMART and new data types

● Documentation

13

Java 7 to Java 8

tranSMART is still running on Java 7 which is no longer supported, even for security updates since April 2015.

Java 7 reached its end of life

14

Groovy 2.4 and Grails 3

- Java 8 supports invokeDynamic, which should increase performance of many groovy dynamic calls

- Many workarounds accounting for old Grails and Hibernate versions bugs no longer necessary

- Upgrade allowed us to adopt better build system: Gradle

15

What does 17.1 mean for future development?

Improved ease of development● Clean up of repositories (single repo)● One step build● Dependencies update● Rest api improvements● Consolidation and extension of the star

schema to better fit tranSMART and new data types

● Documentation

16

REST-API versioning

● TranSMART REST-api is used in production● Several clients and third-party apps● But development needs to continue …

17

REST-API versioning

- in 17.1 REST-api versioning is introduced- Versioning is done on the url level- GET /studies becomes GET /v1/studies- only minor influence on existing clients (change of

base url configuration to include version)

18

Current REST-API documentation

19

Open API (previously Swagger)

20

What does 17.1 mean for future development?

Improved ease of development● Clean up of repositories (single repo)● One step build● Dependencies update● Rest api improvements● Consolidation and extension of the star

schema to better fit tranSMART and new data types

● Documentation

21

Db schema as of now (16.2)

22

Db schema as of now (16.2)

Some facts about the current schema:Study exists only as string ids sprinkled around the star

schema (no table for study)Concepts and patients belong to a study (cannot be

shared)Combination of patient-concept yields a single

observation

23

Db schema of 17.1

24

Db schema of 17.1

Most important Consequences of 17.1 changes:Concepts and patients can be shared between studies more straightforward cross trial comparison (trial-visit

dimension) and longitudinal data (start date) supportMuch redundancy and inconsistencies removed

25

Hypercube- Introduction of longitudinal data

requires a whole different approach

- Modifiers used to store time point. Both relative and absolute allowed

- Each observation has effectively an additional dimension (hence the Hypercube)

26

How to query a Hypercube ?

27

Impact on backwards compatibility- Old UI will work only with old data, new data

(especially longitudinal) will not be supported- Old ui will not make use of new cross-trial

functionality- Migration path will be provided between 16.2 and

17.1

28

New UI however will support the longitudinal data and other features

29

What does 17.1 mean for future development?

Improved ease of development● Clean up of repositories (single repo)● One step build● Dependencies update● Rest api improvements● Consolidation and extension of the star

schema to better fit tranSMART and new data types

● Documentation

30

Documentation

- one of the project deliverables is documentation on the database schema

- REST-api documented with Open-API- Documentation as part of git repository

31

Conclusion

17.1 aside from many new features is also a major clean-up that will make future

developments easier

Backup slides

33

34

Arvados Keep

35

Performance Benchmarks- Goal: safeguarding performance of REST-api- Implemented as a Gradle task (single command)- Should help developers spot falls in performance

after new changes- Reference setup on Amazon will be available to

make benchmarks comparable

36

Other changes- Multiple observations per concept-patient support- Categorial variables no longer loaded per value

(e.g. variable Treated being two variables: yes and no)

- Several new tables to accommodate new HDD data type (RNAseq measurement per transcript) and table to store generic links to external resources (files)