30
Data Archive MIDAS: development decisions and usage peculiarities Saulius Maskeliūnas Vilnius University Institute of Mathematics and Informatics Akademijos str. 4, Vilnius LT-08663, Lithuania .

National Research Data Archive MIDAS

Embed Size (px)

DESCRIPTION

Presentation at "Theory Days – 2014" in Ratnieki, Latvia, 2-5.10.2014 http://home.lu.lv/~df/tdays-ratnieki/

Citation preview

Page 1: National Research Data Archive MIDAS

National Research Data Archive MIDAS: development decisions

and usage peculiarities

Saulius MaskeliūnasVilnius University Institute of Mathematics and Informatics

Akademijos str. 4, Vilnius LT-08663, Lithuania

.

Page 2: National Research Data Archive MIDAS

Content1. Introductory facts about

National Research Data Archive (MIDAS) project

2. Implementation aims and principles of MIDAS

3. Planned MIDAS outcomes and peculiarities

4. MIDAS data mining tool (DAMIS)

5. Conclusions

6. Demonstration of MIDAS

7. Demonstration of DAMIS

MII 2

Page 3: National Research Data Archive MIDAS

1. Introductory facts about MIDAS project (1)

• Project Title: National Open Access Research Data Archive (LT: Nacionalinis atviros prieigos Mokslo Informacijos Duomenų Archyvas, MIDAS)

• Lead institution: Vilnius University www.vu.lt

• Project partner: Vilnius University Hospital Santariškių Klinikos (Santariškės Clinics) santa.lt

• Project participants: 13 institutions of science and studies, and medical institutions

MII 3

Page 4: National Research Data Archive MIDAS

MII 4

1. Introductory facts about MIDAS project (2)

• Funded by: EU Structural Funds and national budget

• Project budget: ~ € 4.34M (i.e., almost 15M LTL)

• Duration: 40 months (start date: January 1, 2012 , end date: June 30, 2014 April 30, 2015)

• Current status: – technical infrastructure: not installed yet; – software development: beginning of 2nd iteration.

Page 5: National Research Data Archive MIDAS

2. Implementation aims and principles of MIDAS

• to establish the infrastructure that enables collection, organizing and storage of empirical and research data (with corresponding metadata), ensuring free, convenient, interactive search, access and analysis of data.

MII 5

MIDAS implementation purpose

Page 6: National Research Data Archive MIDAS

Prospective MIDAS users

• Researchers, lecturers, professors, students;• Science and studies institutions

[and/or their representatives];• Institutions which present research data

(e.g., hospitals), • Research and development (R&D) enterprises;• Public administration institutions

which use R&D statistical data;• other interested physical and judicial persons.

MII 6

Page 7: National Research Data Archive MIDAS

Development principles

• privacy and security (i.e., information confidentiality, integrity and non-repudiation)

• usability

• accessibility (functioning 24 hours per day, 7 days per week)

• extensibility (i.e., software architecture scaling in cases of incorporation of additional hardware)

MII 7

Page 8: National Research Data Archive MIDAS

MIDAS compatibility• MIDAS archive will be based on usage of open

code software, XML format and other open metadata, bibliographic, information retrieval standards (CERIF, CERIF for Datasets, CIF, DICOM, Dublin Core, MARC21, ISO/IEC 11179-1:2004, OAI-PMH, etc.).

• That will ensure compatibility with other information systems, data archives and registries in Lithuania and internationally (e.g., Data Citation Index of Thomson Reuters http://thomsonreuters.com/data-citation-index/ ).

MII 8

Page 9: National Research Data Archive MIDAS

Integration with other data archives and registers

• Lithuanian Academic E-Library eLABa www.elaba.lt • Lithuanian Data Archive for Social Sciences and

Humanities LiDA www.lidata.eu/en• Lithuanian Networked Digital Library of Theses

and Dissertations Lit-ETD etd.elaba.lt• National Medical Picture Archiving and Information

Exchange System MedVAIS http://www.epractice.eu/en/news/5364871

• etc.

MII 9

Page 10: National Research Data Archive MIDAS

MII 10

3. Planned MIDAS outcomes and peculiarities

• The infrastructure that enables collection, organizing and storage of empirical and research data (with corresponding metadata), ensuring free, convenient, interactive search, access and analysis of data;

MIDAS outcomes (1)

Page 11: National Research Data Archive MIDAS

MII 11

• National united research data archive with analytical software tools;

• Infrastructure for collection and transferring of biomedical research data, consisting of DICOM (for collecting data from medical equipment), ECG (for collecting electrical cardiogram data from medical devices), content management, data depersonalisation, and data archiving modules;

• Public interactive e-service “Search, Delivery and Analysis of Research Data”.

MIDAS outcomes (2)

Page 12: National Research Data Archive MIDAS

MIDAS implementation advantages

• Guaranteed safety and effective sharing of research data

• Increased quality of research outputs

• Preventing duplication of effort in research data collection

• Increased variety of research outputs

MII 12

Page 13: National Research Data Archive MIDAS

4. Data mining tool DAMIS(slides by Olga Kurasova <......................................> )

Graphical user interface (GUI)

Data mining algorithm

web services

Page 14: National Research Data Archive MIDAS

Functionalities of DAMIS

• DAMIS is a tool for analysis of the MIDAS data;• The following data mining methods are

implemented:• preprocessing (cleaning, filtering, splitting,

transposing, norming, feature selecting);• statistical primitives (min, max, mean, standard

deviation, median);• dimensionality reduction (multidimensional data

visualization);• classification and clustering.

Page 15: National Research Data Archive MIDAS

Functionalities of DAMIS

• DAMIS is a web-based system http://dev.damis.lt (user name/password: demo/demo , 1234/1234 );

• The web interface does not require any software installation; a web browser is enough for its usage;

• There is a possibility to choose high performance computing resources (VU MII cluster – VU MIF supercomputer);

• The usage is based on creation of scientific workflows;• The results obtained can be saved in MIDAS and

in a user computer.

Page 16: National Research Data Archive MIDAS

A sample of multidimensional data(breast cancer data)

C

5 1 1 1 2 1 3 1 1 b5 4 4 5 7 10 3 2 1 b3 1 1 1 2 2 3 1 1 b6 8 8 1 3 4 3 7 1 b4 1 1 3 2 1 3 1 1 b1 1 1 1 2 10 3 1 1 b2 1 2 1 2 1 3 1 1 b2 1 1 1 2 1 1 1 5 b4 2 1 1 2 1 2 1 1 b

... ... ... ... ... ... ... ... ... ... ...8 10 10 8 7 10 9 7 1 m5 3 3 3 2 3 4 4 1 m8 7 5 10 7 9 5 5 4 m7 4 6 4 6 1 4 3 1 m

10 7 7 6 4 10 4 1 2 m7 3 2 10 5 10 5 4 4 m

10 5 5 3 6 7 7 10 1 m... ... ... ... ... ... ... ... ... ... ...

4 8 8 5 4 5 10 4 1 m

 

Page 17: National Research Data Archive MIDAS

DAMIS GUI

Page 18: National Research Data Archive MIDAS

Data upload

Page 19: National Research Data Archive MIDAS

Data preprocessing

Page 20: National Research Data Archive MIDAS

Experiments

Page 21: National Research Data Archive MIDAS

Statistical primitives

Page 22: National Research Data Archive MIDAS

Dimensionality reduction

Page 23: National Research Data Archive MIDAS

Data classification and clustering

Page 24: National Research Data Archive MIDAS

Matrix view of Iris after dimensionality reduction by PCA

Page 25: National Research Data Archive MIDAS

Iris graphical representation

Page 26: National Research Data Archive MIDAS

5. Conclusions (1)

• MIDAS will provide virtual services for researchers and other participants in research and education that can lead to more efficient, effective and higher quality research;

• Users will have the possibilities to: – register, find and cite research data, – search for and use other infrastructures and tools (which provide data archiving services), – share or integrate data and tools to other science and studies infrastructures;

MII 26

Page 27: National Research Data Archive MIDAS

5. Conclusions (2)

• National Research Data Archive MIDASwill increase research cooperation possibilities, because of simpler, more convenient, unified, advanced possibilities of research data collection, analysis, application and sharing.

MII 27

Page 28: National Research Data Archive MIDAS

6. Demonstration of MIDAS

http://midas.insoft.lt:8888/web/ User name / password:

101/101

MII 28

Page 29: National Research Data Archive MIDAS

7. Demonstration of DAMIS

http://dev.damis.lt User name / password:

demo/demo

MII 29

Page 30: National Research Data Archive MIDAS

Thanks for Your Attention !

Questions ?...