Click here to load reader

CHALLENGESANDOPPORTUNITIESOFANALYSING ...his.diva- 1164371/FULLTEXT01.pdf · PDF file 6 CHAPTER1.BACKGROUND correspondingclassesarepresentedtothecomputer12.Fromtheseexamples,the computerisabletolearnwhatpatternsthatmakeagivenimagebelongtoacertain

  • View
    0

  • Download
    0

Embed Size (px)

Text of CHALLENGESANDOPPORTUNITIESOFANALYSING ...his.diva- 1164371/FULLTEXT01.pdf · PDF file 6...

  • RESEARCH PROPOSAL

    CHALLENGES AND OPPORTUNITIES OF ANALYSING COMPLEX DATA USING DEEP LEARNING

    NICLAS STÅHL Informatics

  • ABSTRACT

    The era of big data and data analysis is here. Unlike data analysis just some decades ago, the analysis today does not only comprise data that are stored in well organized tables. Instead the data are much more diverse and may, for example, consist of images or text. This type of data do often go under the term complex data. However, there are no profound definition for complex data and this term is often used to highlight that an analysis of the data is non-trivial. Therefore, this research aims to find such a definition and specify a set of properties that is required for the data to be complex.

    A sub-field within machine learning that has shown promising results analysing complex data in the last years is deep learning. Therefore, the properties of complex data will be analysed from a deep learning perspective. Even though deep learning has been successful inmany fields, there are still several open problems that need to be solved. There are also other fields in which deep learning still has not made any major breakthroughs in. This research aims to find case studies in such fields and to explain why deep learning still has not succeeded and connect this to the properties of the data from that field. With knowledge about the limitations these properties pose to the current deep learning methods, the aim is to refine these methods and develop new ones.

    keywords: Complex data, Deep learning

    i

  • ii

  • CONTENTS

    1 Background 5

    1.1 Machine learning and computer aided analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    1.1.1 Supervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    1.1.2 Unsupervised learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.1.3 Feature learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.1.4 Multimodal learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.2 Different types of data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.3 Feature engineering – Creating structured data from unstructured data 8

    1.4 Deep learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.4.1 Artificial neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.4.2 Feed forward neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    1.4.3 Convolutional neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    1.4.4 Recurrent neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.4.5 Generative adversarial networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.5 Challenges and open problems in deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2 Problem specification 15

    2.1 Related work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.2 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    2.3 Time plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    3 Method 21

    4 Preliminary results 23

    4.1 Case 1 – Molecule property prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    4.2 Case 2 – Steel rolling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    Bibliography 27

    1

  • 2 CONTENTS

  • INTRODUCTION

    The era of big data and data analysis is here. Our modern society now generates data in such a speed that it only takes less than two days to generate more data than all data that was produced by humans before 20031. This increase in data generation has not stagnated, but kept growing exponentially and the world has started to go through a big data revolution1–6. Before the big data revolution, a lot of effort were put in designing different data collection schemes and surveys for data collection5,6. The analysis process of the data was often predefined before the data collection and served a given purpose. However today this has changed. Data are now abundant in many fields and are no longer hard and expensive to gather. Hence, data are no longer collected for a specific study, but are instead often collected as a bi-product from a given process, so called secondary data collection7. This creates several opportunities and challenges for the field of data analysis1,2,4,8. One of the main challenges with this type of data, that is identified by both Fan et al.8 and Chen & Zhang9, is that the data often are very heterogeneous and do not follow a predefined structure. This is something most conventional methods for data analysis can not handle10. The data can also be stored in different formats, have different quality and granularity and come from many different sources2,8,9, factors making it more difficult to analyse the data. These factors, as well as the studied process, may also change over time, thus making it even more difficult.

    A research field that has shown promising results solving problems arising due to this type of unstructured data frommultiple sources is deep learning11. Deep learn- ing is a new sub-field ofmachine learning that has revolutionized several fields such as image processing12,13, speech recognition14 and natural language processing15,16. Due to the big success in these areas deep learning has gained a lot of attention from the scientific community and popular media. This has caused researchers from many different fields, e.g chemistry17,18, finance19 and biology6, to use deep learning to solve problems in their respective field. Even though deep learning al- gorithms sometimes are presented as something that will work straight out of the box this is seldom the case in reality. How to initialize and train a deep learning model is often a non-trivial problem, that requires expert knowledge20. Due to this there are still many open problems that potentially could be solved with deep learn- ing. There are also several open problems within the field of deep learning. One of the main problems is that there is still no complete understanding on why deep learning works as good as it does21. The understanding of deep learning, so far, is mostly based on empirical studies and heuristics. How to select, configure and train deep learning models is by many still seen as a “black art”.

    3

  • 4 CONTENTS

  • CHAPTER 1

    BACKGROUND

    The first part of this section gives a short and simplistic definition ofmachine learn- ing (ML). Machine learning is sub-field within artificial intelligence (AI) that is fo- cused on how machines may learn and draw conclusions from data. This research will focus on how such algorithms are affected by properties in the data. There- fore, the second part of this section is designated to an overview of different types of data and what characterize each type. In this research there is a special focus on one of these types of data, namely complex data. This type of data are often hard to analyse with conventional ML methods. To solve this problem, the data is often first manually crafted into a a new dataset with new features. This is called feature learning and this process is described in the third part of this section. In the last few years, a new sub-field of machine learning that can handle complex data with- out feature engineering has emerge. This sub-field is calledDeep learning (DL) and an introduction to this field and the models contained within will be described in the forth part of this section. In the final part, open problems and challenges with deep learning is described.

    1.1 MACHINE LEARNING AND COMPUTER AIDED ANALY- SIS

    A very general definition ofmachine learning, is thatmachine learning is a sub-field of computer science which aims to make computers learn22,23. Even thought this is a somewhat simplistic definition, it still becomes problematic, since the meaning of learning must be defined. Since the start of the study of artificial intelligence, re- searchers have tried to create machines that are able to learn in the same manners as humans. Human learning is a very complex process, which is far from fully un- derstood and no general computerized algorithm for learning are able to mimic it so far. However, several methods have been developed that allow computers to do statistical inferences from presented examples, which in some sense could be called learning. Thus, by presenting a lot of training samples to amachine, it would be able to extrapolate knowledge from the observed data. The machine would then be able to use the gained knowledge to draw conclusions in new examples that are similar to those presented during training, while not exactly the same. This is something that has been proven to be of utility in many problems domains, both for automatically drawing con