30
Deep Learning: Trends and Challenges DAVIDE BACCIU DIPARTIMENTO DI INFORMATICA UNIVERSITÀ DI PISA

Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Deep Learning: Trends and Challenges

DAVIDE BACCIUDIPARTIMENTO DI INFORMATICA UNIVERSITÀ DI PISA

Page 2: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

BioMedical

IoT Challenges

Applications

Trends

Structured*HPC

Knowledge Transfer*

Page 3: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Input

Hard-codedexpert

reasoning

Prediction

Expert-designedfeatures

Trainable predictor

Learnedfeatures

AI

ML

Learned feature

hierarchy

Deep Learning

Page 4: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Neural Net Machinery in 1 Slide

w1

f

wn

Synaptic weightsFree parameters of the model

Neuron ActivationWeighted input summation + thresholding function (often differentiable and nonlinear)

Network input

Network prediction

LearningGround-truth predictions in training data can be used to adapt the synaptic weights of all neurons

Page 5: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

BioMedical

IoT Challenges

Applications

Trends

Structured*HPC

Knowledge Transfer*

Page 6: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Structured Data

Compound information whose atomic components provide informative content when considered in their surrounding context

Sequences

Trees

Graphs

Page 7: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Learning with Structured Data

Vectorialdataset

Structured dataset

Learning from a population where each individual is a fixed-size vector

Learning from a population where each individual is a

variable size graph (vectorial information as

node labels)

ML@UNIPI(since 1993)

Page 8: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Recursive Neural Networks

A neural model that can unfold on the structure of the sample

d c

b

a

c

d c

b

c

a

Prediction for the whole structure

Neural encoding of the nodes

Page 9: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

From Image to Graph Convolutions

Image

… …

Graph

Learn hidden neurons responsive to visual patterns

Learning hidden neurons responsive to structural patterns• Node labels• Connectivity

Page 10: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Community Detection

Community detection in social graphs

Kipf & Welling, Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017

Page 11: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

BioMedical

IoT Challenges

Applications

Trends

Structured*HPC

Knowledge Transfer*

Page 12: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

The Rise of Deep Learning...

…and biomedical applications slowly starting to catch up

Source: query on Scopus abstracts on Sept. 2017

some 50 review papers

0

500

1000

1500

2000

2500

3000

3500

2005 2007 2009 2011 2013 2015 2017

Deep Learning Deep Learning + Life Sciences

Page 13: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

CNN for DNA/RNA Sequences (DeepBind)

T

A

G

A

C

A

T

C

T

927 CNN models predicting a binding score for transcription factors and RNA-binding proteins

1D convolutions on the input sequence train to respond to task-specific motifs

Alipanahi, Babak, et al. "Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning." Nature biotechnology 33.8 (2015): 831-838.

http://tools.genes.toronto.edu/deepbind/

Page 14: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

CNN for DNA Sequences

Deep learning visual training system designed for machine vision applications

GPU accelerated CNN training

Digits

ML@UNIPI

Page 15: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

cag gcc taa cac atg caa gtc gaa cgg taa nag

att gat agc ttg cta tca atg ctg acg anc ggc

gga cgg gtg agt aat gcc tgg gaa tat acc ctg

atg tgg gg gat aac tat tgg aaa cga tag cta

ata…

Triplet ID

aaa 1

aac 2

… …

taa 59

… …

ttt 64

Triplet vocabulary

Use ID as graylevel of the corresponding pixel

500K DNA sequences from 18 bacteria species transformed into images

Convolutions have to be 1D even if it is an image!

ML@UNIPI

Page 16: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Testing Deep Learning Acceleration

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Dna100K Dna500K

CNN Training Time

P100 M40

Dell PowerEdge C4130• 4xM40 12Gb• 2 Xeon E5-2670v3• 128GB RAM

Dell PowerEdge C4130• 4xP100 16Gb PCIE• 2 Xeon E5-2690v4• 256GB RAM

3h.30m

3d.3h

ML@UNIPI

Page 17: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Exploiting Clonal Diversity for Personalized Cancer Treatment

primary tumor

Metastasis 1

Metastasis 2

Predicting the effect of chemioterapicdrugs from patients clonal trees

Non-Isomorph tree transduction

ML@UNIPI

Allele frequencyinformation

Page 18: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

BioMedical

IoT Challenges

Applications

Trends

Structured*HPC

Knowledge Transfer*

Page 19: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Internet of Streams

Enormous amounts of

heterogeneous sequential data

+ Adding actuation calls for increased adaptivity

Page 20: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Cloud Intelligence

Deep learning for sequences

(LSTM,GRU,…)

Do we really need:• To transfer all our data to the

could for analytics• Complex DL models for all our

tasks

Page 21: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Edge Intelligence

• Learning models that scale from tiny (8KB) to large (or deep)

• Reservoir computing and randomized networks

ML@UNIPI

Page 22: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Distributed Intelligence as an IoT Service

ML@UNIPI

Multiple learning primitives within the same neural machinery

• Supervised, anomaly detection & feature selection

Embedded learning, management and over-the-air deployment

tuning to normality

Identifying anomalies/novelties

Automating medical screening (from 30mins to 10secs)

Page 23: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

BioMedical

IoT Challenges

Applications

Trends

Structured*HPC

Knowledge Transfer*

Page 24: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Are We Really Building Adaptive Applications?

Probably yes.. if we consider agents and reinforcement

learning

Otherwise we use pre-programmed adaptation

Predictor created at development time

Page 25: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

The Adaptivity Challenge

Learning Automation

Standardization & Protocols

Learning as a primitive

Page 26: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

BioMedical

IoT Challenges

Applications

Trends

Structured*HPC

Knowledge Transfer*

Page 27: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Different Forms of Parallelism?

Current deep learning accelerations based on stream/data parallelism

Structures are irregular and require synchronization

Branch&bound?

Page 28: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

BioMedical

IoT Challenges

Applications

Trends

Structured*HPC

Knowledge Transfer*

Page 29: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Sharing Learned Knowledge

A scalable approach for IoTapplications

Impacting also biomedicalapplications

Reusing trained models

Hidden neural representation as a unifying language?

Page 30: Deep Learning: Trends and Challenges · 2005 2007 2009 2011 2013 2015 2017 Deep Learning Deep Learning + Life Sciences. ... • 128GB RAM Dell PowerEdge C4130 • 4xP100 16Gb PCIE

Deep Learning…

• …or learning representations from data

• Effective for the machine to perform predictions

• Not necessarily helping humans understand the underlying biological process

• Structured information as a means to supply relational knowledge

Upcoming life-science and IoTapplications

Success will depend on how key challenges will be addressed