29
Network Embedding Introductory Talk by Akash Anil Research Scholar OSINT Lab Dept. of Computer Science & Engineering Indian Institute of Technology Guwahati September 11, 2019 Akash Anil Network Embedding September 11, 2019 1 / 25

Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Network Embedding

Introductory Talk by

Akash Anil

Research ScholarOSINT Lab

Dept. of Computer Science & EngineeringIndian Institute of Technology Guwahati

September 11, 2019

Akash Anil Network Embedding September 11, 2019 1 / 25

Page 2: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Table of contents

1 Network Embedding

2 Word Embedding

3 Skip-Gram Based Neural Node Embedding

4 DeepWalk

5 Node2Vec

6 Limitation

7 VERSE

8 References

9 Thank You

Akash Anil Network Embedding September 11, 2019 2 / 25

Page 3: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Network Embedding

Network Embedding

Suppose G(V ,E ) represents a network then Network Embedding refers togenerating low dimensional network features corresponding to Node, Edge,Substructure, and the Whole-Graph [1].

Figure: Different taxonomies of Network Embedding, Picture Source: Cai et. al.”https://arxiv.org/pdf/1709.07604.pdf”

Akash Anil Network Embedding September 11, 2019 3 / 25

Page 4: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Network Embedding

Network Embedding

Suppose G(V ,E ) represents a network then Network Embedding refers togenerating low dimensional network features corresponding to Node, Edge,Substructure, and the Whole-Graph [1].

Figure: Different taxonomies of Network Embedding, Picture Source: Cai et. al.”https://arxiv.org/pdf/1709.07604.pdf”

Akash Anil Network Embedding September 11, 2019 3 / 25

Page 5: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Network Embedding Applications

Applications1 Automatic feature vector generation helps in solving traditional

problems on graph e.g. node classification, relation prediction,clustering, etc.

2 Recent Uses

Akash Anil Network Embedding September 11, 2019 4 / 25

Page 6: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Network Embedding Applications

Applications1 Automatic feature vector generation helps in solving traditional

problems on graph e.g. node classification, relation prediction,clustering, etc.

2 Recent Uses

Akash Anil Network Embedding September 11, 2019 4 / 25

Page 7: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Network Embedding Why Neural Network ??

Why Neural Network based Network Embedding ??

Traditional approaches based on matrix factorization (e.g. SVD) arenot scalable to networks with large number of nodes.Recent advances in unsupervised word embedding using single layerneural network (e.g. Word2Vec [3]).

Akash Anil Network Embedding September 11, 2019 5 / 25

Page 8: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Word Embedding

Word Embedding (Word2Vec)

Akash Anil Network Embedding September 11, 2019 6 / 25

Page 9: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Skip-Gram Based Neural Node Embedding

A generalized framework used for Node Embedding

4

5 6

3

2

1

7

9

8

1

Input Layer

Hidden Layer

Output Layer

Skip-Gram Model

x j

V dim

WV ×N

WV ×N'

WV ×N'

WV ×N'

N dim

y1 j

y2 j

yCj

G(V , E)

Akash Anil Network Embedding September 11, 2019 7 / 25

Page 10: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Skip-Gram Based Neural Node Embedding

A generalized framework used for Node Embedding

Akash Anil Network Embedding September 11, 2019 8 / 25

Page 11: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

DeepWalk

DeepWalk [4]

DeepWalk is the first network embedding model exploiting neuralnetworks.Scalable to the large real-world networks.

Akash Anil Network Embedding September 11, 2019 9 / 25

Page 12: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

DeepWalk

Designing DeepWalk Model

1 Generate node sequences (input corpus) using truncated random walk.2 Iterate random walks from same source node 80 times for

convergence.3 Supply the node sequences as input to skip-gram model.4 Maximize the probability of neighborhoods for the given node.

Akash Anil Network Embedding September 11, 2019 10 / 25

Page 13: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

DeepWalk Results

Multi-label Classification for Blog-Catalog Data

Akash Anil Network Embedding September 11, 2019 11 / 25

Page 14: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

DeepWalk Results

Multi-label Classification for Flickr Data

Akash Anil Network Embedding September 11, 2019 12 / 25

Page 15: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

DeepWalk Results

Limitations of DeepWalk

Relying on rigid notion of network neighborhood or localcharacteristics.Fails to captures proximity of different semantics.

Akash Anil Network Embedding September 11, 2019 13 / 25

Page 16: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Node2Vec

Node2Vec [2]

Uses 2nd Order Random walk to generate corpus.Presents a semi-supervised model which balances the trade-offs ofcapturing local and global network characteristics.Scalable model applicable to any type of graph e.g., (un)directed,(un)weighted, etc.

Akash Anil Network Embedding September 11, 2019 14 / 25

Page 17: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Node2Vec

Designing Node2Vec (I)

Suppose a random walker just traversed edge (t, v) ∈ E and now restingat node v . To estimate the transition probability to visit next node xoriginating from v , Node2Vec sets the transition probability wvx toπvx = αpq(t, v).wvx , where

αpq(t, v) =

1p if dtx = 01 if dtx = 11q if dtx = 2

here dtx is the shortest distance be-tween nodes t to x .

Akash Anil Network Embedding September 11, 2019 15 / 25

Page 18: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Node2Vec

Designing Node2Vec (II)

p is treated as Return Parameter.q is treated as in-out parameter.High q gives BFS like behaviour and low represents DFS.BFS is helpful in capturing local proximities between nodes.DFS is helpful in capturing global proximities between nodes.

Node sequences are generated using truncated random walks oflength 80.From each node random walk iterates 10 times.Using 10% of the dataset sample, Node2Vec sets thehyper-parameters p and q.Supply the node sequences as input to skip-gram model and maximizethe neighborhood probability.

Akash Anil Network Embedding September 11, 2019 16 / 25

Page 19: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Node2Vec

Designing Node2Vec (II)

p is treated as Return Parameter.q is treated as in-out parameter.High q gives BFS like behaviour and low represents DFS.BFS is helpful in capturing local proximities between nodes.DFS is helpful in capturing global proximities between nodes.Node sequences are generated using truncated random walks oflength 80.From each node random walk iterates 10 times.Using 10% of the dataset sample, Node2Vec sets thehyper-parameters p and q.Supply the node sequences as input to skip-gram model and maximizethe neighborhood probability.

Akash Anil Network Embedding September 11, 2019 16 / 25

Page 20: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Node2Vec results

Efficiency of Node2Vec over Multi-label classification

Akash Anil Network Embedding September 11, 2019 17 / 25

Page 21: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Limitation

Limitations of DeepWalk and Node2Vec

fail to capture different types of similarity naturally observed inreal-world networks.

Akash Anil Network Embedding September 11, 2019 18 / 25

Page 22: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Limitation

Limitations of DeepWalk and Node2Vec

fail to capture different types of similarity naturally observed inreal-world networks.

Akash Anil Network Embedding September 11, 2019 18 / 25

Page 23: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

VERSE

Versatile Graph Embeddings (VERSE) [5]

Proposes a model capable of capturing different types of similaritydistributions.Uses state-of-the-art similarity measures to instantiate the model.

Akash Anil Network Embedding September 11, 2019 19 / 25

Page 24: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

VERSE

Designing VERSE

Select a similarity measure, such as Personalized PageRank, SimRank,etc. and generate the similarity distribution matrix SimG .Initialize the Embedding space SimE with random weights.Minimize the KL-divergence between distributions SimG and SimE :∑

v∈V KL(SimG(v , .)||SimE (v , .))

Akash Anil Network Embedding September 11, 2019 20 / 25

Page 25: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

VERSE results

Efficiency of VERSE over Multi-class classification forCo-cit data

Akash Anil Network Embedding September 11, 2019 21 / 25

Page 26: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

References

References I

Hongyun Cai, Vincent W Zheng, and Kevin Chang.A comprehensive survey of graph embedding: problems, techniquesand applications.TKDE, 2018.Aditya Grover and Jure Leskovec.Node2vec: Scalable feature learning for networks.In Proceedings of the 22Nd ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining, KDD ’16, pages 855–864,New York, NY, USA, 2016. ACM.

Akash Anil Network Embedding September 11, 2019 22 / 25

Page 27: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

References

References II

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and JeffDean.Distributed representations of words and phrases and theircompositionality.In Advances in neural information processing systems, pages3111–3119, 2013.Bryan Perozzi, Rami Al-Rfou, and Steven Skiena.Deepwalk: Online learning of social representations.In Proceedings of the 20th ACM SIGKDD international conference onKnowledge discovery and data mining, pages 701–710. ACM, 2014.

Akash Anil Network Embedding September 11, 2019 23 / 25

Page 28: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

References

References III

Anton Tsitsulin, Davide Mottin, Panagiotis Karras, and EmmanuelMuller.Verse: Versatile graph embeddings from similarity measures.In Proceedings of the 2018 World Wide Web Conference, WWW ’18,pages 539–548, Republic and Canton of Geneva, Switzerland, 2018.International World Wide Web Conferences Steering Committee.

Akash Anil Network Embedding September 11, 2019 24 / 25

Page 29: Network Embedding · Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan

Thank You

Akash Anil Network Embedding September 11, 2019 25 / 25