Graph embedding learningema.cri-info.cm/wp-content/uploads/2019/07/Graph_Emm... · 2019. 7. 5. ·...

Introduction Problem statement Node representation learning Subgraph representation learning

Graph embedding learning

Norbert TSOPZE

University of Yaounde I

tsopze.norbert@gmail.com

July 5, 2019

Overview

1 Introduction

2 Problem statement

3 Node representation learning

4 Subgraph representation learning

Overview

1 IntroductionDefinitions

2 Problem statement

Definitions

Couple G = (V ,E ), V 6= ∅ a set of vertices and E ⊆ V × V a setof edges

G is non-directed if ∀(x , y) ∈ E , (y , x) ∈ E and directed otherwise.Attributed graph G ′ = (V ,E ,X ) is a graph with X a matrix ofvalue described each node.

degree of v ∈ V

number of u ∈ V connected to v

1 Out-degree

2 In-degree

Definitions

Let u, v two nodes of V ,a road from u to v is a set of e1, e2, . . .,ek ∈ E such that ei = (vi−1, vi ), v0 = u et vk = v

Connected component

subset V1 ⊂ V such that ∀u, v ∈ V1 there exists a path from u tov

Definitions

Representation

Approaches:

1 Incidence list ie Succ(u) = {v (u, v) ∈ E}2 Adjacency matrix:

Mi ,j =

{0 (i , j) /∈ E1 (i , j) ∈ E

3 Incidence Matrix

Types of graph:

Attributed graph : G = (V ,E ,X )

Attributed and labelled graph : G = (V ,E ,X ,Y )

Definitions

Graph importance

Capture interactions between actors representing by its nodes

Important role in ML for making predictions or discoveringnew patterns, predict new role for an actor, recommend newfriend...

Applications:1 Recommender systems2 Social networks3 Bioinformatics

Overview

1 Introduction

2 Problem statementProblemTraditional approaches

Problem

Question

How to incorporate information about graph structure into MLmodels?

finding a way to represent, or encode, graph structure so that itcan be easily exploited by machine learning models

Applications:

Link prediction

Node classification

Graph clustering

Problem

Figure: Example of graph embedding

REf: [Perozzi et al (2017)]

Traditional approaches

Main idea

in the most cases user defined heuristics to extract features

degree statistics

kernel function

Random walk

measures of local neighborhood structures

Limits:

inflexible : difficult to adapt during the learning process

time-consuming

User knowledge depending

Traditional approaches

Representation learning approach

Main idea

learn a mapping that embeds nodes, or subgraphs, as points in alow-dimensional vector space Rd

use as input adjacency matrix A and node attributes matrix X tomap each node, or a subgraph, to a vector z ∈ Rd , whered � |V |.

optimize this mapping so that geometric relationships in theembedding space reflect the structure of the original graph

learned embeddings can be used as feature inputs fordownstream machine learning tasks

integrate as step in the model construction rather than thepre-processing step

Overview

1 Introduction

2 Problem statement

3 Node representation learningShallow embedding

Matrix factorizationRandom walk

Deep embeddingNeighborhood autoencoderNeighborhood aggregation

Figure: Main objective

encode nodes in a new space (feature vector) by approximatingsimilarity in the original network

encoder

Enc : V → Rd ; maps nodes to vector embeddings zi ∈ Rd

decoder

Dec : Rd × Rd → R+;Dec(Enc(vi ),Enc(vj)) = Dec(zi , zj) ≈ Sim(vi , vj)accepts a set of node embeddings and decodes user-specified graphstatistics from these embeddings

Loss function

L =∑

(vi ,vj )∈D l(Dec(zi , zj), sim(vi , vj))

Shallow embedding

Matrix factorization

Laplacian eigenmaps

Dec(Zi ,Zj) =‖ Zi − Zj ‖22

l() = Dec(Zi ,Zj).Sim(vi , vj)

Inner Product

Dec(Zi ,Zj) = ZTZj

l() =‖ Dec(Zi ,Zj)− Sim(vi , vj) ‖22

Similarity functions:

1 HOPE : Jaccard

2 GRAREP: Sim(vi , vj) ∼= A2ij

3 GF : Sim(vi , vj) ∼= Aij

Shallow embedding

Deepwalk

Random walk rooted at vertex vi as Wvi

stochastic process with random variables W 1vi

, W 2vi

,..., W kvi

suchthat W k+1

viis a vertex chosen at random from the neighbors of

vertex vk after k + 1 steps.

Language Modeling goal (one)

given a sequence of words W 1n = (w0,w1, ...,wn), maximize

Pr(wn/w0,w1, ...,wn−1) over the training corpus

By analogy RW is to estimate the likelihoodPr(vi/(v1, v2, ..., vi−1))

Shallow embedding

Deepwalk

Given a mapping function Φ : V → R |V |×d , the problem is toestimate Pr(vi/(Φ(v1),Φ(v2), ...,Φ(vi−1)))

Transformed problem

Inspired by Language Modeling, this yields the optimizationproblem: minimize(−logPr({vi−w , ..., vi+w}|vi/Φ(vi ))) under Φ

approximate by Skypgram as:∏i+wj=i−w ,j 6=i Pr(vj/Φ(vi ))

Shallow embedding

Node2vec

Figure: Node2vec strategy

Shallow embedding

Node2vec - Walks strategies

Breadth-First-Sampling (BFS): Neigborhood Ns is restrictedto nodes nodes which are immediate neigbors of the source u

Depth-First-Sampling (DFS): Neigborhood consists of asequence of sampling by increasing distance from the source u

Generation of Ci (i th node in the walk) with c0 = u

P(ci = x/ci−1 = v) =

{ ∏vxZ if (v , x) ∈ E

0 else∏vx : unnormalized transition probability between v and u;

Z normalizing constant

Shallow embedding

Node2vec

αpq(t, x) =

1p if dtx = 0

1 if dtx = 11q if dtx = 2

with dtx ∈ {0, 1, 2}Consider a random walk that just traversed the edge (t; v) andnow resides at node v . The transition probabilities are evaluated∏

vx on edges (v ; x) leading from v∏vx = αpq(t, x).wvx∏

vx is submitted to Sky-gram

Shallow embedding

Node2vec - edge features

Given two nodes u and v , the edge (u, v) is embedded tog(u, v) = f (u) ◦ f (v).

Average: f (u)+f (v)2

Hadamard f (u) ∗ f (v)

Weighted-L1 |f (u)− f (v)|Weighted-L2 |f (u)− f (v)|2

Material: [Nasrullah et al (2019)]Code: http://snap.stanford.edu/node2vec.

Shallow embedding

GAT2VEC

1 Network generation.2 Random walks.3 Representation learning.

Figure: Networks generation

Shallow embedding

GAT2VEC

1 Graph generation

structural graph Gs = (Vs ,E ), Vs ⊆ Vbipartite graph Ga = (Va,A,Ea), Va = {v : A(v) 6= ∅};Ea = {(v , a) : a ∈ A(v)}

2 perform on Gs and Ga

γs random walks of length λs for a corpus Rγa random walks of length λa to build a corpus W .

3 use the SkipGram to jointly learn an embedding based onthese two contexts R and G

Shallow embedding

GAT2VEC

Figure: GAT2VEC Architecture

Materials: https://github.com/snash4/GAT2VEC.

Deep embedding

Neighborhood autoencoder

Figure: Neighborhood autoencoder Architecture

Deep embedding

compress the node’s neighborhood information into alow-dimensional vector.

1 Incorporate graph structure into the encoder algorithm

2 use autoencoders in order to compress information about anode’s local neigborhood

Deep embedding

For each node vi with si ∈ R |V | its neigborhood vector coorespondsto to vi ’s row in the similarity matrix S (Sij = Sg (vi , vj))

Objective

Embed nodes using the si vectors such that the si vector can bereconstructed:

Dec(Enc(Si )) = Dec(Zi ) ≈ si

Training autoencoder: minimize the loss function

‖Dec(Zi )− si‖22

Deep embedding

input dimension to the autoencoder is fixed at |V |, can beextremely costly and even intractable for graphs with millionsof nodes.

the structure and size of the autoencoder is fixed,cannot copewith evolving graphs, or generalize across graphs

Deep embedding

Neighborhood aggregation

Figure: Neighborhood aggregation Architecture

Deep embedding

generate embeddings for a node by aggregating information fromits local neighborhood

rely on node features or attributes

use simple graph statistics as attributes in case where nodefeatures are not available

Deep embedding

Figure: Neighborhood aggregation training

Ref: http://snap.stanford.edu/proj/embeddings-www/

Deep embedding

1 Node embedding initialization: using the nodes attributes2 For each iteration:

1 nodes aggregate the embeddings of their neighbors, using anaggregation function

2 assign a new embedding to every node, equal to its aggregatedneighborhood vector combined with its previous embeddingfrom the last iteration

3 feed the combined (average) embedding through a denseneural network layer

3 output the final embedding vectors

Overview

1 Introduction

2 Problem statement

4 Subgraph representation learningSets of node embeddingsGraph neural networks

Figure: Subgraph embedding

Sets of node embeddings

encode a set of nodes and edges into a low-dimensional vectorembedding

use the convolutional neighborhood aggregation idea to generateembeddings for nodes and then use additional modules toaggregate sets of node embeddings corresponding to subgraphs

Sets of node embeddings

Sum-based approaches

Main idea

represent subgraphs by summing all the individual nodeembeddings in the subgraph

ZS =∑

vi∈S zi

where vi ∈ S and its embedding zi is generated using one of theprevious algorithms

Graph neural networks

Figure: Subgraph embedding

Ref: [Scarselli et al (2009)]

1 xn ∈ Rs : label attached to the node n

2 local transition function fw that expresses the dependence of anode on its neighborhood:

xn = fw (l1, lco[1], xne[1], lne[n])

3 local output function gw , produces the output:

on = gw (xn, ln)

ln : label of nlco[1]: labels of n’s edges,xne[1], lne[n]: states and labels of the nodes in the neighborhood of n

1 f knwkn: funtion fw defined on the group of nodes kn

2 gknwkn

: funtion gw defined on the group of nodes kn

3 For the overall graph:x = Fw (x , l) and o = Gw (x , lN)

4 Learning task:

L = {(Gi , nij , tij)|Gi = (Ni ,Ei ) ∈ G ; nij ∈ Ni , tij ∈ Rm; }5 Cost function:

ew =∑p

∑qij=1(tij − ϕ(Gi , nij))2

References

William L. Hamilton, Zhitao Ying, Jure Leskovec,

Representation Learning on Graphs: Methods and Applications; in IEEEData Eng. Bull 40,pp. 52-74, 2017.

Perozzi Bryan, Al-Rfou’ Rami, Skiena Steven

DeepWalk: online learning of social representations; in Proc. of KDD’14,pp. 701-710, 2014.

Nasrullah Sheikh,Zekarias T. Kefato, Alberto Montresor

gat2vec: representation learning for attributed graphs; in Computing, Vol101 pp. 187–209, 2019.

Grover Aditya, Leskovec Jure

node2vec: Scalable Feature Learning for Networks; in Proc. of KDD’16,2016.

F. Scarselli, M. Gori, A.C. Tsoi, M. Hagenbuchner, and G. Monfardini.

The graph neural network model; in IEEE Transactions on NeuralNetworks, 20(1):6180, 2009.

Thank you

Graph embedding learningema.cri-info.cm/wp-content/uploads/2019/07/Graph_Emm... · 2019. 7. 5. ·...

Documents

The Largest Eigenvalue of a Graph: A Survey · mean of the vertex degrees of an arbitrary graph G. THEOREM 1.4 (D. Cvetkovic (20)) For any graph G, the dynamic m~an of the vertex

Deleting Vertices To Bound Path Lengthsahni/papers/delete.pdfwdag is a series-parallel graph. Keywords And Phrases Vertex deletion, directed acyclic graphs, rooted trees, series-parallel

Vertex Coloring Graph

Graphs: Connectivityjordicf/Teaching/AP2/pdf4/... · Graph representation: adjacency list A graph can be represented by 8 lists, one per vertex. The list for vertex Q holds the vertices

arXiv:math/0603062v6 [math.PR] 1 Nov 2018 · 2018-11-05 · By analogy, then, given a ﬁnite graph, take a uniform random vertex as root. Such a randomly rooted graph automatically

GraphGen: An FPGA Framework for Vertex-Centric Graph … · 2020-04-26 · GraphGen: An FPGA Framework for Vertex-Centric Graph Computation Eriko Nurvitadhi1, Gabriel Weisz2, Yu Wang2,

3D Exploration of Graph Layers via Vertex Cloning › polochau › papers › 17-vis-playground.pdf · 3D Exploration of Graph Layers via Vertex Cloning James Abello† , * Rutgers

Graph Vertex=(h,k) Graph A. B. C. D

Lecture 10 Graph Algorithms - AndroBenchcsl.skku.edu/uploads/SWE2004S14/Lecture10.pdf · Articulation vertex • A single vertex whose deletion disconnects the graph ! Biconnected

SBV-Cut: Vertex-Cut based Graph Partitioning using ... Vertex-Cut based Graph Partitioning using Structural Balance Vertices Mijung Kim · K. Selc¸uk Candan Received: date / Accepted:

Finding vertex-surjective graph homomorphisms

CuSha: Vertex-Centric Graph Processing on GPUskvora001/contents/talks/CuSha-HPDC14.pdfCuSha: Vertex-Centric Graph Processing on GPUs Farzad Khorasani, Keval Vora, Rajiv Gupta, Laxmi

Graph quadratic equations. Complete the square to graph quadratic equations. Use the Vertex Formula to graph quadratic equations. Solve a Quadratic Equation

Graph Coloring - UTKweb.eecs.utk.edu/.../coloring_pres.pdf · Vertex Coloring Proper vertex coloring - Labeling of all vertices in a graph such that no two vertices sharing an edge

BIL694-Lecture 4: Vertex Coloring and Edge Coloringozkahya/classes/bil694/Lectures/... · Introduction to Graph Theory by Douglas B. West. ... is the graph with vertex set V(G) V(H)

VERTEX COLORING OF A FUZZY GRAPH USING ALPA CUT

1.022 Introduction to Network Models · A directed tree is a digraph whose underlying undirected graph is a tree ⇒ Rooted if paths from one vertex to all others. I Vertex terminology:

Path Related Vertex Coloring of a Graph

CHARACTERIZING A VERTEX-TRANSITIVE GRAPH BY A LARGE …tessera/LocalGlobal2015_11_05.pdf · CHARACTERIZING A VERTEX-TRANSITIVE GRAPH BY A LARGE BALL 5 is in nite. Assume moreover

A Comparison of Vertex Ordering Algorithms for Large Graph Visualization