Tensor Basic Facts II · 22 . Hierarchical Tucker Format . T is a dimension tree, a family of non-negative integers, . (Let be a nested frame tree

1

Tensor Basic Facts II

2

Overview

Hierarchical Tucker Tensor Train Generalizing matrix properties Software

3

Tensor Networks

Original tensor Ai1,…,id: memory nd CP: dnR, good compression but bad approximation Tucker: dnr+rd , good approximation but bad compression

Find better Tensor Networks as a compromise of both: Recursive binary tree ( H-Tucker) or Recursive chain ( Tensor train)

4

Hierarchical Decompositions dII

r

jj

dIRaACP ××

= =∈⊗=∑ 1

1,1

: µµTensor rank

dd

d

d

rrr

j

r

jj

d

jj IRCaCA ××

=∈⊗=∑ ∑

11

1

1,:Tucker ,1,, µµ µ multilinear rank

{ }q

r

j

r

jj

dp

J jjJaBAq

q

,,,:TuckerΗ 1,11

1

1

⊂⊗=− ∑ ∑∏ ==

νµµνµν

hierarchical rank

5

Tucker revisited Special case of Tucker: Orthogonal Tucker, if all matrices U(μ) the so called mode frames are orthogonal = HOSVD

]],...,,;[[... )()2()1()()2(2

)1(1

NNN UUUGUUUGA =×××=

,......1

1

2

2

221121211 1 1

)()2()1(...... ∑∑ ∑

= = =

=R

k

R

k

R

k

Nkikikikkkiii

N

N

NNNNuuuGA nn Ii ,...,1=

6

Best Appoximation For given orthogonal U(μ) , μ=1,…,d, it holds

TTT NN

NNG

UUUAG

UUUGA)()2(

2)1(

1

)()2(2

)1(1

...

...min

×××=

⇔×××−

Truncation of tensor A to Tucker rank (k1,…,kN) via SVD:

µµ µµµµµµµµµ σσσσ nnT diagVUA ,1,,1,)( ...,),...,(, ≥≥=ΣΣ=

):1(:,:~µµµ kUU =Define , the Tucker truncation of A by

( )( ) ( )T

NNNT

NNTNN

Tkk

UUUUA

UUUUAATN

~~...~~

~...~~...~:)(

2111

211211),...,( 1

×××=

=××××××=

7

Tucker Truncation Error

bestN n

kiikk AANATA

N−≤≤− ∑ ∑

= +=1 1

2,),...,( )(

1µ

µ

µ

µ

σ

where Abest is the best possible approximation in Tucker(k1,…,kN).

Beware! Change of the notation: Unfolding A(n) A(μ), N d !!!

It holds:

8

Hierarchical Rank Model d

I IIIIRA ××=∈ 1,

( ) ( ),11 dqq IIIIIRA ××××× +∈

Matricification by reduction to two indices matrix A

Compute SVD of A:

,,, 11

1

dqq IIi

IIi

d

i

Tiiii IRVIRUVUA ××××

=

+∈∈Σ=∑

Ui, Vi seen as vectors/tensors of lower dimensional. Repeat for tensor matricisation Ul:

( ) ( ) .,~,ˆ1

),(),()(

1

)()( etcZYWXWUr

i

Tjliii

jli

lj

r

i

Tliii

lil ∑∑

==

Σ=Σ=

9

Memory Costs A U1 … Uk V1 … Vk U1 … Vk … U1 … Vk U1 … Vk … U1 … Vk …………………………………………………………………….. U1…Vk ……………………………………………………U1…Vk

22d

nk ⋅

( ) 422d

nk ⋅

( ) ( ))dlog(dd)dlog( dnkOnkO =

⋅2Data complexity of the last row:

10

Hierarchical uniform subspaces ( ) ( ),11 dqq IIIIIRA ××××× +∈

,,, 11

1

dqq IIi

IIi

d

i

Tiiii IRVIRUVUA ××××

=

+∈∈Σ=∑

Ui, Vi seen as vectors/tensors are lower dimensional.

Better and cheaper recursion in binary tree: Repeat for tensor Ul:

.,~,1

..11

,,1

etcZBYWXBWUr

i

Tjjili

r

jl

r

i

Tjjili

r

jl ∑∑∑∑

= == =

==

Leads to smaller data complexity )dkdnk(O 3+

11

Dimension Tree

{1,2,3,4,5}

{3,4,5}

{1,2}

{4,5}

{3}

{4}

{5}

{1}

{2}

Level: 0 1 2 3

Root: Leaves: Interior node:

12

Dimension Tree

Definition: A dimension tree TI for dimension d ϵ IN is a tree with root {1,…,d} and depth such that each node t ϵ Td is either 1. a leaf and singleton t={μ } on level l ϵ {p-1,p} 2. the union of two disjoint successors S(t) = {s1,s2}:

)(log2 dp =

21 sst ∪=

The level l of the tree is defined as the set of all nodes having a distance of exactly l to the root.

{ }.)(|: ltlevelTtT Il

I =∈=

Set of leaves: L(TI). Set of interior nodes: I(TI). A node of the tree is a mode cluster = union of modes.

13

Properties

Up to last level: complete binary tree. For a canonical dimension tree each interior node has two successors:

1},,...,{ 1 >= qt qµµ

},...,{:,2/:},,...{: 1211 qrr tqrt µµµµ +===

Total number of nodes: 2d-1 Number of leaves: d Number of interior nodes: d-1

14

Hierarchical Rank For a mode cluster t in a dimension tree TI we define the complementary cluster with ,\},...,1{:' tdt =

,:,:'' µµµµIIII

tttt ∈∈×=×=

and the related t-matricization

( ))(

),...,()(,)(

)(:

:)(,:1'''

'

tt

iiiitIII

t

AAMNotation

AAMIRIRMdtt

tt

=

=→∈∈

×

µµµµ

{1,2,3,4,5}

{3,4,5}

{1,2}

{4,5}

{3}

{4} {5}

{1}

{2} }5,4,2,1{},3{

})3({ AA =

15

Example 4321 IIIIIRdcbaA ×××∈⊗⊗⊗=

( )( ) ( ) ( )

( )( ) ( ) ( )

( )( ) ( ) ( )

( ) ( )4321

4132

2143

4321

})1({

})3,2({

})4,3({

})2,1({

IIIIT

IIIIT

IIIIT

IIIIT

IRdcbaA

IRdacbA

IRbadcA

IRdcbaA

×××

×××

×××

×××

∈⊗⊗=

∈⊗⊗=

∈⊗⊗=

∈⊗⊗=

For tensor A , dimension tree T, node t with complementary cluster t‘ it holds Ttt AA )()'( =

16

Hierarchical Rank For dimension tree TI the hierarchical rank of a tensor is defined by

ITttk ∈)(IIRA∈

)(:: )(ttI ArankkTt =∈∀

The set of tensors of hierarchical rank at most (node-wise) is denoted by

ITttk ∈)(

{ }.)(:|:)Tucker( )(t

tI

ITtt kArankTtIRAkH

I≤∈∀∈=− ∈

17

SVD of A(t), d=5, ni=25 255

255 9765625

253x252 15625x625 22

252x253 625x15625 22

252x253 625x15625 22

25x254 25x390625 22

25x254 25x390625 22

25x254 25x390625 22

25x254 25x390625 22

25x254 25x390625 22

Matrix sizes of A(t) and

# large singular values:

kt1

kt2

kt3

kt4

kt5

kt6

kt7

kt8

18

Nestedness of Matricizations Let T be a dimension tree and a tensor with hierarchical rank . Let be a node with sons s1, s2.

ITttk ∈)(

IIRA∈

Tt∈( ) )A(imageofbasisabek,...,i,U )t(

tit 1=

( ) )A(imageofbasisabek,...,j,U )s(sjs

1

111=

( ) )A(imageofbasisabek,...,l,U )s(sls

2

221=

Let

Then there exist coefficients (Bt)i,j,l such that

( ) ( ) ( ) ( ) .UUBUs sk

j

k

llsjsl,j,itit ∑∑

= =

⊗=1 2

211 1

If the bases are orthogonal then 21 ss U,U

( ) ( ) ( ) ( ) .UU,UBlsjsitl,j,it 21

⊗=

19

Proof:

Consider one column of the matricization A(t).

For index (jμ)μϵt‘ this column defines a matrix ( ) ( )

'tj:,

)t(A∈µµ

( ) ( ) .: ),...,(, 121

dssjjjj AY =

∈∈ µµµµ

By assumption the rows and columns of Y are in the span of ( ) ( ) :U,U

lsjs 21

( ) ( )∑∑= =

=1 2

1 121

s sk

j

k

l

Tlsjsl,j UUcY

for some coefficients cj,l.

Therefore, every column of A(t) is a linear combination of ( ) ( ) .UU

lsjs 21⊗

20

t-frame, frame tree:

Let be a mode cluster and a family of non-negative integers.

Tt∈ ( ) Tttk ∈

We call a matrix a t-frame and a tuple of frames a frame tree.

tt kIt IRU ×∈

( )ITssU ∈

A frame is called orthogonal if the columns are orthogonal. A frame tree is called orthogonal if each frame (except the root) is orthogonal.

21

transfer tensor

A frame tree is nested if for each interior mode cluster t with successors S(t) = {t1,t2} the following relation holds:

( ){ } ( ) ( ){ }.kj,ki|UUspanki|Uspan ttjtittit 2121111 ≤≤≤≤⊗⊂≤≤

The corresponding tensor relative to the representation of by is called the transfer tensor :

21 ttt kkkt IRB ××∈

( )itU21 tt U,U

( ) ( ) ( ) ( ) .1 2

1 1,, 21∑∑

= =

⊗=t tk

j

k

lltjtljitit UUBU

22

Hierarchical Tucker Format T is a dimension tree, a family of non-negative integers,

.

Let be a nested frame tree

with transfer tensors and

( ) Tttk ∈

( )( )TttkHA ∈−∈ Tucker( ) TttU ∈

( ) )(TIttB ∈

( ) ( ) .,: ),...,1()(

dtt

I UAUimageAimageTt ==∈∀

Then the representation is a hierarchical Tucker representation. The family is the hierarchical representation rank.

( ) ( )( ))()( , TLttTItt UB ∈∈

( ) Tttk ∈

Note that the columns of Ut need not to be linear independent! This representation with orthogonal frame tree is unique upto orthogonal transformations of the t-frames.

23

Storage complexity Again T dimension tree with given

in hierarchical Tucker representation

and for S(t)={t1,t2}, Bt of minimal size.

( )( )TttkHA ∈−∈ Tucker( ) ( )( ))()( , TLttTItt UB ∈∈


Then the total storage for all transfer tensors and

leaf-frames in terms of number of entries

is bounded by

( ) )(TIttB ∈

( ) )(TLttU ∈

( ) ( )( ) ( ) ,1,1

3)()( ∑

=∈∈ +−≤

d

TLttTItt nkkdUBStorageµ

µ

,max: tTt kk ∈=

is linearly bounded in the dimension d (provided the representation parameter k is uniformly bounded).

where

24

Proof: For each leaf t={μ} of the dimension tree we have to store the t-frame which yields the second term .

1∑=

d

nkµ

µ

tknt IRU ×∈ µ

For all d-1 interior mode clusters we have to store the transfer tensors Each has at most k3 entries.


25

Hierarchical Truncation error Def.: T a dimension tree, t ϵ T and Ut an orthogonal t-frame. Orthogonal frame projection πt : IRI IRI is defined as

( ).:

},...,1{:

},...,1{

)()(

AAdtforAUUA

d

tTtt

tt

=≠=

ππ

Theorem (Hierarchical truncation error): Dimension tree T, AϵIRI. Let Abest be the best approximation of A in H-Tucker((kt)tϵT), πt the ortogonal frame projection for t-frame Ut that consists of the left singular vectors of A(t) corresponding to the kt largest singular values σt,i of A(t). Then it holds

.222,

best

Tt kiit

Ttt AAdAA

t

−−≤≤− ∑∑∏∈ >∈

σπ

26

Proof: Lemma: It holds

∑∏∈∈

−≤−Tt

tTt

t AAAA 22

ππ

222 AAAAAA stst ππππ −+−≤−

Proof of Lemma: ( ) ( )( ) 2222

22

AAAAAAAA

AAAAAA

stsst

sttst

πππππ

πππππ

−+−≤−+−

=−+−=−

Proof of Theorem: It holds 22

,2 best

kiitt AAAA

t

−≤≤− ∑>

σπ

Applying the above Lemma and the result on the number of nodes of a dimension tree yields:

( ) 22,

2

22 best

Tt kiit

Ttt AAdAA

t

−−≤≤− ∑∑∏∈ >∈

σπ

Can be improved to (2d - 3)

27

Properties of H-Tucker Format

The set H-Tucker((kt)tϵT) is - a closed set in IRI , but - is not a linear space (the rank increases by linear combinations)

Storage complexity: With and

3

},{)(),(1 21

21)( dkdnkkkkknAStorage

tttsonsTItttt

d

+≤+≤ ∑∑=∈=µ

µµ

µµ nn d,...,1max: == tTt kk ∈= max:

All tensors of canonical rank k are contained in H-Tucker((kt)tϵT) (also all tensors of boarder rank k). The set H-Tucker((kt)tϵT) is much thinner than the Tucker format because we impose additional rank conditions.

28

Example { }( ) [ ],||: 212211

2,1 quququA ⊗⊗⊗=:333 ××∈ IRA( )( )( )

=⊗=⊗=⊗

=321

,21

,22

,11})2,1({

),,(

lifqulifqulifqu

A

ji

ji

ji

lji

.010

,21

021

,010

,001

2121

=

=

=

= qquu

{1,2,3}

{1,2}

{3}

{1}

{2}

t

t1

t2 [ ][ ] [ ]2121

212211

|:,|:

,||:

21qqUuuU

qqququU

tt

t

==

⊗⊗⊗=

Consider orthogonal mode frames:

9 x 3 - matrix

29

( )

=

=⊗⊗⊗=

00000000000001000000211000021

212211)}2,1({ quququA

30

( ) ( )

)1()1(

)1(

2

121

)1(

000010001

010001

001001

1

11

1

1

AA

Auu

uuAUUA T

TtT

ttt

t

=

=

== =π

00210000021 000

010000 000

001000 3

1 2

( )

⊗=⊗21

021

00111 qu ( )

⊗=⊗

010

01022 qu

( )

⊗=⊗

010

00121 qu

=00000000000001000001000021021

)1(A

( )( ) .21

1=t

t Arank π

31

( ) ( )TTTTTtt

Ttt

Ttt

Ttttttt

qqqquuuuUUUUQ

UUUUUU

221122112211

22112121,,

+⊗+=⊗=⇔

⇔⇔⇔ππ

( ) ( )( )( )

⊗⊗⊗=

=⊗⊗⊗+⊗+=

212211

21221122112211

21 quququ

qqququqqqquuuuQU TTTTt

( ) ( )

( )

( )

⊗⊗⊗=

⊗⊗⊗=

=⊗⊗⊗

⊗⊗⊗=

==

212211212211

212211212211

})2,1({})2,1({})2,1({

21

2100010001

21

21

qqququqqququ

ququququququU

AQUUQAUUAT

t

Ttt

Tttttt πππ

32

( )( ) 32

1212211

)1(21

=

=

TTT

ttt qqququrankArank πππ

because u1, u2, q1 are linearly independent.

3 x 9 matrix

The first projection maps A into Tucker(2,2,3), but after the coarser projection πt the 1-mode rank is 3 and thus This is because πt mixes the t1-frame and the t2-frame. Tucker(2,2,3) means ranks in the standard Tucker format.

21 tt ππ

).3,2,2Tucker(21

∉Attt πππ

33

Root-to-Leaves Truncation

Input: tensor A, dimension tree TI, target rank ((kt)tϵT). For each singelton t ϵ L(TI) do Compute SVD of A(t), store dominant kt left singular vectors in the columns of the t-frame Ut. For l=p-1,…,0 do For each mode cluster t ϵ L(TI) on level l do Compute SVD of A(t), store dominant kt left singular vectors in the columns of the t-frame Ut. Let and denote the frames for the successors of t on level l+1. Compute the entries of the transfer tensor

1tU

2tU

( ) ( ) ( ) ( )νν 21

,:,, tjtitjit UUUB ⊗=

34

Compute the entries of the root (with sons t1, t2 ) transfer tensor:

{ }( ) ( ) ( )νν 21

,:,,1,...,1 tjtjd UUAB ⊗=

Return H-Tucker representation for

( ) ( )( ))()( ,II TIttTLtt BU ∈∈

( )( ).TuckerITttH kHA ∈−∈

Complexity: ( )( )231 ... dnnO ⋅⋅

35

Brothers and related matricizations For dimension tree TI and non-root mode cluster t with father f we define the unique mode cluster as the brother of t such that .

tttf ∪=

Let TI a dimension tree with interior node . Assume the matricization and the representation

21 ttt ∪=

∑=

=k

Tt vuA1

)(

ννν

kIRyIRxyxcu ii Il

Ij

k

j

k

lljlj ,...,1,,, 21

1 2

1 1,, =∈∈⊗=∑∑

= =

ννν

This gives the matricization Tk k

lllj

k

jj

t vycxA

⊗= ∑∑∑

= == 1 1,,

1

)(21

1

ννν

36

Proof:

( ) ( ) ( ) ( ) ( )

( ) ( ) ( )

( ) .'''

21

111

2

'''222

1

111

1 2

'''2'22111'''1

)(1 1,,

1)(

1 1)()(,,

1)(

1 1 1)()()(,,

)(,),...,(

t

t

ttt

tttttd

i

k k

lllj

k

jij

k k

liillj

k

jij

k k

j

k

liilijlj

tiiii

vycx

vycx

vyxcAA

∈

∈

∈∈∈

∈∈∈∈∈

⊗=

⊗=

==

∑∑∑

∑∑∑

∑∑∑

= ==

= ==

= = =

µµ

µµ

µµµµµµ

µµµµµµµµµµ

ννν

ννν

ννν

37

Matricization in H-Tucker format TI dimension tree, AϵH-Tucker((kt)tϵI) with nested orthogonal frame tree and transfer tensors .

For p>0 let Root(TI) = t0,t1,…,tp-1,tp=t a path of length p. ( )

ITttU ∈ ( )ITttB ∈

Let denote the frames of the corresponding brothers, the corresponding transfer tensors, and the corresponding representation ranks.

pUU ,...,1

10 ,..., −pBBpkk ,...,0

We always assume that the brother is always first ( ) ( )∑∑ +

⊗= +

i jjt

li

ljit ll

UUBU1

1,,νν

Then it holds with complementary frame ( ) ( ) Ttt

kT

ttt VUVUA ==∑

=

1

1

)(

ννν

( ) .11,,

0,,1 1111∑∑ ∑∑∑ ⊗⊗= −

−

pii

pjijjijt ppppp

UUBBV

38

Accumulated Transfer Tensors ( ) ,:ˆ

1

1

1111111

0,,1

0,,1,

1 ∑=

=k

isijisj BBB

( ) ( ) ,,...,2,ˆ:ˆ 1,,

1 1

1,,,

1, 1

1

1

1

1

111plBBBB l

sis

k

s

k

i

k

j

ljijsj

lsj

llll

l

l

l

l

l

l

lllllll=

= −

= =

−−−

−

−

−

−

−−−∑∑ ∑

.ˆ:ˆ pt BB =

are useful for computing all matricizations out of the transfer tensors.

39

H-Tucker

( ) ( ) ( ) ( )∑∑= =

⊗=1 2

211 1

,,

t tk

j

k

lltjtljitit UUBU

0t1t

1t

2t

2t0tB

1tB

1

1 , tt kU

2

2 , tt kU

2

2 , tt kU

1

1 , tt kU

AU t =0

40

(non-orthogonal) H-Tucker for CP ., ,

1,

1

µµµ

µ

Ii

k

ii

d

IRaaA ∈=∑⊗= =

CP-tensor:

H-Tucker representation:

Leaves: ( ) ,:,,...,1,::)(}{ , kkkiaUTLt iitI ===∈=∀ µµµ

Interior nodes, transfer tensors:

( ) .:,,01

::)(\)( ,, kkIRBotherwise

ljiifBTRootTIt t

kkktljitII =∈

==

=∈∀ ××

Root transfer tensor:

( ) .1:,,01

: },...,1{1

},...,1{,,},...,1{ =∈ =

= ××d

kkdljid kIRB

otherwiseljif

B

41

Examples

25,5,:2

1

1

2),...,( 1

==

=

−

=∑ µµ

µ ndiAd

ii dConsider tensor as discretization of the function 1/ ||x|| on [1,25]2.

255 9765625

253x252 15625x625 22

252x253 625x15625 22

252x253 625x15625 22

25x254 25x390625 22

25x254 25x390625 22

25x254 25x390625 22

25x254 25x390625 22

25x254 25x390625 22

Approximation by H-Tucker:

42

Tensor Train

Instead of complete binary tree we can also consider a linear list:

Partitioning the index sets in half: (i1,…,i2d)(i1,…,id)(id+1,…,i2d)

Partitioning the index sets in one and rest: (i1,…,id+1)(i1,…,id)(id+1)

43

Tensor train by recursive splitting:

TT: ∑=

− ⋅⋅=−−−

d

d

dddddd

DDD

jjjjidjjidjjijiii ggggA

,...,

1,...,,;;,;;1,;;2;;1),...,(

21

21

121212111

∑=

=D

jjidjijiii dd

uuuA1

,;,;2,;1),...,( 211CP:

∑∑ ∑ ∑

∑ ∑∑

= =−

= =

= = =

−−−−

−

−

−

−

⋅⋅⋅⋅=

=⋅⋅=⋅=

1

1

2

2

1121

2

2

1

1

21211

1

1

1

1

2

2

322121121111

1 1;;,;;2

1 1,;;2;;1

1 1 1),...,(,,;;2;;1),...,(,;;1),...,(

D

j

D

jjidjjid

D

j

D

jjjiji

D

j

D

j

D

jiijjjijiiijjiii

ddddd

d

d

d

d

ddd

gggg

AggAgA

44

Tensor train by recursive splitting:

1i 2i 3i 4i

),...,( 41 iiA

∑=

⋅1

1

432111

),,(;;1

D

jiiiji Ag

1i 2i 3i 4i

1j

1i 2i 3i 4i

1j 2j

∑=

⋅21

21

4321211

,

1,),(,;;2;;1

DD

jjiijjiji Agg

3j1j

1i 2i 3i 4i

2j

∑=

321

321

3432321211

,,

1,,;;4,;;3,;;2;;1

DDD

jjjjijjijjiji gggg

45

Tensor Train Network

i1 i2 …. id-1 id A

i1j1 j1i2j2 j2i3j3 jd-2id-1jd-1 jd-1id …

46

Matrix Formulation

dd

ddddd

d

d

d

d

ddd

id

id

ii

D

j

D

jjidjjid

D

j

D

jjjiji

D

j

D

j

D

jiijjijiiijiii

GGGG

gggg

AggAgA

⋅⋅⋅⋅=

=⋅⋅⋅⋅=

=⋅⋅=⋅=

−

−−−−

−

−

−

−

−

= =−

= =

= = =

∑∑ ∑ ∑

∑ ∑∑

121

1

1

2

2

1121

2

2

1

1

21211

1

1

1

1

2

2

3212112111

121

1 1;;,;;1

1 1,;;2;;1

1 1 1),...,(,;;2;;1),...,(;;1),...,(

with Dj-1 x Dj – matrices as core tensors. jijG

and are vectors (row, resp. column vector).

The j indices are related to the matrix product.

;1;,...,1 01 ==∈= ×−

dDDi

jjj DDwithIRGni jjj

11nG dn

dG

47

Visualization

dd id

id

ii GGGGA 121121−−=

dd

dnd

nd

nn

dd

ii

GGGG

GGGGA

121

1

121

111

12

11

,...,−−

−

=

d

dd

niiii

⋅⋅−

1121

Visualization:

48

Periodic case:

With additional index jd, and summation over jd represented by trace summation.

dDDi

jjj DDdjforIRGni jjj ==∈= ×−

0,,...,1;,...,1 1

( )dd

dddddd

d

d

d

d

d

d

id

id

ii

D

j

D

jjjidjjid

D

jjjijji

D

jii

GGGGtrace

ggggA

⋅⋅⋅⋅=

=⋅⋅⋅⋅=

−

−−−−

−

−

−

= =−

= =∑∑ ∑ ∑

121

1

1

2

2

1121

1

1

212111

121

1 1,;;,;;1

1,;;2,;;1

1),...,(

3j1j

1i 2i 3i 4i

2j

∑=

4321

4321

434323212141

,,,

1,,,,;;4,;;3,;;2,;;1

DDDD

jjjjjjijjijjijji gggg

4j

49

Rank - Unfolding

{ }{ } )(:,: ,...,,,..., 11 ppiiiip ArankrAAdpp

==+

Dp are called compression ranks = size of matrices

Theorem: If for each unfolding rank(Ap) = rp = Dp, then there exists a TT decomposition with compression ranks not higher than Dp.

Proof: { }∑=

⋅==1

1

2111211

...,,)...(,1 :n

jiijjiiii

Tdd

VUAUVA

Consider V also as a d-1 tensor with indices with „long index“ j1i2 varying from 1 to n1n2 and consider all unfoldings of V resulting in V2,…,Vd . We show: rank(Vp) ≤ rp .

{ } { }diiijV ,...,, 321

50

Proof To prove rank(Vp) ≤ rp we express V as

( ) WAUUUAVUVA TTTTT1

111 ==⇒=

−{ } { }∑

=

=⇒1

1

1113,211

,,...,,...,,

n

ijiiiiiij WAV

dd

Repeat what we have started with A and i1 now for V and i2. …

componentwise

Because the p-th mode has compression rank rp it holds

⇒=∑=

+

p

dppd

r

iiiiii GFA1

,...,,,,...,,..., 111β

ββ

{ }{ }

∑

∑∑

=

= =

+

++

=

===

p

dpp

p

dppdpp

r

iiiij

n

i

r

iiiijiiiiijp

GH

GFWVV

1,...,,,...,,

1 1,...,,,,...,,,...,,...,

1,21

1

1

1111121

ββ

βββ

β with ∑=

⋅=1

1

1113211

,,,...,,,...,,

n

ijiiiiiij WFH

pp ββ

resulting in rank(Vp) ≤ rp.

componentwise

51

Complexity

Following the proof the TT form of a general tensor A can be derived by successive SVD of matrix unfoldings.

The number of parameters in the TT format is bounded by (d-2)nD2 + 2nD where n = max(n1,…,nd), D = max(r1,…,rd). Proof: The number of core tensor matrices is bounded by n and the number of entries is bounded by D2 for interior core tensors, resp. D for the first/last vector.

In the periodic case the bound is given by dnD2.

52

Approximation

Suppose that the unfolding matrices are only approximated by low rank terms

1,...,1,,)(, −===+= dpErRrankERA pFpppppp ε

Theorem: With the algorithm we can compute for a given Tensor A an TT-tensor B with ranks rp and

∑−

=

≤−1

1

2d

ppF

BA ε

F

bestF

AAdBA −−≤− 1

53

Recompression TT TT

Let us assume that we have already given a tensor in the TT format

dd id

id

ii GGGGA 121121−−=

We want the derive minimal ranks, resp. compute approximations with smaller ranks. More tomorrow.

54

Basic Operations

Addition:

( )

=

=+=+==

−

−

−

−

d

d

d

d

ddd

id

id

id

id

i

iii

id

iid

iid

i

BA

BA

BA

BA

BBAABACCC

1

1

2

211

111

1

1

2

211

111

00

00

Scalar multiplication: ( ) dd

d

id

iiid

iii AAAAAA 211

1 211... ααα ==

( ) ( )

=

=+=+==

d

d

ddd

id

id

i

i

id

iid

iid

i

BA

BA

trace

BBAAtraceBACCCtrace

00

00

1

1

111

1

1

111

Periodic:

55

Hadamard product ( )( )

( ) ( )( ) ( )dd

dd

ddd

id

id

ii

id

iid

i

id

iid

iid

i

BABA

BBAA

BBAABACCC

⊗⊗=

=⊗=

====

11

11

111

11

11

111

Inner product: Take the Hadamard product to derive C in TT-format, and then compute the contractions with vectors of all ones.

∑

∑∑−−=

==

d

dd

d

d

d

dd

ii

id

id

ii

iiii

iiiiii

CCCC

CBABA

,...,121

,...,...

,...,......

1

121

1

1

1

11,

56

Inner Product:

∑ ∑ ∑∑

∑ ∑

∑

∑∑

=

==

==

==

−

−

−

−−

−

−

−

1 2

2

21

1

1

1

1 11

1

1

12

2

21

1

1

1

121

1

1

1

11

,2,1

,..., ,...,,,1,2,1

,...,121

,...,...

,...,......,

j i

ijj

i

ij

ii jj

ijd

ijjd

ijj

ij

ii

id

id

ii

iiii

iiiiii

CC

CCCC

CCCC

CBABA

d d

d

d

d

dd

d

dd

d

d

d

dd

221 nDD11nD21DD

)( 2ndDO

57

Equivalent formulation for Vector

( )

( ) ( ) ( )

∑

∑ ∑∑

∑ ∑

∑ ∑

∑

−

−−−

−

−

−

−

−

−

−

−−

⊗⊗⊗⊗=

⊗⊗

=

⊗⊗⊗=

=⊗⊗⊗

=

==

−

−

11

112211

11

1

1

1

1

1

11 1

12

2

211

1

1

1

21

11

1

1

12

2

21

1

1

1

11

...,,1,2,1

...,,1

... ...,,2,1

... ...,1,2,1

.........

d

ddd

d d

d

d

d

d d

d

d

d

d

d

d

d

d

d

dd

d

dd

jjjdjjdjjj

jj ii

ijd

ii

ij

jj iii

ijdi

ijji

ij

iiiii

jj

idj

ijjd

ijj

ij

iiiiii

uuuu

eAeA

eAeAeA

eeeAAAA

eAx

with vectors of length nk. Compare CP. kk jjku

1, −

58

Inner Product for vector:

( ) ( ) ( )

∑

∑

∑∑

∑∑

∑∑

−−

−−

−−

−−

−

−

−

−

′′′′

′′′′

′′′′

′′′′

⋅=

⋅=

⊗⊗⊗⋅

⊗⊗⊗=

===

==

1111

11221111

1111

11212111

11

1211

11

1211

1

1

1

11

1

11

1

11

''...,,2,1

''...,,,2,2,1,1

...,,2,1

''...,,2,1

......

.........

.........

''.........

dd

dd

dd

dd

d

d

d

d

d

d

d

dd

d

dd

d

dd

jjjjjjdjjjjjj

jjjjjd

Tjdjj

Tjjj

Tj

jjjdjjj

T

jjjdjjj

iiii

iiiiii

iiiiii

ii

Tiiii

T

www

uvuvuv

uuuvvv

CAB

eAeBxy

59

Special Cases

dd

d

id

id

ii

jdjji

aaaa

eeee121

21

121

,,2,1

−−=

=⊗⊗⊗=

Representing unit vector ei:

with

≠=

==kk

kkji

ik jiif

jiifa

kk

k

01

,δ

( ) iii

iiii aaaa ,12121 ,

01

01

0001

21

21 δ===

⊗

=

60

Examples

( ) ( ) ( )( ) ( ) ( ) 1

0

000111

k

k

AA

( )Tee 00110..00 ==

1

0

1000

1000

1000

0001

0001

0001

k

k

A

A

( )T1001

dd

d

id

id

iiii AAAAx 121

1 121...−−=

1

0

000000001

000010000

100000000

100010000

100000001

000010001

k

k

A

A

( )T10001010

61

Application to functions

∑∑∑===

++++≈d

jjjjjjjjj

d

jjjjjj

d

jjjd xxxgxxgxggxxxf

1,,,3

1,,2

1,1021

321

321321

21

2121

1

11...),,(),()(),...,,(

Truncated ANOVA decomposition:

Find functions gk that allow a good approximation of f.

)()()()(),...,,( 11221121 ddddd xgxGxGxgxxxf −−≈

Tensor train approximation with matrices Gk:

Similarly we can generalize CP, Tucker, H-Tucker to functions.

∑=

−−≈n

jdjddjdjjd xgxgxgxgxxxf

1,1,12,21,121 )()()()(),...,,(

62

Representing Matrices

{ }{ }dddd

dd

jid

jid

jijijjii GGGGAM ,,

1,

2,

1......112211

11

−−−==

∑−

−−−⊗⊗⊗⊗= −

11

112211...

,,1,2,1d

dddjj

jdjjdjjj UUUUM

Special case: Laplacian

63

Generalizing Matrix Properties

Let A be a symmetric n-dimensional tensor and xm a rank-one tensor for vector x ϵ IR :

;:;11... mm ii

mii xxxaA ==

IRxxaAxxfn

iiiiii

m

m

mm∈== ∑

=1,...,...

1

11::)(

Define the n-dimensional homogeneous polynomial of degree m

A symmetric: invariant under any index permutations. A positive definite: f(x) > 0 for all x ≠ 0.

64

Eigenvalue

n

n

i

n

iiiiiii

m IRxxaAxm

mm∈

=

==

− ∑11...

,...,,1

2

22: Define by a vector.

We call a number λ ϵ C an eigenvalue of A if λ and x≠0 are solutions of the polynomial equation

( ) ( )immii

m IxxAx 111 −−− == λλ

11 −− = mm IxAx λ , or the vector x is a fixed point of operator A.

Here I is the identity tensor: ai,i,…,i=1, and a=0 otherwise.

65

Matrix terms A Hermitian: - Invariant subspace: Ax=λx - Rayleigh quotient: xTAx/xTx - Lagrange multipliers: xTAx – λ (||x||2 – 1) - Best rank-1 approximation: min||x||=1 ||A – λ xxT||

A general: - Pseudospectrum: σε(A) = {λ | ||(A-λI)-1|| > ε-1 } - Numerical range: W(A) = {x*Ax | xTx=1}

How to generalize to tensors?

66

Numerical Range

( ) ∑=⋅=d

ddii

idiiidd xxAxxAxxA,...,

,,1,...,111

11,...,),...,( Rayleigh Q.

( ) ∑=⋅==d

ddii

iiii xxAxxAxxAAW,...,

,...,1

11,...,),...,()( Range:

CP or Tucker as generalization of SVD: -CP loosing the orthogonality! - Tucker loosing the diagonal form of the core tensor!

67

Eigenvalues As critical points of the Rayleigh quotient

),...,(),...,(),...,(

xxIxxA

xxxA

d

d

=

By Lagrangian ( )1),...,(:),( −−= d

dxxxAxL λλ

Characteristic Polynomial p(λ) is the resultant of the two polynomials Axm-1 and λxm-1 (searching for common zeros).

Eigenvalues are roots of p. #eigenvalues: n(m-1)n-1 Product of eigenvalues = det(A) = resultant of Axm-1 and 0 Sum of all eigenvalues is equal (m-1)n-1trace(A)

Liqun Qi, Lek-Heng Lim

68

Software

Kolda: Data structures, CP, Tucker Oseledets: TT Kressner, Tobler: H-Tucker ALPS: Quantum simulation

http://www.sandia.gov/~tgkolda/TensorToolbox/index-2.5.html http://spring.inm.ras.ru/osel/?page_id=24 http://www.sam.math.ethz.ch/NLAgroup/htucker_toolbox.html https://www.rdb.ethz.ch/projects/project.php?proj_id=8486

http://www.sandia.gov/~tgkolda/TensorToolbox/index-2.5.html



http://spring.inm.ras.ru/osel/?page_id=24

http://www.sam.math.ethz.ch/NLAgroup/htucker_toolbox.html

https://www.rdb.ethz.ch/projects/project.php?proj_id=8486

Documents

Tensor Basic Facts II · 22 . Hierarchical Tucker Format . T is a dimension tree, a family of non-negative integers, . (Let be a nested frame tree