37
Chunked Extendible Dense Arrays for Scientific Data Storage G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa Fifth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2) September 2012 G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Chunked Extendible Dense Arrays for Scientific Data Storage September 2012 1 / 25

Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

  • Upload
    others

  • View
    26

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunked Extendible Dense Arrays for Scientific DataStorage

G. Nimako, E.J. Otoo, D. Ohene-Kwofie

School of Computer ScienceThe University of the Witwatersrand

Johannesburg, South Africa

Fifth International Workshop on Parallel Programming Models and Systems Software forHigh-End Computing (P2S2)

September 2012

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 1 / 25

Page 2: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Outline

1 Introduction

2 Linear Mapping for a Dense Extendible Array

3 Chunking Extendible Dense Arrays

4 Axial-Vectors as Memory Resident O2-Tree

5 Experimental Results

6 Summary and Future Work

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 2 / 25

Page 3: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Introduction

Multidimensional arrays has been proposed as the most appropriatemodel for representing scientific databases.

Scientific data analysis use multidimensional arrays as theirfundamental data structure. Examples of Array Files:

HDF/HDF5 and variantsNetCDF/pNetCDFFITSGlobal Array toolkit

SciDB is being organised around multidimensional array storage.

The problem is that such datasets gradually grow to massive sizes ofthe order of peta-bytes.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 3 / 25

Page 4: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Introduction

Multidimensional arrays has been proposed as the most appropriatemodel for representing scientific databases.

Scientific data analysis use multidimensional arrays as theirfundamental data structure. Examples of Array Files:

HDF/HDF5 and variantsNetCDF/pNetCDFFITSGlobal Array toolkit

SciDB is being organised around multidimensional array storage.

The problem is that such datasets gradually grow to massive sizes ofthe order of peta-bytes.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 3 / 25

Page 5: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Introduction

Multidimensional arrays has been proposed as the most appropriatemodel for representing scientific databases.

Scientific data analysis use multidimensional arrays as theirfundamental data structure. Examples of Array Files:

HDF/HDF5 and variantsNetCDF/pNetCDFFITSGlobal Array toolkit

SciDB is being organised around multidimensional array storage.

The problem is that such datasets gradually grow to massive sizes ofthe order of peta-bytes.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 3 / 25

Page 6: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Introduction

Multidimensional arrays has been proposed as the most appropriatemodel for representing scientific databases.

Scientific data analysis use multidimensional arrays as theirfundamental data structure. Examples of Array Files:

HDF/HDF5 and variantsNetCDF/pNetCDFFITSGlobal Array toolkit

SciDB is being organised around multidimensional array storage.

The problem is that such datasets gradually grow to massive sizes ofthe order of peta-bytes.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 3 / 25

Page 7: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Introduction - Problem Motivation

k-dimensional arrays represented in linear consecutive locationscannot extend without reallocation of already stored elements.

Definition

A realisation of the array A[U0][U1]...[Uk−1] in L[n] for n = ∏k−1j=0 Uj , is a

mapping function, F : Uk → L, of the elements of A, one-to-one, onto theaddress, {0, 1, ..., n} with F (0, 0, ..., 0) = 0.

Row major realisation

q = F (i0, i1, i2, ..., ik−1) = s0 + i0C0 + i1C1 + ... + ik−1Ck−1

Cj =k−1

∏r=j+1

Ur , 0 ≤ j ≤ k − 1,Ck−1 = 1

The limitation imposed by F () is that extensions of the array canonly be done on one dimension (i.e. that is dimension U0 since it wasnot used in the evaluation of F ()).

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 4 / 25

Page 8: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Introduction - Problem Motivation

k-dimensional arrays represented in linear consecutive locationscannot extend without reallocation of already stored elements.

Definition

A realisation of the array A[U0][U1]...[Uk−1] in L[n] for n = ∏k−1j=0 Uj , is a

mapping function, F : Uk → L, of the elements of A, one-to-one, onto theaddress, {0, 1, ..., n} with F (0, 0, ..., 0) = 0.

Row major realisation

q = F (i0, i1, i2, ..., ik−1) = s0 + i0C0 + i1C1 + ... + ik−1Ck−1

Cj =k−1

∏r=j+1

Ur , 0 ≤ j ≤ k − 1,Ck−1 = 1

The limitation imposed by F () is that extensions of the array canonly be done on one dimension (i.e. that is dimension U0 since it wasnot used in the evaluation of F ()).

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 4 / 25

Page 9: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Introduction - Problem Motivation

This extendibility limitation degrades performance of various arrayoperations particularly in scientific and engineering applications thatsometimes undergo interleaved extensions.

For example, some data processing applications require incrementaltiling of adjacent scenes and progressive inclusion of selected bands.

Extendible arrays, on the other hand can handle dynamic growth inthe bounds of the dimensions.

These arrays can expand in any dimension without reorganisingalready allocated array element

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 5 / 25

Page 10: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Outline

1 Introduction

2 Linear Mapping for a Dense Extendible Array

3 Chunking Extendible Dense Arrays

4 Axial-Vectors as Memory Resident O2-Tree

5 Experimental Results

6 Summary and Future Work

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 6 / 25

Page 11: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Linear Mapping for a Dense Extendible Array

The mapping function for extendible array uses axial-vectors to storeinformation needed to compute the function.

A vector-list of axial-vectors is maintain for each dimension.

Let A[U∗0][U∗1][U

∗2] be an arbitrary 3-dimensional array, where U∗j

denotes the bound that has the ability to grow as opposed to a fixedbound Uj as in the conventional array.

Similarly we employ the notation:

F () when referring to conventional array mapping function.F ∗() when referring to a mapping function that allows extendibility inany dimension

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 7 / 25

Page 12: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Linear Mapping for a Dense Extendible Array

The mapping function for extendible array uses axial-vectors to storeinformation needed to compute the function.

A vector-list of axial-vectors is maintain for each dimension.

Let A[U∗0][U∗1][U

∗2] be an arbitrary 3-dimensional array, where U∗j

denotes the bound that has the ability to grow as opposed to a fixedbound Uj as in the conventional array.

Similarly we employ the notation:

F () when referring to conventional array mapping function.F ∗() when referring to a mapping function that allows extendibility inany dimension

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 7 / 25

Page 13: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Linear Mapping for a Dense Extendible Array - Illustration

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 8 / 25

Page 14: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Linear Mapping for a Dense Extendible Array - Illustration

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 8 / 25

Page 15: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Linear Mapping for a Dense Extendible Array

Suppose that in a k-dimensional extendible arrayA[U∗0][U

∗1][U

∗2]...[U

∗k−1], dimension l is extended by λl , then the

index range increases from U∗l to U∗l + λl .

Let the location A〈0, 0, ..., U∗l , ..., 0〉 (i.e. the starting location of anallocated hyperslab ) be denoted as `Z∗l

where Z∗l = ∏k−1r=0 U∗r .

The Mapping Function

q∗ = F ∗(〈i0, i1, i2, ..., ik−1〉)) = Z0U∗l

+ (il −U∗l )C∗l +

k−1

∑j=0j 6=l

ijC∗j

C ∗l =k−1

∏j=0j 6=l

U∗j

C ∗j =k−1

∏r=j+1r 6=l

U∗r

The limitation imposed by F () is that extensions of the array canonly be done on one dimension (i.e. that is dimension U0 since it wasnot used in the evaluation of F ().

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 9 / 25

Page 16: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Linear Mapping for a Dense Extendible Array

Suppose that in a k-dimensional extendible arrayA[U∗0][U

∗1][U

∗2]...[U

∗k−1], dimension l is extended by λl , then the

index range increases from U∗l to U∗l + λl .

Let the location A〈0, 0, ..., U∗l , ..., 0〉 (i.e. the starting location of anallocated hyperslab ) be denoted as `Z∗l

where Z∗l = ∏k−1r=0 U∗r .

The Mapping Function

q∗ = F ∗(〈i0, i1, i2, ..., ik−1〉)) = Z0U∗l

+ (il −U∗l )C∗l +

k−1

∑j=0j 6=l

ijC∗j

C ∗l =k−1

∏j=0j 6=l

U∗j

C ∗j =k−1

∏r=j+1r 6=l

U∗r

The limitation imposed by F () is that extensions of the array canonly be done on one dimension (i.e. that is dimension U0 since it wasnot used in the evaluation of F ().

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 9 / 25

Page 17: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Outline

1 Introduction

2 Linear Mapping for a Dense Extendible Array

3 Chunking Extendible Dense Arrays

4 Axial-Vectors as Memory Resident O2-Tree

5 Experimental Results

6 Summary and Future Work

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 10 / 25

Page 18: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunking Extendible Dense Arrays

The use of the vector-list for axial-vectors can be expensive anddepends particularly on the interruptible expansions (cubicalextensions).

Such interruptible expansion causes the addition of a new entry in thevector-list.

Chunking the array gives some additional advantages:

It gives contiguous storage allocations for the elements of the chunks.When arrays are allocated onto secondary storage, I/O can be made inmultiples of the chunk size.

The allocation is done in chunks as opposed to the single elements.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 11 / 25

Page 19: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunking Extendible Dense Arrays

The use of the vector-list for axial-vectors can be expensive anddepends particularly on the interruptible expansions (cubicalextensions).

Such interruptible expansion causes the addition of a new entry in thevector-list.

Chunking the array gives some additional advantages:

It gives contiguous storage allocations for the elements of the chunks.When arrays are allocated onto secondary storage, I/O can be made inmultiples of the chunk size.

The allocation is done in chunks as opposed to the single elements.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 11 / 25

Page 20: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunking Extendible Dense Arrays

Given a chunked block Q [χ0][χ1][χ2]...[χk−1], the number of chunkindices, ρi for a given dimension i , is given by:

ρi =

⌈U∗iχi

⌉The allocation of chunks, denoted by Ac , becomesAc [ρ0][ρ1][ρ2]...[ρk−1].

An entry is made to the requisite axial-vector only if this condition ismet:

[U∗l + λl ] > [ρl × χl ]

The number of chunks ρl to be allocated is given by:

ρl =

⌈[U∗l + λl ]− [ρl × χl ]

χl

⌉G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 12 / 25

Page 21: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunking Extendible Dense Arrays

Given a chunked block Q [χ0][χ1][χ2]...[χk−1], the number of chunkindices, ρi for a given dimension i , is given by:

ρi =

⌈U∗iχi

⌉The allocation of chunks, denoted by Ac , becomesAc [ρ0][ρ1][ρ2]...[ρk−1].

An entry is made to the requisite axial-vector only if this condition ismet:

[U∗l + λl ] > [ρl × χl ]

The number of chunks ρl to be allocated is given by:

ρl =

⌈[U∗l + λl ]− [ρl × χl ]

χl

⌉G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 12 / 25

Page 22: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunking Extendible Dense Arrays

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 13 / 25

Page 23: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunking Extendible Dense Arrays

To access an array element A〈i0, i1, i2, ..., ik−1〉, the input indices〈i0, i1, i2, ..., ik−1〉 is translated into chunk indices 〈j0, j1, j2, ..., jk−1〉where

ji =

⌊iiχi

⌋The starting address, q∗c of the chunk containing A〈i0, i1, i2, ..., ik−1〉can be found by:

The Mapping Function for Chunked Extendible Array

q∗c = F ∗(〈j0, j1, j2, ..., jk−1〉)) = Z0ρl+ (jl − ρl )C

∗l +

k−1

∑m=0m 6=l

jmC∗m

C ∗l =k−1

∏m=0m 6=l

ρm

C ∗m =k−1

∏r=m+1r 6=l

ρr

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 14 / 25

Page 24: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunking Extendible Dense Arrays

To access an array element A〈i0, i1, i2, ..., ik−1〉, the input indices〈i0, i1, i2, ..., ik−1〉 is translated into chunk indices 〈j0, j1, j2, ..., jk−1〉where

ji =

⌊iiχi

⌋The starting address, q∗c of the chunk containing A〈i0, i1, i2, ..., ik−1〉can be found by:

The Mapping Function for Chunked Extendible Array

q∗c = F ∗(〈j0, j1, j2, ..., jk−1〉)) = Z0ρl+ (jl − ρl )C

∗l +

k−1

∑m=0m 6=l

jmC∗m

C ∗l =k−1

∏m=0m 6=l

ρm

C ∗m =k−1

∏r=m+1r 6=l

ρr

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 14 / 25

Page 25: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunking Extendible Dense Arrays

To compute the address of A〈i0, i1, i2, ..., ik−1〉 within the local chunk,the input indices 〈i0, i1, i2, ..., ik−1〉 needs to be translated to localchunk indices 〈ic0, ic1, ic2, ..., ic(k−1)〉 by :

icm = (im mod χm)

The address of A〈i0, i1, i2, ..., ik−1〉 is only a displacement within thechunk.

This can be done by using a row-major sequence order orcolumn-major order.

If the chunk size is 2n where n ≥ 2, then the Z-order sequence orPeano-Hilbert space filling curve can be used.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 15 / 25

Page 26: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Chunking Extendible Dense Arrays

To compute the address of A〈i0, i1, i2, ..., ik−1〉 within the local chunk,the input indices 〈i0, i1, i2, ..., ik−1〉 needs to be translated to localchunk indices 〈ic0, ic1, ic2, ..., ic(k−1)〉 by :

icm = (im mod χm)

The address of A〈i0, i1, i2, ..., ik−1〉 is only a displacement within thechunk.

This can be done by using a row-major sequence order orcolumn-major order.

If the chunk size is 2n where n ≥ 2, then the Z-order sequence orPeano-Hilbert space filling curve can be used.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 15 / 25

Page 27: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Outline

1 Introduction

2 Linear Mapping for a Dense Extendible Array

3 Chunking Extendible Dense Arrays

4 Axial-Vectors as Memory Resident O2-Tree

5 Experimental Results

6 Summary and Future Work

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 16 / 25

Page 28: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Axial-Vectors as Memory Resident O2-Tree

A new approach to maintaining the these axial-vectors in memory iswith the use of O2-Tree.

An O2-Tree is an augmented Red-Black Tree with data records storedonly at the leaf nodes.

A metadata file Fm stores the records that correspond to the leafnodes of the O2-Tree.

These records in Fm is used to reconstruct the memory residentO2-Tree.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 17 / 25

Page 29: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Axial-Vectors as Memory Resident O2-Tree

A new approach to maintaining the these axial-vectors in memory iswith the use of O2-Tree.

An O2-Tree is an augmented Red-Black Tree with data records storedonly at the leaf nodes.

A metadata file Fm stores the records that correspond to the leafnodes of the O2-Tree.

These records in Fm is used to reconstruct the memory residentO2-Tree.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 17 / 25

Page 30: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Axial-Vectors as Memory Resident O2-Tree

General Structure of the O2-Tree :

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 18 / 25

Page 31: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Outline

1 Introduction

2 Linear Mapping for a Dense Extendible Array

3 Chunking Extendible Dense Arrays

4 Axial-Vectors as Memory Resident O2-Tree

5 Experimental Results

6 Summary and Future Work

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 19 / 25

Page 32: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Experimental Results

Average Access Cost without Extensions (in Memory)

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 20 / 25

Page 33: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Experimental Results

Total Access Cost for Interleaved Extensions in Memory

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 21 / 25

Page 34: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Experimental Results

Total Access Cost for Interleaved Extensions on Disk

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 22 / 25

Page 35: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Experimental Results

Storage Utilization for Chunked Extendible Array

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 23 / 25

Page 36: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Outline

1 Introduction

2 Linear Mapping for a Dense Extendible Array

3 Chunking Extendible Dense Arrays

4 Axial-Vectors as Memory Resident O2-Tree

5 Experimental Results

6 Summary and Future Work

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 24 / 25

Page 37: Chunked Extendible Dense Arrays for Scientific Data Storage · 3 Chunking Extendible Dense Arrays 4 Axial-Vectors as Memory Resident O 2-Tree 5 Experimental Results 6 Summary and

Summary and Future Work

In this paper, we have given an implementation of the chunkedextendible dense arrays.

By chunking the elements of the array, the chunked extendible arraycan be conveniently stored in files.

Array elements are then accessed into and out of memory in multiplesof chunks with the aid of a mapping function.

The organisation of extendible arrays using such a mapping functionis highly appropriate for most scientific datasets where the model ofthe data is perceived to be in the form of large array files.

Currently the appropriate APIs for integrating our scheme with theGlobal Array Toolkit are being developed.

G. Nimako, E.J. Otoo, D. Ohene-Kwofie School of Computer Science The University of the Witwatersrand Johannesburg, South Africa (WITS, Johannesburg)Chunked Extendible Dense Arrays for Scientific Data StorageSeptember 2012 25 / 25