accULL (HAC Leganés)

Preview:

Citation preview

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

accULL: An User-directed Approach toHeterogeneous Programming

Ruyman Reyes Ivan Lopez-Rodrıguez Juan J. FumeroFrancisco de Sande

1Dept. E.I.O. y Computacion,Univ. de La Laguna, 38271–La Laguna, Spain

International Workshop on HeterogeneousArchitectures and Computing

Leganes, July 13 2012

1 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Outline

1 Heterogeneous Architectures

2 accULL: An Early OpenACC Implementation

3 Results

4 Conclusions and Future Work

2 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Outline

1 Heterogeneous Architectures

2 accULL: An Early OpenACC Implementation

3 Results

4 Conclusions and Future Work

3 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Introduction

The irruption of GPUs: Impressive Results

4 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPUs

Successfully used for general purpose computing (GPGPU)

5 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Heterogeneous Architectures

But ...

It is not Easy!

6 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Heterogeneous Architectures

A GPU is not a CPU

GPUs are inherently SIMD processorsCPUs and GPUs tackle the processing of tasks differentlyCPUs excel at serial processingGPUs are better at handling applications that require highfloating point calculations and lower power consumption

7 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Parallel Languages: MPI (DM) and OpenMP (SM)

They are not valid for programming GPUs

New programming models are required...

8 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPGPU Programming

Nowadays Software Stack:

9 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

CUDA from NVIDIA

Pros: Performance, Easierthan OpenCL

Con: Only for NVIDIAhardware

CUDA Code Example

1 __global__ v o i d mmkernel ( f l o a t ∗ a , f l o a t ∗ b , f l o a t ∗ c , i n t n ,2 i n t m , i n t p ) {3 i n t i = blockIdx . x∗32 + threadIdx . x ;4 i n t j = blockIdx . y ;5 f l o a t sum = 0 . 0 f ;6 f o r ( i n t k = 0 ; k < p ; ++k ) sum += b [ i+n∗k ] ∗ c [ k+p∗j ] ;7 a [ i+n∗j ] = sum ;8 }

10 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPGPU Programming

OpenCL: Open Computing Language

A framework developed by the Khronos Group

A standard

OpenCL programs execute across heterogeneous platforms:CPUs + GPUs + other processors

Pros: can be used with any device, it is a standardCons: more complex than CUDA, inmature

11 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPGPU Programming

Common Problems1 The programmer needs to know low-level details of the

architecture

12 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPGPU Programming

Common Problems1 The programmer needs to know low-level details of the

architecture2 Source codes need to be rewritten:

One version for CPUA different version for GPU

3 Good performance requires a great effort in parameter tunning

4 CUDA and OpenCL are new and complex for non-experts

13 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPGPU Programming

Our Claim: New models and tools are needed if we wantto widespread the use of GPUs in HPC

Is there anything new in the horizon?

hiCUDA

PGI accelerator model

CAPS HMPP

OpenACC

14 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPGPU Programming

hiCUDATranslates each directive into a CUDA call

It is able to use the GPU Shared Memory

Only works with NVIDIA devices

The programmer still needs to know hardware details

hiCUDA Code Example:

1 . . .2 #pragma h icuda g l o b a l a l l o c c [ ∗ ] [ ∗ ] c o p y i n

4 #pragma h icuda k e r n e l mxm t b l o c k (N/16 ,N/16) t h r e a d ( 1 6 , 1 6 )5 #pragma hicuda loop_partition over_tblock over_thread6 f o r ( i = 0 ; i < N ; i++ ) {7 #pragma hicuda loop_partition over_tblock over_thread8 f o r ( j = 0 ; j < N ; j++) {9 double sum = 0 . 0 ;

10 . . .

15 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPGPU Programming

PGI accelerator model

It is a higher level (directive-based) approach

Fortran and C are supported

Precursor to OpenACC

PGI Accelerator Model Code Example:

1 #pragma acc data c o p y i n ( b [ 0 : n∗ l ] , c [ 0 :m∗ l ] ) copy ( a [ 0 : n∗m] )2 {3 #pragma acc r e g i o n4 {5 #pragma acc loop independent6 f o r ( j = 0 ; j < n ; j++)7 {8 #pragma acc loop independent9 f o r ( i = 0 ; i < l ; i++ ) {

10 double sum = 0 . 0 ;11 f o r ( k = 0 ; k < m ; k++ ) {12 sum += b [ i+k∗l ] ∗ c [ k+j∗m ] ;13 }14 a [ i+j∗l ] = sum ;15 }16 }17 }18 }

16 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPGPU Programming

OpenACC: introduced last November inSuperComputing’2011

A directive based language

Aim to be standard

Supported by: Cray, NVIDIA, PGI and CAPS

A single source code for CPU/GPU

Platform independent

Easier for beginners

17 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

GPGPU Programming

OpenACC Code Example:

18 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Outline

1 Heterogeneous Architectures

2 accULL: An Early OpenACC Implementation

3 Results

4 Conclusions and Future Work

19 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

accULL: Our OpenACC implementation

accULL is a framework developed to support OpenACCprograms

20 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

accULL: Our OpenACC implementation

accULL = YaCF + Frangollo

It is a two-layer based implementation:Compiler + RunTime Library

21 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

YaCF: the compiler

YaCF (Yet Another Compiler Framework) is the compilerframework we have developed

Some features:

It is a StS compiler

Written in Python from scratch with an OO approach

Receives C99 as input

It is able to generate CUDA/OpenCL kernels from an annotatedcode

A driver for compiling OpenACC directives has been added

YaCF translates the directives into Frangollo calls

A public-domain development

22 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Frangollo: the RunTime

Frangollo

It is a RunTime to support the execution over heterogeneousplatforms

1 Encapsulates the hardware issues

2 Is able to run in NVIDIA devices using CUDA

3 Is able to manage a wider range of devices using OpenCL

23 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Frangollo: the RunTime

Compilation flow

24 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Frangollo: the RunTime

Its Responsibilities1 Manages the memory

2 Initializes the devices

3 Launches the kernels

Makes programmers’ life easier!

25 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Frangollo: the RunTime

Its Responsibilities1 Manages the memory

2 Initializes the devices

3 Launches the kernels

Makes programmers’ life easier!

26 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Frangollo: Memory Management

A program workflow

27 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Frangollo: Structure

Interface layer: A door to Frangollo

Some functions in the C interface:

registerVar

launchKernel

getNumDevices

28 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Frangollo: Structure

Abstract layer

Frangollo uses a class-hierarchy

All classes in this layer are abstracts

29 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Frangollo: Structure

Device layer

Encapsulates all targetlanguage related functions

New platforms could beadded in the future

30 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Outline

1 Heterogeneous Architectures

2 accULL: An Early OpenACC Implementation

3 Results

4 Conclusions and Future Work

31 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Platforms

M1: A Desktop computer

Intel Core i7 930 processor (2.80 GHz)

1MB of L2 cache, 8MB of L3 cache, shared by the four cores

4 GB RAM

2 GPU devices attached:

Tesla C1060 with 3Gb memory (M1a)Tesla C2050 (Fermi) with 4GB memory (M1b)Accelerator platform is CUDA 4.0

M1a/ M1b mimic the scenario of an OpenACC average developer

She can purchase a GPU card and plug in it into her desktopcomputer

It features a relatively cheap platform

32 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Platforms

M2: A cluster node

M2: 2 quad core Intel Xeon E5410 (2.25GHz) processors

24 GB memory

Attached a Fermi C2050 card with 448 multiprocessors and 4GB memory

Accelerator platform: CUDA 4.0

M2 is a node of a common multinode cluster

Nowadays clusters combine multicore processors and GPUdevices, so we can take advantage of OpenACC

This kind of compute node has higher acquisition andmaintenance costs than M1

33 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Platforms

M3: A second clusterM3 is a shared memory system

4 Intel Xeon E7 4850 CPU

2.50MB L2 cache and 24MB L3 cache (for all its 10 cores)

6GB of memory per core

Accelerator platform: Intel OpenCL SDK 1.5, running on theCPU

M3 showcases an alternative use of OpenCL

There are implementations of OpenCL targeting shared memorysystems

Using CPU-targeted OpenCL platforms along with OpenACCrepresents an interesting alternative to OpenMP programming

34 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Some of our Experiments

Blocked Matrix Multiplication (M×M)

Rodinia BenchmarkThe Rodinia Benchmark suite comprises compute-heavyapplications

It covers a wide range of applications

OpenMP, CUDA and OpenCL versions are available for most ofthe codes in the suite

From them, we have selected:

Needleman-Wunsch (NW)HotSpot (HS)Speckle Reducing Anisotropic Diffusion (SRAD)

35 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Matrix Multiplication

Sketch of M×M in OpenACC

1 #pragma acc k e r n e l s name ( "mxm" ) copy ( a [ L∗N ] )2 c o p y i n ( b [ L∗M ] , c [ M∗N ] . . . )3 {4 #pragma acc loop p r i v a t e ( i , j ) c o l l a p s e ( 2 )5 f o r ( i = 0 ; i < L ; i++)6 f o r ( j = 0 ; j < N ; j++)7 a [ i ∗ L + j ] = 0 . 0 ;8 /∗ I t e r a t e ove r b l o c k s ∗/9 f o r ( ii = 0 ; ii < L ; ii += tile_size )

10 f o r ( jj = 0 ; jj < N ; jj += tile_size )11 f o r ( kk = 0 ; kk < M ; kk += tile_size ) {12 /∗ I t e r a t e i n s i d e a b l o ck ∗/13 #pragma acc loop c o l l a p s e ( 2 ) p r i v a t e (i , j , k )14 f o r ( j=jj ; j < min (N , jj+tile_size ) ; j++)15 f o r ( i=ii ; i < min (L , ii+tile_size ) ; i++)16 f o r ( k=kk ; k < min (M , kk+tile_size ) ; k++)17 a [ i∗L+j ] += ( b [ i∗L+k ] ∗ c [ k∗M+j ] ) ;18 }19 }

36 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Matrix Multiplication

Floating point performance for M×M in M2

37 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Matrix Multiplication

Floating point performance comparison between OpenMP,accULL, PGI and hiCUDA in M1

accULL is the second with better performance

38 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Matrix Multiplication

Comparison between OpenMP-gcc implementation andFrangollo+OpenCL in M3 (SM system 40 cores)

39 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Needleman-Wunsch

Performance comparisons of NW in M1b

accULL performs worse than native versions40 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Needleman-Wunsch

Performance comparisons of NW in M3 (SM, 40 cores)

The OpenMP versions outperform to the OpenCL counterparts41 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

HotSpot

Performance comparison of different implementationsshowing efficiency over native CUDA code in M1

In this case, accULL performs similarly to hiCUDA 42 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

HotSpot

Speed-Up comparison with native CUDA code inM1b (Fermi)

43 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

HotSpot

Efficiency w.r.t. Intel-OpenMP in M3 (SM, 40 cores)

44 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

SRAD

Speedup over the OpenMP implementation in M1b

45 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

SRAD

Speedup over the OpenMP implementation in M3

46 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Outline

1 Heterogeneous Architectures

2 accULL: An Early OpenACC Implementation

3 Results

4 Conclusions and Future Work

47 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions I

accULL

First OpenACC implementation with support for both CUDAand OpenCL

It supports most of the standard

We validate accULL using codes from widely availablebenchmarks using GPUs and CPUs

It meets the requirements of a non-expert developer

48 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions I

accULL

First OpenACC implementation with support for both CUDAand OpenCL

It supports most of the standard

We validate accULL using codes from widely availablebenchmarks using GPUs and CPUs

It meets the requirements of a non-expert developer

49 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions I

accULL

First OpenACC implementation with support for both CUDAand OpenCL

It supports most of the standard

We validate accULL using codes from widely availablebenchmarks using GPUs and CPUs

It meets the requirements of a non-expert developer

50 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions I

accULL

First OpenACC implementation with support for both CUDAand OpenCL

It supports most of the standard

We validate accULL using codes from widely availablebenchmarks using GPUs and CPUs

It meets the requirements of a non-expert developer

51 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions II

accULL

YaCF can be used as a fast-prototyping tool to exploreoptimizations

Frangollo can be detached from YaCF and combined with aproduction-ready compiler

Some issues that can be tackled within Frangolloindependently from the compiler

Memory allocationKernel schedulingData splittingOverlapping of computation and communicationsParallel reduction implementation

52 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions II

accULL

YaCF can be used as a fast-prototyping tool to exploreoptimizations

Frangollo can be detached from YaCF and combined with aproduction-ready compiler

Some issues that can be tackled within Frangolloindependently from the compiler

Memory allocationKernel schedulingData splittingOverlapping of computation and communicationsParallel reduction implementation

53 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions II

accULL

YaCF can be used as a fast-prototyping tool to exploreoptimizations

Frangollo can be detached from YaCF and combined with aproduction-ready compiler

Some issues that can be tackled within Frangolloindependently from the compiler

Memory allocationKernel schedulingData splittingOverlapping of computation and communicationsParallel reduction implementation

54 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions II

accULL

YaCF can be used as a fast-prototyping tool to exploreoptimizations

Frangollo can be detached from YaCF and combined with aproduction-ready compiler

Some issues that can be tackled within Frangolloindependently from the compiler

Memory allocation

Kernel schedulingData splittingOverlapping of computation and communicationsParallel reduction implementation

55 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions II

accULL

YaCF can be used as a fast-prototyping tool to exploreoptimizations

Frangollo can be detached from YaCF and combined with aproduction-ready compiler

Some issues that can be tackled within Frangolloindependently from the compiler

Memory allocationKernel scheduling

Data splittingOverlapping of computation and communicationsParallel reduction implementation

56 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions II

accULL

YaCF can be used as a fast-prototyping tool to exploreoptimizations

Frangollo can be detached from YaCF and combined with aproduction-ready compiler

Some issues that can be tackled within Frangolloindependently from the compiler

Memory allocationKernel schedulingData splitting

Overlapping of computation and communicationsParallel reduction implementation

57 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions II

accULL

YaCF can be used as a fast-prototyping tool to exploreoptimizations

Frangollo can be detached from YaCF and combined with aproduction-ready compiler

Some issues that can be tackled within Frangolloindependently from the compiler

Memory allocationKernel schedulingData splittingOverlapping of computation and communications

Parallel reduction implementation

58 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Conclusions II

accULL

YaCF can be used as a fast-prototyping tool to exploreoptimizations

Frangollo can be detached from YaCF and combined with aproduction-ready compiler

Some issues that can be tackled within Frangolloindependently from the compiler

Memory allocationKernel schedulingData splittingOverlapping of computation and communicationsParallel reduction implementation

59 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Future work

There are plenty of opportunities to improve performance

To implement 2D arrays as cudaMatrix or OCLImages toimprove non-contiguous memory access

To complete the implementation of the asynchronous calls forbetter performance

Multi-GPU support

To explore different possibilities of integration with MPI

Integration of Frangollo with a production-ready compiler

New backend for FPGAs

60 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Future work

There are plenty of opportunities to improve performance

To implement 2D arrays as cudaMatrix or OCLImages toimprove non-contiguous memory access

To complete the implementation of the asynchronous calls forbetter performance

Multi-GPU support

To explore different possibilities of integration with MPI

Integration of Frangollo with a production-ready compiler

New backend for FPGAs

61 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Future work

There are plenty of opportunities to improve performance

To implement 2D arrays as cudaMatrix or OCLImages toimprove non-contiguous memory access

To complete the implementation of the asynchronous calls forbetter performance

Multi-GPU support

To explore different possibilities of integration with MPI

Integration of Frangollo with a production-ready compiler

New backend for FPGAs

62 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Future work

There are plenty of opportunities to improve performance

To implement 2D arrays as cudaMatrix or OCLImages toimprove non-contiguous memory access

To complete the implementation of the asynchronous calls forbetter performance

Multi-GPU support

To explore different possibilities of integration with MPI

Integration of Frangollo with a production-ready compiler

New backend for FPGAs

63 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Future work

There are plenty of opportunities to improve performance

To implement 2D arrays as cudaMatrix or OCLImages toimprove non-contiguous memory access

To complete the implementation of the asynchronous calls forbetter performance

Multi-GPU support

To explore different possibilities of integration with MPI

Integration of Frangollo with a production-ready compiler

New backend for FPGAs

64 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Future work

There are plenty of opportunities to improve performance

To implement 2D arrays as cudaMatrix or OCLImages toimprove non-contiguous memory access

To complete the implementation of the asynchronous calls forbetter performance

Multi-GPU support

To explore different possibilities of integration with MPI

Integration of Frangollo with a production-ready compiler

New backend for FPGAs

65 / 66

HeterogeneousArchitectures

accULL: An EarlyOpenACCImplementation

Results

Conclusions andFuture Work

Thank you for your attention!

accULL: An User-directed Approach toHeterogeneous Programming

http://accull.wordpress.com/

This work has been partially supported by the EU (FEDER),the Spanish MEC (contracts TIN2008-06570-C04-03 andTIN2011-24598), HPC-EUROPA2 and the Canary Islands

Government, ACIISI

F. de Sandefsande@ull.es

66 / 66