Upload
clungaho7109
View
217
Download
0
Embed Size (px)
Citation preview
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 1/162
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 2/162
What is Available?
Image processing (vImage)
Digital signal processing (vDSP)
Math functions (vForce, vMathLib, vBigNum)
Linear algebra (LAPACK, BLAS)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 3/162
What is the Accelerate Framework?
High performance• Fast
• Energy efficient
OS X and iOS
All generations of hardware
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 4/162
Session Goals
New features in vImage
Introduce
• LinearAlgebra
• <simd/simd.h>
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 5/162
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 6/162
vImageSome things you can do
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 7/162
vImageSome things you can do
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 8/162
Getting Data into vImageCGImageRef
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 9/162
Getting Data into vImageCGImageRef
// CGImageRef —-> vImage_BuffervImage_Buffer buf;vImage_CGImageFormat fmt = { .bitsPerComponent = 8, .bitsPerPixel = 32, … };vImage_Error err = vImageBuffer_initWithCGImage( &buf, &fmt, NULL,
cgImage, kvImageNoFlags );
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 10/162
Getting Data into vImageCGImageRef
// CGImageRef —-> vImage_BuffervImage_Buffer buf;vImage_CGImageFormat fmt = { .bitsPerComponent = 8, .bitsPerPixel = 32, … };vImage_Error err = vImageBuffer_initWithCGImage( &buf, &fmt, NULL,
cgImage, kvImageNoFlags );
// vImage_Buffer —-> CGImageRefcgImage = vImageCreateCGImageFromBuffer( &buf, &fmt, NULL, NULL,
kvImageNoFlags, &err );
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 11/162
Conversion SupportvImageConvert_AnyToAny
// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … };
vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err);
// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 12/162
Conversion SupportvImageConvert_AnyToAny
// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … };
vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err);
// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 13/162
Conversion SupportvImageConvert_AnyToAny
// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … };
vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err);
// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 14/162
Conversion SupportvImageConvert_AnyToAny
// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … };
vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err);
// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 15/162
What You Had to Say
“functions that convertvImage_Buffer objects toCGImageRef objects and back !!!!!! ”
Twitter user
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 16/162
What You Had to Say
“functions that convertvImage_Buffer objects toCGImageRef objects and back !!!!!! ”
Twitter user
“vImageConvert_AnyToAnyis magical. Threaded andvectorized conversion betweennearly any two pixel formats.”
Twitter user
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 17/162
Video—RGB, Grayscale, and Y’CbCrCVPixelBufferRef (a video frame)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 18/162
Video—RGB, Grayscale, and Y’CbCrCVPixelBufferRef (a video frame)
// CVPixelBufferRef —-> vImageBuffervImageBuffer_InitWithCVPixelBuffer( &buf, &desiredFormat, cvPixelBuffer, NULL, NULL, kvImageNoFlags );
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 19/162
Video—RGB, Grayscale, and Y’CbCrCVPixelBufferRef (a video frame)
// CVPixelBufferRef —-> vImageBuffervImageBuffer_InitWithCVPixelBuffer( &buf, &desiredFormat, cvPixelBuffer, NULL, NULL, kvImageNoFlags );
// vImageBuffer —-> CVPixelBufferRefvImageBuffer_CopyToCVPixelBuffer( &buf, &bufFormat, cvPixelBuffer,
NULL, NULL, kvImageNoFlags );
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 20/162
Getting video into vImage
Lower level interfaces• 41 video conversions
• Manage chroma siting, transfer function, conversion matrix, etc.
• RGB colorspaces for video formats
vImageConvert_AnyToAny() for video- vImageConverter_CreateForCGtoCVImageFormat- vImageConverter_CreateForCVtoCGImageFormat
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 21/162
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 22/162
LinearAlgebra (LA)Simple to use high performance
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 23/162
Solving System of Linear EquationsWith LAPACK
Given the system A, and the right-hand-side B, how do you find X (AX = B)?
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 24/162
Solving System of Linear EquationsWith LAPACK
Given the system A, and the right-hand-side B, how do you find X (AX = B)?
__CLPK_integer n = matrix_size;__CLPK_integer nrhs = number_right_hand_sides;__CLPK_integer lda = column_stride_A;
__CLPK_integer ldb = column_stride_B;__CLPK_integer *ipiv = malloc(sizeof(__CLPK_integer)*n);__CLPK_integer info;sgesv_(&n, &nrhs, A, &lda, ipiv, B, &ldb, &info);free(ipiv);
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 25/162
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 26/162
Solving System of Linear EquationsWith LA
Given the system A, and the right-hand-side B, how do you find X (AX = B)?
la_object_t X = la_solve(A,B);
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 27/162
LinearAlgebra
New in iOS 8.0 and OS X Yosemite
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 28/162
LinearAlgebra
New in iOS 8.0 and OS X Yosemite
Simple with good performance
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 29/162
LinearAlgebra
New in iOS 8.0 and OS X Yosemite
Simple with good performance
Single and double precision
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 30/162
Wh ’ A il bl ?
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 31/162
What’s Available?
Element-wise arithmeticMatrix product
Transpose
Norms / normalization
Linear systemsSlice
Splat
LA Obj
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 32/162
LA Objects
LA Obj
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 33/162
LA Objects
Reference counted opaque objects• Objective-C objects when appropriate
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 34/162
M M g t
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 35/162
Memory Management
Memor Management
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 36/162
Memory Management
Memory Management
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 37/162
Memory Management
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 38/162
Buffer to LA Object
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 39/162
Buffer to LA ObjectWith copy
double *A = malloc(sizeof(double) * num_rows * row_stride);// Fill A as row major matrix
// Data copied into object, user still responsible for Ala_object_t Aobj = la_matrix_from_double_buffer(A, num_rows, num_cols,
row_stride, LA_NO_HINT,LA_DEFAULT_ATTRIBUTES);
// User retains all rights to A. User must clean up Afree(A);
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 40/162
Buffer to LA Object
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 41/162
Buffer to LA ObjectWith copy
double *A = malloc(sizeof(double) * num_rows * row_stride);// Fill A as row major matrix
// Data copied into object, user still responsible for Ala_object_t Aobj = la_matrix_from_double_buffer(A, num_rows, num_cols,
row_stride, LA_NO_HINT,LA_DEFAULT_ATTRIBUTES);
// User retains all rights to A. User must clean up Afree(A);
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 42/162
Hints
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 43/162
Hints
la_object_t o = la_matrix_from_double_buffer(A, num_rows, num_cols,row_stride, LA_SHAPE_DIAGONLA_DEFAULT_ATTRIBUTES);
Hints
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 44/162
Hints
la_object_t o = la_matrix_from_double_buffer(A, num_rows, num_cols,row_stride, LA_SHAPE_DIAGONLA_DEFAULT_ATTRIBUTES);
Allow for better performance
Hints
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 45/162
Hints
la_object_t o = la_matrix_from_double_buffer(A, num_rows, num_cols,row_stride, LA_SHAPE_DIAGONLA_DEFAULT_ATTRIBUTES);
Allow for better performance
Insight about the data buffer• Diagonal
• Triangular
• Symmetric
• Positive Definite
Lazy Evaluation
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 46/162
Lazy Evaluation
la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);
// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),
la_vector_slice(x,1,2,la_vector_length(x)/2));
// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2
}
Lazy Evaluation A x
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 47/162
Lazy Evaluation
la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);
// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),
la_vector_slice(x,1,2,la_vector_length(x)/2));
// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2
}
A x
Lazy Evaluation A x
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 48/162
Lazy Evaluation
la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);
// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),
la_vector_slice(x,1,2,la_vector_length(x)/2));
// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2
}
At
A x
Lazy Evaluation A x
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 49/162
Lazy Evaluation
la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);
// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),
la_vector_slice(x,1,2,la_vector_length(x)/2));
// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2
}
x.odd x.even
x2
At
Lazy Evaluation A x
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 50/162
Atx2
3.2
a y va uat o
la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);
// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),
la_vector_slice(x,1,2,la_vector_length(x)/2));
// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2
}
x.odd x.even
x2
At
Lazy Evaluation
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 51/162
yDetails
No computationNo data buffer allocation
Triggered by- la_matrix_to_float_buffer- la_matrix_to_double_buffer
- la_vector_to_float_buffer- la_vector_to_double_buffer
Performance Comparison
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 52/162
p
Netlib BLAS• Open source
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 53/162
Performance Comparison
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 54/162
pHigher is better
G F L O P S
12.5
25
37.5
50
Matrix Size
32 160 288 416 544 672 800 928 1024
LAAcceleraNetlib B
Performance Comparison
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 55/162
pHigher is better
G F L O P S
12.5
25
37.5
50
Matrix Size
32 160 288 416 544 672 800 928 1024
LAAcceleraNetlib B
Performance Comparison
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 56/162
pHigher is better
G F L O P S
12.5
25
37.5
50
Matrix Size
32 160 288 416 544 672 800 928 1024
LAAcceleraNetlib B
Error Handling
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 57/162
la_object_t AB = la_matrix_product( A, la_transpose(B) );if (la_status(AB) < 0) { // handle error }
la_object_t result = la_sum( AB, la_scale_with_float( C, 3.2f ) );if (la_status(result) < 0) { // handle error }
la_status_t status = la_matrix_to_float_buffer(buffer, leading_dim, result);
Error Handling
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 58/162
la_object_t AB = la_matrix_product( A, la_transpose(B) );if (la_status(AB) < 0) { // handle error }
la_object_t result = la_sum( AB, la_scale_with_float( C, 3.2f ) );if (la_status(result) < 0) { // handle error }
la_status_t status = la_matrix_to_float_buffer(buffer, leading_dim, result);
Error Handling
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 59/162
la_object_t AB = la_matrix_product( A, la_transpose(B) );
la_object_t result = la_sum( AB, la_scale_with_float( C, 3.2f ) );
la_status_t status = la_matrix_to_float_buffer(buffer, leading_dim, result);if (status == LA_SUCCESS) {
// No errors, buffer is filled with good data.} else if (status > 0) {
// No errors occurred, but result does not have full accuracy.} else {
// An error occurred.assert(0);
}
Debugging
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 60/162
Enable logging• LA_ATTRIBUTE_ENABLE_LOGGING
Debugging
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 61/162
Enable logging• LA_ATTRIBUTE_ENABLE_LOGGING
Error logla_object_t la_sum(la_object_t, la_object_t):
LA_DIMENSION_MISMATCH_ERROR: Encountered a dimension mismatchobj_left rows must be equal to obj_right rows; failed comparison: 8 == 9
Solve
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 62/162
la_object_t x = la_solve(A,b);• If A is square and non-singular, compute the solution x to Ax = b
• If A is square and singular, produce an error
Slicing
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 63/162
What is slicing
Light weight access to partial object• No buffer allocations
• No buffer copies
Slicing
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 64/162
What is slicing
Light weight access to partial object• No buffer allocations
• No buffer copies
Three pieces of information• Offset
• Stride• Dimension
Slicing
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 65/162
What is slicing
la_vector_slice (vector, 7, // offset-2, // stride3); // dimension
[ 0 1 2 3 4 5 6 7 8 9 ]
Slicing
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 66/162
What is slicing
la_vector_slice (vector, 7, // offset-2, // stride3); // dimension [ 7 ]
[ 0 1 2 3 4 5 6 7 8 9 ]
Slicingh l
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 67/162
What is slicing
la_vector_slice (vector, 7, // offset-2, // stride3); // dimension [ 7 ]5
[ 0 1 2 3 4 5 6 7 8 9 ]
SlicingWh i li i
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 68/162
What is slicing
la_vector_slice (vector, 7, // offset-2, // stride3); // dimension [ 7 ]5 3
[ 0 1 2 3 4 5 6 7 8 9 ]
Slice ExampleTili
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 69/162
Tiling
la_object_t A,B,C;// A and B are matrices of dimension MxN
for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {
la_object_t Atile = la_matrix_slice(A,i*M/2,j*N/2,1,1,M/2,N/2);la_object_t Btile = la_matrix_slice(B,i*M/2,j*N/2,1,1,M/2,N/2);
C = la_sum(Atile, Btile);// use of C tile
}}
Slice ExampleTili
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 70/162
Tiling
la_object_t A,B,C;// A and B are matrices of dimension MxN
for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {
la_object_t Atile = la_matrix_slice(A,i*M/2,j*N/2,1,1,M/2,N/2);la_object_t Btile = la_matrix_slice(B,i*M/2,j*N/2,1,1,M/2,N/2);
C = la_sum(Atile, Btile);// use of C tile
}}
= +A B
C
Slice ExampleTili g
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 71/162
Tiling
la_object_t A,B,C;// A and B are matrices of dimension MxN
for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {
la_object_t Atile = la_matrix_slice(A,i*M/2,j*N/2,1,1,M/2,N/2);la_object_t Btile = la_matrix_slice(B,i*M/2,j*N/2,1,1,M/2,N/2);
C = la_sum(Atile, Btile);// use of C tile
}}
= +A B
C
Slice ExampleTiling
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 72/162
Tiling
la_object_t A,B,C;// A and B are matrices of dimension MxN
la_object_t sum = la_sum(A,B);
for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {
C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2);// use of C tile
}}
Slice ExampleTiling
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 73/162
Tiling
la_object_t A,B,C;// A and B are matrices of dimension MxN
la_object_t sum = la_sum(A,B);
for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {
C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2);// use of C tile
}}
Slice ExampleTiling
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 74/162
Tiling
la_object_t A,B,C;// A and B are matrices of dimension MxN
la_object_t sum = la_sum(A,B);
for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {
C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2);// use of C tile
}}
sum=C
Slice ExampleTiling
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 75/162
Tiling
la_object_t A,B,C;// A and B are matrices of dimension MxN
la_object_t sum = la_sum(A,B);
for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {
C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2);// use of C tile
}}
=C = +A B
SplatWhat is splat
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 76/162
What is splat
la_splat_from_float(5.0f);
SplatWhat is splat
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 77/162
What is splat
la_splat_from_float(5.0f);
SplatWhat is splat
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 78/162
What is splat
la_splat_from_float(5.0f);
Add 2 to every element of a vector
• la_sum(vector, la_splat_from_double(2.0));
LinearAlgebraSummary
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 79/162
Summary
Simple APIModern language and run-time features
Good performance
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 80/162
LINPACK The joy of benchmarking
Stephen CanonEngineer, Vector and Numerics Group
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 81/162
How fast can you solve asystem of linear equations?
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 82/162
LINPACK tests bothhardware and software
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 83/162
Accelerate vs. “Brand A”
2013LINPACK performance in GFLOPS (bigger is better)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 84/162
p ( gg )
Accelerate on iPhone 5
“Brand A”
1 2 3 4
2014LINPACK performance in GFLOPS (bigger is better)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 85/162
p ( gg )
Accelerate on iPhone 5
“Brand A”
1 2 3 4
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 86/162
Let’s find some new competition
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 87/162
2014LINPACK performance in GFLOPS (bigger is better)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 88/162
iPhone 5
2010 MacBook Air
7654321
2014LINPACK performance in GFLOPS (bigger is better)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 89/162
11
iPhone 5s
2010 MacBook Air
10987654321
10.4 GFLOPS
2014LINPACK performance in GFLOPS (bigger is better)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 90/162
10987654321 11
iPhone 5s
2010 MacBook Air
10.4 GFLOPS
2014LINPACK performance in GFLOPS (bigger is better)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 91/162
108642 1412
iPad Air
2010 MacBook Air
14.6 GFLOPS
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 92/162
<simd/simd.h>Short vector and matrix math
<simd/simd.h>
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 93/162
New (iOS 8 and OS X Yosemite) library with three purposes:
<simd/simd.h>
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 94/162
New (iOS 8 and OS X Yosemite) library with three purposes:• 2D, 3D, and 4D vector math and geometry
<simd/simd.h>
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 95/162
New (iOS 8 and OS X Yosemite) library with three purposes:• 2D, 3D, and 4D vector math and geometry
• Features of Metal in C, Objective-C, and C++ on the CPU
<simd/simd.h>
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 96/162
New (iOS 8 and OS X Yosemite) library with three purposes:• 2D, 3D, and 4D vector math and geometry
• Features of Metal in C, Objective-C, and C++ on the CPU
• Abstraction over architecture-specific SIMD types and intrinsics
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 97/162
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 98/162
Inline implementations
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 99/162
Inline implementations
Concise functions without extra parameters
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 100/162
Inline implementations
Concise functions without extra parameters
float a = cblas_sdot(3, 1.0f, x, 1, y, 1);
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 101/162
Inline implementations
Concise functions without extra parameters
float a = GLKVector3DotProduct(x, y);
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 102/162
Inline implementations
Concise functions without extra parameters
float a = vector_dot (x, y);
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 103/162
Inline implementations
Concise functions without extra parameters
using namespace simd;
float a = dot (x, y);
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 104/162
Inline implementations
Concise functions without extra parameters
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 105/162
Inline implementations
Concise functions without extra parameters
Arithmetic should use operators
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 106/162
Inline implementations
Concise functions without extra parameters
Arithmetic should use operators
z = GLKVector4MultiplyScalar(GLKVector4Add(x,y),0.5);
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 107/162
Inline implementations
Concise functions without extra parameters
Arithmetic should use operators
z = 0.5*(x + y);
Vector Math and GeometryWish list
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 108/162
Inline implementations
Concise functions without extra parameters
Arithmetic should use operators
z = 0.5*(x + y);
Vector Math and Geometry Types
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 109/162
In C and Objective-C, the primary type is vector_float N , where N is 2, 3, or 4
In C++ you can use simd::float N
Based on clang “extended vectors”
Vector Math and GeometryArithmetic
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 110/162
Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalars
Vector Math and GeometryArithmetic
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 111/162
Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {
}
Vector Math and GeometryArithmetic
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 112/162
Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {
}
x
Vector Math and GeometryArithmetic
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 113/162
Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {
}
xn
Vector Math and GeometryArithmetic
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 114/162
Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {
}
xn
Vector Math and GeometryArithmetic
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 115/162
Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {
}
xn
vector_reflect(x,n)
Vector Math and GeometryArithmetic
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 116/162
Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect(vector_float3 x, vector_float3 n) {
return x - 2*vector_dot(x,n)*n ;}
xn
vector_reflect(x,n)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 117/162
Vector Math and GeometryElements and subvectors
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 118/162
Named subvectorsvector_float4 a = { 0, 1, 2, 3 };vector_float2 b = a.lo; // b = { 0, 1 }a.even = -b; // a = { 0, 1,-1, 3 }
Vector Math and Geometry
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 119/162
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 120/162
Vector Math and Geometry
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 121/162
Vector Math and Geometry
f h fl d f
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 122/162
Some functions have two flavors: “precise” and “fast”
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 123/162
Vector Math and Geometry
S f ti h t fl “ i ” d “f t”
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 124/162
Some functions have two flavors: “precise” and “fast”
• “precise” is the default…• … but if you compile with -ffast-math , you get the “fast” versions
Vector Math and Geometry
E ith ff t th ll th i f ti b
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 125/162
Even with -ffast-math , you can call the precise functions by name:
float len = vector_precise_length(x);
Vector Math and Geometry
You can call the fast versions by name too:
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 126/162
You can call the fast versions by name too:
x = fast::normalize(x);
MatricesC and Objective-C
“matrix float NxM” where N and M are 2 3 or 4
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 127/162
matrix_float Nx M , where N and M are 2, 3, or 4
• N is number of columns , M is number of rows .
MatricesC and Objective-C
Create matrices
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 128/162
Create matrices
matrix_from_diagonal(vector)matrix_from_columns(vector, vector, …)matrix_from_rows(vector, vector, …)
Arithmeticmatrix_scale(matrix, scalar)matrix_linear_combination(matrix, scalar, matrix, scalar)matrix_transpose(matrix)matrix_invert(matrix)matrix_multiply(matrix/vector, matrix/vector)
MatricesC++ (and Metal)
Create matrices
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 129/162
Create matrices
float4x4()float2x2(diagonal)float3x4(column0, column1, …)
Arithmeticfloat3x4 A, B;A -= 2.f * B;float4x3 C = transpose(A)vector3 x;vector4 y = A*x;
Abstract SIMDAdditional types
Doubles signed and unsigned integers
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 130/162
Doubles, signed and unsigned integers
Longer vectors (8, 16, and 32 elements)Unaligned vectors
Abstract SIMDInteger operators and conversions
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 131/162
Abstract SIMDInteger operators and conversions
Arithmetic operators: +, - , *, / , %
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 132/162
Arithmetic operators: , , , / , %
Abstract SIMDInteger operators and conversions
Arithmetic operators: +, - , *, / , %
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 133/162
p , , , ,
Bitwise operators: >>, <<, &, | , ^ , ~
Abstract SIMDInteger operators and conversions
Arithmetic operators: +, - , *, / , %
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 134/162
p , , , ,
Bitwise operators: >>, <<, &, | , ^ , ~Conversions:vector_float x;vector_ushort y = vector_ushort(x);vector_char z = vector_char_sat(x);
Abstract SIMDComparisons
Vector comparisons: ==, !=, >, <, >=, <=
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 135/162
p
• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false
Abstract SIMDComparisons
Vector comparisons: ==, !=, >, <, >=, <=
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 136/162
• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false
x
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 137/162
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 138/162
Abstract SIMDComparisons
Vector comparisons: ==, !=, >, <, >=, <=
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 139/162
• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false
x
y
x < y
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 140/162
Abstract SIMDComparisons
Vector comparisons: ==, !=, >, <, >=, <=
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 141/162
• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false
x
y
x < y
Abstract SIMDComparisons
Vector comparisons: ==, !=, >, <, >=, <=
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 142/162
• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false
x
y
x < y
Abstract SIMDComparisons
Vector comparisons: ==, !=, >, <, >=, <=
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 143/162
• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false• Type of result usually isn’t important, because you’ll use one of the following:if ( vector_any (x < 0)) { /* executed if any lane of x is negative */ }if ( vector_all (y != 0)) { /* executed if every lane of y is non-zero */ }z = vector_bitselect (x, y, x > y); /* minimum of x and y */
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 144/162
String CopyScalar implementation
void string_copy(char *dst, const char *src) {while ((*dst++ = *src++));
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 145/162
while (( dst++ src++));}
String CopySSE intrinsic implementation
void vector_string_copy(char *dst, const char *src) {while ((uintptr t)src % 16)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 146/162
while ((uintptr_t)src % 16)if ((*dst++ = *src++) == 0) return;
while (1) {__m128i data = _mm_load_si128((const __m128i *)src);__m128i contains_zero = _mm_cmpeq_epi8(data, _mm_set1_epi8(0));if (_mm_movemask_epi8(contains_zero))
break;
_mm_storeu_si128((__m128i *)dst, data);src += 16;dst += 16;
}string_copy((char *)vec_dst, (const char *)vec_src);
}
String Copy<simd/simd.h> implementation
void vector_string_copy(char *dst, const char *src) {while ((uintptr t)src % 16)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 147/162
(( p _ ) )if ((*dst++ = *src++) == 0) return;
const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))
*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);
}
String Copy<simd/simd.h> implementation
void vector_string_copy(char *dst, const char *src) {while ((uintptr_t)src % 16)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 148/162
(( p _ ) )if ((*dst++ = *src++) == 0) return;
const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))
*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);
}
String Copy<simd/simd.h> implementation
void vector_string_copy(char *dst, const char *src) {while ((uintptr_t)src % 16)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 149/162
(( p ) )if ((*dst++ = *src++) == 0) return;
const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))
*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);
}
String Copy<simd/simd.h> implementation
void vector_string_copy(char *dst, const char *src) {while ((uintptr_t)src % 16)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 150/162
if ((*dst++ = *src++) == 0) return;const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))
*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);
}
String Copy<simd/simd.h> implementation
void vector_string_copy(char *dst, const char *src) {while ((uintptr_t)src % 16)
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 151/162
if ((*dst++ = *src++) == 0) return;const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))
*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);
}
PerformanceHigher is better
n d
6 Scalar<simd/siLibc
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 152/162
B y t e
s c o p i e d p e r n a n o s e c o
1.5
3
4.5
String length in bytes
1 2 4 8 16 32 64 128 256 512 1024
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 153/162
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 154/162
PerformanceHigher is better
n d
6 Scalar<simd/siLibc
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 155/162
B y t
e s c o p i e d p e r n a n o s e c o
1.5
3
4.5
String length in bytes
1 2 4 8 16 32 64 128 256 512 1024
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 156/162
More Information
Paul DanboldCore OS Technology Evangelist
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 157/162
George WarnerDTS Sr. Support [email protected]
DocumentationvImage Programming Guidehttp://developer.apple.com/library/mac/#documentation/Performance/Conceptual/vImage/Introduction/Introduction.html
More Information
DocumentationvDSP Programming Guide
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 158/162
http://developer.apple.com/library/mac/#documentation/Performance/Conceptual/vDSP_Programming_Guide/Introduction/Introduction.html
vImage Headers /System/Library/Frameworks/Accelerate.framework/Frameworks/vImage.framework/Headers/vImage.h
vDSP Headers /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vDSP.h
More Information
DocumentationLinearAlgebra Headers
b k l f k k b f k
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 159/162
/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/LinearAlgebra/LinearAlgebra.h
<simd/simd.h> /usr/include/simd/simd.h
Apple Developer Forumshttp://devforums.apple.com
Bug Reporthttp://bugreport.apple.com
Related Sessions
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 160/162
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 161/162
8/15/2019 703 Whats New in the Accelerate Framework
http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 162/162