26
® Intel® Integrated Intel® Integrated Performance Performance Primitives (IPP) Primitives (IPP)

® Intel® Integrated Performance Primitives (IPP)

Embed Size (px)

Citation preview

Page 1: ® Intel® Integrated Performance Primitives (IPP)

®®

Intel® Integrated Intel® Integrated Performance Performance Primitives (IPP)Primitives (IPP)

Page 2: ® Intel® Integrated Performance Primitives (IPP)

®®

AgendaAgenda

Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary

Page 3: ® Intel® Integrated Performance Primitives (IPP)

®®

The Intel® Integrated The Intel® Integrated Performance Primitives are…Performance Primitives are… 函数库:信号处理,图像处理,多媒体,

向量处理等 跨平台和 OS 的通用 API 高性能代码

Overview of Intel® IPP

Page 4: ® Intel® Integrated Performance Primitives (IPP)

®®

Overview of Intel® IPP

易于在多种平台上开发

IA32IA32IA32Intel Pentium®

and Xeon processors

Intel Itanium®Architecture

Intel® MathIntel® MathKernelKernelLibraryLibrary

ApplicationsApplicationsApplicationsApplications

Intel® PCA application processors

Intel® Integrated Performance Primitives (IPP)Intel® Integrated Performance Primitives (IPP)Primitives Interface

Processor-specific Functions

Sample codeSample code

Intel® PCA application processors

Page 5: ® Intel® Integrated Performance Primitives (IPP)

®®

Overview of Intel® IPP

• 快速编码快速编码– 自动选择处理器相关的 自动选择处理器相关的 DLLDLL– 和体系结构相关的指令集和体系结构相关的指令集

Itanium®Architecture

Intel PCA application processors based on XScale™

technology

Integrated Performance Primitives (IPP)Integrated Performance Primitives (IPP)

Pentium® II processor

Pentium® III processor

Pentium® 4 processor

Xeon™ processor

Full IA-32 Family

MMX™ technology

Streaming SIMD Extensions

Streaming SIMD Extensions-2

Page 6: ® Intel® Integrated Performance Primitives (IPP)

®®

Cross platform & OSCross platform & OSOverview of Intel® IPP

• 支持多种平台 支持多种平台 – MMX™, Streaming SIMD Extensions (SSE) and Streaming MMX™, Streaming SIMD Extensions (SSE) and Streaming

SIMD Extensions 2 (SSE-2) TechnologiesSIMD Extensions 2 (SSE-2) Technologies

– IA-32 (including Intel® Xeon™ processor)IA-32 (including Intel® Xeon™ processor)

– Itanium® architectureItanium® architecture

– Intel® XScale™ micro-architectureIntel® XScale™ micro-architecture

• 支持多种操作系统支持多种操作系统– Windows NT* 4.0 / Windows 2000* /Windows XP*Windows NT* 4.0 / Windows 2000* /Windows XP*

– Windows XP* 64-bit Windows XP* 64-bit

– Linux* & Linux-64Linux* & Linux-64

– Windows CE*, Linux in embedded deviceWindows CE*, Linux in embedded device

Page 7: ® Intel® Integrated Performance Primitives (IPP)

®®

Overview of Intel® IPP

不需要写汇编代码,获得优化的应用程序不需要写汇编代码,获得优化的应用程序不需要写汇编代码,获得优化的应用程序不需要写汇编代码,获得优化的应用程序

Page 8: ® Intel® Integrated Performance Primitives (IPP)

®®

AgendaAgenda

Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary

Page 9: ® Intel® Integrated Performance Primitives (IPP)

®®

对对 IPPIPP 性能的评论性能的评论

Gaining Performance with Intel® IPP

Leo Volfson, President and Chief Technology Officer, Inetcam, Inc. Leo Volfson, President and Chief Technology Officer, Inetcam, Inc.

““The Intel ® Integrated Performance Primitives (IPP) has enhanced the The Intel ® Integrated Performance Primitives (IPP) has enhanced the iVISTA* application to be more in line with customers' expectations. For iVISTA* application to be more in line with customers' expectations. For exampleexample, it enables us to dynamically rescale, in real-time, a video , it enables us to dynamically rescale, in real-time, a video stream without loss of performance. This capability would not be stream without loss of performance. This capability would not be possible without IPP.possible without IPP.””

““The Intel IPP provided The Intel IPP provided a 300% improvement in the number of users who a 300% improvement in the number of users who can simultaneously participate in a webcastcan simultaneously participate in a webcast. In addition, the . In addition, the migration migration from the Intel Pentium III to the Intel Pentium 4 took only a dayfrom the Intel Pentium III to the Intel Pentium 4 took only a day .”.”

Page 10: ® Intel® Integrated Performance Primitives (IPP)

®®

对对 IPPIPP 性能的评论性能的评论

Gaining Performance with Intel® IPP

Bryan Cook, Software Architect, AuSIM Inc, Los Altos, California Bryan Cook, Software Architect, AuSIM Inc, Los Altos, California October 2001October 2001

““AuSIM Inc. delivers the most advanced audio simulation technology for AuSIM Inc. delivers the most advanced audio simulation technology for mission-critical aural displays and simulations. mission-critical aural displays and simulations. With Intel’s Integrated With Intel’s Integrated Performance Primitives (IPP), AuSIM has leveraged 4X performance Performance Primitives (IPP), AuSIM has leveraged 4X performance gains within its AuSIM3D* audio simulation technology.gains within its AuSIM3D* audio simulation technology. …directly …directly enhances AuSIM’s ability to provide the ultimate audio solutions for enhances AuSIM’s ability to provide the ultimate audio solutions for simulations, team communications, audio production, tele-conferences, simulations, team communications, audio production, tele-conferences, and aural information displays.”and aural information displays.”

Page 11: ® Intel® Integrated Performance Primitives (IPP)

®®

Minor effort, major gainsMinor effort, major gains

Gaining Performance with Intel® IPP

通过 Intel® IPP ,很小的代码改变可以获得极大的性能

通过将运行时的函数调用替换为IPP ,应用程序模块的性能可以得到极大的改进

pFlTmp=flArgSin[thrd].algPtr;

for(t=0; t<4*iWvMeshSize; t++) {

pFlTmp[t]=(float)sin(pFlTmp[t]);

}

____________________

ippsSin_32f_A11(flArgSin[thrd].algPtr,

flArgSin[thrd].algPtr, 4*iWvMeshSize);

1x

4x

Demo

Page 12: ® Intel® Integrated Performance Primitives (IPP)

®®

AgendaAgenda

Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary

Page 13: ® Intel® Integrated Performance Primitives (IPP)

®®

An IPP Function NameAn IPP Function Name

Intel® IPP Programming Conventions

ippsAddC_8u_I();

Prefixipps, ippi, ippm

BasenameE.g. Add, DCT, etc. Data array type

Indicates bit depth & integer / floating pointE.g. 8u, 16s, 32f

DescriptorsIndicates data layout variants

Page 14: ® Intel® Integrated Performance Primitives (IPP)

®®

Data-types and layoutsData-types and layouts

Intel® IPP Programming Conventions

对数据类型的特殊优化 (8u, 16s, 32f) 对数据布局的特殊优化 (pixel, planer…) 对数据类型和布局的转换函数的优化

ippsConvert_8u_32f()

ippsInterleave_16s()

ippiRGBToYUV_8u_C3P3R()

Color Models:• YCbCr (4:4:4, 4:2:2)• YUV (4:2:2, 4:2:0)• YCC, HLS, RGB, RGBA …

Interleaving:• L/R/L/R/L/R… stereo• RGBRGBRGB… color image• RGBARGBA… color +alpha

Page 15: ® Intel® Integrated Performance Primitives (IPP)

®®

数据布局数据布局 pixelpixel 格式中,每个像素所有位都被顺序存储格式中,每个像素所有位都被顺序存储 在在 planerplaner 格式中,每个像素的第一位被存储,格式中,每个像素的第一位被存储,

接着每个像素的第二位被存储,等等接着每个像素的第二位被存储,等等

Page 16: ® Intel® Integrated Performance Primitives (IPP)

®®

AgendaAgenda

Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary

Page 17: ® Intel® Integrated Performance Primitives (IPP)

®®

IPP IPP 对信号处理的优化对信号处理的优化Intel® IPP Lab Exercise: Signal Processing

IPP 为信号处理应用提供了优化的性能• 信号的生成• 信号的转换• 时域频域的转换 (FFTs)• 滤波器 (FFT, FIR)

Page 18: ® Intel® Integrated Performance Primitives (IPP)

®®

IPP IPP 对信号处理的优化对信号处理的优化Intel® IPP Lab Exercise: Signal Processing

SupportSupport– conj, copy, imag, real, zero, setconj, copy, imag, real, zero, set

ConvertConvert– polar/cart, complex/realpolar/cart, complex/real

– integer/floatinteger/float

– up/down sampleup/down sample

WindowingWindowing– Bartlett, Blackman, Hamming, Hann, Bartlett, Blackman, Hamming, Hann,

KaiserKaiser

Signal GenerationSignal Generation– random, wave patternsrandom, wave patterns

FiltersFilters– FIR, IIRFIR, IIR

– medianmedian

TransformsTransforms– FFT, DFT, Goertzel, DCT, FFT, DFT, Goertzel, DCT,

waveletwavelet

StatisticsStatistics– norms, threshold, min / max / norms, threshold, min / max /

std.dev., mean, powerspectrstd.dev., mean, powerspectr

AudioAudio– A-law / mu-law, preemphasizeA-law / mu-law, preemphasize

Lab

Page 19: ® Intel® Integrated Performance Primitives (IPP)

®®

IPP IPP 对图像处理的优化对图像处理的优化Intel® IPP Lab Exercise: Image Processing

IPP 为图像处理应用提供了优化的性能

• 图像生成• 颜色转换• 变形和过滤• 算术和逻辑操作• 几何变形• 编解码支持 (MPEG-1, -2, -4, H.263)

Page 20: ® Intel® Integrated Performance Primitives (IPP)

®®

IPP OptimizationsIPP Optimizationsfor Image Processingfor Image Processing

Intel® IPP Lab Exercise: Image Processing

Arithmetic & LogicalArithmetic & Logical– Abs, Add, convolve, cross-Abs, Add, convolve, cross-

correlation, div, exp, ln, LShift, correlation, div, exp, ln, LShift, normalize, mul, RShift, sqr, sqrt, normalize, mul, RShift, sqr, sqrt, sub, thresholdsub, threshold

– And, not, orAnd, not, or– CompareCompare– Phase, magnitudePhase, magnitude

ConvertConvert– Pixel/planarPixel/planar– Color conversionsColor conversions

FiltersFilters– User-defined and built-inUser-defined and built-in

TransformsTransforms– FFT, DFT, DCT, waveletFFT, DFT, DCT, wavelet

StatisticsStatistics– norms, threshold, min / max / norms, threshold, min / max /

std.dev., mean, powerspectr, std.dev., mean, powerspectr, momentsmoments

GeometricGeometric– MirrorMirror, rotate, resize, remap, rotate, resize, remap

Alpha CompositeAlpha Composite Gamma correctionGamma correction Image GenerationImage Generation

Lab

Page 21: ® Intel® Integrated Performance Primitives (IPP)

®®

AgendaAgenda

Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary

Page 22: ® Intel® Integrated Performance Primitives (IPP)

®®

将来的 将来的 IPP:3.0 IPP:3.0 或以后版本或以后版本 迁移支持迁移支持 ::

–SPL (Signal Processing Library)SPL (Signal Processing Library)

– IPL (Image Processing Library)IPL (Image Processing Library)

– IJL (Intel® JPEG Library)IJL (Intel® JPEG Library)

–RPL (Recognition Primitives Library)RPL (Recognition Primitives Library)

The Future of Intel® IPP

…supporting migration to new platforms

Page 23: ® Intel® Integrated Performance Primitives (IPP)

®®

将来的 将来的 IPP:3.0 IPP:3.0 或以后版本或以后版本 编解码原语编解码原语 ::

–MPEG-1, MPEG-2, MPEG-4, JPEG2000MPEG-1, MPEG-2, MPEG-4, JPEG2000

–G.729, G.723, GSM AMRG.729, G.723, GSM AMR

编解码采样编解码采样 ::–MPEG-1, MPEG-2, MPEG-4MPEG-1, MPEG-2, MPEG-4

–G.729, G.723, GSM AMRG.729, G.723, GSM AMR

The Future of Intel® IPP

…supporting video, image and speech encoding/decoding

Page 24: ® Intel® Integrated Performance Primitives (IPP)

®®

将来的 将来的 IPP:3.0 IPP:3.0 或以后版本或以后版本 支持更多的颜色模式和转换支持更多的颜色模式和转换 ……and much, much more!and much, much more!

The Future of Intel® IPP

…tell us what you want!

Page 25: ® Intel® Integrated Performance Primitives (IPP)

®®

AgendaAgenda

Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary

Page 26: ® Intel® Integrated Performance Primitives (IPP)

®®

Summary: IPPSummary: IPP

高性能信号处理,图像处理,多媒体和向高性能信号处理,图像处理,多媒体和向量数学函数库量数学函数库

通过很小的代码改变获得很高的性能通过很小的代码改变获得很高的性能 跨平台和操作系统的通用跨平台和操作系统的通用 APIAPI

Intel® IPP Summary