Upload
jooink
View
484
Download
2
Embed Size (px)
DESCRIPTION
Webinar intel on NDK (Native Development Kit) & benchmark test
Citation preview
Go Native? Benchmark test su dispositivi x86:Java, NDK, IPP e TBB
Francesca Tosi Alberto Mancini
Francesca
Web and mobile developer
software engineer and architect
with a passion for clean code and fine tuned details
Intel Software Innovator
[email protected]@francescatosi+FrancescaTosiwww.jooink.com
Alberto
Software Developer (Web & Mobile);
Linux Sysadmin
DevOp.
Intel Software Innovator
[email protected]+AlbertoManciniwww.jooink.com
Why NDK?
The NDK is a toolset that allows you to implement parts of your app using native-code languages such as C and C++.
You should understand that the NDK will not benefit most apps. As a developer, you need to balance its benefits against its drawbacks.
!!NDK is not always the right choice http://goo.gl/LrmM5G
Why NDK?
While developing expensive (from the computational point of view) apps
→ we must optmize
We decided to test the performance of NDK.
Browser based computation
● Augmented Reality
Browser based computation
● Augmented Reality
● Interest Point detection
Browser based computation
● Augmented Reality
● Interest Point detection
● Computer Vision & Image Recognition
Performance
new mobile_app( ... )
→ system & tools evaluation
→ benchmark test:Image RGB2GrayScale conversion
Performance
Our choice for benchmarks:
→ Pure Java→ NDK (C/C++)→ Intel Performance Primitives (IPP)→ Threading Building Blocks (TBB)
Performance
Our choice for benchmark:
→ Pure Java→ NDK (C/C++)→ Intel Performance Primitives (IPP)→ Threading Building Blocks (TBB)
Intel INDE
Performance
Our choice for benchmark:
→ Pure Java→ NDK (C/C++)→ Intel Performance Primitives (IPP)→ Threading Building Blocks (TBB)
Intel Beacon Mountaingoo.gl/RWvxU6
Dispositivi di test
● Galaxy Tab 3
● DELL Venue 8
● Lenovo K900
Dispositivi di test
Galaxy Tab 3
● Intel atom cpu z2560● 2 core - 1.60 GHz● Android 4.4.2● Architettura i686
Dispositivi di test
DELL Venue 8
● Intel atom cpu z2580● 2 core - 2 GHz● Android 4.2.2● 2 GB RAM
Dispositivi di test
Lenovo K900
● Intel atom cpu z2580● 2 core - 2 GHz● Android 4.2.2● 2 GB RAM
Image RGB2Grayscale
RGB jfloat[1024*1024*3]
grayjfloat[1024*1024]
1
2
3
0.299*R+0.587*G+0.114*B → Y
Average on 10K runs
Image RGB2Grayscale
Java
void compute(float[] in, float[] out) {
for(int i=0, j=0; i< out.length; i++, j+=3)
out[i] = (0.299 * in[j] + 0.587 * in[j+1] + 0.114 * in[j+2]);
}
Image RGB2Grayscale
NDKvoid JNICALL … jfloatArray in, jfloatArray out) {
jsize len_out = (*env)->GetArrayLength(env, out);
...
jfloat *body_out = (*env)->GetFloatArrayElements(env, out, 0);
for(i=0, j=0; i< len_out; i++, j+=3)
body_out[i] = (jfloat)(0.299 * body_in[j] + … );
(*env)->ReleaseFloatArrayElements(env, in, body_in, 0);
...
}
Image RGB2Grayscale
IPP
IppiSize srcRoi = { 1024, 1024 };
Ipp32f* pSrc = body_in;
Ipp32f* pDst = body_out;
ippiRGBToGray_32f_C3C1R(pSrc ,1024, pDst, 1024, srcRoi);
Image RGB2Grayscale
TBB tbb::parallel_invoke( [pSrc,pDst] { IppiSize srcRoi = { 1024, 512 }; ippiRGBToGray_32f_C3C1R(..., srcRoi); }, [pSrc,pDst] { IppiSize srcRoi = { 1024, 512 }; Ipp32f* pSrcShifted = pSrc+3*(1024*512); Ipp32f* pDstShifted = pDst+(1024*512); ippiRGBToGray_32f_C3C1R(...); });
Intel INDE
IntegratedNativeDevelopmentExperience
INDE is a cross-platform suite that provides developers with tools, support, integration and updates to create high-performance C++/Java applications.
Intel INDE
IntegratedNativeDevelopmentExperience
→ download https://software.intel.com/en-us/intel-inde
→ integrates with: - Android Studio - Eclipse - Ms Visual Studio (*)
Intel INDE
IntegratedNativeDevelopmentExperience
Intel INDE
IntegratedNativeDevelopmentExperience
→ Intel HAXM
The Intel Hardware AcceleratedExecution Manager is an hardware-assisted virtualization engine (hypervisor) to speed up Android app emulation.
Intel INDE
IntegratedNativeDevelopmentExperience
→ Intel C++ compiler for Android
Intel IPP
Integrated Performance Primitives
Extensive Library of higly optimized (Intel SSE, Intel AVX) software functions for:- multimedia- data processing- communication
Intel IPP
Integrated Performance Primitives
Components:- Signal Processing (filtering, transform)- Image Processing (color conversion, wavelet transforms, computer vision)- Small Matrices and Rendering (matrix algebra, Eigen problem)- Cryptografy (RSA, DSA, prime & pseudorandom number generation)
Intel TBB
Threading Bulding Blocks
Open source project (www.threadingbuilingblocks.org)
Let’s you easly write parallel C++ programs that takes full advantage of multicore performance, that are portable & composable
Alberto Mancini
Coding time
Let’s view some code!
Results
handmade C
50%
slower than TBB+IPP
Results
Native/C with IPP & TBB(4 threads)
3x
faster than pure java
That’s All !!
Francesca TosiR&D at Jooink [email protected]
Alberto ManciniDev at Jooink [email protected]