Upload
docker-inc
View
471
Download
2
Embed Size (px)
Citation preview
Containerizing Hardware Accelerated Applications
Chelsea Mafrica
Data Center Systems EngineerIntel Corporation
MotivationEvaluate performance impact of containers on a media stack that uses hardware acceleration
Agenda● Hardware accelerators, applications, and media
● Media stack
● When & how to use containers
● Experiment & results
● Portability
Hardware accelerators, applications, and mediaA hardware accelerator is a processor or fixed function specialized to perform specific tasks (excluding a general purpose CPU)
Examples: GPUs, FPGAs, ASICs
Applications that typically benefit from hardware acceleration are ones that can be parallelized
Examples: AI, machine learning, HPC, media
Media refers to video processing
Examples: Video compression and decompression (encode and decode), filters
KERNEL
SERVER
DRIVER
GPU
LIBS
APP APP
APP
USER SPACE
Media stack
Transcode applicationsIntel® Media Server StudioIntel® Quick Sync VideoIntel® Iris® Pro Graphics
Media stack with Docker
Transcode applicationsIntel® Media Server StudioIntel® Quick Sync VideoIntel® Iris® Pro GraphicsDocker
KERNEL
SERVER
DRIVER
GPU
LIBS
APP
USER SPACE
CONTAINER ENGINE
CONTAINER
Software-only app
ApplicationsLibraries & dependenciesDockerKERNEL
SERVER
LIBS
APP
USER SPACE
CONTAINER ENGINE
CONTAINER
KERNEL
SERVER
DRIVER
GPU
USER SPACE
CONTAINER ENGINE
CONTAINER Media stack with Docker
LIBS
APP APP
APPTranscode applicationsIntel® Media Server StudioIntel® Quick Sync VideoIntel® Iris® Pro GraphicsDocker
KERNEL
SERVER
DRIVER
GPU
USER SPACE
Media stack with Docker
LIBS
APP
CONTAINER
APP
LIBS
CONTAINER
CONTAINER ENGINE
Transcode applicationsIntel® Media Server StudioIntel® Quick Sync VideoIntel® Iris® Pro GraphicsDocker
• Kernel module installation• Custom kernel build
$ ls /dev/dricard0 card1 controlD64 controlD65 renderD128
Host requirements
FROM centos:7.2.1511MAINTAINER Chelsea Mafrica <[email protected]>
COPY intel-linux-media_generic_16.5.1-59511_64bit.tar.gz sample_multi_transcode /root/RUN yum -y -t install mesa-dri-drivers && \yum clean all && \useradd user && \usermod -a -G wheel user && \usermod -a -G video user && \find /usr -name "libdrm*" | xargs rm -rf && \find /usr -name "libva*" | xargs rm -rf && \cd root && \tar -xvf intel-linux-media_generic_16.5.1-59511_64bit.tar.gz && \cp -r etc/* /etc && \cp -r lib/* /lib && \cp -r opt/* /opt && \cp -r usr/* /usr && \cp sample_multi_transcode /home/user && \chown user:user /home/user/sample_multi_transcode && \rm -rf *
WORKDIR /home/user
Dockerfile
docker build -t mss:centos.transcode .
docker run --device=/dev/dri/renderD128 \--volume=/home/user/volume/mss_content:/home/user/content \-i -d mss:centos.transcode bash
docker exec CONTAINER_ID su - user –c \"./sample_multi_transcode -i::h264 content/video_input.264 \-o::h264 content/video_output.264"
Building and running the container
ExperimentTest the number of transcodes that can run on a system before the average performance of a transcode drops below 30 frames per second
APPNAPP1 APPNAPP1 APPNAPP1
CONTAINER
HOSTHOST HOST
CONTAINER1 CONTAINERN
baseline single container case multiple container case
Observations● Variability in container startup time as the system reaches
capacity
● Running in detached mode, negligible change in performance
fram
es p
er s
econ
d
Legal Disclaimer: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. *Other names and brands may be claimed as the property of others. See backup for configuration details.
Transcode Performance
apps
real-time(30 fps)
Observations● Variability in container startup time as the system reaches
capacity
● Running in detached mode, negligible change in performance
● Portability is limited due to driver and hardware requirements
Media stack with Docker
KERNEL
SERVER
DRIVER
GPU
LIBS
APP
USER SPACE
CONTAINER ENGINE
CONTAINER
Transcode applicationsIntel® Media Server StudioIntel® Quick Sync VideoIntel® Iris® Pro GraphicsDocker
Media stack with Docker
KERNEL
SERVER
DRIVER
GPU
LIBS
APP
USER SPACE
CONTAINER ENGINE
CONTAINER
Transcode applicationsIntel® Media Server StudioIntel® Quick Sync VideoIntel® Iris® Pro GraphicsDocker
Media stack with Docker
KERNEL
SERVER
DRIVER
GPU
LIBS
APP
USER SPACE
CONTAINER ENGINE
CONTAINER
Transcode applicationsIntel® Media Server StudioIntel® Quick Sync VideoIntel® Iris® Pro GraphicsDocker
Media stack with Docker
KERNEL
SERVER
DRIVER
GPU
LIBS
APP
USER SPACE
CONTAINER ENGINE
CONTAINER
Transcode applicationsIntel® Media Server StudioIntel® Quick Sync VideoIntel® Iris® Pro GraphicsDocker
Summary● Running accelerated apps in containers uses existing Docker capabilities
● The use of containers resulted in negligible performance difference for transcode apps in capacity test
● Containers are helpful for reducing conflicts with the host, but this isn’t specific to hardware accelerators
● Dependency on hardware and custom kernels limits portability of container, but the app will have better performance because of the hardware
Links & current workIntel® Media Server Studio: http://intel.ly/MediaServerStudio
Intel ® MediaSDK http://github.com/Intel-Media-SDK
Intel® OTC: http://github.com/vmmqa/dockerGpuStack
twitter: mafrica_chelsea.e.mafrica at intel dot com
Legal InformationTesting by Chelsea Mafrica, January 2017 – June 2017System Configuration:BASELINE: Intel® Xeon® CPU E3-1585L v5, 3.5GHz, 4 cores, turbo and HT on, BIOS AMI 1.0, 32GB total memory, 2 slots / 16GB / 2133MHz / DDR4 DIMM, 480GB total storage / 2 240GB SSDs (2.5”), Intel® I350 Gigabit Network Connection, CentOS Linux* 7.2.1511 kernel 3.10.0-327.13.1.x86_64, Media Server Studio 2017 R1NEW: Baseline configuration, Docker* 1.12.3
DisclaimerSoftware and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. *Other names and brands may be claimed as the property of others