Upload
tranlien
View
226
Download
7
Embed Size (px)
Citation preview
RE-ARCHITECTING THE DATACENTER: INTRODUCING THE INTEL® XEON® PROCESSOR SCALABLE FAMILY
S U M N E R L E M O N D I R E C T O R , GOVERNMENT & PUBLIC SECTOR SALES I N T E L C O R P O R A T I O N A S I A / P A C I F I C & J A P A N
CLOUD ECONOMICS INTELLIGENT DATA PRACTICES NETWORK TRANSFORMATION
INFRASTRUCTURE TRANSFORMATION
Performance Security
Agility
REQUIRES
INTEL® XEON® SCALABLE PLATFORM
THE INDUSTRY’S
IN A DECADEBIGGEST PLATFORM ADVANCEMENT
INTEL® XEON® SCALABLE PLATFORM DELIVERS
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. 1. Geomean based on Normalized Generational Performance going from Intel® Xeon® processor E5-26xx v4 to Intel® Xeon® Scalable processor (estimated based on Intel internal testing of OLTP Brokerage, SAP SD 2-Tier, HammerDB, Server-side Java, SPEC*int_rate_base2006, SPEC*fp_rate_base2006, Server Virtualization, STREAM* triad, LAMMPS, DPDK L3 Packet Forwarding, Black-Scholes, Intel Distribution for LINPACK. 2. 2X gains in Reed Solomon Erasure Code: Intel Xeon® Processor Scalable Family: Platinum 8180 Processor, 28C, 2.5 GHz, H0, Neon City CRB, 12x16 GB DDR4 2666 MT/s ECC RDIMM, BIOS PLYCRB1.86B.0128.R08.1703242666. Intel® Xeon® E5-2600v4 Series Processor, E5-2650 v4, 12C, 2.2 GHz, Aztec City CRB, 4x8 GB DDR4 2400 MT/s ECC RDIMM, BIOS GRRFCRB1.86B.0276.R02.1606020546. Operating System: Redhat Enterprise Linux 7.3, Kernel 4.2.3, ISA-L 2.18, BIOS Configuration, P-States: Disabled, Turbo: Disabled, Speed Step: Disabled, C-States: Disabled, ENERGY_PERF_BIAS_CFG: PERF. 3. Up to 4.28x more VMs based on server virtualization consolidation workload: Based on Intel® internal estimates 1-Node, 2 x Intel® Xeon® Processor E5-2690 on Romley-EP with 256 GB Total Memory on VMware ESXi* 6.0 GA using Guest OS RHEL6.4, glassfish3.1.2.2, postgresql9.2. Data Source: Request Number: 1718, Benchmark: server virtualization consolidation, Score: 377.6 @ 21 VMs vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 Processor on Wolf Pass SKX with 768 GB Total Memory on VMware ESXi6.0 U3 GA using Guest OS RHEL 6 64bit. Data Source: Request Number: 2563, Benchmark: server virtualization consolidation, Score: 1580 @ 90 VMs. Higher is better 4. Up to 65% lower 4-year TCO estimate example based on equivalent rack performance using VMware ESXi* virtualized consolidation workload comparing 20 installed 2-socket servers with Intel Xeon processor E5-2690 (formerly “Sandy Bridge-EP”) running VMware ESXi* 6.0 GA using Guest OS RHEL6.4 compared at a total cost of $919,362 to 5 new Intel® Xeon® Platinum 8180 (Skylake) running VMware ESXi6.0 U3 GA using Guest OS RHEL 6 64bit at a total cost of $320,879 including basic acquisition. Server pricing assumptions based on current OEM retail published pricing for 2-socket server with Broadwell based Intel Xeon processor systems– subject to change based on actual pricing of systems
offered.
Performance Agility Security
65% LOWER TOTAL COST
OF OWNERSHIP VS 4-YEAR OLD SERVER4
4.2X GREATER
VM CAPACITY VS 4-YEAR-OLD SERVER3
UP TO
2X DATA PROTECTION
PERFORMANCE G E N O V E R G E N 2
UP TO
1.65X AVERAGE
GENERATIONAL GAINS1
PERFORMANCE
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. 1. Up to 2X flops per cycle: when compared with Intel® AVX2 available on previous E5/E7 v3, v4 families
Intel® AVX-512 N E W L E V E L O F
V E C TO R P E R F O R M A N C E UP TO
2X FLOPS PER
CLOCK CYCLE1
Intel® Mesh Architecture
OPTIMIZED CORES, CACHE, MEMORY, I/O
UP TO
100GB/S ENCRYPTION @1K PACKETS
UP TO
100GB/S COMPRESSION
@8K BLOCK
Intel® QuickAssist Technology
INTEL® MESH ARCHITECTURE
Video
Other names and brands may be claimed as the property of others. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks/datacenter. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. 1. XXX world records as of July 11, 2017. Full details available at: www.intel.com/xxxxx
58 ESTABLISHED TODAY
HPE HAS 16 OF THE WORLD RECORDS
& COUNTING…
Source: https://www.intel.com/content/www/us/en/benchmarks/server/xeon-scalable/xeon-platinum-world-record.html
Security Without
Compromise
SECURITY
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. 1. 2X gains in Reed Solomon Erasure Code: Intel Xeon® Processor Scalable Family: Platinum 8180 Processor, 28C, 2.5 GHz, H0, Neon City CRB, 12x16 GB DDR4 2666 MT/s ECC RDIMM, BIOS PLYCRB1.86B.0128.R08.1703242666. Intel® Xeon® E5-2600v4 Series Processor, E5-2650 v4, 12C, 2.2 GHz, Aztec City CRB, 4x8 GB DDR4 2400 MT/s ECC RDIMM, BIOS GRRFCRB1.86B.0276.R02.1606020546. Operating System: Redhat Enterprise Linux 7.3, Kernel 4.2.3, ISA-L 2.18, BIOS Configuration, P-States: Disabled, Turbo: Disabled, Speed Step: Disabled, C-States: Disabled, ENERGY_PERF_BIAS_CFG: PERF. 2. BigBench query Runtime/second. Testing done by Intel. BASELINE: Platform 8168, NODES 1 Mgmt + 6 Workers, Make Intel Corporation, Model S2600WFD, Form Factor 2U, Processor Intel(R) Xeon(R) Platinum 8168 processor, Base Clock 2.70 GHz, Cores per socket 24, Hyper-Threading Enabled, NUMA mode Enabled, RAM 384GB DDR4, RAM Type 12x 32GB DDR4, OS Drive Intel® SSD DC S3710 Series (800GB, 2.5in SATA 6Gb/s, 20nm, MLC), Data Drives 8x - Seagate Enterprise 2.5 HDD ST2000NX0403 2TB, Intel® SSD DC P3520 Series (2.0TB), Temp Drive DC 3520 2TB, NIC Intel X722 10GBE - Dual Port, Hadoop Cloudera 5.11, Benchmark BigBench, Operating System CentOS Linux release 7.3.1611 (Core); HDFS
encryption turned OFF. vs. NEW: Platform 8168, NODES 1 Mgmt + 6 Workers, Make Intel Corporation, Model S2600WFD, Form Factor 2U, Processor Intel(R) Xeon(R) Platinum 8168 processor, Base Clock 2.70 GHz, Cores per socket 24, Hyper-Threading Enabled, NUMA mode Enabled, RAM 384GB DDR4, RAM Type 12x 32GB DDR4, OS Drive Intel® SSD DC S3710 Series (800GB, 2.5in SATA 6Gb/s, 20nm, MLC), Data Drives 8x - Seagate Enterprise 2.5 HDD ST2000NX0403 2TB, Intel® SSD DC P3520 Series (2.0TB), Temp Drive DC 3520 2TB, NIC Intel X722 10GBE - Dual Port, Hadoop Cloudera 5.11, Benchmark BigBench, Operating System CentOS Linux release 7.3.1611 (Core); HDFS encryption turned ON
Protect the Data
INTEL® KEY PROTECTION TECHNOLOGY
PROTECT KEYS FROM SOFTWARE ATTACKS
Secure the Platform
HARDWARE ROOT OF TRUST
0.37% ENCRYPTION
PERFORMANCE OVERHEAD2
UP TO
2X DATA PROTECTION
PERFORMANCE G E N O V E R G E N 1
AGILITY
Advanced RAS Features
Artificial Intelligence
Intel® Volume Management
Device WITH INTEL® OPTANE™ SSDs
Enhanced Virtualization
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance.
WITH MODE BASED EXECUTION
MOST AGILE, SCALABLE AI PLATFORM
Real-time workloads, like inferencing,
run on Xeon
Potent Performance
Production Ready
Built-in ROI
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. 1. Platform: 2S Intel® Xeon® Platinum 8180 CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via intel_pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC).Performance measured with: Environment variables: KMP_AFFINITY='granularity=fine, compact‘, OMP_NUM_THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Compared with Platform: 2S Intel® Xeon® CPU E5-2699 v3 @ 2.30GHz (18 cores), HT enabled, turbo disabled, scaling governor set to “performance” via intel_pstate driver, 256GB
DDR4-2133 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.el7.x86_64. OS drive: Seagate* Enterprise ST2000NX0253 2 TB 2.5" Internal Hard Drive.Performance measured with: Environment variables: KMP_AFFINITY='granularity=fine, compact,1,0‘, OMP_NUM_THREADS=36, CPU Freq set with cpupower frequency-set -d 2.3G -u 2.3G -g performance. Intel Caffe: (http://github.com/intel/caffe/), revision b0ef3236528a2c7d2988f249d347d5fdae831236. Inference measured with “caffe time --forward_only” command, training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training. Topology specs from https://github.com/intel/caffe/tree/master/models/intel_optimized_models (GoogLeNet, AlexNet, and ResNet-50), GCC 4.8.5, MKLML version 2017.0.2.20170110. BVLC-Caffe: https://github.com/BVLC/caffe, Inference & Training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was st ored on local storage and cached in memory before training BVLC Caffe (http://github.com/BVLC/caffe), revision 91b09280f5233cafc62954c98ce8bc4c204e7475 (commit date 5/14/2017). BLAS: atlas ver. 3.10.1.
2. Platform: 2S Intel® Xeon® Platinum 8180 CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via intel_pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC).Performance measured with: Environment variables: KMP_AFFINITY='granularity=fine, compact‘, OMP_NUM_THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Compared with Platform: 2S Intel® Xeon® CPU E5-2699 v4 @ 2.20GHz (22 cores), HT enabled, turbo disabled, scaling governor set to “performance” via acpi-cpufreq driver, 256GB DDR4-2133 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_64. SSD: Intel® SSD DC S3500 Series (480GB, 2.5in SATA 6Gb/s, 20nm, MLC). Performance measured with: Environment variables: KMP_AFFINITY='granularity=fine, compact,1,0‘, OMP_NUM_THREADS=44, CPU Freq set with cpupower frequency-set -d 2.2G -u 2.2G -g performance. Neon: ZP/MKL_CHWN branch commit id:52bd02acb947a2adabb8a227166a7da5d9123b6d. Dummy data was used. The main.py script was used for benchmarking , in mkl mode. ICC version used : 17.0.3 20170404, Intel MKL small libraries version 2018.0.20170425; Inference and training throughput uses FP32 instructions
cut training time from days to hours
up to
113X performance with
optimized software V S I N T E L X E O N E 5 V 3 1
up to
2.2X deep learning performance
V S P R I O R G E N 2
GEN10 & XEON® SCALABLE FAMILY
IN SUMMARY…Video