12
Inspur OAI Product Introduction Jan 2021

Inspur OAI Product Introduction

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Inspur OAI Product Introduction

Inspur OAI Product Introduction

Jan 2021

Page 2: Inspur OAI Product Introduction

AutoML Suite

AIStation(Training Platform)

Server

AI Resource

Platform

AI Computing

Platform

Accelerator Card

F07V F10A F37X N10X F10STraining Inference Edge

Model Development

Model Deployment

Application Development

T-Eye

AI application and framework feature

analyzer

N20X

AIStation(Inference Platform)

Compatible with multiple DL frameworks

Support AI model online testing and evaluation

Multi-model deployment

and weighted calculation

Caffe-MPITensorFlow-optLMSTF2

AI Algorithm

Toolkit Platform Cloud& On-premise

Deployment

Losslessmodel inaccuracy

Speed upFPGA

development

Automatic modeling

Automatic tuning

Automatic cropping

One of the firstparallel versions ofCaffe framework

Optimized TensorFlowframework with the fastest

AI training speed on thepublic cloud, 512GPU

expansion efficiency 90%

Self-developed AImodel computing

framework,supporting GPU

large-scaletraining

Smart City MedicineEducation ManufactureTelecom Finance MediaInternetE2E AI Solution

General Server and Open Compute

Page 3: Inspur OAI Product Introduction

Telecom FinTech HealthcareBroadcastGovernment ManufacturingTransportationInternet

“Solution Partner”-SI&ISV able to deliver total solution for industries

Efficient Innovation

AI Computing Platform

• Industry’s Most Comprehensive AI Server

Portfolio

• General Server 2U/4U/6U

• Open Hardware Compute and OAI

• M5 AI Servers, FPGAs, ASIC Cards….

Agile Collaboration

AI Resource Platform

• AIStation: One-stop AI development platform,

efficient and flexible computing resource

scheduling; easy to deploy AI dev environment

• T-Eye: AI performance profiling and tuning tool,

empower AI application optimization

Time to Delivery

Algorithm Toolkit

• AutoML Suite: On-Premise & Cloud deployment;

Parallel Acceleration; Effortless Model Generation

• Caffe-MPI: 1st Parallel Version of Caffe

• TensorFlow-Opt: Scale-out TensorFlow on public

cloud, optimization on cloud RoCE

“Algorithm Partner”-AI Companies able

to develop core AI capabilities

Page 4: Inspur OAI Product Introduction

ODCC Rack

OCPRack

Project Olympus Rack

Open19 Rack

InCloud Rack with Intel® RSD

1st Industry 21” OAM Platform

ODCC Solution Provider

Intel® Rack Scale Design

One of the Key Members

Inspur is a Key Member in Open Platform Communities.

OCP Platinum MemberSolution Provider

MicrosoftProject Olympus

Page 5: Inspur OAI Product Introduction

Data

Computing Resources Utilization

Training Time

40% 80%2 days 4Hours

Telecom

Finance

MedicalScience

Government

Manufacture

transport

Internet

Low Model Development Efficiency

On-premiseDeployment

Public Cloud

Private Cloud

Model Development and Training

Model Deploymentand Inference

AI App

Efficient and flexible

platform, obtain AI

computing resources on

demand to speed up

model training efficiency

Easily deploy AI

development

environment and

development process,

significantly improving

development efficiency

Low Utilization of Computing Resource

The deployment complication to

deploy the trained model into production

The deployment complication to get the trained model into production

Seamless connection

between model

development and

deployment, shorten the

time of scaling to

production

Unified management

of multiple models,

Centralized scheduling

of computing

resources

Dynamic allocation,

Elastic expansion

One-stop Model Deployment

Multi-application load balancing and resource elastic scaling

Data Model AI Service

2 days 5 min

PC

Mobile

Manufacture

Robot

IOT

Page 6: Inspur OAI Product Introduction

141mm

• SAS Switch for pooling HDDs, improving

storage flexibility

• PCIe Switch for pooling GPU, GPU

acceleration ratio increases linearly

• GPU/FPGA over Fabric, heterogeneous

acceleration remote expansion

Server 1 Server 2 Server 3 Server 4

PCIe Switch

NVMe over Fabric / PCIe / Ethernet

GPU Pool

FPGA Pool 1 GPU Pool 2

Page 7: Inspur OAI Product Introduction

2018/11

World’s First 21” OAI Reference

System

2020/2

54V OAM Power on

2020/5 2020/82019/112019/8 2019/92019/52019/3

OAI Reference SystemMX1

OCP Certificated 2S Compute NoteON5263M5 (San Jose)

High Density Whisper Cable

WhisperConnector

Front IOconnector

QSFP-DD Connector for OAMExpansion

4 x HHHL PCIe Expansion

1570W without OAMs

141mm

35

Ambient Temperature

Supported

Page 8: Inspur OAI Product Introduction

Product Model: MX1

Chassis 21” 3OU Rack mount

Dimensions 537W*141H*803D (mm)

Connection with Compute node

Up to PCIe Gen4 x32

OAMSupport Max 8pcs 48~54V OAM(up to 450W each);Support Max 8pcs 12V OAM (up to 350W each)

Power without OAM 1570W

PCIe Switch Support PCIe Gen4 (100lanes/chip)

PCIe re-timer Support PCIe Gen4 x16

Phy re-timer 56Gbps PAM-4 or 10/28Gbps NRZ x16

Expansion slots Up to 4 x PCle Gen4 x16 low profile standard card

BMC AST2520

I/ODongle connector for dedicate NIC and UBS, UID/PWR Button with LED , QSFDDx8 for OAM scale out, micro USBx2 for OAM debug

Ambient Working Temperature

5-35 ℃

INSPUR CONFIDENTIAL

OAI Reference SystemMX1

OCP Certificated 2S Compute NoteON5263M5 (San Jose)

High Density Whisper Cable

35

Ambient Temperature

Supported

Front IOconnector

QSFP-DD Connector for OAMExpansion

4 x HHHL PCIe Expansion

1570W without OAMs

Page 9: Inspur OAI Product Introduction

INSPUR CONFIDENTIAL

ComputeNode

54V HSC x9

54V to 12VVR x6

PCleRe-timerPT4161L

PCleRe-timerPT4161L

PCleRe-timerPT4161L

PCleRe-timerPT4161L

PCleRe-timerPT4161L

PCleRe-timerPT4161L

PCleRe-timerPT4161L

PCleRe-timerPT4161L

OAM0 OAM1 OAM2 OAM3 OAM4 OAM5 OAM6 OAM7

CPLD

CPLD

QSFP

-DD

QSFP

-DD

QSFP

-DD

QSFP

-DD

QSFP

-DD

QSFP

-DD

QSFP

-DD

QSFP

-DD

IBModule

IBModule

Pcle SwitchPM42100

Pcle SwitchPM42100

Pcle SwitchPM42100

Pcle SwitchPM42100

BMCAST2520

(Management)

CPLD

PhyRe-timer

CRT50216P

PhyRe-timer

CRT50216P

PhyRe-timer

CRT50216P

PhyRe-timer

CRT50216P

I2C

Power Monitor

I2C/JTAG

I2C

UBB

HIB

I2C

PDB

Signal symbol Signal type

PCle x16

Management

Serdes

Page 10: Inspur OAI Product Introduction

A I O p s a n d M g m t

Open RMC

Open BMC

Physical Infrastructure Manager

Open standard mgmt. interface

For

• Solutions for the implementation of the

rack Mgmt based on node level

• Southbound manages system resources;

northbound presents Info

• Meet the needs of Mgmt encryption and

resource pooling

• Relying on vendors maintenance

for traditional BMC code base

• Complex to modify the traditional

BMC code for new HW

• Poor readability of IPMI tool

binary code

For

To B u i l d a s m a r t e r D C

Automated AssetMgmt

Intelligent alarm andFault Mgmt

One-click upgradeMgmt

Visual 3D Mgmt

100K units scaleMgmt capability

Page 11: Inspur OAI Product Introduction

INSPUR CONFIDENTIAL

• RMC Web Server:基于OpenBMC的Rack Manager控制器服务

• RMC Web UI:资源收集及服务配置文件

• 南向接口:支持Redfish RESTful API

• 北向接口:支持Redfish RESTful API并丰富了服务配置文件

RMCWEB SERVER

BMCBMCBMC

Redfish Redfish Redfish

Redfish Redfish

USER TOOLSRMC WEBUI100G Switch

1G Mgmt Switch

OAI system

Compute node x2

OAl system

OAl system

Compute node x2

OAl system

Power Shelf

48VDC Open Rack

1 pairs 48V Bus Bar

1 shelf per Rack

Power Shelf

33KW(12xPSU)

40V-58V

93mm (H, 2OU) x 537mm

(W, 21”) x 586 (D) mm

System Devices

Inspur 3OU OAI systems x4

Inspur 2OU compute node x4

Page 12: Inspur OAI Product Introduction