Upload
naohiro-tamura
View
537
Download
0
Embed Size (px)
Citation preview
Naohiro Tamura
Professional Engineer
Fujitsu Limited
Ironic
towards truly open and reliable,
eventually for mission critical
Copyright 2015 FUJITSU LIMITED
OpenStack Summit October 2015 Tokyo
Thursday, October 29 • 11:00am - 11:40am
TOC
Introduction
Who am I?
What is Fujitsu good at?
Vision
Customer Values: What are the most important for customer?
Mid Term Vision: Truly Open and Reliable
Long Term Vision: Eventually Mission Critical
Contribution
What we have done and are doing
What we are going to do
Conclusion
1 Copyright 2015 FUJITSU LIMITED
TOC
Introduction
Who am I?
What is Fujitsu good at?
Vision
Customer Values: What are the most important for customer?
Mid Term Vision: Truly Open and Reliable
Long Term Vision: Eventually Mission Critical
Contribution
What we have done
What we are going to do
Conclusion
2 Copyright 2015 FUJITSU LIMITED
Who am I?
I joined Ironic Community at the beginning of Kilo development cycle, and focus on Ironic Driver Development.
Before that, I mainly worked on proprietary software developmentfor system management.
I developed bare metal provisioning and IO virtualization for N+1 server redundancyby enhancing PXE server on legacy BIOS and UEFI.
OpenStack is my first Open Source Project, it’s a whole new experience
Joyful part – working with talented and smart people
Interesting phenomenon – bikeshedding
Contacts
Email: [email protected] IRC: naohirot
3 Copyright 2015 FUJITSU LIMITED
What is Fujitsu good at?
Fujitsu sustains many social infrastructures
Banking
Stock exchange
Factory automation
Government agency system
For example, Tokyo Stock Exchange Trading System
Highly Reliable IA x64 server (PRIMEQUEST/PRIMERGY)
Open Source Linux (RHEL)
Fujitsu’s in-memory database system
Fujitsu is good at missions critical systems
4 Copyright 2015 FUJITSU LIMITED
TOC
Introduction
Who am I?
What is Fujitsu good at?
Vision
Customer Values
Mid Term Vision: Truly Open and Reliable
Long Term Vision: Eventually Mission Critical
Contribution
What we have done and are doing
What we are going to do
Conclusion
5 Copyright 2015 FUJITSU LIMITED
Three Customer Values We See
7 Copyright 2015 FUJITSU LIMITED
Truly open, no vendor lock-into provide customer with freedom to switch any vendor anytime
Reliable, robust, highly available system
to operate customer's business continuously
Responsive, responsible, competent support
to resolve customer's incident quickly and accurately.
Mid Term Vision : Truly Open and Reliable
8 Copyright 2015 FUJITSU LIMITED
What should truly open and reliable be?
The Android Robot logo is licensed under the terms of the Creative Commons Attribution license
CurrentUsed to be
Mid Term Vision : Truly Open and Reliable
Android as a concrete reference model
9 Copyright 2015 FUJITSU LIMITED
iOS
Android
Android
iOS
2015Q2 WW Smartphone Shipments
Source: Worldwide Smartphone Growth Expected to Slow to 10.4% in 2015, Down From 27.5% Growth in 2014, According to IDC http://www.idc.com/getdoc.jsp?containerId=prUS25860315
13.9%
82.8%
No vendor lock-in.
Customer can switch
from one vendor
to other vendor
whenever she/he wants
because of truly open
and reliable.
Current Future
Mid Term Vision : Truly Open and Reliable
OpenStack would be in the same situation as Android
10 Copyright 2015 FUJITSU LIMITED
Big FourBig Four
Customer can switch
from one cloud
to other cloud
whenever she/he wants
if it’s truly open and reliable.
Interoperability is key
for no vendor lock-in
among public, private
and hybrid cloud.
Mid Term Vision : Truly Open and Reliable
Android
Android defines Hardware
Customer can switch anytime to whichever vendor he/she likes.
From UI’s point of view, all Android smartphones have same functionality
Hardware reliability and Support Responsiveness are different among vendors.
11 Copyright 2015 FUJITSU LIMITED
Ironic
Ironic will define datacenter server hardware specification if the market accepts Ironic.
Customer will be able to switch anytime to whichever vendor he/she likes.
From API, CLI, and UI’s point of view, Ironic will have same functionality to all bare metal servers
Hardware reliability and Support Responsiveness will be different among vendors.
Situation Comparison between Android and Ironic
Mid Term Vision: Truly Open and Reliable
12 Copyright 2015 FUJITSU LIMITED
How can we achieveTruly Open and Reliable?
Mid Term Vision: Truly Open and Reliable
1. First of all, we need to complete the following table to create the same situationas Android, that is no vendor lock-in situation.
Current status as of Liberty: Ironic BMC Driver Implementation
Legend: ✔done, △ ongoing, ×not yet , - not applicable Green: Contributed, Yellow: Contributing, Pink: plan to contribute
2. And then enhance proactive/reactive features to achieve higher reliability
13 Copyright 2015 FUJITSU LIMITED
State Power On/Off Power Off to On Deploy Active Inspect Clean Zap Rescue
I/F Power Boot Deploy Mgmt Inspect Clean Raid Rescue
method
BMC
hard soft pxe vmedia iscsi agent oob ib oob iscsi agent ib oob pxe vmedia
IPMI ✔△
liberty✔ - ✔ ✔ ✔
✔kilo
-×
✔kilo
×-
× -
AMT✔kilo
×✔kilo
×✔kilo
×✔kilo
× × ××
× × × ×
DRAC ✔×
✔ × ✔ × ✔✔kilo
× × × × × × ×
iLO ✔ × ✔ ✔ ✔ ✔ ✔△
liberty
✔kilo
✔kilo
✔kilo
△liberty
×× ×
iRMC✔kilo
△liberty
✔kilo
✔liberty
✔kilo
✔liberty
✔kilo
×△
liberty
×× ×
×× ×
UCS✔
liberty
× ✔liberty
×✔
liberty
✔liberty
✔liberty
×△
liberty× × × × × ×
Long Term Vision: Eventually Mission Critical
Can you imagine that Tokyo Stock Exchange Trading System runs inside Ironic?
No, I can’t right now.Stock market involves huge amount of investment money.
Our Vision is really challenging, “Ironic for Mission Critical”.
We believe that it’s difficult to achieve this visionjust by a company.
But we believe that we can achieve this visionby a community.Because there are a lot of things to be done.
14 Copyright 2015 FUJITSU LIMITED
https://ja.wikipedia.org/wiki/%E6%9D%B1%E8%A8%BCArrows
TOC
Introduction
Who am I?
What is Fujitsu good at?
Vision
Customer Values: What are the most important for customer?
Mid Term Vision: Truly Open and Reliable
Long Term Vision: Eventually Mission Critical
Contribution
What we have done and are doing
What we are going to do
Conclusion
15 Copyright 2015 FUJITSU LIMITED
To realize the mid term vision1) Complete the table for truly open
2) Proactive/reactive features for reliability
TOC
Introduction
Who am I?
What is Fujitsu good at?
Vision
Customer Values: What are the most important for customer?
Mid Term Vision: Truly Open and Reliable
Long Term Vision: Eventually Mission Critical
Contribution
What we have done and are doing
What we are going to do
Conclusion
16 Copyright 2015 FUJITSU LIMITED
What we have done and are doing
Virtual Media Deployment• Out of Band Boot
Soft Power Off and Inject NMI• Power Control Finite State Machine
• Abort Task
What we are going to do
Rescue Mode in Tenant Network• Repair Instance Image in Cinder by Virtual Media Boot
Bare Metal N+1 Redundancy• Cold Migration by Soft Power Off and Virtual Media Boot
Virtual Media Deployment
What we are going to do
Rescue Mode in Tenant Network• Repair Instance Image in Cinder by Virtual Media Boot
Bare Metal N+1 Redundancy• Cold Migration by Soft Power Off and Virtual Media Boot
18 Copyright 2015 FUJITSU LIMITED
Virtual Media enables Out of Band (OOB) Boot
It is good for multi tenant and networked storage environment
Element Technology
Note: Ironic Deploy Basics
Boot methods
PXE (network) - IB (In Band)
Virtual Media - OOB (Out Of Band)
Types of Image
Deploy image (Deploy ramdisk)
User image (Boot ramdisk, Instance boot image, OS instance)
Deploy methods
iSCSI
Ironic Python Agent (http/https)
20 Copyright 2015 FUJITSU LIMITED
CIFS/NFS
Virtual Media Deployment
How does iscsi_irmc driver work?
21 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic API
depoy.iso
floppy.img
Instance
Boot Image
Management Network
Tenant Network
1
Local
disk
1) Create virtual floppy and copy it into CIFS/NFS
CIFS/NFS
Virtual Media Deployment
How does iscsi_irmc driver work?
22 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic API
depoy.iso
floppy.img
Instance
Boot Image
Management Network
Tenant Network
1
2
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
CIFS/NFS
Virtual Media Deployment
How does iscsi_irmc driver work?
23 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic API
depoy.iso
floppy.img
Instance
Boot Image
Management Network
Tenant Network
1
2, 3
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
CIFS/NFS
Virtual Media Deployment
How does iscsi_irmc driver work?
24 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic API
depoy.iso
floppy.img
Instance
Boot Image
Management Network
Tenant Network
1
2, 3
4
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export local disk as iscsi target,
Call Ironic API to continue
Virtual Media Deployment
How does iscsi_irmc driver work?
25 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Management Network
Tenant Network
1
2, 3
45
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export local disk as iscsi target,
Call Ironic API to continue5) Dispatch Ironic API call to conductor
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does iscsi_irmc driver work?
26 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Management Network
Tenant Network
1
2, 3
45
6
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export local disk as iscsi target,
Call Ironic API to continue5) Dispatch Ironic API call to conductor
6) Call Image service
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does iscsi_irmc driver work?
27 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Management Network
Tenant Network
1
2, 3
45
6
7
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export local disk as iscsi target,
Call Ironic API to continue
6) Call Image service
7) Download boot image
Local
disk
Mount cd/fd
5) Dispatch Ironic API call to conductor
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does iscsi_irmc driver work?
28 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Management Network
Tenant Network
1
2, 3
45
6
7
8
8) Attach local disk by iscsi,
DD boot image to local disk
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export local disk as iscsi target,
Call Ironic API to continue
6) Call Image service
7) Download boot image
5) Dispatch Ironic API call to conductor
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does iscsi_irmc driver work?
29 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Management Network
Tenant Network
1
2, 3, 9
45
6
7
8
8) Attach local disk by iscsi,
DD boot image to local disk9) Boot from local disk
Local
disk
Mount cd/fd
9
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export local disk as iscsi target,
Call Ironic API to continue
6) Call Image service
7) Download boot image
5) Dispatch Ironic API call to conductor
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does agent_irmc driver work?
30 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Management Network
Tenant Network
1
Local
disk
1) Create virtual floppy and copy it into CIFS/NFS
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does agent_irmc driver work?
31 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Tenant Network
1
2
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
Management Network
CIFS/NFS
depoy.iso
floppy.img
CIFS/NFS
Virtual Media Deployment
How does agent_irmc driver work?
32 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic API
depoy.iso
floppy.img
Instance
Boot Image
Tenant Network
1
2, 3
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
Management Network
Virtual Media Deployment
How does agent_irmc driver work?
33 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Tenant Network
1
2, 3
4
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export IPA (Ironic Python Agent) API
Call Ironic API to heartbeat
Management Network
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does agent_irmc driver work?
34 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Tenant Network
1
2, 3
45
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export IPA (Ironic Python Agent) API
Call Ironic API to heartbeat
5) Dispatch Ironic API call to conductor
Management Network
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does agent_irmc driver work?
35 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Tenant Network
1
2, 3
456
Local
disk
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export IPA (Ironic Python Agent) API
Call Ironic API to heartbeat
5) Dispatch Ironic API call to conductor
6) Call IPA API to start boot image download
Management Network
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does agent_irmc driver work?
36 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Tenant Network
1
2, 3
456
Local
disk
7
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export IPA (Ironic Python Agent) API
Call Ironic API to heartbeat
5) Dispatch Ironic API call to conductor
6) Call IPA API to start boot image download
7) Download boot image by HTTP to local disk
Management Network
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does agent_irmc driver work?
37 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Tenant Network
1
2, 3
456, 8
Local
disk
7
Mount cd/fd
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export IPA (Ironic Python Agent) API
Call Ironic API to heartbeat
5) Dispatch Ironic API call to conductor
6) Call IPA API to start boot image download
7) Download boot image by HTTP to local disk
8) Call IPA API to see if deploy has been done
Management Network
CIFS/NFS
depoy.iso
floppy.img
Virtual Media Deployment
How does agent_irmc driver work?
38 Copyright 2015 FUJITSU LIMITED
Bare Metal
ServerIronic Conductor
BMC
Image Service Ironic APIInstance
Boot Image
Tenant Network
1
2, 3, 9
456, 8
1) Create virtual floppy and copy it into CIFS/NFS
2) Attach virtual cdrom and floppy
3) Boot from virtual cdrom
4) Export IPA (Ironic Python Agent) API
Call Ironic API to heartbeat
5) Dispatch Ironic API call to conductor
6) Call IPA API to start boot image download
7) Download boot image by HTTP to local disk
8) Call IPA API to see if deploy has been done
9) Boot from local disk
Local
disk
7
Mount cd/fd
9
Management Network
CIFS/NFS
depoy.iso
floppy.img
TOC
Introduction
Who am I?
What is Fujitsu good at?
Vision
Customer Values: What are the most important for customer?
Mid Term Vision: Truly Open and Reliable
Long Term Vision: Eventually Mission Critical
Contribution
What we have done and are doing
What we are going to do
Conclusion
39 Copyright 2015 FUJITSU LIMITED
What we have done and are doing
Virtual Media Deployment• Out of Band Boot
Soft Power Off and Inject NMI• Power Control Finite State Machine
• Abort Task
What we are going to do
Rescue Mode in Tenant Network• Repair Instance Image in Cinder by Virtual Media Boot
Bare Metal N+1 Redundancy• Cold Migration by Soft Power Off and Virtual Media Boot
Usecases of Soft Power Off and Inject NMI*
In what situation or scenario does Soft Power Off help?
Unscheduled Hardware Maintenance, because cloud provider cannot logon customer’s instance.
Scheduled Hardware Maintenance, but customer didn’t shutdown
In what situation or scenario does Inject NMI help?
Cloud provider support can ask customer to provide OS dump to resolve customer's incident quickly and accurately.
Customer can investigate problem by themselves with keeping sensitive business data such as credit card number.
40 Copyright 2015 FUJITSU LIMITED
*NMI: Non-maskable interrupt https://en.wikipedia.org/wiki/Non-maskable_interrupt
Benefits of Soft Power Off and Inject NMI
Soft Power Off protects customer’s data
Current Power Control is “hard” only. Imagine in-memory database is running, it’s very dangerous operation!• ironic node-set-power-state off
Soft Power Off shuts down OS gracefully, and it’s abortable• ironic node-set-power-state soft_off
• ironic node-set-power-state abort_soft_off
Inject NMI enables responsive support
Inject NMI behaves like reboot, and take OS dump when reboot has done• ironic node-set-power-state inject_nmi
41 Copyright 2015 FUJITSU LIMITED
Soft Power Off and Inject NMI
Power State and Target Power State
Power Control is so basic, but not simple and easy to implement
$ ironic node-show-states $NODE_UUID
+------------------------+---------------------------+
| Property | Value |
+------------------------+---------------------------+
| target_power_state | None |
| target_provision_state | None |
| last_error | None |
| console_enabled | False |
| provision_updated_at | 2015-10-01T05:20:15+00:00 |
| power_state | power off |
| provision_state | available |
+------------------------+---------------------------+42 Copyright 2015 FUJITSU LIMITED
Power On | Power Off | Error
power on | power off
soft power off | inject NMI
Soft Power Off and Inject NMI
Current Implementation
No Power Control Finite State Machine such as Deployment State
43 Copyright 2015 FUJITSU LIMITED
Power ON Power OFF
Power ON
Error
Timeout | IOError
Reboot = Power Cycle (Power OFF + Power ON) Stable State Existing Target State
Power OFF
Soft Power Off and Inject NMI
Proposed Implementation
Create Power Control Finite State Machine such as Deployment State
The most difficult part is to support Abort
44 Copyright 2015 FUJITSU LIMITED
Power ON Power OFF
Power ON
Power OFF SOFT
Inject NMI Error
Abort | Timeout | IOError
Reboot = Power Cycle (Power OFF + Power ON) New Target StateStable State Existing Target State
Power OFF
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
45 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
Exit
Abort Message
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
46 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
Exit
2) Exclusive Node Lock
Abort Message
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
47 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
Exit
2) Exclusive Node Lock
3) Get Chan
Abort Message
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
48 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
4) Read Chan
Exit
2) Exclusive Node Lock
3) Get Chan
Abort Message
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
49 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
4) Read Chan
Exit
2) Exclusive Node Lock
3) Get Chan
5) AbortAbort Message
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
50 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
4) Read Chan
Exit
2) Exclusive Node Lock
3) Get Chan
5) AbortAbort Message
6) Get Chan
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
51 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
4) Read Chan
Exit
2) Exclusive Node Lock
3) Get Chan
5) AbortAbort Message
6) Get Chan
7) Send Message
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
52 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
4, 8) Read Chan
Exit
2) Exclusive Node Lock
3) Get Chan
5) AbortAbort Message
6) Get Chan
7) Send Message
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
53 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
4, 8) Read Chan
Exit
2) Exclusive Node Lock
3) Get Chan
5) AbortAbort Message
6) Get Chan
7) Send Message
9) If Abort
Message
Soft Power Off and Inject NMI
How to implement Abort
How to handle Abort/Cancel/Timeout/Exception of background task are common problem in concurrent programming
How should we implement in eventlet green thread?• CSP (Communication Sequential Process) Channel
• Channel Registry such as Erlang process registry
54 Copyright 2015 FUJITSU LIMITED
Database
node state
node state
node state
Ironic Conductor
Ironic API
Channel Registry
channelnode uuid
channelnode uuid
channelnode uuid
Soft Power OFF Task (green thread)
AbortTask (green thread)
1) Soft Power Off
4, 8) Read Chan
Exit
2) Exclusive Node Lock
10) Unock
Node
3) Get Chan
5) AbortAbort Message
6) Get Chan
7) Send Message
9) If Abort
Message
TOC
Introduction
Who am I?
What is Fujitsu good at?
Vision
Customer Values: What are the most important for customer?
Mid Term Vision: Truly Open and Reliable
Long Term Vision: Eventually Mission Critical
Contribution
What we have done and are doing
What we are going to do
Conclusion
55 Copyright 2015 FUJITSU LIMITED
What we have done and are doing
Virtual Media Deployment• Out of Band Boot
Soft Power Off and Inject NMI• Power Control Finite State Machine
• Abort Task
What we are going to do
Rescue Mode in Tenant Network• Repair Instance Image in Cinder by Virtual Media Boot
Bare Metal N+1 Redundancy• Cold Migration by Soft Power Off and Virtual Media Boot
Multi Tenant, and
Networked storage
Environment
Rescue Mode in Tenant Network
Rescue Usecase in Multi Tenant Support
The instance image is deployed by pxe (In Band) boot and flip network
56 Copyright 2015 FUJITSU LIMITED
Management Network
Tenant Network
Bare Metal
Server 1
BMC
Cinder
Instance
Boot Image
Deploy Network
L2 Switch
NeutronIronic ConductorDeploy
Image
What if the instance is damaged?
Flip NetworkPXE boot
Multi Tenant, and
Networked storage
Environment
Rescue Mode in Tenant Network
Multi Tenant Network Support – Provider Network
Rescue Image needs Rescue Network and Tenant Network
Bare Metal Server 1 now has different network configuration from the production environment which could make rescue difficult
57 Copyright 2015 FUJITSU LIMITED
Ironic Conductor
Tenant Network
Bare Metal
Server 1
BMC
Cinder
Instance
Boot Image
Rescue Network
Rescue
Image
L2 Switch
Neutron
Management Network
Fix the damaged instance
in different network configuration
Multi Tenant, and
Networked storage
Environment
Rescue Mode in Tenant Network
Virtual Media Boot provides Out Of Band Rescue Modewith the same tenant network configuration as the real Instance
58 Copyright 2015 FUJITSU LIMITED
Management Network
Tenant Network
Bare Metal
Server 1
BMC
Rescue Network
L2 Switch
Neutron
CIFS/NFS
rescue.iso
Ironic ConductorRescue
Image
Fix the damaged instance
in same network configuration
Cinder
Instance
Boot Image
TOC
Introduction
Who am I?
What is Fujitsu good at?
Vision
Customer Values: What are the most important for customer?
Mid Term Vision: Truly Open and Reliable
Long Term Vision: Eventually Mission Critical
Contribution
What we have done and are doing
What we are going to do
Conclusion
59 Copyright 2015 FUJITSU LIMITED
What we have done and are doing
Virtual Media Deployment• Out of Band Boot
Soft Power Off and Inject NMI• Power Control Finite State Machine
• Abort Task
What we are going to do
Rescue Mode in Tenant Network• Repair Instance Image in Cinder by Virtual Media Boot
Bare Metal N+1 Redundancy• Cold Migration by Soft Power Off and Virtual Media Boot
Multi Tenant, and
Networked storage
Environment
CIFS/NFS
Bare Metal N+1 Redundancy
Cold Migration by Soft Power Off and Virtual Media Boot
60 Copyright 2015 FUJITSU LIMITED
Ironic Conductor Bare Metal
Server 2
BMC
Cinder
Instance
Boot Image
Management Network
Tenant Network
Bare Metal
Server N
BMC
Bare Metal
Server 1
BMC
Bare Metal
Server N+1
BMC
…
migration.iso
floppy.img
1) Bare Metal Server 1 is running normally 3) % ironic cold-migration “Bare Metal Server 1”
2) A sign of failure is detected
Spare server
Multi Tenant, and
Networked storage
Environment
CIFS/NFS
Bare Metal N+1 Redundancy
Cold Migration by Soft Power Off and Virtual Media Boot
61 Copyright 2015 FUJITSU LIMITED
Ironic Conductor Bare Metal
Server 2
BMC
Cinder
Instance
Boot Image
Management Network
Tenant Network
Bare Metal
Server N
BMC
Bare Metal
Server 1
BMC
Bare Metal
Server N+1
BMC
…
migration.iso
floppy.img3) % ironic cold-migration “Bare Metal Server 1” 4) Graceful shutdown by
Soft Power Off
Spare server
Multi Tenant, and
Networked storage
Environment
CIFS/NFS
Bare Metal N+1 Redundancy
Cold Migration by Soft Power Off and Virtual Media Boot
62 Copyright 2015 FUJITSU LIMITED
Ironic Conductor Bare Metal
Server 2
BMC
Cinder
Instance
Boot Image
Management Network
Tenant Network
Bare Metal
Server N
BMC
Bare Metal
Server 1
BMC
Bare Metal
Server N+1
BMC
…
migration.iso
floppy.img
5) Boot migration.iso
from Virtual Media,
and set IO to attach Cinder
Spare server
3) % ironic cold-migration “Bare Metal Server 1”
Multi Tenant, and
Networked storage
Environment
CIFS/NFS
Bare Metal N+1 Redundancy
Cold Migration by Soft Power Off and Virtual Media Boot
63 Copyright 2015 FUJITSU LIMITED
Ironic Conductor Bare Metal
Server 2
BMC
Cinder
Instance
Boot Image
Management Network
Tenant Network
Bare Metal
Server N
BMC
Bare Metal
Server 1
BMC
Bare Metal
Server N+1
BMC
…
migration.iso
floppy.img
6) Reboot from the same instance boot
Image in Cinder
Spare server
3) % ironic cold-migration “Bare Metal Server 1”
Recap
64 Copyright 2015 FUJITSU LIMITED
Mid Term VisonCostumer Values Contribution
Long Term Vison: Eventually for Mission Critical
Truly Open
Reliable
Truly open,
no vendor lock-in
Reliable, robust,
highly available system
Responsive, responsible,
competent support
Virtual Media Deployment
Rescue Mode
in Tenant Network
Soft Power Off and
Inject NMI
Bare Metal
N+1 Redundancy
Proactive
Reactive