Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
1
/03 1
Blade Server DSM Analysis
manufacture and test.
12/03 Lisa Curhan
Lisa Curhan
As a product engineer for Volume Products operations at Sun, I act as an organizational interface between hardware and software design groups and the manufacturing organization. The engineers in my group need to be product generalists and to know enough about both the hardware and software architecture to design good test processes for manufacturing and act as information conduits between the external manufacturers and engineering. We also try to influence the hardware and software design for ease of
Therefore we have a high interest in the architecture of the system, and how the system development process evolves.
Courtesy of Lisa Curhan. Used with permission.
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
2
/03 2
Outline • What is a Blade Server? •
• Organizational Resources • Project Challenges •
•
• Partitioned/Torn Component DSM •
•
•
•
12/03 Lisa Curhan
High Level Object-Process Diagram
Component Relationship Diagrams Initial Component DSM
Present Organizational DSM Comparison of Organization and Architecture Project Management Implications Conclusions
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
3
/03 3
What is a Blade Server?
ishelf full of books. Each “book” is an independently functional computer system.
i l
land RLX). which is in production.
12/03 Lisa Curhan
As you can see, the appearance of a blade server chassis looks l ke a library
The chassis supplies power, networking capability and system management in addit on to the blade modu es. The idea behind this form for a server is increased density and expandability. The challenges are cooling, interconnecting the system and management. This type of system has proven popular for financial institutions and forecasts are that the market will grow. This kind of system was actually popu arized by startup companies (eGenerra
The blade project in development is a distant cousin to this one,
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
4
/03 4
Object-Process Diagram
i
i
i
il
ing
l
spread over many processes. arbi
12/03 Lisa Curhan
Information Processing
Computat on
Information Storage
External Communicat on
Internal Communicat on
System Monitor ng and Contro
Mechanical Containment and cool
CPU/Memory Blades
Backplane
I/O modules
System Control er Board
Switch Board
Chassis
Server Software
Server Firmware
This very high-level object-process diagram shows how the hardware is fairly well modularized by major process, but the software and firmware effort is
The split between software and firmware seems trary, but it is a traditional organizational division.
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
5
/03 5
Organizational Resources Volume Sun
Processor
Products
High End Central
HEES HW Design
HEES Central SC SW
and Self tests
OS Mgmt SW
OS I/O
UK
Solaris
j
This makes j
12/03 Lisa Curhan
Systems Enterprise Systems
Network Systems
Software
And Network Entry Servers Software
Software Boot FW
Hardware
MA, CA
platform code MA, CA
Blade FW
MA, CA
SC FW &
CA, MA
I/O Firmware Layers Norway
Layers Drivers CA,India
External I/O Layer
The software effort for this pro ect is spread all over the globe, and there are 8 to 10 software managers involved. The hardware effort has several groups involved, but most of the resources are in Massachusetts. software coordination a complex effort for the pro ect managers. Communication between geographically distant and organizationally distant groups is somewhat constrained, which can lead to rework and delay.
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
6
/03 6
ASICx
To
i Even in this very simpli
12/03 Lisa Curhan
Blade Component Relationships
Interface or Dependency
NETWORK ACCESS
HIGH SPEED IO ACCESS
IO FRAMEWORK
ASICx DRIVER
ASICy DRIVER
OS KERNAL AND FAULT MANAGEMENT
STORAGE ACCESS
ASICz DRIVER
BLADE BOOT FIRMWARE BLADE SELF TEST
FIRMWARE ASICy FIRMWARE
DIAGNOSTIC APPLICATION
BLADE HARDWARE
DIAGNOSTIC SCRIPT
Network
This is a very generic representation of the software components on the blade showing interactions w th the hardware and each other.
fied form you can see that more than one level of the software interacts with the hardware. I’ve portrayed the hardware as a single block because it’s basically all coupled internally.
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
7
/03 7
BOOT
ii The SC
l ly aren’t as closely examined.
12/03 Lisa Curhan
SC Component Relationships
SYSTEM MANAGEMENT APPLICATIONS AND USER INTERFACE
EXTERNAL MANAGMENT PROTOCOL
FAULT MANAGEMENT FRAMEWORK
BLADE MONITOR
SWITCH MONITOR
SELF TESTS
FIRMWARE
CHASSIS MONITOR
To BLADE FIRMWARE
MULTISYSTEM
MANAGEMENT
SWITCH HARDWARE
CHASSIS HARDWARE
Interface or Dependency
SC HARDWARE
EMBEDDED KERNAL/DRIVERS
Here is a similar chart of the system controller. You can see it has mon toring funct ons and its upper levels communicate externally to the chassis. talks to both the blades and the switch. A complication of this project is that all the hardware is basically coupled due to both electrical and mechanical interactions. I haven’t bothered to show a hardware diagram because you’ll see the hardware on the DSM as a highly coupled block. Also, I’m more interested in the hardware/software re ationships which usual
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
8
/03 8
IO LINK
To
12/03 Lisa Curhan
Switch Component Relationships
SYSTEM MANAGEMENT
APPLICATIONS AND USER INTERFACE
EXTERNAL MGMT PROTOCOL
IO RESOURCE MANAGEMENT
PLATFORM INTERFACE
MANAGEMENT LAYERS
DEVICE INTERFACE LAYER
SWITCH COMPONENT DRIVERS
MGMT AGENT
EMBEDDED KERNAL
SWITCH BOOT FW SWITCH SELF TEST
SWITCH HARDWARE
SWITCH FAULT MANAGEMENT
To IO
MODULE
MULTI-SYSTEM MANAGEMENT
Dependency Or Interface
Network
The switch module has many layers of software to handle communications both internally and externally to the chassis. Notice that it handles communication between blades. It may not be a good commentary on our project documentation, but I had to build block diagrams similar to these myself from interviewing. There was no single document that showed these relationships.
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
9
/03 9
Initial Component DSM
BladeBlade SW Team
Switch SW Team
System Controller SW
Hardware Team
ject. It is
12/03 Lisa Curhan
The initial DSM is ordered by firmware and software associated with each major functional board, with the hardware as a block at the end this is somewhat similar to the organizational arrangement of this proactually not a bad clustering, except that the hardware and software interactions are clearly not taken into account. This clustering is roughly analougous to the project team structure.
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
10
/03 10
Partitioned Component DSM
Hard ware
SW
Blade OS
SC and swi
You can
side of the green square.
i The larger purple l l
pieces of code. icate
iddle
12/03 Lisa Curhan
Hardware Switch
and Fault Management
External Communication
SC Monitoring/ Management
I had a lot of trouble with the partitioning and tearing of this DSM, as there are so many interactions. This attempt is not ideal, but does present some interesting opportunities. The upper left hand corner of the green square presents a cluster involving SC and Switch firmware that is overlapped with the
tch part of the hardware cluster (first purple square inside the green square). This shows us that the hardware cluster, which is tightly coupled, is also coupled with a set of key firmware and software applications. note this again by observing the “torn” hardware elements on the right hand
I would have preferred that the blade self test turned out to be blocked with the blade hardware at level zero, but it is in the larger square due to the fact that the coupling between hardware components is even stronger than that between f rmware and hardware. square under the hardware group is what I call the “fau t management” c uster. It shows that the fault management software is tightly coupled to several other
The last purple group on the right/bottom is an “external communication” cluster. This involves code that works to communbetween systems. I think the main advantage of this partition is that the large group in the mhas few items outside of it that it depends on but which are not inside the square.
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
11
/03 11
Organizational DSM Ordered by Project Team
Blade
Switch
SC
Hardware
i
iteams.
12/03 Lisa Curhan
This organizat onal DSM was not based on a team communications survey. Instead I applied a few simple rules to the organizational structure and which components were assigned to the people in different parts of the organization. I wanted to see where communications might be hampered by organizational and geographical barriers. So I used the following rules for assumed interaction:
0 = most interaction for groups with the same director at the same site 1 = groups within the same site with different directors 2 = groups with the same director at different sites blank = groups with different directors at different sites
This probably exaggerates the known problem with geographical and organization distance (especially geographical distance) but it shows where there may be issues clearly (as blank or pink spaces w thin the outlined project
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
12
/03 12
Organizational DSM Rearranged by Component Function
i i
to communication.
12/03 Lisa Curhan
This comparison of an organizational DSM arranged in the same order as the funct onal partitioned block ng shows that the fault management function has some challenges with regard to geographical and organizational impediments
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
13
/03 13
Conclusions •
They must
•
team. •
challenges to deal with. •
challenges as well.
project.
12/03 Lisa Curhan
Present teams may work as long as hardware group acts as integration team to software activities. remain involved and software teams, especially blade software, must stay on top of hardware issues. Switch and SC teams may want to form some working groups to deal with fault management issues and other linkages or form a cross-functional fault management
Blade and Switch Software teams have some geographical and organizational communication
Fault management function has a lot of communication
Other conclusions:
This sort of project would be most effective when architecture has been discussed, but project teams are not firmed up. It usually unwise to re-arrange teams in mid-
I need better graphics software for presentations to deal with images like this! PSM32 student version not sufficient for a project of this size. Also needs better import/export functions.
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
14
/03 14
Project Extension •
to look at real 70x70 relationship matrix. •
functional relationships from project team. •
software architecture documentation with team. I think a minimal investment in documents would save the team time over the course of the project.
For
12/03 Lisa Curhan
Get software that can work with larger datasets
Extract better information on strength of
Pursue issue of insufficient platform-level
I was worried that my component mergers needed to fit in a 40x40 matrix to use the available software led to some missing information or even false conclusions. example, the geographical and organizational challenges may be even greater that shown because there were some merged functional component groups in this component list which are actually divided between geographical locations.
12/03/03 Lisa Curhan‹#›
Click to edit the title text format
• Click to edit the outline text format– Second Outline Level
• Third Outline Level– Fourth Outline Level
» Fifth Outline Level» Sixth Outline Level» Seventh Outline Level» Eighth Outline Level» Ninth Outline Level
15
/03 15
Questions?
12/03 Lisa Curhan