9
XTRATUM: AN OPEN SOURCE HYPERVISOR FOR TSP EMBEDDED SYSTEMS IN AEROSPACE A. Crespo 1 , I. Ripoll 1 , M. Masmano 1 , P. Arberet 2 , and J.J. Metge 2 1 Instituto de Inform ´ atica Industrial, Universidad Polit´ ecnica de Valencia, Spain 2 CNES, France ABSTRACT XtratuM is an hypervisor designed to meet safety critical requirements. XtratuM 2.1.0 is a redesign of the former version XtratuM 2.0 (for x86 architectures) to meet safety critical requirements. It has been ported to SPARC v8 arquitecture and specially to the to the LEON2 proces- sor, which is the reference platform for the spatial sec- tor. Adaptation involves a strong effort in redesign to be closer to the ARINC-653 standards. As far as we know, XtratuM is the first hypervisor for the SPARC v8 arqui- tecture. In this paper, the main design aspects are dis- cussed and the internal architecture described. An ini- tial evaluation of the most significant metrics is also pro- vided. Key words: Partitioning systems, hypervisors,micro- kernels. 1. INTRODUCTION Although virtualisation has been used in mainframe sys- tems since 60’s; the advances in the processing power of the desktop processors in the middle of the 90’s, opened the possibility to use it in the PC market. The embedded market is now ready to take advantage of this promising technology. Most of the recent advances on virtualization have been done in the desktop systems, and transferring these results to embedded systems is not as direct as it may seem. The current state of the visualizing technology is the re- sult of a convergence of several technologies: operating system design, compilers, interpreters, hardware support, etc. This heterogeneous origin, jointly with the fast evo- lution, has caused a confusion on the terminology. The same term is used to refer to different ideas and the same concept is differently named depending on the engineer background. A virtual machine (VM) is a software implementation of a machine (computer) that executes programs like a real machine. Hypervisor (also known as virtual ma- chine monitor VMM [Gol74]) is a layer of software (or a combination of software/hardware) that allows to run several independent execution environments 1 in a single computer. The key difference between hypervisor tech- nology and other kind of virtualizations (such as java vir- tual machine or software emulation) is the performance. Hypervisor solutions have to introduce a very low over- head; the throughput of the virtual machines has to be very close to that of the native hardware. Hypervisor is a new and promising technology, but has to be adapted and customized to the requirements of the target application. As far as we know, there are no previous experiences with hypervisors for spatial systems. When a hypervisor is designed for real-time embedded systems, the main issues that have to be considered are: temporal and spatial isolation, basic resource virtualisa- tion (clock and timers, interrupts, memory, cpu time, se- rial i/o), real-time scheduling policy, deterministic hyper- visor system calls, efficient inter-partition communica- tion, efficient context switch, low overhead and low foot- print. In this paper, we present the design, implementation and evaluation of XtratuM for the LEON2 processor. Al- though XtratuM was initially implemented for x86 archi- tectures, its porting to LEON2 has implied a strong effort in redesign and implementation due to the architecture constraints. 2. VIRTUALISATION TECHNOLOGIES Attending to the resources used by the hypervisor there are two classes of hypervisors called type 1 and type 2. The type 1 hypervisors run directly on the native hard- ware (also named native or bare-metal hypervisors); the second type of hypervisors are executed on top of an op- erating system. The native operating system is called host operating system and the operating systems that are exe- cuted in the virtual environment are called guest operat- ing systems. Although the basic idea of virtualizing[Cor] is widely understood: “any way to recreate an execution environ- ment, which is not the original (native) one”; there are 1 We will use the terms:guest, virtual machine and partition as syn- onyms.

XtratuM: An Open Source Hypervisor for TSP Embedded Systems in

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

XTRATUM: AN OPEN SOURCE HYPERVISOR FOR TSP EMBEDDED SYSTEMS INAEROSPACE

A. Crespo1, I. Ripoll1, M. Masmano1, P. Arberet2, and J.J. Metge2

1Instituto de Informatica Industrial, Universidad Politecnica de Valencia, Spain2CNES, France

ABSTRACT

XtratuM is an hypervisor designed to meet safety criticalrequirements. XtratuM 2.1.0 is a redesign of the formerversion XtratuM 2.0 (for x86 architectures) to meet safetycritical requirements. It has been ported to SPARC v8arquitecture and specially to the to the LEON2 proces-sor, which is the reference platform for the spatial sec-tor. Adaptation involves a strong effort in redesign to becloser to the ARINC-653 standards. As far as we know,XtratuM is the first hypervisor for the SPARC v8 arqui-tecture. In this paper, the main design aspects are dis-cussed and the internal architecture described. An ini-tial evaluation of the most significant metrics is also pro-vided.

Key words: Partitioning systems, hypervisors,micro-kernels.

1. INTRODUCTION

Although virtualisation has been used in mainframe sys-tems since 60’s; the advances in the processing power ofthe desktop processors in the middle of the 90’s, openedthe possibility to use it in the PC market. The embeddedmarket is now ready to take advantage of this promisingtechnology. Most of the recent advances on virtualizationhave been done in the desktop systems, and transferringthese results to embedded systems is not as direct as itmay seem.

The current state of the visualizing technology is the re-sult of a convergence of several technologies: operatingsystem design, compilers, interpreters, hardware support,etc. This heterogeneous origin, jointly with the fast evo-lution, has caused a confusion on the terminology. Thesame term is used to refer to different ideas and the sameconcept is differently named depending on the engineerbackground.

A virtual machine (VM) is a software implementationof a machine (computer) that executes programs like areal machine. Hypervisor (also known as virtual ma-chine monitor VMM [Gol74]) is a layer of software (ora combination of software/hardware) that allows to run

several independent execution environments1 in a singlecomputer. The key difference between hypervisor tech-nology and other kind of virtualizations (such as java vir-tual machine or software emulation) is the performance.Hypervisor solutions have to introduce a very low over-head; the throughput of the virtual machines has to bevery close to that of the native hardware. Hypervisor is anew and promising technology, but has to be adapted andcustomized to the requirements of the target application.As far as we know, there are no previous experiences withhypervisors for spatial systems.

When a hypervisor is designed for real-time embeddedsystems, the main issues that have to be considered are:temporal and spatial isolation, basic resource virtualisa-tion (clock and timers, interrupts, memory, cpu time, se-rial i/o), real-time scheduling policy, deterministic hyper-visor system calls, efficient inter-partition communica-tion, efficient context switch, low overhead and low foot-print.

In this paper, we present the design, implementation andevaluation of XtratuM for the LEON2 processor. Al-though XtratuM was initially implemented for x86 archi-tectures, its porting to LEON2 has implied a strong effortin redesign and implementation due to the architectureconstraints.

2. VIRTUALISATION TECHNOLOGIES

Attending to the resources used by the hypervisor thereare two classes of hypervisors called type 1 and type 2.The type 1 hypervisors run directly on the native hard-ware (also named native or bare-metal hypervisors); thesecond type of hypervisors are executed on top of an op-erating system. The native operating system is called hostoperating system and the operating systems that are exe-cuted in the virtual environment are called guest operat-ing systems.

Although the basic idea of virtualizing[Cor] is widelyunderstood: “any way to recreate an execution environ-ment, which is not the original (native) one”; there are

1We will use the terms:guest, virtual machine and partition as syn-onyms.

substantial differences between the different technologi-cal approaches used to achieve this goal.

Virtualizing is a very active area, several competing tech-nologies are actively developed. There is still not a clearsolution, or a winner technology over the rest. Some vir-tualizing technologies are better than other for a given tar-get. For example, on desktop systems, para-virtualizationis the best choice if the source code of the virtualized en-vironment is available, otherwise full-virtualization is theonly possible solution.

A detailed description and analysis of the techniques andthe existing solutions is beyond the scope of this report(the reader is referred to the document “Virtualization:State of the Art” [SCO08]). Just to summarise the currentavailable solutions for the real-time embedded systems:

2.1. Separation kernel

Also known as operating system-level virtualization. Inthis approach the operating system is extended (or im-proved) to enforce a stronger isolation between processesor groups of processes. Each group of isolated group ofprocesses is considered a partition. In this solution, allthe partitions must use the same operating system. It islike if several instances of the same O.S. were executedin the same hardware.

An important disadvantage of the this solution is the largeamount of code and the high complexity (the whole O.S.)of the virtualizer.

2.2. Micro-kernel

This was originally an architectonic solution for develop-ing large and complex operating systems. The idea wasto separate the core kernel services from the rest of morecomplex and “baroque” services. The core kernel ser-vices are implemented by a layer of code called micro-kernel, and consist of: context switch, basic memorymanagement, and simple communication services (IPC).

Only the micro-kernel code is executed in processorprivileged-mode, the rest of the operating system subsys-tems (scheduler, user process management, memory ser-vices, filesystem, network stack, etc.) are implementedas micro-kernel thread. The operating system itself is de-signed as a set of cooperating servers; each server is athread with its own protected memory space and commu-nicated with other servers via the micro-kernel IPC facili-ties. The micro-kernel implements only the mechanisms;the policies are implemented by the micro-kernel threadsat user level (even the scheduling policy).

“The microkernel implements only the mechanism toselect one of the time partitions as active foregroundpartition. The policy part of deciding which of thetime partitions is activated when is left to the userlevel. ...”

Micro-kernel was initially presented as a solution (theright way to do) to the supposed problems of the mono-lithic kernels2. This strong (and enforced by the micro-kernel) isolation between the components of the oper-ating system prevents that an error on a component af-fects the behavior of the rest of the system. Although themicrokernel technology was developed as a paradigm toimplement a single operating system, the services pro-vided by the micro-kernel can be used to build severaldifferent operating systems, resulting in a virtualized sys-tem. Currently the main drawback is the non negligi-ble overhead introduced by the communication mech-anism, and the high abstraction level of the processor.The virtualized operating system must be highly modi-fied (ported) to meet the micro-kernel API and philoso-phy. The micro-kernel approach started with the Marchmicro-kernel. The most representative implementation ofa micro-kernel is the L4 [Lie95].

2.3. Bare-metal hypervisor

It is a thin layer of software that virtualizes the criti-cal hardware devices to create several isolated partitions.The hypervisor also provides other virtual services: inter-partition communication or partition control services.

The hypervisor does not define an abstract virtual ma-chine but tries to reuse and adapt to the underlying hard-ware as much as possible to reduce the virtualizationoverhead. In other words, the virtual machine will beclose to the native hardware in order to directly use thenative hardware as much as possible without jeopardiz-ing the temporal and spatial isolation. Several hypervi-sors are available for general purpose operating systems.

2.4. Para-virtualization

The para-virtualization (term coined in theXen [DFH+03] project) technique consist in replacingthe conflicting instructions3 explicitly by functionsprovided by the hypervisor. In this case, the partitioncode has to be aware of the limitations of the virtualenvironment and use the hypervisor services. Thoseservices are provided thought a set of hypercalls.

The hypervisor is still in charge of managing the hard-ware resources of the systems, and enforce the spatial andtemporal isolation of the guests. Direct access to the na-tive hardware is not allowed.

The para-virtualization is the technique that better fitsthe requirements of embedded systems: Faster, simpler,smaller and the customization (para-virtualization) of theguest operating system is not a problem because thesource code is available. Also, this technique does notrequires special processor features that may increase thecost of the product.

2Linux is a monolithic kernel.3Conflicting instructions: instructions that operate directly on the

native hardware and may break the isolation.

2.5. Dedicated devices

In the server and desktop segments, the virtualizer pro-vides a complete (or full) virtualized environment foreach virtual machine. That is, the each virtual machineis fully isolated from the native hardware, and has no di-rect access to any peripheral.

Some virtualizers allows a virtual machine to access di-rectly some parts of the native hardware. This concept isknown as partial virtualization. This technique is widelyused in embedded systems when a device is not sharedamong several partitions, or when the complexity of thedriver is that high that it does not worth including it inthe virtualizer layer. In some cases, when the policy forsharing a peripheral is user specific (or when the user re-quires a fine grain control over the peripheral), it is betterto allocate the native peripheral to a designated managervirtual machine.

In this case, the virtualizer is not aware of the exact oper-ation of the peripheral, but enforces that only the allowedpartition uses it.

This technique is frequently used in embedded systemsdue to the use of home designed (or customised) periph-erals.

3. XTRATUM OVERVIEW

XtratuM 1.0 [MRC05a, MRC05b] was designed ini-tally as a substitution of the RTLinux [Yod, Bar97] toachieve temporal and spatial requirements. XtratuM wasdesigned as a nanokernel which virtualises the essentialhardware devices to execute concurrently several OSes,being at least one of these OSes a RTOS. The other hard-ware devices (including booting) were left to a specialdomain, named root domain. The use of this approach letus speed up the implementation of a first working proto-type. Like RTLinux before, XtratuM was implementedas a LKM which, once loaded, takes over the box; Linuxwas executed as the root domain. XtratuM and the rootdomain still shared the same memory sapce. Neverthe-less, the rest of the domains were run in different memorymaps, providing partial space isolation.

After this experience, it was redesigned to be independentof Linux and bootable. The result of this is XtratuM 2.0.In the rest of this paper, the term XtratuM will refer tothis second version.

XtratuM is a type 1 hypervisor that uses para-virtualization. The para-virtualized operations are asclose to the hardware as possible. Therefore, porting anoperating system that already works on the native sys-tem is a simple task: replace some parts of the operatingsystem HAL (Hardware Abstraction Layer) with the cor-responding hypercalls.

The ARINC-653 [Air96] standard specifies the base-line operating environment for application software usedwithin Integrated Modular Avionics (IMA), based on apartitioned arquitecture. Although not explicitly stated in

the standard, it was developed considering that the under-laying technology used to implement the partitions is theseparation kernel. Athough it is not an hypervisor stan-dard, some parts of the APEX model of ARINC-653 arevery close to the functionality provided by an hypervisor.For this reason, it was used as a reference in the design ofXtratuM.

ARINC-653 relays on the idea of a “separation kernel”,which basically consists in extending and enforcing theisolation between a process or a group of processes. Itdefines both, the API and operation of the partitions,and also how the threads or processes are managed in-side each partition. From this point of view, XtratuMwill cover the ARINC-653 specification related to the lowlevel layer.

In an hypervisor, and in particular in XtratuM, a partitionis a virtual computer rather than a group of strongly iso-lated processes. When multi-threading (or tasking) sup-port is needed in a partition, then an operating system or arun-time support library has to provide support to the ap-plication threads. In fact, it is possible to run a differentoperating system on each XtratuM partition.

XtratuM was designed to meet safety critical real-timerequirements. The most relevant features are:

• Bare hypervisor• Employs para-virtualisation techniques• An hypervisor designed for embedded systems:

some devices can be directly managed by a desig-nated partition

• Strong temporal isolation: fixed cyclic scheduler• Strong spatial isolation: all partitions are executed

in processor user mode, and do not share memory• Fine grain hardware resource allocation via a con-

figuration file• Robust communication mechanisms (XtratuM sam-

pling and queuing ports)• Full management of exceptions and errors via

Health Monitor

4. ARCHITECTURE AND DESIGN

Figure 1 shows the complete system architecture. Themain components of this architecture are:

4.1. Hypervisor

XtratuM is in charge of virtalisation services to parti-tions. It is executed in supervisor processor mode andvirtualises the cpu, memory, interrupts and some specificperipherals. The internal XtratuM architecture includes:memory management, scheduling (fixed cyclic schedul-ing), interrupt management, clock and timers manage-ment, partition communication management (ARINC-653 communication model), health monitoring and trac-ing facilities. Three layers can be identified:

Figure 1. System architecture.

Hardware-dependent layer It implements the set ofdrivers required to manage the strictly necessaryhardware: processor, interrupts, hardware clocks,hardware timers, paging, etc. This layer is iso-lated from the rest through the Hardware Abstrac-tion Layer (HAL). Thus, the HAL hides the com-plexity of the underlying hardware by offering ahigh-level abstraction of it (for instance, a ktimer isthe common abstraction to all the hardware timers).

Internal-service layer Those services are not availableto the partitions. This layer includes a minimal C li-brary which provides the strictly required set of stan-dard C functions (e.g. strcpy, memcpy, sprintf) anda bundle of data structures. The system boot is alsopart of the internal services.

Virtualization-service layer It provides the services re-quired to support the para-virtualisation services,which are provided via the hypercall mechanism topartitions. Some of these services are also used fromother XtratuM modules.

Four strategies have been used to prevent partitions topervert the temporal isolation:

• Partitions cannot disable native traps. Partitions areexecuted in user mode, thus guaranteeing that theyhave not access to control registers. Additionally,apartition can not interact with native traps.

• A partition can not mask those hardware interruptsnot allocated to the partition.

• Partitions have no access to the trap table. Thus, par-titions are unable to install their own trap handlers.All traps are rstly handled by XtratuM and, whenrequired, propagated to partitions.

• Partitions can not access to the system hardwaretimer, but virtual timers. Partitions can set virtualtimers based on native hardware clock or the execu-tion partition clock.

4.2. Trap and interrupt management

A trap is the mechanism provided by the SPARC v8 pro-cessor to implement the asynchronous transfer of control.When a trap occurs, the processor switches to supervisormode and jumps unconditionally into a predened handler.XtratuM extends the concept of processor traps by addinga 32 additional interrupt numbers. This new range is usedto implement the concept of asynchronous event. Thoseasynchronous events are used to indicate to the partitionsthe need for attention on the occurrence of an event.

The trap mechanism is used to handle hardware inter-rupts, software traps and processor exceptions.

Although in a fully virtualised environment, a partitionshould not need to manage native interrupt hardware;XtratuM only virtualises the hardware peripherals thatmay endanger the isolation, but leaves to the partitions todirectly manage non-critical devices. A hardware inter-rupt can only be allocated to one partition (in the congu-ration le).

A partition, using exclusively a device (peripheral), canaccess the device through the device driver implementedin the partition. The partition is in charge of handlingproperly the device. The conguration le has to specify theIO ports and the interrupt lines that will be used by eachpartition. Two partitions cannot use the same IO port orthe same interrupt line. When a device is used by severalpartitions, a specic IO server partition will be in chargeof the device management. An IO server partition is aspecic partition which accesses and controls the devicesattached to it, and exports a set of services via the inter-partitions communication mechanisms, enabling the restof partitions to make use of the managed peripherals.

4.3. Inter-partition communication

Inter-partition communication is related with the commu-nications between two partitions or between a partitionand the hypervisor. XtratuM implements a message pass-ing model which highly resembles the one dened in theARINC-653. A communication channel is the logicalpath between one source and one or more destinations.Two basic transfer modes are provided: sampling andqueuing. Partitions can access to channels through accesspoints named ports. The hypervisor is responsible for en-capsulating and transporting messages. At the partitionlevel, messages are atomic entities. Partition developersare responsible for agreeing on the format (data types,endianess, padding, etc.). Channels, ports and the maxi-mum message sizes and maximum number of messages(queuing ports) are entirely dened in the conguration les.

4.4. Health Monitor

The health monitor is the part of XtratuM that detectsand reacts to anomalous events or states. The purposeof the HM is to discover the errors at an early stage andtry to solve or conne the faulting subsystem in order toavoid or reduce the possible consequences. As as result ofenforcing the isolation of the partitions, XtratuM containsa lot of consistency and security checks; therefore, it candetect a large number of errors. Errors are grouped bycategories. Once an error is detected, XtratuM reacts tothe error providing a simple set of predened actions to bedone when it is detected.

XtratuM HM subsystem is composed by four logicalcomponents:

HM event detection : to detect abnormal states, usinglogical probes in the XtratuM code.

HM actions : a set of predened actions to recover thefault or conne the error.

HM conguration : to bind the occurence of each HMevent with the appropriate HM action.

HM notication : to report the occurrence of the HMevents.

Once an HM event is raised, XtratuM performs anaction that is specified in the configuration file. Nexttable shows the list of HM events and the prede-fined default action at hypervisor and partition level.

XtratuM scope Partition scopeEvent name Def. action Log Def. Action Log

Processor triggeredWRITE ERROR Warm Reset Y Propagate NINST ACC EXCEPTION Warm Reset Y Propagate YILLEGAL INST Warm Reset Y Propagate NPRIVILEGED INST Warm Reset Y Propagate YFP DISABLED Warm Reset Y Propagate NCP DISABLED Warm Reset Y Propagate NREG HW ERROR Warm Reset Y Suspend YMEM ADDR NOT ALIG Warm Reset Y Propagate NFP EXCEPTION Warm Reset Y Propagate NDATA ACC EXCEPTION Warm Reset Y Propagate YTAG OVERFLOW Warm Reset Y Propagate NDIVIDE EXCEPTION Warm Reset Y Propagate N

User triggeredPARTITION ERROR Halt Y

XtratuM triggeredMEM PROTECTION Suspend YPARTITION INTEGRITY Suspend YPARTITION UNRECOV. Halt YOVERRUN Ignore YSCHED ERROR Ignore YINTERNAL ERROR Warm Reset YUNEXPECTED TRAP Ignore Y

Table 1. Health monitoring events and actions

4.5. Tracing

XtratuM provides a mechanism to store and retrieve thetraces generated by partitions and XtratuM itself. Tracescan be used for debugging, during the development phaseof the application, but also to log relevant events or statesduring the production phase. In order to enforce resourceisolation, each partition (as well as XtratuM) has a ded-icated trace log stream to store its own trace messages.Trace streams str stored in buffers (RAM or FLASH).Only supervisor partitions can read from a trace stream.

4.6. Partitions

A partition is an execution environment managed by thehypervisor which uses the virtualised services. Each par-tition consists of one or more concurrent processes (im-plemented by the operating system of each partition),sharing access to processor resources based upon the re-quirements of the application. The partition code can be:

• An application compiled to be executed on a bare-machine (bare-application).

• A real-time operating system (or runtime support)and its applications.

• A general purpose operating system and its applica-tions.

Partitions need to be virtualised to be executed on topof an hypervisor. Depending on the type of executionenvironment, the virtualisation implications in each casecan be summarised as:

Bare application The application has to be virtualisedusing the services provided by XtratuM. The appli-cation is designed to run directly on the hardwareand it has to be aware about it.

Operating system application When the applicationruns on top of a (real-time) operating system, it usesthe services provided by the operating system anddoes not need to be virtualised. But the operatingsystem has to deal with the virtualisation. The oper-ating system has to be virtualised (ported on top ofXtratuM).

XtratuM defines two types of partitions: normal and su-pervisor. Supervisor partitions are allowed to manage andmonitor the state of the system and other partitions. Somehypercalls cannot be called by a normal partition or haverestricted functionality. Note that supervisor rights are re-lated to the capability to manage the system, and not tothe ca- pability to access directly to the native hardware orto break the isolation: a supervisor partition is scheduledas a normal partition; and it can only use the resourcesallocated to it in the conguration le.

4.7. Design issues

XtratuM has designed specifically to meet real-time con-straints and be as efficient as possible. The main decis-sions involving these requirements are:

Data structures are static: All data structures are pre-defined at build time from the configuration file;therefore: 1) more efficient algorithms can be used;2) the exact resources used by XtratuM are known.

XtratuM code is non-preemptive: 4 Although this fea-ture is desirable in most operating systems, there isno benefits in the case of a small hypervisor. Thecode is simpler (no fine grain critical sections) andfaster.

All services (hypercalls) are deterministic and fast:XtratuM provides the minimal services to guaranteethe temporal and spatial isolation of partitions.

Peripherals are managed by partitions. XtratuM onlysupervises the access to IO ports as defined in theconfiguration file.

Interrupt occurrence isolation. When a partition is un-der execution, only the interrupts managed bythis partition are enabled, which minimizes inter-partition interferences though hardware.

Extended management of traps, interrups and errors.All these events are handled by a module (HealthMonitor) and handled properly according theconfiguration table. A trace system provides themechanisms to log messages.

Full control of the resources. All the system resourcesallocated to partitions are specified in a configura-tion file.

4Note that XtratuM is non-preemptive, but it is prepared to bere-entrant, which allows multiple processors to execute concurrentlyXtratuM code.

One-shot timer. It provides a 1 microsecond resolutionboth for timers and clocks with a very low overhead.

4.8. LEON2 virtualisation issues

The SPARC v8 architecture does not provide any kindof virtualization support. It implements the classical twoprivilege levels: supervisor and user; used by the operat-ing system to control user applications. In order to guar-antee the isolation, partition code has to be executed inuser mode; only XtratuM can be executed in supervisormode.

Additionally, the design of XtratuM for LEON2 pro-cessors introduce some additional aspects as the regis-ter window mechanism and the MPU (LEON2 withoutMMU).

LEON2 has not MMU . In a processor without MMU,the most simple and convenient way to manage sys-tem memory is with a xed partition memory scheme.The required amount of memory is allocated to eachpartition at boot (or load) time. Partition memorycan not grow or shrink.

LEON2 provides a register window mechanism . Avery particular feature of the LEON (sparcv8) pro-cessor is the register window. The register windowis a mechanism used to replace the standard stackoperation by a large set of internal processor regis-ters. Partitions have full access to the 32 general pur-pose registers. XtratuM guarantees their consistencewhen the partition context-switch is performed.XtratuM provides a transparent management of thestack. Stack overflow and underflow is directly man-aged by XtratuM without the intervention of the par-tition. The partition code must load the stack registerwith valid values.

5. SYSTEM CONFIGURATION AND DEPLOY-MENT

The integrator, jointly with the partition developers, haveto define the resources allocated to each partition. Theconfiguration file that contains all the information allo-cated to each partition as well as specific XtratuM param-eters is called XM CF.xml. It contains the informationas: memory requirements, processor sharing, peripherals,health monitoring actions, etc.

Memory requirements: The amount of physical mem-ory available in the board and the memory allocatedto each partition.

Processor sharing: How the processor is allocated toeach partition: the scheduling plan.

Native peripherals: Those peripherals not managed byXtratuM can be used by one partition. The I/O portranges and the interrupt line if any.

Health monitoring: How the detected error are man-aged: direct action, delivered to the offending par-tition, create a log entry, etc.

Inter-partition communication: The ports that eachpartition can use and the channels that link thesource and destination ports.

Since XM CF.xml defines the resources allocated toeach partition, this file represents a contract betweenthe integrator and the partition developers. A partner(the integrator or any of the partition developers) shouldnot change the contents of the configuration file on itsown. All the partners should be aware of the changes andshould agree in the new configuration in order to avoidproblems later during the integration phase.

In order to reduce the complexity of the XtratuM hypervi-sor, the XM CF.xml is parsed and translated into a binaryrepresentation that can be directly used by XtratuM code.This process is performed by two tools xmct parserand xmct builder.

In order to ensure that each partition does not depend onor affect other partitions or the hypervisor due to sharedsymbols. The partition binary is not an ELF file. It is araw binary file which contains the machine code and theinitialized data.

The system image is a single file which contains allthe code, data and configuration information that will beloaded in the target board. The tool xm embpack is aprogram that reads all the executable images and the con-figuration files and produces the system image. The finalsystem image also contains the necessary code to bootthe system. The system image file can be written into theROM of the board.

6. EVALUATION

In this section we provide the initial evaluation of Xtra-tuM for LEON2 processor. A development board (GR-CPCI-AT697 LEON2-FT 80MHz with 8Mb flash PROMand 4 Mb RAM, 33 MHz 32-bit PCI bus) has been usedduring the evaluation.

The following metrics have been measured:

• Partition context switch: Time needed by the hyper-visor to switch between partitions. Three main ac-tivities can be identified:

1. Save the context2. Timer interrupt management3. Scheduling decision4. Load the context of the new partition

• Clock interrupt management: Time needed to pro-cess a clock interrupt

• Effective slot duration: Time where the partition isexecuting partition code during a slot.

• Partition performance loss: This measurement pro-vides a measure of the overhead introduced by Xtra-tuM. It is measured at partition level.

• Hypercall cost: Time spent in some of the hypercalls

The measures involving internal activity of XtratuM havebeen perfomed adding breakpoints at the begining andend of the code to be measured.

Internal operation Time (microseconds)Partition Context switch 251. Save the context 52. Timer interrupt management 93. Scheduler 74. Load the context 5Clock interrupt management 2Effective slot duration Slot duration - 25

Table 2. Context switch evaluation

6.1. Performance evaluation

In order to evaluate the partition performance loss a sce-nario consisting in 3 partitions with the same code whichincrements a counter. They write in a sampling port theaddress of the counter. A forth partition reads the portsand prints the counter value of the each partition.

In this scenario, several plans are built:

Case 1 Partitions 1,2 and 3 are executed sequentiallywith a slot duration of 1 second. The schedulingplan consists in the execution of p1; p2; p3; p4.When p4 is executed, it reads the final counter ofthe partitions executed during 1 second. This case istaken as reference.

Case 2 Partitions 1,2 and 3 are executed sequentiallywith a slot duration of 0.2 second. The schedulingplan consists in the execution of 5 sequences of p1;p2; p3;... (5 times)...; p4. When p4 is executed, itreads the final counter of the partitions executed 5times (1 second).

Case 3 Partitions 1,2 and 3 are executed sequentiallywith a slot duration of 0.1 second. The schedulingplan consists in the execution of 10 sequences of p1;p2; p3;... (10 times)...; p4. When p4 is executed, itreads the final counter of the partitions executed 10times (1 second).

Case 3 Partitions 1,2 and 3 are executed sequentiallywith a slot duration of 0.2 second. The schedulingplan consists in the execution of 5 sequences of p1;gap; p2; gap; p3;gap ... (5 times)...; p4. When p4 isexecuted, it reads the final counter of the partitionsexecuted 5 times (1 second).

Next table summarises the result obtained.

Compile& Link

xm_core.bin

XM_CF.xml

xmcf_parserXM_CT.c

xmct_builder

xm_embpack

XtratuMSource Code

libxm.a

Partition 1Source codeBuild static

executable

libxm.a

Partition1.bin

Partition 2Source codeBuild static

executable

libxm.a

Partition2.bin

XM_CT.bin

.config

autoconfig.h

Source codeConfiguration(menuconfig)

dummy.bin

1

2

3

5

Custom_CT 2

Partition2.bin

Custom_CT 1

Partition1.bin

XM_CT.bin

xm_core.o

ct_conf_tab

4

Figure 2. The big picture of building a XtratuM system.

Case Counter Performance (loss)1 72725182 7271682 0.011 %3 7270643 0.026 %4 7271706 -0.0003 %

Table 3. Performance loss

Case 4 is compared with Case 2. These results providean idea of the overhead (or performance loss) introducedby XtratuM when a partition is split in several time slots.

In general, the overhead can be modeled with two com-ponents: the effect of the clock interrupt which occursonce each two seconds and the partition context switchwhich.

Next table summarizes the overhead depending on thepartition slot duration. Two overhead are provided: themeasured in the experiments and the theoretical. Slot du-ration is in milliseconds.

Slot duration Measured (%) Model (%)1000 0.00 0.00

200 0.01 0.01100 0.03 0.03

50 0.05 0.0620 0.14 0.1410 0.29 0.28

5 0.57 0.561 2.67 2.68

Table 4. Overhead

6.2. Hypercall cost

A set of test have been designed to measure the cost ofhypercalls. All hypercalls except those that perform acopy of the data sent or receive (read or write in ports)should have a constant cost. Read and write operation

on ports perform a copy of the data stream to the kernelspace.

The cost measured are (unit microseconds)

Hypercall TimeXM get time 9XM set time 9XM enable irqs 10XM unmask event 1XM create sampling port 23XM write sampling port (16bytes) 19XM write sampling port (1024bytes) 52XM write sampling port (4096bytes) 153XM read sampling port (16bytes) 19XM read sampling port (1024bytes) 53XM read sampling port (4096bytes) 154

Table 5. Hypercall measurement

Read and write operations are implemented in a naiveway: performing one byte at once. The results obtainedare coherent with the implementation. There is a largemarging of improvement on this implementation. An op-timisation of these and other hypercalls will be done inthe next version of XtratuM.

7. CONCLUSIONS

XtratuM 2.1.0 is the first implementation of an hyper-visor for the LEON2 processor. The initial versions ofXtratuM5 were designed for conventional real-time sys-tems: dynamic partition loading, fixed priority schedul-ing of partitions, Linux in a partition, etc.

This version of XtratuM, besides of the porting to theSPARC v8 architecture, is a major redesign to meethighly critical requirements: health monitoring services,

5XtratuM 1.2 and 2.0 were implemented in the x86 architecture only.

cyclic plan scheduling, strict resource allocation via aconfiguration file, etc.

The resources allocated to each partition (processor time,memory space, I/O ports, and interrupts, communicationports, monitoring events, etc.) are defined in a config-uration file (with XML syntax). A tool to analyze andcompile the configuration file has also been developed.

Two issues of the initial implementation of XtratuM havenot ported to LEON2: the MMU and the multiproces-sor support. The effort of porting the MMU support (forLEON3 with MMU) can be relatively low and will per-mit to step up the spatial isolation of partitions. Xtra-tuM 2.1.0 code has been designed to be multiprocessor(the x86 version was multiprocessor). Therefore, Xtra-tuM may be the faster and safer way to use a multipro-cessor system (SMP6) in a highly critical environment.

7.1. Future work

Although this work was planned as a prototype develop-ment, the result achieved, in efficency and performance,is closer to a product than a proof of concept. Additionalsteps are being performed to obtain a product in terms ofprocedures (following the appropriated standars), docu-mentation and code optimisation.

The roadmap of XtratuM includes the porting to LEON3processor and the consideration of several aspects asmulti-plan, multi-core and other scheduling policies. Ad-ditionally, the design and implementation of a minimalrun-time following the ARINC-653standard for partitionsis under study.

Partitioned based scheduling tools is one of the weak ar-eas in the integration of partitioned systems with real-time contraints. The improvement of the scheduling toolsto optimise the scheduling plan in another important ac-tivity to be done in the next months.

REFERENCES

[Air96] Airlines Electronic Engineering Commit-tee, 2551 Riva Road, Annapolis, Mary-land 21401-7435. Avionics ApplicationSoftware Standard Interface (ARINC-653),March 1996.

[Bar97] M. Barabanov. A Linux-Based Real-TimeOperating System. Master’s thesis, NewMexico Institute of Mining and Technology,Socorro, New Mexico, June 1997.

[Cor] IBM Corporation. IBM systems virtu-alization. Version 2 Release 1 (2005).http://publib.boulder.ibm.com/infocenter/-eserver/v1r2/topic/eicay/eicay.pdf.

[DFH+03] B. Dragovic, K. Fraser, S. Hand, T. Harris,A. Ho, I. Pratt, A. Warfield, P. Barham, and

6SMP: Symmetric Muli-Processor

R. Neugebauer. Xen and the art of virtual-ization. In Proceedings of the ACM Sympo-sium on Operating Systems Principles, Octo-ber 2003.

[Gol74] R.P. Goldberg. Survey of virtual machine re-search. IEEE Computer Magazine, 7(6):34–45, 1974.

[Lie95] Jochen Liedtke. On microkernel construc-tion. In Proceedings of the 15th ACM Sympo-sium on Operating System Principles (SOSP-15), Copper Mountain Resort, CO, Decem-ber 1995.

[MRC05a] M. Masmano, I. Ripoll, and A. Crespo. In-troduction to XtratuM. 2005.

[MRC05b] M. Masmano, I. Ripoll, and A. Crespo. Anoverview of the XtratuM nanokernel. In Pro-ceedings of the Workshop on Operating Sys-tems Platforms for Embedded Real-Time Ap-plications (OSPERT), 2005.

[SCO08] SCOPE Promoting Open Carrier Grade BasePlatforms. Virtualization: State of the Art,April 2008. http://www.scopealliance.org.

[Yod] V. Yodaiken. The RTLinux manifesto.http://www.fsmlabs.com/developers/whi-te papers/rtmanifesto.pdf.