63
User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc.

User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Embed Size (px)

Citation preview

Page 1: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

User-space System Device Enumeration (uSDE)

Mark Bellon

MontaVista Software, Inc.

Page 2: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE

• Enumerate - to specify one after another– Specify/instantiate/remove system devices

• create

• delete

• diagnostics

– Deal with devices in a dynamic environment• system start up

• hot insertions and removals

Page 3: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE (1)?

• An architecturally and philosophically neutral framework for enumerating the devices attached to a computer system

• An open, extensible implementation (even in real-time!) of device enumeration that supports one or more systems of enumeration - simultaneously if necessary!

Page 4: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE (2)?

• Provides transaction protected consistent real-time (low latency) access to data

• Designed for carrier grade and embedded environments; desktops fall out trivially

• Optimized for speed; can handle a huge number of devices

• Small and reliable

Page 5: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE (3)?

• It did not start life as as specialized or limited handler; from its beginning it has been designed to handle all device types

• It does not mandate a formal database

• It operates entirely in user space– MVL CGE 3.1– 2.6 test 6 or later

Page 6: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Overview

uSDE executivedaemon

uSDE/sbin/hotplugreplacement

uSDE scanner

uSDE agent

uSDE utility ConfigurationFiles

backing-store(optional)

exec-cache

PolicyMethod

PolicyMethod

PolicyMethod

PolicyMethod

uSDE utility

uSDE agent

Page 7: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE External Stimuli (1)

uSDE executivedaemon

uSDE/sbin/hotplugreplacement

uSDE scanner

uSDE agent

Appear events

Insert/remove events

Aspect-change events

Page 8: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE External Stimuli (2)

• uSDE /sbin/hotplug replacement– A binary that provides the functionality of

existing shell scripts– Forwards all hotplug events to the uSDE

executive for processing– Device insert and remove event are of

particular interest

Page 9: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE External Stimuli (3)

• uSDE scanner– Invoked by the uSDE executive to determine

the initial ensemble of system devices– Scans sysfs for appropriate devices and sends

“appear” events– Typically runs only once (when uSDE

executive runs for “the first time”

Page 10: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE External Stimuli (4)

• uSDE agent– A program, usually a daemon, that provides

information necessary for the manipulation of a device that is otherwise unavailable from sysfs, /proc or the kernel

– Commonly used to send aspect-change events• Multi-chassis, geographical addressing

– ATCA

– “well known” platforms

• IPMI and/or networks

Page 11: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Executive (1)

uSDE executivedaemon

PolicyMethod

PolicyMethod

PolicyMethod

PolicyMethod

Internal Events

Page 12: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Executive (2)

• Loads configuration files

• Determines initial device ensemble– device scanner

• Initializes event/device handlers– sends (internal) “init” event to each handler

• Processes events– handles out of order arrival issues

Page 13: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Executive (3)

• Event processing– Classifies device associated with an event– Maps external event to an internal event– Queues the internal event for servicing– Schedules internal event processing– Provides logging of critical data

Page 14: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Executive (4)

• Device classification (phase 1)– Derived directly from device’s sysfs path

• class– disk, ethernet, cdrom, floppy, loop, raid, etc.

• sub-class– sda -> class “disk”, sub-class “scsi”

– eth0 -> class “ethernet”, sub-class “generic”

Page 15: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Executive (5)

• Device classification (phase 2)– sub-class from phase 1 may be updated

• Determine parent device

• Search for additional information and, if present, override initial classification

– “scsi” may become “fibrechannel”, “ieee-1394”, etc.

– “ide” may become “eide”, “serial-ata” , etc.

– No limitations on sub-class override– pci-info file provides information for this phase

Page 16: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Executive (6)

• The internal event is queued for service– sysfs path of device– internal event type– class and sub-class assigned to device

• Enumeration service maintains queues– each class has a queue– sub-class is ignored

Page 17: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Executive (7)

• Device queues are aggressively scheduled– All queues may be running concurrently– No concurrent servicing within a queue

• Events may be coalesced– identical event type and sub-class– each sysfs path is added to a list

• A service container is invoked in response to an event

Page 18: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Executive (8)

• A service container is a list of one or actions that are invoked in a definite order– a configuration file specifies the service containers

• Class and sub-class control handling– A service container is associated with each class and

sub-class

• An internal event is sent to each action within the service container

Page 19: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Executive (9)

• An action contained in a service container is known as a policy method– implement the policies of its designer– Each policy method is sent the same parameters

• Policy methods must be prepared to accept multiple arguments (devices)– minimized number of invocations– “closeness” optimizations are possible

Page 20: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Policy Methods (1)

PolicyMethod

PolicyMethod

PolicyMethod

PolicyMethod

Page 21: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Policy Methods (2)

• Policy methods:– Are Linux programs

• Write in any language you wish including shells

– Are invoked with a standardized command line• class

• sub-class

• event type

• device argument(s) - sysfs path

• standardized options

Page 22: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Policy Methods (3)

• Policy methods:– actually enumerates a device– determine which instance within class should

be associated with a device– are free to implement whatever policies they

see fit

Page 23: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Files (1)

uSDE executivedaemon

uSDE utilities ConfigurationFiles

Backing store(optional)

exec-cache

PolicyMethod

PolicyMethod

PolicyMethod

PolicyMethod

Page 24: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Files (2)

• Human readable - ASCII

• Formal grammars (YACC) for each file– One can be sure the file is valid

• Hand optimized lexer for speed– still room for improvement

• Separate API for each file via shared library– No wasted memory

Page 25: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Files (3)• Deployment-model

– how to handle events and permissions

• hardware-map (optional)

– how to control your special hardware

• pci-info (optional)

– additional information for classification

• backing-store (optional)

– a place to retain critical information

• exec-cache (optional in the future (special case))

– executive caches classification here

Page 26: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Policy Method Toolkit (1)

Trivial PolicyMethod

PersistentPolicy

Method

EmulationPolicy

Method

A wonderful set of sample code to play with...

Page 27: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Policy Method Toolkit (2)

• disk-ide-policy– implements persistent device naming

• Vendor/model string, Serial number

– handles IDE, EIDE, serial ATA and USB hosted [E]IDE devices

– Implements replacement and relocation policies for [E]IDE and mapped serial ATA

Page 28: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Policy Method Toolkit (3)

• disk-scsi-policy– implements persistent device naming

• Vendor ID, Product ID, Serial number– handles parallel SCSI, IEEE-1394,

FibreChannel and USB hosted SCSI devices– handles multi-ported storage devices– implements replacement and relocation policies

for parallel SCSI

Page 29: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Policy Method Toolkit (4)

• floppy-policy– handles internal floppies– USB floppies show up as disks

• simple-device-policy– handles block and character devices– “catch all” for many device classes

Page 30: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Policy Method Toolkit (5)

• ethernet-policy– implements persistent device naming

• initial MAC address

– implements replacement and relocation policies• USB ethernet devices not supported yet (trivial)

– uses hardware-map file to insure specific interfaces retain names despite device search order

Page 31: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Policy Method Toolkit (6)

• Emulation policies (for those that need it)– devfs– Linux Standard Base (LSB)

Page 32: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Policy Method Toolkit (7)

• Special purpose policies– disk-cs-policy

• An example of a policy that makes use of an agent

• Names are based on the geographical address of a disk in a chassis/slot environment

– multipath-policy• automatic provisioning of multi-ported disks

• Not limited to SCSI or FibreChannel

Page 33: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Where is it?

• http://sourceforge.net/projects/usde

• http://source.mvista.com/sde

Page 34: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Future Directions (1)

• A sufficient portion of our ideas are expressed in this prototype; it’s time to get lots of feedback and additional input

• Implementation is open source and available

• sourceforge project is up and running

Page 35: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Future Directions (2)

• Event mechanism is a closed socket hack. This should be replaced with an open messaging system

• grammar cleanup throughout

• classification scheme should be reviewed, simplified; scripted?

• Utilities should be improved and expanded– helpers for scripted policies that want retention

Page 36: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Future Directions (3)

• general walk-through and review

• multipathing - additional controls

• more device classes; more policies

• devfs and lsb emulation needs work

• flood of ideas from the community

• backing store content wars

Page 37: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Discussion Items

• Disk naming• Multi-chassis agent example• backing-store and deployment-model examples• Critical definitions• Configuration file details• Transaction details• More on events

Page 38: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

A Few Definitions (1)

• Interface Technology Path (ITP)– The unique, unambiguous and repeatable path

over which a system traverses hardware to arrive at the “location” of a device.

– Must remain constant across system crashes, reset and reboots

– For PCI devices the ITP is the Slot Path Address (SPA) of a device

Page 39: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

A Few Definitions (2)

• Interface Domain IDentifier (IDID)– The unique identification of a device within the

domain managed by the device’s parent device (controller/interface/adapter)

– Examples : address/LUN, Dev/Func

Page 40: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

A Few Definitions (3)

• Device Discrimination (DD)– The ability to discern a difference between

devices that on the surface appear to be identical. Specifically, it is the ability to uniquely identify one device from another where the devices share the same class, vendor and product descriptions

Page 41: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

A Few Definitions (4)

• Device Discrimination (continued)– The most common form of device

discrimination is implemented via a serial number

– When a device is not discriminatable a useful equivalent is possible - use the ITP and IDID!

Page 42: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

A Few Definitions (5)

• Persistent Device Naming (PDN)– Associates a unique name with a device based

on several of the device’s attributes– This differs from the current Linux device

naming scheme where the “name” of a device is actually a (shorthand) description of the data path and selection criteria used to access the device

Page 43: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

A Few Definitions (6)

• Persistent Device Naming (PDN) (cont.)– Persistently named devices must provide an

ensemble of attributes, including the ITP, IDID and DD, that unambiguously discriminates one device from all others. It is then possible to recognize and insure that the device name remains constant regardless of how the device is interfaced to the a system

Page 44: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

A Few Definitions (7)

• Persistent Device Naming (PDN) (cont.)– When a device’s name cannot be built directly

from its attributes some form of non-volatile storage must be be available to record the unique attributes along with the name assigned (aliased) to the device

Page 45: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE Files in Detail (1)

uSDE executivedaemon

uSDE utilities ConfigurationFiles

Backing store(optional)

exec-cache

PolicyMethod

PolicyMethod

PolicyMethod

PolicyMethod

Page 46: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

deployment-model File (1)

• service directive– Specifies which list of policy methods is

associated with a given class and sub-class

• device-node-default directive– specifies the device node control information for

a given class and sub-class• mode

• group

• owner

Page 47: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

deployment-model File (2)• device-node-specific directive

– specifies the device node control information for a specific device - class and instance within class

• mode• group• owner

• alias directive– specifies an alias associated with a specific device

- class and instance within class

Page 48: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

hardware-map File

• Optional

• map directive– specifies that a particular device, identified by

its ITP, is to be treated as a specific instance within a class

– force eth0 hardware to stay eth0 no matter what the discovery order

• Additional information in the future

Page 49: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

pci-info File

• Specifies the sub-class associated with a given PCI device by mapping the PCI vendor and product registers to a sub-class

• Will be generalized to handle other interfaces in the near future

• Optional

Page 50: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

exec-cache File

• Not a configurable file• Used internally by the uSDE executive to cache

the mapping of a sysfs path to class and sub-class– Have to remember how a device was classified

so the correct service action can be invoked upon remove/disappear

• Will be made optional via a special insert/appear mode of the executive in the near future

Page 51: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

backing-store File

• Optional file used to store non-volatile information– policy methods store their data, if any, here– simple “data base”– hierarchical model

Page 52: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

File Transactions (1)

• All files are protected via a transaction framework

• Transaction framework is tuned for speed and simplicity– lock contention is expected to be minimal– files are expected to be small– files are human readable - ASCII

Page 53: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

File Transactions (2)• Serialization is performed at transaction start

and end times:– lock is held only within the formal transaction start and

end routines– All of the files involved in the transaction are read into

memory– Modified files are rewritten if modified– transaction must be repeated if modified file has been

previously modified (but after transaction start within a given thread) by another thread of execution

Page 54: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE External Events (1)

• Insert event– a device has been physically inserted into the

system

• Remove event– a device has been physically removed from the

system

Page 55: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE External Events (2)

• Appear event– a device has been detected that was not inserted

• initial device scanning

• diagnostics (return to service)

• Disappear event– a device currently known to the system and in

service has disappeared from the system• no longer in service

• diagnostics (removal from service)

Page 56: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

uSDE External Events (3)

• Aspect-Change Event– A parameter associated with a device has

become available or has changed• information otherwise unavailable from the kernel

• “unusual” information sources - “out of band”

Page 57: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Unambiguous Disk Naming (1)

• Names should be persistent– Name remains fixed across reboots and

configuration changes

• Multi-ported disks are a challenge:– How is a disk named?– How does on unambiguously access a port?– How does generic SCSI logically work?

• One node or multiple?

Page 58: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Unambiguous Disk Naming (2)

• /dev/sde-disk/disk-name/d<n>p<m>– <n> is data port number (all disks have 0)– <m> is partition number

• generic SCSI node is either:– generic (if one)– generic_d<n> (if multiple)

• multi-path nodes are “multi_p<m>”

Page 59: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

backing-store details (1)object "ethernet0"{

string "class" "ethernet"string "sub-class" "generic"string "vendor-string" "Intel Corp. 82544EI Gigabit Ethernet Controller"string "product-string" "Intel Corp. 82544EI Gigabit Ethernet Controller"string "discriminator" "00:02:b3:c3:5d:ac"string "interface-technology-path" "/devices/pci0000:00/0000:XX:03.0/0000:XX:1d.0/0000:XX:01.0"integer "class-instance" 0string "state" "present"

}

Page 60: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

backing-store details (2)object "disk0"{

string "device-path" "/dev/sde-disk/disk0"string "class" "disk"string "sub-class" "fibrechannel"string "vendor-string" "IBM "string "product-string" "DDYF-T36950R "string "discriminator" "TFF6C829"integer "class-instance" 0string "state" "present"string "service-location" "unknown"object "ports"{

object "0"{

string "interface-technology-path" "/devices/pci0000:00/0000:XX:02.0/0000:XX:1d.0/0000:XX:01.0"string "interface-domain-ID" "0:9:0"string "sysfs-path" "/sys/block/sdd"integer "reference-count" 3

}object "1"{

string "interface-technology-path" "/devices/pci0000:00/0000:XX:02.0/0000:XX:1d.0/0000:XX:01.1"string "interface-domain-ID" "0:9:0"string "sysfs-path" "/sys/block/sdb"integer "reference-count" 3

}}

}

Page 61: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

deployment-model details

service-container disk fibrechannel { disk-scsi-policy multipath-policy }service-container disk ide { disk-ide-policy lsb-policy devfs-policy }service-container ethernet generic { ethernet-policy }

device-node-default disk fibrechannel{

mode 0x642owner “root”group “foo”

}

Page 62: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Multi-chassis agent example (1)

CPU

SLOT

2

Chassis 0x1234 Chassis 0x5678

CPU

SLOT

1

CPU

SLOT

3

CPU

SLOT

1

CPU

SLOT

3

CPU

SLOT

4

DISK

SLOT

4

DISK

SLOT

2

Disk

Net

Chassis have their disks and networks interconnected Hot swap notification is limited to the chassis (IPMI)A publisher agent broadcasts hot swap events to other chassisEach CPU runs a subscriber agent - processes hot swap eventsEach CPU is running a uSDE executive

Page 63: User-space System Device Enumeration (uSDE) Mark Bellon MontaVista Software, Inc

Multi-chassis agent example (2)

Hot Swap Subscriberand uSDE agent

uSDE Executive

disk-cs-policy

Chassis 0x1234, slot 1,2 3

Publisher

Insert event for chassis 0x5678, slot 2, disk ID

Aspect-change event

Aspect-change event

Hot Swap Subscriberand uSDE agent

uSDE Executive

disk-cs-policy

Chassis 0x5678, slot 1,3, 4

/dev/chassis5678/slot2/... /dev/chassis5678/slot2/...

Create device node