Tail-f Systems Whitepaper - Practical Examples of NETCONF Protocol

Tail-f White Paper

Tail-f Systems © 2008

Page 1 of 11

NETCONF by Example

Practical examples of network management using NETCONF

Executive Summary

The NETCONF protocol is a modern building block for device management automation, offering a

unified approach to network configuration, rather than a device-specific methodology. To equipment

vendors, NETCONF is a way to standardize the management interface of network elements. To service

providers, NETCONF is a way to optimize the administrative workflow, by moving management

intelligence out of the device under management, and consolidating this management into higher-level

applications.

This paper provides a brief overview of NETCONF, presents its commands and operations, describes

how the protocol works in a series of practical scenarios, and describes its sophisticated configuration

capabilities.

Overview of NETCONF

NETCONF is a protocol which was officially published as an RFC by the IETF NETCONF Working

Group on December 13, 2006. [1, 2]. It provides the mechanisms for installing, querying, manipulating

and deleting the configurations of network devices.

As shown in Figure 1, NETCONF consists of four layers.

Layer Function

Content Layer Transports configuration data

Operations Layer Queries and edits the configuration

RPC Layer Performs the actual <rpc> and <rpc-reply> operations

Application Protocol (SSH, SSL, etc.) Transports the protocol

NETCONF exposes a standardized RPC-style API based on XML. The XML requests and responses are

sent over a persistent, secure, authenticated transport protocol, such as SSH. The use of encryption

means the requests and responses are confidential and tamper-proof. This enables devices to be

managed over an untrusted wide area network (WAN) using well-known security technologies.

Configuration over a WAN means network management can be centralized, by consolidating all

management to a single site, or decentralized, by permitting multiple sites to share device management

work.

Figure 1: NETCONF comprises four distinct layers.

Tail-f White Paper


Page 2 of 11

In addition to secure communications, NETCONF requires devices to track client identities and enforce

permissions associated with identities. Identities are managed at the underlying secure transport layer,

such as SSH, and reported to the NETCONF agent. The NETCONF agent then enforces any restrictions

based on whatever security model is implemented by the node.

NETCONF is extensible and future proof. NETCONF sessions begin with a capability discovery phase,

where the network element exposes its capabilities to the management device and the parties

subsequently discard unknown capabilities. New features can be defined locally, but formally, with a

rigorous syntax and semantics.

NETCONF Commands and Operations

NETCONF consists of a base set of commands, extended by capabilities. A capability is identified by a

URI, and augments the base operations with new commands, parameters, values and named entities.

NETCONF commands operate mainly on data stores, which are versions of the device state. A data

store consists of configuration data, which represents settable device parameters and state data, which

represents device statistics. Configuration data can be read and written while state data is read-only. By

default, a NETCONF device provides a single data store, named <running>. Capabilities add extra data

stores, such as <startup>, which represents the device’s startup state, or <candidate>, which represents a

temporary state before it is made permanent.

NETCONF uses a remote procedure call model. Requests are XML documents with a top-level tag

<rpc>. An XML namespace [3] and a unique message identifier are associated with each request. An

example that defines a trivial query <get> of the configuration follows. (Note that, when appropriate,

the core portion of XML requests will be highlighted to help understanding.)

<rpc message-id="101"

xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">

<get>

<filter type="subtree">

</filter>

</get>

</rpc>

The request specifies a filter, written <filter>, which tells the device what data the client wants.

The device returns the following:

<rpc-reply message-id="101"

xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <data> </data> </rpc-reply>

As shown, the request and reply both include the same message-id number. In this example, no data is

returned since the filter is empty and does not specify any desired fields.

In general, filters are used to express simple database queries. For example, 'return the <name> and

<ip> fields for all <interface> nodes of type "ethernet"' is expressed as follows:

Tail-f White Paper


Page 3 of 11

<filter type="subtree"> <top xmlns="http://example.com/schema/1.2/config">

<interfaces> <interface> <type>ethernet</type>

<name/> <ip/> </interface>

</interfaces> </top> </filter>

By specifying a concrete value, ethernet, for the <type> field, the server is directed to select only

“records” matching this specification. By specifying the <name> and <ip> fields, the server is told to

only return those fields from the selected records. (If no fields are given, the whole record is returned.)

Filters are discussed in greater detail below.

A NETCONF reply can also indicate one or more errors of varying severity. A simple example follows.

The client asks for the <running> data store:

<rpc xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <get-config> <source> <running/> </source> </get-config> </rpc>

However, the client has omitted the message-id attribute. The server thus replies:

<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <rpc-error> <error-type>rpc</error-type> <error-tag>missing-attribute</error-tag> <error-severity>error</error-severity> <error-info>

<bad-attribute>message-id</bad-attribute> <bad-element>rpc</bad-element>

</error-info> </rpc-error> </rpc-reply>

The <rpc-error> structures the error message to provide type, severity, and other error-specific details.

In contrast with a text-based CLI, a structured error can be easily analyzed by the management

application and presented in the administrator’s preferred format. Sophisticated management

applications can in many instances detect and handle some errors without operator intervention.

The base NETCONF protocol defines a number of RPC operations:

• <hello> advertises capabilities at the beginning of a session. • <get> and <get-config> retrieve the configuration. • <copy-config> and <delete-config> are used for bulk access to datastores. • <edit-config> updates the configuration, by a detailed specification or bulk access. • <lock> and <unlock> are used to get and release exclusive access to the device

configuration. • <close-session> and <kill-session> are used to terminate your own or some other

session (as identified by a <session-id> tag).

This set of NETCONF operations is extensible with respect to operations, parameters, data stores, and

so on, by specifying and using NETCONF capabilities. Of particular interest are the locking operations.

Tail-f White Paper


Page 4 of 11

A management system can ensure exclusive access to a collection of devices by locking and

reconfiguring each device, and committing the changes. Furthermore, by using locks, multiple system

administrators or tools can work concurrently and safely on the same network.

NETCONF in action

A number of detailed examples are presented here to demonstrate how NETCONF performs actual

administrative tasks. Processes examined include a capability negotiation, installation of a new network

element and finally, a multi-device configuration.

Capability negotiation

A NETCONF session begins with capability negotiation. This phase is common to all sessions, so is

discussed in detail here, but omitted from subsequent examples.

The session begins with the management application (client) connecting to the network element

(server). The secure transport layer, e.g., SSH, takes care of client authentication, and the NETCONF

device takes care of authorization; we do not consider these issues further here [4]. The client then sends

an initial capability summary:

<?xml version="1.0" encoding="UTF-8"?> <hello xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <capabilities> <capability>urn:ietf:params:netconf:base:1.0</capability> </capabilities> </hello>

The request begins by defining the XML version, character encoding and the XML namespace (in this

case, the basic NETCONF 1.0 namespace). Each capability is denoted by a Universal Resource

Indicator. In this example the capability -- urn:ietf:params:netconf:base:1.0 -- advertises the basic

NETCONF protocol.

Simultaneously, without waiting for the client, the server sends its own session initiation message:

<?xml version="1.0" encoding="UTF-8"?> <hello xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <capabilities> <capability>urn:ietf:params:netconf:base:1.0</capability> <capability>urn:ietf:params:netconf:capability:startup:1.0</capability> <capability>http:/example.net/router/2.3/myfeature</capability> </capabilities> <session-id>4</session-id> </hello>

The server advertises three capabilities: the basic NETCONF capability, the :startup data store

capability, and an implementation-defined “myfeature” capability. The server is also responsible for

assigning and advertising a unique identifier to the session, in this example, the value 4.

As can be seen, the only common capability is the base protocol. Both server and client must therefore

limit themselves to using only this capability.

Tail-f White Paper


Page 5 of 11

Capabilities can include a variety of functions, including but not limited to adding commands and data

stores and modifying the parameters of commands. A partial list of capabilities defined in the

NETCONF document follows:

• :url specifies that URLs can be used to specify locations.

• :writable-running means the device supports updates directly to the <running> data store.

• :candidate indicates that the device provides a data store for candidate configurations, which can be used as a work area, then committed or discarded.

• :confirmed-commit extends the :candidate capability with a timeout; an installed but uncommitted configuration will be discarded after the timeout.

• :rollback-on-error reverts the configuration if an error occurs.

• :validate checks configuration for errors before applying it to the device.

• :startup defines an initial (“startup”) data store, separate from the <running> data store

Installation of a network element

In this example, NETCONF installs a new network element. Essentially, the system administrator plugs

in the device, which prompts the management application to contact the device; determine vendor, make

and version; generate an appropriate configuration; and install the configuration.

First, the administrator installs the necessary security information on the network element so that it can

establish a NETCONF session. The management application (which is the “client” with respect to

NETCONF) sets up a NETCONF session with the device and they exchange capabilities. The

negotiated capabilities are assumed to be :url and :writeable-running.

Next, the client queries the device to determine details such as model number and software version.

Generally, a filter can specify deep XML subtrees, but here a simplistic data model is assumed to keep

our examples clear. We define the following filter:

<filter type=”subtree”> <top xmlns=”http://example.com/schema/1.2/config/”> <vendor/><product-name/><vendor-os-release/><vendor-application-release/>

</top> </filter>

This filter defines the name space to use, filter type, and subtree filtering, which is the basic form of

query filtering supported by NETCONF. (Other forms of filtering, e.g., based on XPath, can be

provided as capabilities.) The filter selects and returns the vendor, product name, and the relevant

software revisions for node operating system and application. The following full request uses the filter

to get the configuration data:

<rpc message-id="220" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">

<get>

<filter type="subtree">

<top xmlns=”http://example.com/schema/1.2/config/”> <vendor/><product-name/><vendor-os-release/><vendor-

Tail-f White Paper


Page 6 of 11

application-release/> </top>

</filter> </get> </rpc>

The network element responds:

<rpc-reply message-id="220" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <data> <vendor> ExampleBrand </vendor>

<product-name> Ethernet Switch Model 300 </product-name> <vendor-os-release> 0.9 </vendor-os-release> <vendor-application-release> 0.4 </vendor-os-release>

</data> </rpc-reply>

Given this information, the management application looks up or generates a device-specific

configuration that satisfies the network role of the device (e.g., regarding connectivity).

The configuration generation might start from a general XML document, which is rewritten using XSLT

[6] to a form suitable for the specific device capabilities and data model and updated to account for

aspects such as network connectivity. The exact details of this process are beyond the scope of this

paper, so they are left unspecified. The generated configuration document is stored at a URL uuu in the

management application. The management application then installs the configuration document on the

device by issuing a <copy-config> or <edit-config> command.

Depending on the network element and its advertised capabilities, the configuration document can be

installed by sending the configuration embedded in the XML request, or by instructing the network

element to fetch it from the specified URL. The configuration contents can be sent as XML, plain text,

or some other negotiated format. In the following example, we assume that the :writable-running and

:url (for secure HTTP) capabilities are supported, which is specified as follows:

<?xml version="1.0" encoding="UTF-8"?> <hello xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">

<capabilities> <capability> urn:ietf:params:netconf:capability:writable-running:1.0

</capability> <capability> urn:ietf:params:netconf:capability:url:1.0?protocol=http,https,ftp,file </capability> </capabilities> <session-id>11902</session-id>

</hello>

The :url capability specifies that the protocols http, https, ftp and file are supported in URLs. After

capability exchange, the client sends a <copy-config> request:

<rpc message-id="102" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <copy-config> <target>

<running/> </target> <source> <url>https://[email protected]:passphrase/cfg/new.txt</url>

Tail-f White Paper


Page 7 of 11

</source> </copy-config> </rpc>

The network element copies the configuration from the URL to its <running> data store using secure

HTTP and then replies:

<rpc-reply message-id="101" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <ok/> </rpc-reply>

In general, the management application can now modify the device configuration as appropriate, using

<edit-config>. In this example, however, the configuration is now assumed to be complete and the

device is ready for operation. The NETCONF session can either be stopped or left running, depending

on management policy.

Multi-device configuration

The next example demonstrates how to safely configure two devices at a time with locking. Two

devices are to be reconfigured, A and B. First, NETCONF sessions are established from the

management application to both devices. The management application then locks the <running>

configurations of both devices. First it requests a lock for device A:

<rpc message-id="500" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <lock> <target> <running/> </target> </lock> </rpc>

Device A replies that the lock is granted:


The same process is repeated for device B. If another NETCONF session has already locked the

configuration, the operation will fail and an error is returned:

<rpc-reply message-id="101" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <rpc-error>

<error-type>protocol</error-type> <error-tag>lock-denied</error-tag> <error-severity>error</error-severity>

<error-message> Lock failed, lock is already held </error-message>

<error-info> <session-id>32500</session-id> 

</error-info> </rpc-error> </rpc-reply>

In this event, the management application must subsequently try to reacquire the lock on B. When

multiple managers compete for a set of devices, care must be taken to ensure that deadlock does not

occur. Standard policies handle this, normally by locking the devices in a well-defined order. If a

Tail-f White Paper


Page 8 of 11

deadlock occurs for some reason, a lock can be broken by an outside agent, via the <kill-session>

command. This terminates the lock-owning session.

Once both A and B have been locked, the management application modifies both data stores using

<edit-config>. The <edit-config> operation specifies what should be done with a decorated query filter,

an XML tree where certain nodes are decorated with operations indicating whether the node should

merge with the existing configuration, replace it, delete it, or perform some other operation. Here is an

example:

<config xmlns:xc="urn:ietf:params:xml:ns:netconf:base:1.0"> <top xmlns="http://example.com/schema/1.2/config"> <interface xc:operation="replace"> <name>Ethernet0/0</name>

<mtu>1500</mtu> <address>

<name>192.0.2.4</name>

<prefix-length>24</prefix-length> </address> </interface>

</top> </config>

The xmlns:xc definition binds the identifier xc to the NETCONF base namespace, which then defines

the meaning of the xc:operation=”replace” attribute two lines below.

Let us assume that the data model of the device specifies that the interface name is the unique key

identifying subtree. The filter in the above example must locate the interface with name Ethernet0/0 and

replace the associated information with the specified mtu and address information. If no such interface

exists, an error is returned. (To create a new interface with this information, the management application

could instead specify xc:operation=”create”.)

In the example of an installation of a new network element, we wrote directly into the <running> data

store, but other options exist. If the :candidate capability is available, the management application can

write to the <candidate> data store, then commit it later. The commit operation copies the <candidate>

data store to <running>.

We have also assumed that the entire configuration update of a single device is successful. If the update

fails, the data store may be left partially updated in the basic case (this behavior is known as :stop-on-

error, stopping after the first error, or :ignore-error, running all updates and indicating any errors

afterwards). To revert to the original configuration or repair a partial reconfiguration, changes would

have to be manually undone by the client, for example by examining the errors and updates, followed by

issuing appropriate <edit-config> commands, or by copying and restoring the original configuration. A

more convenient alternative is provided by the (optional) :rollback-on-error capability, which discards

all changes when an update fails, and therefore automatically reverts back to the pre-update data store.

NETCONF furthermore supports a rich spectrum of options for operating on data stores, including the

validation of submitted configuration values before accepting them.

Returning to the multi-device example, the management application unlocks both devices. Device A is

unlocked with the request below, and device B by an equivalent request.

Tail-f White Paper


Page 9 of 11

<rpc message-id="456" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <unlock> <target> <running/> </target> </unlock> </rpc>

Device A replies that the operation was successful, that the lock has been released:


Device B gives a similar indication of success. The NETCONF sessions can now be terminated or

continued as desired. (Note that terminating a session implicitly releases any held locks, so in this case,

the above explicit unlocking is not strictly needed if the sessions are terminated.)

At this point, the configurations of both devices have been updated, avoiding possible inconsistencies

due to multiple managers updating overlapping equipment.

Sophisticated configuration capabilities of NETCONF

In addition to support in basic scenarios such as those described above, NETCONF offers sophisticated

configuration capabilities. One of them is the :candidate capability, which provides a tentative data store

<candidate>, where a configuration can be modified without impacting the current device configuration.

A <commit> operation makes the candidate store permanent, ie, copies <candidate> to the <running>

data store.

Note that the <candidate> data store is not private by default. A client should lock the store before

modifying it. When all modifications are done, the client either makes them permanent with <commit>

or discards them, using <discard-changes>.

In some cases, device reconfiguration may potentially disable a device. For example, assume a device

has a single interface, if0. While reconfiguring, Joe deletes the interface if0 and adds a new one, if1.

Unfortunately, once he commits the operation, Joe realizes that if1 was misconfigured and the device is

now unreachable. What should be done?

The conventional approach is to reset the device to a known factory state and reconfigure it to a known

good configuration. However, taking such an approach is time consuming, and, in a modern, device-rich

installation, will not scale.

The underlying problem is that reconfiguring a device may essentially disable it, putting it into a state

which cannot itself be reconfigured. At this point, administrators are normally forced to do a factory

reset, manual debugging via a serial line, or the equivalent. However, NETCONF provides a high-level

way to back out of such “disabled” states, using the :confirmed-commit capability. This extends the

<commit> operation with the following protocol:

1. Lock and modify the configuration

2. Issue a <commit> with an extra confirm timeout parameter.

Tail-f White Paper


Page 10 of 11

3. Test the configuration.

4. If the configuration is satisfactory, issue a second <commit> with an extra confirmation parameter.

After the second commit, the configuration is made permanent. Alternatively, if the commit fails, the

device reverts to its pre-commit configuration.

Reconsider the scenario above and assume the device gets a bad if1 definition and becomes

unreachable. If the confirming commit cannot reach the device, it automatically reverts back to the

working configuration after the timeout. To trigger this behavior, the first commit is extended with a

confirm-timeout of 120 seconds:

<rpc message-id="747" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">

<commit> <confirmed/> <confirm-timeout>120</confirm-timeout> </commit> </rpc>

This copies the configuration to <running>. Next, the administrator commences and successfully

completes configuration testing. Finally, the administrator issues the second, confirming commit, which

is simply a regular commit operation:

<rpc message-id="757"

xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> <commit/> </rpc>

Confirmed commit can also be used, quite easily, for commit spanning multiple devices, as follows:

1. For each device, update its configuration and emit a first (confirmed) commit with a timeout.

2. If all updates are successful and testing passes, then issue a second (confirming) commit to each device.

Most significantly, if the configuration update for any device fails, then all the involved devices are

forced to revert to their old configurations simply by not issuing the second, confirming commit to any

of them.

Conclusion

This paper shows how a selection of NETCONF protocol features combine to enable safe and

expressive remote management of multiple NETCONF-enabled network devices. The protocol security

features provide secure access over local and wide-area networks while supporting administrator roles.

Filters provide support for XML-database queries and updates. Configuration data stores permit safe

concurrent operations through a locking mechanism. Finally, confirmed commit operations allow

networks to automatically recover from previously fatal configuration errors.

Tail-f White Paper


Page 11 of 11

References

[1] IETF NETCONF Working Group. (http://www.ops.ietf.org/netconf/).

[2] R. Enns. NETCONF Configuration Protocol. RFC 4741, IETF NETCONF Working Group,

December 2006 (http://www.ops.ietf.org/netconf/).

[3] Namespaces in XML. W3C specification (http://www.w3.org/TR/REC-xml-names/).

[4 M. Wassermann, T. Goddard. Using the NETCONF Configuration Protocol over Secure

Shell (SSH). Work in Progress (version 6), IETF NETCONF Working Group

(http://www.ops.ietf.org/netconf/).

[5] XML Path Language. W3C specification (http://www.w3.org/TR/xpath).

[6] XSL Transformations. W3C specification (http://www.w3.org/TR/xslt).

Tail-f Systems Headquarter

Klara Norra Kyrkogata 31

SE-111 22, Stockholm, Sweden

Phone: +46 8 21 37 40

www.tail-f.com

[email protected]

Tail-f Systems North America

109 S. King Street, Suite 4

Leesburg, VA 20175, USA

Phone: +1 703-777-1936

http://www.w3.org/TR/xslt

http://www.tail-f.com/

mailto:[email protected]

Technology

Tail-f Systems Whitepaper - Practical Examples of NETCONF Protocol