28
Postings may contain unverified user-created content and change frequently. The content is provided as-is and is not warrantied by Cisco. 1 Troubleshooting high CPU on a 6500 with sup720 Troubleshooting high CPU on a 6500 with sup720 The purpose of this document is to cover how to determine the cause of high CPU on a 6500/7600 with a sup720. The troubleshooting methods discussed in this documentation will make it possible to determine the cause of 90% of all high CPU issues on the sup720. The majority of high CPU on the sup720 is related to CPU usage on the MSFC, thus the majority of this document will cover high CPU on the MSFC. Because it would not be possible to cover every possible reason high CPU can be caused on the sup720, I will demonstrate how to use some of the tools built-in to the sup720 to show general methods on how to narrow down the cause of high CPU. If you are unable to determine the reason based on this documentation, please open a TAC case to investigate this issue further. **Note that these methods can be used to determine the cause of high CPU on a RSP720, Sup32 and VS-S720, due to common architecture. Determining Where the CPU utilization is occurring:

Troubleshooting High CPU on 6500 With Sup 720

Embed Size (px)

Citation preview

Page 1: Troubleshooting High CPU on 6500 With Sup 720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

1

Troubleshooting high CPU on a 6500 withsup720

Troubleshooting high CPU on a 6500 with sup720

The purpose of this document is to cover how to determine the cause of high CPU on a6500/7600 with a sup720. The troubleshooting methods discussed in this documentationwill make it possible to determine the cause of 90% of all high CPU issues on the sup720.

The majority of high CPU on the sup720 is related to CPU usage on the MSFC, thus themajority of this document will cover high CPU on the MSFC.

Because it would not be possible to cover every possible reason high CPU can be causedon the sup720, I will demonstrate how to use some of the tools built-in to the sup720 to showgeneral methods on how to narrow down the cause of high CPU.

If you are unable to determine the reason based on this documentation, please open a TACcase to investigate this issue further.

**Note that these methods can be used to determine the cause of high CPU on a RSP720,Sup32 and VS-S720, due to common architecture.

Determining Where the CPU utilization is occurring:

Page 2: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

2

Within the sup720 6500, there are two types of CPU’s.   One is used for layer2 operations and is commonly referred as the SP (Switch Processor) CPU.  Theother CPU is used for layer3/4 operations and it commonly referred to as theRP(Route Processor) CPU.  Both of these processors are located on the MSFC3complex, each with a 1 gig in-band channel to the supervisor.

Also depending on the module you may also have a DFC (Distributed Feature Card) toperform forwarding locally on that module. The DFC also has its own CPU, which performsprocessing locally on the line card. Under certain scenarios high CPU can be seen on thesemodules.

High CPU on the SP (Switch processor):

High CPU on the SP is much less common than high CPU on the RP. The reasons for highCPU on the SP are typically related to layer 2 operations of the sup720, such a spanning-tree (processing of BPDU's) or processing IGMP snooping/IGMP queries/membershipreports as well as LACP/PAGP.

You can view the CPU utilization using the following command:

SP CPU Util:

Page 3: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

3

Switch# remote command switch show process cpu

OR

Switch#remove login switch

Switch-sp#show process cpu

High CPU on the RP (Route Processor):

This will be traffic that needs to be processed for layer 3 operations, such as ARP, HSRP,forwarding traffic in software. Below I will go over troubleshooting steps when seeinghigh CPU on the IP Input/ARP input process as well as CPU utilization caused by interruptswitched traffic on the RP CPU.

You can view the CPU utilization using the following command:

Page 4: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

4

RP CPU Util:

Switch#show process cpu

High CPU on a DFC/module:

The CPU on the DFC will help in programming TCAM and router in hardware, since eachDFC has its own TCAM.

High CPU on a DFC is not very common and can occur for a few different reasons. Onereason you may see high CPU on the DFC is due to Netflow Data Export. Typically CPUfrom NDE is expected, but in rare instances it can become high enough to disrupt otherprocesses.

You can view the CPU utilization using the following command:

DFC CPU Util:

Page 5: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

5

Switch# attach <module>

Switch-DFC#show process cpu

Types of CPU utilization:

There are two type of CPU utilization within IoS, interrupt and process.

Process based CPU utilization:

CPU utilization caused by a process can be caused by few reasons listed below:

1.) Processes switched traffic. This is traffic that is hitting a specific process in order to beforwarded OR processed by the CPU. An example of each would traffic being forwarded viathe "IP Input" process OR control-plane traffic hitting the "PIM process".

2.) A process trying to clean up tables/previous actions performed. This can be seen inprocess such a "CEF Scanner" OR "BGP Scanner", which are used to clean/update the CEFand BGP tables.

Page 6: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

6

Interrupt based CPU utilization:

CPU caused by an interrupt is always traffic based. Interrupt switched traffic, is traffic thatdoes not match a specific process, but still needs to be forwarded.

Determining the type of CPU utilization:

Process and Interrupt CPU utilization are listed within the "show process cpu" command. This is broken down below on how to determine what percentage of the CPU utilization isdue to interrupt traffic or processed switched traffic:

6500-3#sh proc cpu

CPU utilization for five seconds: 0%/0%; one minute: 0%; five minutes: 0%

Page 7: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

7

Red - Percentage of total CPU utilization

Blue - Percentage of the CPU that is caused by Interrupts.

Percentage of process CPU util. = Total CPU - Interrupt CPU util.

Common reasons for HIGH CPU on the MSFC/RP:

IP traffic with a TTL of 1 - Due to the fact we need to send an IP unreachable messageto the host letting them know the message has expired in transit. This cannot be done inhardware and thus the packet must be punted to the MSFC. Find the device sending trafficthe TTL of 1 and stop is from sending traffic, increase the TTL OR install the MLS TTL rate-limiter.

Using an ACL with the log keyword - Since a log keyword requires a syslog message tobe generated this must be punted to the RP CPU as it cannot be done in hardware. Removethe log keyword from the ACL.

Using a PBR route-map without a set statement - Any traffic that matches a PBR route-map with no set statement will be punted. This is due to the fact that we need to programthe next-hop in hardware and if the next-hop is not known, this traffic must be punted to

Page 8: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

8

determine the next hop. Configure a set statement OR remove the policy route from theinterface.

FIB TCAM Exception - If you try to install more routes than are possible into the FIB TCAM you will see the following error message in the logs:

CFIB-SP-STBY-7-CFIB_EXCEPTION : FIB TCAM exception, Some entries will be software switched

%CFIB-SP-7-CFIB_EXCEPTION : FIB TCAM exception, Some entries will be softwareswitched

%CFIB-SP-STBY-7-CFIB_EXCEPTION : FIB TCAM exception, Some entries will besoftware switched

This error message is received when the amount of available space in the TCAM isexceeded. This results in high CPU. This is a FIB TCAM limitation. Once TCAM is full, a flagwill be set and FIB TCAM exception is received. This stops from adding new routes to theTCAM. Therefore, everything will be software switched. The removal of routes does not helpresume hardware switching. Once the TCAM enters the exception state, the system must bereloaded to get out of that state. You can view if you have hit a FIB TCAM exception with thefollowing command:

6500-2#sh mls cef exception status

Current IPv4 FIB exception state = TRUE

Current IPv6 FIB exception state = FALSE

Current MPLS FIB exception state = FALSE

Page 9: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

9

When the exception state is TRUE, the FIB TCAM has hit an exception.

The maximum routes that can be installed in TCAM is increased by the mls cef maximum-routes command.

This issue is common when trying to route a full BGP table on PFC-3A or a PFC-3B.

**Note a failover of the supervisors in dual supervisor system will not recover this exception,even through the “show mls cef exception status” will no longer indicate a FIB exception. A full reload of the switch is required.

ICMP redirects - If traffic is taking a path that is not efficient, an ICMP redirect willbe sent out to inform the host of a better next-hop.  This will cause the packet tobe punted in order to trigger the MSFC to send the ICMP redirect to the host. Thiscan be seen when performing a netdr capture.  An example of using netdr can beseen in the “Tools used to determine the source of the CPU utilization:” section

Turn off icmp redirects to stop this traffic from being punted. However this is an indicationof network inefficiency that was attempting to be dynamically resolved. User interaction isneeded in order to track down this inefficiency.

If you need assistance in determining why ICMP redirects are being generated please opena TAC case.

CEF Glean adjacency - This can happen when no ARP resolution for the next hop. Alltraffic must be punted in order to trigger an ARP request for the next hop. This will alwaysmanifest it self as interrupt based traffic.

Page 10: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

10

To protect the RP CPU from this issue you can implement the Glean adj. mls rate-limiter.

Netflow and ACL feature configured on the same interface matching the same traffic- You cannot have an ACL based feature and a Flow based feature configured on the sameinterface for the same traffic. An Example of this would be having NAT and PBR configuredon the same interface matching the same traffic.

NAT is netflow assisted, as the first packet in every flow would need to be punted to createthe netflow entry in hardware. Once the netflow entry is created all subsequent packets willhit this hardware netflow entry and thus be forwarded in hardware.

Policy-Based Routing is ACL based. This will create a “policy-route” state

when a route-map is configured to use PBR and applied to that interface.   Thiswill point to a special adjacency, which is where the next-hop is specified in the“set” statement of the route-map.

The issue comes when a packet matches both the NAT and the PBR feature, the traffic cannot be sent to the CPU to be put into Netflow AND be redirected to the PBR special adj,thus this traffic must be software switched. If these two features overlap, these features aretaken out of hardware and the traffic is software switched. When this occurs neither featuremay be applied to the matching traffic.

Page 11: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

11

If a packet does not match both the ACL based feature and Netflow based feature matchcriteria then the relevant function (ACL based or Netflow based) will be performed inhardware.

Therefore, for proper hardware based performance in situations where ACL based featuresand Netflow based features are configured on the same interfaces it is important to haveunique policies.

To work around this problem do not have both an ACL based and Netflow based featureconfigured on the same interface, matching the same traffic.

You can read more about troubleshooting feature conflict issues via the following link:

https://supportforums.cisco.com/docs/DOC-15670

Directed Broadcast traffic – All broadcast traffic must be sent to the MSFC ona vlan when a layer 3 interface is configured within that vlan.  This includesdirected broadcast traffic.   Use multicast instead of directed broadcast.

Bridging loop - If a bridging loop occurs on the network, this could cause high CPU on theMSFC. All broadcast traffic must be sent to the MSFC on a vlan when a layer 3 interface isconfigured within that vlan.

Page 12: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

12

You can determine what traffic is hitting the CPU by using a netdr capture to track downthe source interface of the loop (See Using Netdr to determine traffic punted to the CPUsection).

GRE with non-unique tunnel source - On the sup720, tunnel sources must be unique forall tunnels. Tunnels with a non-unique source will be software switched. The workaroundfor this limitation is to use either unique loopback interfaces for every GRE tunnel OR usesecondary addresses on a loopback interface for the tunnel source addresses. For moreinformation see CSCdy72539.

You may also see the following error:

%Warning: Using same source IP for more than one IP/GRE tunnels may causesoftware switching packets for tunnels using this address. If possible, use a uniquetunnel source for Interface Tunnel <tun#>

Other common unsupported features on Sup720-PFC3:

The following features/traffic types are common features that are not supported by the 6500and will cause high CPU if implemented:

**Note this is not an exhausted list and there may be unsupported features not listed below.

Page 13: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

13

NBAR

Traffic with IP options field set.

Multicast RPF drops

RSVP (INTSERV QOS) *can be used for tunnels

CEF accounting

Multicast traffic and NAT – see CSCek78254

The following link will give a larger list of all unsupported features and commands on thesup720-PFC3:

http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SX/release/notes/features.html#wp3691673

Tools used to determine the source of the CPU utilization:

Determine the source of RP CPU utilization using interface buffers:

Page 14: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

14

**Note** you will only be able to see traffic in the interface buffers on a layer 3interface if the traffic is being processed switched (see “Determining type of CPUutilization” above).  This will not work when traffic is being interrupt switched.  Inthe case of interrupt switched traffic use the netdr capture instead.

One of the quickest ways to determine the layer 3 interface that is the source of traffic thatis causing high CPU is to see which interface has a large amount of drops flushes on theinterfaces input queue. The input queue on a layer 3 interface is the CPU queue for thatinterface on the sup720. If we ever see packets/drops on the input queue on the sup720it is always due to traffic that is being sent towards the CPU. You can narrow down thelocation of such an interface with the following commands:

6500-2#show interface | include is up|drop

Vlan10 is up, line protocol is up

Input queue: 74/75/18063/18063 (size/max/drops/flushes); Total output drops: 0

Vlan20 is up, line protocol is up

Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

We can see that SVI (Switched Virtual Interface) 10 has 74 packets in its buffer, whosequeue size is 75 packets. This demonstrates that a large amount of traffic is being punted onthis interface to the RP CPU, since this queue is full.

Page 15: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

15

Now that we can see a large amount of traffic within this queue, we can look at what is inthis queue with the command "show buffers input-interface vlan 10 header". This commandwill display the IP header of the packet so we can attempt to determine the source. If youwant to look at the entire packet you can use the command "show buffers input-interfacevlan 10 packet".

Below is the output from this command for SVI 10

6500-2#sh buffers input-interface vlan 10 header

Buffer information for Small buffer at 0x4667A08C

data_area 0x802F664, refcount 1, next 0x466AE968, flags 0x200

linktype 7 (IP), enctype 1 (ARPA), encsize 14, rxtype 1

if_input 0x530D5048 (Vlan10), if_output 0x0 (None)

inputtime 00:00:00.000 (elapsed never)

outputtime 00:00:00.000 (elapsed never), oqnumber 65535

datagramstart 0x802F6DA, datagramsize 60, maximum size 308

mac_start 0x802F6DA, addr_start 0x802F6DA, info_start 0x0

network_start 0x802F6E8, transport_start 0x802F6FC, caller_pc 0x41F78790

Page 16: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

16

source: 10.10.10.2, destination: 10.100.101.10, id: 0x0000, ttl: 1,

TOS: 0 prot: 6, source port 0, destination port 0

Above we can see the basic information about this traffic that is included in the IP header,including the TOS, TTL and protocol encapsulated within the IP header.

If we viewed the entire packet we can look at more in depth information including the layer 2information, as can be seen below:

6500-2#sh buffers input-interface vlan 10 packet

Buffer information for Small buffer at 0x466A23B0

data_area 0x80340A4, refcount 1, next 0x466E991C, flags 0x200

linktype 7 (IP), enctype 1 (ARPA), encsize 14, rxtype 1

if_input 0x52836BE4 (Vlan10), if_output 0x0 (None)

inputtime 16:32:10.292 (elapsed 00:00:50.608)

outputtime 00:00:00.000 (elapsed never), oqnumber 65535

datagramstart 0x803411A, datagramsize 60, maximum size 308

Page 17: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

17

mac_start 0x803411A, addr_start 0x803411A, info_start 0x0

network_start 0x8034128, transport_start 0x0, caller_pc 0x41F78790

source: 10.10.10.2, destination: 10.100.101.10, id: 0x0000, ttl: 1,

TOS: 0 prot: 6, source port 0, destination port 0

0: 0015C726 FB800000 01000600 08004500 ..G&{.........E.

16: 002E0000 00000106 36510A0A 0A020A64 ........6Q.....d

32: 650A0000 00000000 00000000 00005000 e.............P.

48: 0000265C 00000001 02030405 FD ..&\........}

Red = Dest MAC

Blue = Source MAC

Green = Ethertype (0x800 for IP traffic)

Purple = Src. IP

Orange = Dest IP

Using “show ip traffic” statistics to see why traffic is punted:

Page 18: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

18

6500-2#show interface | i is up|drop

Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

Vlan10 is up, line protocol is up

Input queue: 74/75/18063/18063 (size/max/drops/flushes); Total output drops: 0

Vlan20 is up, line protocol is up

SVI 10/Interface Vlan 10 is receiving a large amount of traffic that is being punted to the RPCPU. When we look at what is in this queue with the command "show buffers input-interfacevlan 10 header".

Below is the output from this command for SVI 10

6500-2#sh buffers input-interface vlan 10 header

Buffer information for Small buffer at 0x4667A08C

data_area 0x802F664, refcount 1, next 0x466AE968, flags 0x200

linktype 7 (IP), enctype 1 (ARPA), encsize 14, rxtype 1

if_input 0x530D5048 (Vlan10), if_output 0x0 (None)

inputtime 00:00:00.000 (elapsed never)

outputtime 00:00:00.000 (elapsed never), oqnumber 65535

datagramstart 0x802F6DA, datagramsize 60, maximum size 308

Page 19: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

19

mac_start 0x802F6DA, addr_start 0x802F6DA, info_start 0x0

network_start 0x802F6E8, transport_start 0x802F6FC, caller_pc 0x41F78790

source: 10.10.10.2, destination: 10.100.101.10, id: 0x0000, ttl: 1,

TOS: 0 prot: 6, source port 0, destination port 0

Buffer information for Small buffer at 0x4667C7E8

data_area 0x80314A4, refcount 1, next 0x46695FD0, flags 0x200

linktype 7 (IP), enctype 1 (ARPA), encsize 14, rxtype 1

if_input 0x530D5048 (Vlan10), if_output 0x0 (None)

inputtime 00:00:00.000 (elapsed never)

outputtime 00:00:00.000 (elapsed never), oqnumber 65535

datagramstart 0x803151A, datagramsize 60, maximum size 308

mac_start 0x803151A, addr_start 0x803151A, info_start 0x0

network_start 0x8031528, transport_start 0x803153C, caller_pc 0x41F78790

source: 10.10.10.1, destination: 10.10.10.2, id: 0xD096, ttl: 255, prot: 1

Since at this point we are unsure why this traffic is being punted, we can look at“show ip traffic" statistics to see why this traffic is being punted to the CPU.  Firststart by clearing the IP traffic statistics.  We can then see what is incrementing inthese counters to see what would be the cause:

Page 20: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

20

6500-2#clear ip traffic

Clear "show ip traffic" counters [confirm]

6500-2#sh ip traffic

IP statistics:

Rcvd: 33516 total, 0 local destination

0 format errors, 0 checksum errors, 33516 bad hop count <------We can see thatthe bad Hop count in this case is incrementing

0 unknown protocol, 0 not a gateway

0 security failures, 0 bad options, 0 with options

Opts: 0 end, 0 nop, 0 basic security, 0 loose source route

0 timestamp, 0 extended security, 0 record route

0 stream ID, 0 strict source route, 0 alert, 0 cipso, 0 ump

0 other

Frags: 0 reassembled, 0 timeouts, 0 couldn't reassemble

0 fragmented, 0 couldn't fragment

Bcast: 0 received, 0 sent

Mcast: 0 received, 0 sent

Sent: 0 generated, 0 forwarded

Drop: 40005 encapsulation failed, 0 unresolved, 0 no adjacency

0 no route, 0 unicast RPF, 0 forced drop

0 options denied, 0 source IP address zero

Page 21: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

21

ICMP statistics:

Rcvd: 0 format errors, 0 checksum errors, 0 redirects, 0 unreachable

0 echo, 0 echo reply, 0 mask requests, 0 mask replies, 0 quench

0 parameter, 0 timestamp, 0 info request, 0 other

0 irdp solicitations, 0 irdp advertisements

0 time exceeded, 0 timestamp replies, 0 info replies

Sent: 0 redirects, 0 unreachable, 0 echo, 0 echo reply

0 mask requests, 0 mask replies, 0 quench, 0 timestamp

0 info reply, 58464 time exceeded, 0 parameter problem <---- We can also the6500 is sending ICMP TTL expired messages as well.

0 irdp solicitations, 0 irdp advertisements

<snip>

Looking at the traffic statistics we can see that the bad hop count counter is incrementingand the switch is sending ICMP time exceeded messages.

On the 6500 all traffic with a TTL of 1 is punted to the CPU so that an ICMP TTL expiredmessage can be sent to the host who sent this traffic.

Also, the first packet in the buffer can be seen to have TTL of 1, which is why this trafficis punted. We can see that the 2nd packet is sourced from 10.10.10.1 (SVI 10) sent to10.10.10.2. This packet is an ICMP TTL expired message.

Page 22: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

22

Using Netdr to determine traffic punted to the CPU:

A netdr capture is preformed on the MSFC CPU controller. This is the closest location youcan capture a packet on the MSFC in order to determine why traffic is being punted to theMSFC/RP CPU. With a Sup720 or Sup32 it allows one to capture packets on the RP or SPinband. The netdr command can be used to capture both Tx and Rx packets in the software-switching path.

Cat6500#debug netdr capture ?

acl (11) Capture packets matching an acl

and-filter (3) Apply filters in an and function: all must match

continuous (1) Capture packets continuously: cyclic overwrite

destination-ip-address (10) Capture all packets matching ip dst address

dstindex (7) Capture all packets matching destination index

ethertype (8) Capture all packets matching ethertype

interface (4) Capture packets related to this interface

or-filter (3) Apply filters in an or function: only one must match

rx (2) Capture incoming packets only

source-ip-address (9) Capture all packets matching ip src address

srcindex (6) Capture all packets matching source index

tx (2) Capture outgoing packets only

Page 23: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

23

vlan (5) Capture packets matching this vlan number

<cr>

OPTIONS:

• Using the continuous option, the switch will capture packets on the RP-inbandcontinuously fill the entire capture buffer (4096 packets) and then start to overwritethe buffer in a FIFO fashion.

• The tx and rx options will capture packets coming from the MSFC and going to theMSFC respectivey.The and-filter and the or-filter specify that an and or an or will be applied respectivelyto all of the options that follow. For example, if you use the syntax below, then bothoption #1 and option #2 must match for the packet to be captured. Similarly, if theor-filter is used either option #1 or option #2 or both must match for the packet to becaptuered.

• debug netdr and-filter option#1 option#2• The interface option is used to capture packets to or from the specified interface.

The interface can be either an SVI or a L3 interface on the switch.• The vlan option is used to capture all packets in the specified VLAN. The VLAN

specified can also be one of the internal VLANs associated with a L3 interface.• The srcindex and dstindex options are used to capture all packets matching the

source ltl and destination ltl indices respectively. Note that the interface option aboveonly allows the capture of packets to or from a L3 interface (SVI or physical). Usingthe srcindex or dstindex options allows the capture of Tx or Rx packets on a givenL2 interface. The srcindex and dstindex options work with either L2 or L3 interfaceindices.

• The ethertype option allows the capture of all packets matching the specifiedethertype.

• The source-ip-address and destination-ip-address options allow the capture of allpackets matching the specified source or destination IP address respectively.

Below is an example of capturing traffic destined to 10.100.101.10 sourced from 10.10.10.2going to the RP CPU:

Page 24: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

24

6500-2#debug netdr cap rx and-filter source-ip-address 10.10.10.2 destination-ip-address 10.100.101.10

6500-2#sh netdr cap

A total of 4096 packets have been captured

The capture buffer wrapped 0 times

Total capture capacity: 4096 packets

------- dump of incoming inband packet -------

interface Vl10, routine mistral_process_rx_packet_inlin, timestamp 00:00:11

dbus info: src_vlan 0xA(10), src_indx 0xC0(192), len 0x40(64)

bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x380(896)

10020400 000A0000 00C00000 40080000 00060468 0E000040 00000000 03800000

mistral hdr: req_token 0x0(0), src_index 0xC0(192), rx_offset 0x76(118)

requeue 0, obl_pkt 0, vlan 0xA(10)

destmac 00.15.C7.26.FB.80, srcmac 00.00.01.00.06.00, protocol 0800

protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 46, identifier 0

df 0, mf 0, fo 0, ttl 100, src 10.10.10.2, dst 10.100.101.10

tcp src 0, dst 0, seq 0, ack 0, win 0 off 5 checksum 0x265C

Page 25: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

25

Red = Ingress Vlan of traffic

Blue = Layer 3 interface traffic is coming from

Green = Ethertype and SRC/DST MAC addresses

Purple = IP Header

Orange = SRC index (source of ingress traffic).

Dark Red = Dest Index (where traffic is being sent).

You can use this information to track down the source of the traffic being punted. Pleaserefer to Troubleshooting with a NETDR capture on a sup720/6500 documention for a furtherexplanation of how to interpret this data.

Please open a TAC case if you need further assistance interpreting this data.

Using a CPU SPAN to determine traffic being punted to the CPU

Page 26: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

26

This capture is performed on the ASIC, which is connected to the RP/SP CPU. This willallow you to replicate traffic that is being sent to the RP or SP CPU to a capture device. This can be handy for determine the cause of the HIGH CPU OR determining if trafficis being sent to or from the CPU for processing (such as HSRP/OSPF/PIM control planetraffic).

When using the 12.2(18)SXF train and earlier the configuration for an inband span sessionis as follows:

RP Console:

Router#monitor session <1-66> <source|destination|filter> <interface|remote|vlan><FastEthernet|GigabitEthernet|Port-channel|GE-WAN> <tx|rx|both>

SP Console:

Router#remote login switch

Router-sp#test monitor session <1-66> <add|del|show> <rp-inband|sp-inband> <tx|rx|both>

Page 27: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

27

-OR-

Router#remote login switch

Router-sp#test monitor <add|del|> <session: 1-66> <rp-inband|sp-inband> <tx|rx|both>Router-sp#test monitor session <1-66> <show>

On the 12.2(33)SXH train and later, this is the configuration for an inband sp->rp spansession:

Router(config)# monitor session 1 type local

Router(config-mon-local)# source cpu <rp|sp> <tx|rx|both>

Router(config-mon-local)# destination interface gigabitethernet 1/2

Router(config-mon-local)# no shutdown

For more information please reference the following link:

http://cco.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SX/configuration/guide/span.html#wp1109488

Once this information is collected you can then use the source MAC/source IP information todetermine the source of the traffic.

Troubleshooting CPU spikes.

Page 28: Troubleshooting High CPU on 6500 With Sup 720

Troubleshooting high CPU on a 6500 with sup720

Postings may contain unverified user-created content and change frequently. The content is provided as-is andis not warrantied by Cisco.

28

At times it is not possible to determine the cause of a CPU spike, since a "show processCPU" cannot be run during the times of the issue. One way to get around this would be tosetup an EEM script to run the command for you when the CPU goes above a certain value. The following EEM script will run a "show process cpu sorted" when the CPU utilization ofthe device goes above 50%:

event manager scheduler script thread class default number 1

event manager applet High_CPU

event snmp oid 1.3.6.1.4.1.9.9.109.1.1.1.1.3.1 get-type exact entry-op ge entry-val 50poll-interval 0.5

action 0.0 syslog msg "High CPU DETECTED. Please wait - logging Information to<file system>:high_cpu.txt"

action 0.1 cli command "enable"

action 0.2 cli command "show clock | append <file system>:high_cpu.txt"

action 1.2 cli command "term length 0"

action 1.3 cli command "show process cpu sorted | append <filesystem>:high_cpu.txt"

Please fill in <file system> with the location of the file system without "<" or ">".

If you need further assistance in determining the cause of your high CPU please open aCisco TAC case.