Upload
vinay2211
View
293
Download
1
Embed Size (px)
Citation preview
8/9/2019 BIOS POST trouble shooting guide.pdf
1/17
8/9/2019 BIOS POST trouble shooting guide.pdf
2/17
8/9/2019 BIOS POST trouble shooting guide.pdf
3/17
3 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Table of contentsRevisions ............................................................................................................................................................................................. 2
Executive summary .......................................................................................................................................................................... 4
1.
BIOS Splash Screen Display........................................................................................................................................... 4
2.
POST Error and Warning Messages ............................................................................................................................. 6
3. Post Code in iDRAC Web GUI....................................................................................................................................... 9
4. Driver Health Status Report ......................................................................................................................................... 10
5. Dell Diagnostics (ePSA) ................................................................................................................................................ 12
6. Red Screen of Death (RSOD) ...................................................................................................................................... 14
7. Yellow Screen of Death (YSOD) ................................................................................................................................. 16
8/9/2019 BIOS POST trouble shooting guide.pdf
4/17
4 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Executive summary
The Unified Extensible Firmware Interface (UEFI) is a set of industry-standard firmware interfaces that is
designed to replace the legacy BIOS to support modern operating systems and hardware architectures.
Dell has been shipping UEFI support in the BIOS since the 11thgeneration of PowerEdge servers through a
UEFI-over-Legacy model, where it is the legacy BIOS that initializes the whole system and loads the UEFIlayer at the end of Power-On Self-Test (POST) if needed. The Dell Lifecycle Controller technology is built
upon UEFI as well.
The BIOS on the 13thgeneration of Dell PowerEdge servers is now a native UEFI implementation, with a
Compatibility Support Module (CSM) to provide legacy BIOS interfaces to support operating systems that
are not UEFI-aware. The look and feel of the boot process is dramatically different from the previous
generations.
This guide provides troubleshooting solution for possible issues that may arise during POST and pre-boot
environment on the 13thgeneration of PowerEdge servers.
1.BIOS Splash Screen DisplayAfter the system is powered on, the Dell server BIOS may get to video display almost instantly. Fig. 1 is a
sample snapshot of the POST splash screen. The text next to the progress bar on the bottom of the screen
indicates various phases of POST. The text can aid in troubleshooting issues that happen during the
system boot process.
The following table lists the currently supported progress texts in the BIOS:
Text Display Phase of the Boot ProcessInitializing Intel QuickPath Interconnect...
BIOS performs an early initialization of the chipset,
processors, and QPI interfaces.Configuring Memory BIOS initializes the system memory.
Loading BIOS Drivers BIOS starts the Driver Execution Environment (DXE)
phase, loads and executes DXE drivers to perform
additional chipset, processor and hardware initializations.Initializing iDRAC BIOS waits for iDRAC to become ready. This phase may
take more than a few seconds on the first AC power on of
the system.Initializing iDRAC Done
iDRAC initialization has completed.Initializing PCIe, USB and Video Start of PCI enumeration and detection of USB keyboard
devices.Initializing PCIe, USB and Video Done PCI and USB enumeration has completed.Legacy PCI option ROM initialization (BIOS
boot mode only)
Applies to the BIOS boot mode only. The onscreen
display varies, depending on the type of PCIe cards thatare installed in the system.
Testing Memory (X% Complete) Software-based memory test phase. A percent progress .
Note: The memory test is disabled in the BIOS setup by
default.
8/9/2019 BIOS POST trouble shooting guide.pdf
5/17
5 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Testing Memory Done [No Errors] Memory test completed without any issue.
Testing Memory Done [Errors Encountered] Memory test has found error(s).TestingMemory Aborted
Memory test was aborted by pressing or spacebar
.Loading Lifecycle Controller Drivers
BIOS loads the Lifecycle Controller drivers.Loading Lifecycle Controller Drivers Done
BIOS has finished loading the Lifecycle Controller drivers.
Initializing Firmware Interfaces
BIOS connects the UEFI drivers to the device handles. TheUEFI drivers from add-in PCIe cards are expected to be
installed in this phase.Running In-System Characterization...
In-System Characterization (ISC) is in progress.Connecting iSCSI device(s) the UEFI iSCSI device drivers are connected. This display
applies to UEFI boot mode only. It gets displayed when an
iSCSI boot device(s) has been configured.Enumerating Boot options
BIOS starts to enumerate Boot Options in the system.Enumerating Boot options Done
The enumeration of Boot Options has completed.Entering Lifecycle Controller
The system is booting into the Lifecycle Controller.Lifecycle Controller: Applying Updates or
Setting System Configuration
An Automated Task Application is being scheduled in the
Lifecycle Controller.Lifecycle Controller: Collecting System
Inventory
Lifecycle Controller is collecting system inventory for this
boot.Lifecycle Controller: Done
Lifecycle Controller has finished execution.Booting
BIOS has finished POST and is giving control to the
operating system.
8/9/2019 BIOS POST trouble shooting guide.pdf
6/17
8/9/2019 BIOS POST trouble shooting guide.pdf
7/17
7 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Fig. 2 An error message box in early POST
If the issue is detected at a later time in POST, corresponding error and warning messages aredisplay ed
on the screen with a UEFIxxxx prefix. An event entry is logged in the Lifecycle Controller log (LC log) as
well. Depending on the severity of the error/warning, the system may proceed with continuing boot, or
prompt with F1/F2/F10/F11 for user input, or reset, or halt. The message comprisesof two parts, the
error/warning message itself, and a recommended response action. You can follow the corresponding
recommended response action to address the issue. For a complete list of POST error and warning
messages, see theEvent and Error Message Reference Guide for 13 thGeneration Dell PowerEdge Servers.
In the following example, the UEFI driver for the Integrated Network card is not signed. The user has just
turned on Secure Boot in BIOS setup utility. In the next boot, a few error messages are displayed on the
screen during POST.
- The first error message (UEFI0072) displays that the UEFI driver from the Integrated NIC 1 Port 1
Partition 1 was not loaded because it failed the Secure Boot authentication. You may address this issue
by updating the NIC firmware to a version that supports the UEFI driver signing.
http://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereg8/9/2019 BIOS POST trouble shooting guide.pdf
8/17
8 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
-
The second error message (UEFI0071) displays that the previously configured UEFI network boot
interface is no longer available. This is a result of the corresponding UEFI driver not being loaded.
- The third warning message (UEFI0074) displays that the Secure Boot policy has been modified since
the last time the system was booted. In this particular example, the user enabled Secure Boot on
purpose, so no action needs to be taken.
Fig. 3 An example of POST error messages
Corresponding logs for the error and warning messages will be recorded in the Lifecycle Log (Fig. 4).
8/9/2019 BIOS POST trouble shooting guide.pdf
9/17
9 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Fig. 4 Screen shot of the Lifecycle Log
3.
Post Code in iDRAC Web GUIIn case you cannot get to the screen display, the Post Codefeature available in the iDRAC web GUI may
come handy. This page displays the last system POST code with a descriptive text. POST code helps to
detect pre-video hangs, report fatal errors, and analyze system failures during POST.
8/9/2019 BIOS POST trouble shooting guide.pdf
10/17
10 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Fig. 5 An example of the Post Code in the iDRAC Web GUI
4.
Driver Health Status ReportThe UEFI specification defines a Driver Health Protocol (DHP). The DHP provides services allowing a UEFI
driver to express health status of a controller, return status messages associated with the health status,
perform repair operations if necessary and request configuration changes to place the controller back in a
usable state.
Dell server BIOS checks the driver health status of each UEFI driver in the system, and displays the status
messages . The BIOS may invoke the repair and configuration utility if a repair or reconfiguration operation
is required. In most cases, you can follow the instructions on the screen to proceed.
Fig. 6 is an example display where the BIOS halts on some errors returned from DHP. In this particular
example, the iDRAC DHP detected that the backplane 2 power cable has been disconnected; The LSI SAS
controller requires configuration changes, possibly due to a catastrophic issue.
8/9/2019 BIOS POST trouble shooting guide.pdf
11/17
11 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Fig. 6 Example of errors detected by UEFI Driver Health Protocol
The following (Fig. 7) is a snapshot of the Driver Health Manager in the case when a driver requires
configuration change. The Driver Health Manager lists all the device instances that require reconfiguration.
You can select each one of them and follow the instructions on the screen to configure the devices.
8/9/2019 BIOS POST trouble shooting guide.pdf
12/17
12 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Fig. 7 Driver Health Manager
5.
Dell Diagnostics (ePSA)Dell Enhanced Pre-Boot System Diagnostics (ePSA) are diagnostics tests that are embedded in the system
(Fig. 8). These tests allow you to check the hardware health status outside the operating system
environment. The findings of this diagnostics can assist you in troubleshooting the fault and working
toward a resolution to the issue.
The ePSA can be launched from the Boot Manager-> System Utilities-> Launch Diagnostics(Fig. 9).
8/9/2019 BIOS POST trouble shooting guide.pdf
13/17
13 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Fig. 8 Sample screen shot of ePSA
8/9/2019 BIOS POST trouble shooting guide.pdf
14/17
14 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Fig. 9 Launching diagnostics from Boot Manager
6.
Red Screen of Death (RSOD)The Dell server BIOS implements an enhanced CPU exception handler (RSOD) which aids the user and
tech support to analyze the software exception when the system crashes in the pre-boot UEFI
environment. The debug information is displayed on the screen and additional information and stack
traces can be retrieved through the serial port (if available). You can save the dump and use it for
debugging offline.
8/9/2019 BIOS POST trouble shooting guide.pdf
15/17
15 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
A sample RSOD display is depicted in Fig. 10.
Fig. 10 An example of the RSOD screen shot
When an exception is raised by the processor the BIOS displays the RSOD screen with the following
information related to the exception.
The exception type, such as Page Fault, General Protection Fault, Divide by Zero,
Breakpoint, and so on.
A Dell-defined error value, pre-fixed with UEFIxxxx.Note a corresponding error will be
logged to the LC log as well.
Partial register set (x86 64bit).
Last-Branch records and associated module names if available.
Current RIP and Faulting driver module name
Stack trace back from faulted module.
Additional information is available from the serial port dump. To retrieve the serial dump, you can connect
the server to a client system with a null modem cable and use any terminal program (for example, Putty or
HyperTerminal) with the baud rate set to 115200 bps, then press . The serial dump can be
retrieved from Serial over LAN (SOL) method as well.
8/9/2019 BIOS POST trouble shooting guide.pdf
16/17
16 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Note: The RSOD serial dump can be obtained at the point of failure. The serial session does not have to
be started prior to the RSOD.
RSOD are usually caused by software issues, and may be resolved by updating the BIOS, Lifecycle
Controller, or the UEFI firmware for PCIe cards. You may send the screen shot and serial dump to Dell
support for further analysis, should you encounter a RSOD even after all the firmware updates.
7.
Yellow Screen of Death (YSOD)When a hardware error occurs during UEFI pre-boot environment (excluding CSM phase in BIOS boot
mode), the Dell server BIOS may display a Yellow Screen of Death (YSOD) with some of the software
contexts at the time when the issue is detected.
The hardware errors include Nonmaskable Interrupt (NMI) and Machine Check Errors (MCE). You should
check the System Event Log (SEL) to identify the source and type of the error. Update the corresponding
device firmware if the error is originated from a PCIe device.
Note: The stack trace displayed on the YSOD screen only provides some context information before the
failure, and not the source of the problem.
A sample YSOD is depicted in Fig. 11.
8/9/2019 BIOS POST trouble shooting guide.pdf
17/17
17 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers
Fig. 11 An example of the YSOD screen shot