37
Microsoft Exchange Server Best Practices Analyzer Tool Paul Bowden Program Manager Exchange Server Development Microsoft Corporation

Microsoft Exchange Server Best Practices Analyzer Tool Paul Bowden Program Manager Exchange Server Development Microsoft Corporation

Embed Size (px)

Citation preview

Microsoft Exchange Server Best Practices Analyzer Tool

Paul Bowden

Program Manager

Exchange Server Development

Microsoft Corporation

What is it?• The Exchange Sever Best Practices Analyzer 'encodes' the top

product support issues into a tool which can be run against a live deployment. – Step by step documentation tells you how to resolve each problem

• The tool can be run as part of a proactive 'health check' which can expose availability or scalability problems. Additionally, the tool can be run as part of a reactive troubleshooting step for problem diagnosis and identification.– The tool will report issues currently causing problems within the

topology, and discrepancies which may cause future outages.• The tool can be used to actively document the design and

configuration of the Exchange topology. This data can be used to track the history of a deployment, or provide a ‘quick-start' to administrators and product support staff who need to analyze the history and configuration of an unfamiliar deployment.

Why we developed it

• Administrators are finding it difficult to keep up with the documentation that we produce– Urgency– Relevance

• Customers find it difficult to keep track of whether they are conforming to all the best practices

• Exchange has many options and finding root cause for a problem can be a long process– ~60% of Exchange problems are mis-configurations

• We have many tools for collecting information, but not many provide auto-analysis

Design Principles• Concentrate on Performance, Scalability and Availability of Exchange Servers

– ExBPA does not check security configuration• Make it easy to run

– No complex configuration settings– Auto-detect everything– Allow multiple credentials to be entered– No server-side components to install– No impact on Exchange performance, even at peak periods

• Don’t leave me hanging– Every Error | Warning | NonDefault rule has a specific article which tells you more about the

problem and how we detected it• Keep it up-to-date

– Provide best practice updates every month– Make the tool auto-download the updates

• Work in all environments– From single server SBS implementations through to the largest enterprise– Make the tool work seamlessly in both open and closed networks

Similar Tools

• MBSA – Microsoft Baseline Security Analyzer

• SQLBPA – Microsoft SQL Server Best Practices Analyzer

• The ExBPA engine has now been mandated as part of the WSS 2006 Common Engineering Criteria– BPAs for other Microsoft products are forthcoming

Architecture

• One tool runs against all versions of Exchange– No support for pure Exchange 5.5 topologies

• You generally install the tool on a Windows XP workstation, and it remotely collects the data– Don’t need to install any components on the server

• ExBPA is written in managed code (C#)

• Input/output data model is XML based

• Analysis engine is based on XPath

Where do we look?• We look for data in…

– Active Directory– DNS– WMI– Registry– Metabase– Performance Monitor– Files on disk– TCP/IP ports

• First pass of execution - collection– ExBPA collects the data and places it in the same namespace

• Second pass of execution – analysis– Individual settings are analysed against the defined rules. Cross-

checking between data sources is possible as the data is in the same hierarchy

How it works

ActiveDirectory

ExchangeServer

ExchangeServer

ExchangeServer

ExBPADispatcher

XMLRules

collectors

OutputData

ExBPAAnalyzer

Import

XMLExport

ExBPA Interface

Demonstration…

What does ExBPA check today?

This following is not an exhaustive list of the checks that the tool performs, but it should give you a general idea!

Exchange Roles

• ExBPA detects and understands the difference between…– Small mailbox servers– Large mailbox servers– Clustered Exchange servers– Front-end servers– Bridgehead servers

• Rules are conditioned for their roles (e.g. Circular logging needs to be disabled on mailbox servers, but should be enabled on bridgehead servers)

Rule Types• Error

– We found something that is causing, or will cause a problem– Example: No maximum message size set for the organization

• Warning– We found something that looks suspicious– Example: An ADC connection agreement is scheduled to ‘Never’

• NonDefault– We found a setting which has been changed– Example: One of the many store parameters has been tuned/tweaked

• Time– We found something that was changed during the past 5 days– Example: The cost on an SMTP connector was changed

• BestPractice– We found that a best practice is not being followed– Example: Dr. Watson crashes are not being uploaded to Microsoft for analysis

• Info– We found something of interest– Example: Your server has 8 processors installed

Active Directory

• Forest-wide– Forest functionality level– Exchange schema extensions– Default policy changes

• Per-domain– Domain functionality level– Domains which have been renamed– Check availability of FSMO servers– EDS/EES group renamed/deleted/moved– MESO container renamed/deleted/moved

Active Directory Connector

• ADC Server– Server is overloaded– Server is idle (i.e. no connection agreements)– There’s a newer version of the ADC available– Server is running the latest OS Service Pack

• Connection Agreements– Orphaned agreements– Schedule set to never– Nominated server is missing– One way agreements– Out-of-date agreements

Exchange Organization

• Check– Global message size limits are enforced– Stray Exchange objects in LostAndFound container– More than 10 administrators defined– ForestPrep version– Mixed/native mode– OMA/EAS options– UCE thresholds– Recipient Update Service definitions– Address List and OAB definitions

Admin Groups

• Check– Validity of legacyExchangeDN– Policy containers intact

• Routing Groups– Check for valid routing master– Enumerate all connectors– Check for connectors that have recently changed

Exchange Server object

• Check– Validity of server name– FQDN/NetBIOS name resolution– Latest Exchange Service Pack / Roll-up– Time synchronization with the Active Directory

Cluster Configuration

• Checks both Active and Passive nodes• Cluster-specific checks

– Number of nodes in the cluster– Configuration discrepancies between nodes– Cluster account TEMP/TMP path– Quorum configuration– Heartbeat configuration– DNS/WINS configuration– Enumerates all resources and parameters– Kerberos configuration

Directory Access

• Check– DSAccess cache configuration and non-default parameters.

E.g.• MaxMemoryUser | MaxMemoryConfig• LdapKeepAliveSecs, DisableNetLogonCheck• MinUserDC

– DSAccess cache efficiency

• DSAccess topology– Round-trip times between Exchange and each DC/GC in the

topology– Hardware/OS configuration of each DC/GC– Calculates the GC to Exchange processor ratio

Information Store• Check

– ESE cache configuration– Current state of virtual memory– Online maintenance window– Checkpoint depth– Circular logging state– Log buffer configuration– Log generation level– File system characteristics (NTFS/Compression/Encryption)– Validity of legacyExchangeDN– Database and logs on the same LUN– Content Indexing state– Non-default parameters in Private|Public-GUID registry– Database size– E-mail address on Public Folder stores– RPC Compression / Buffer Packing settings– Hard-coded TCP/IP ports, and clashes with other Exchange ports

Store Process Parameters

Examples– Disable MAPI Cllients– Enable Tracing– Initial Memory Percentage– Initial Reserve Size KB– Ignore Zombie Users– Logon Only As– Mailbox Cache Age Limit– Mailbox Cache Idle Limit– Mailbox Cache Size– MaxOpenMessagesPerLogon– Reserve Increment KB– SuppressOOFsToDistributionLists– Trace User LegacyDN– VM Warning|Error Level

• More– objtAttachment– objtFolder– objtFolderView– objtMessage– ProrateFactor– ProrateStart– ProrateMax– IMAIL settings– ExIFS drive

Check for non-default settings and bad values

Transport

• Check– Main configuration parameters in the AD– Cross-check AD and metabase for consistency– Non-default settings– File system characteristics for ‘mailroot’ folders

(NTFS/Compression/Encryption)– SMTP stack verb validation (e.g. X-LINK2STATE)– SMTP mail submission test– Enumeration of transport event sinks– Enumeration of MTA settings, calling out any non-defaults– Detection of Archive Sink and configuration– Non-default routing parameters (e.g. SuppressStateChanges)

System Attendant

• Check– Service state– File system characteristics for message tracking

folder (NTFS/Compression/Encryption)– RFR service– RFR / NSPI Target Server configuration– Hard-coded TCP/IP ports

Anti-virus Support

• CA eTrust 6/7 file-level AV configuration and exclusions

• Trend Micro ScanMail – Patch level– Performance tuning configuration (threads/thresholds/debug

settings)• Product detection and configuration settings for

– McAfee GroupShield– Symantec Mail Security for Exchange– Sybari Antigen

• VS API configuration settings– Warn if number of threads is not appropriate for underlying

hardware

Other Installed Applications

• Check– RPC Client|Server binding order configuration– Presence of LeakDiag– For old versions of Simpler-Webb ERM– ISA 2000 Service Pack level– Presence of MOM Agent

Hardware Configuration

• Check– System BIOS is not over a year old– Specific support for HP, Dell and IBM servers– Processor configuration– Physical memory installed

Disk Storage System

• Check– Performance counters are enabled– Enumeration of physical and logical disks– Enumeration of identification of mount points– Enumeration of disk controllers and driver levels– Configuration of Host Bus Adaptors– Version of multi-pathing software (e.g. SecurePath,

PowerPath)

File Versions

• Verify 29 key Exchange binaries– Physical presence– Make sure that they’re not too old– Identify binaries which are hotfixes

• Check– Server MAPI subsystem– Presence of old Roll-ups– Presence of ESE API virus scanners

Hotfixes

• Detect all hotfixes and Service Packs installed for– Windows 2000– Windows 2003– Exchange 5.5– Exchange 2000– Exchange 2003

• Call out any updates that were installed during the past 5 days, and the logon name of the user that performed the installation

Network Subsystem

• Enumerate all network cards

• Check– NIC connection status– DNS/WINS configuration– IP Gateway settings– Primary DNS is alive– Domain suffix

Operating System

• Check– Page Table Entry (PTE) levels– Paged|NonPaged pool configuration– CrashOnAuditFail configuration– HeapDeCommitFreeBlockThreshold– TEMP/TMP paths– SystemPages configuration– /3GB /USERVA configuration– Physical Address Extensions (PAE) detection– OS Version and SKU (e.g. Standard, Enterprise, etc)– Dr. Watson configuration– Debug settings (including GlobalFlag, PageHeapFlags)– Virtual PC / Virtual Server / VMWare detection

Success Stories

• Identified that circular logging was enabled on a 12,000 user Exchange cluster– Was a potential time-bomb

• Identified incorrect memory configuration that required the Exchange server to be restarted every two weeks

• Identified a case where database files were being stored on a compressed volume– Root cause of the performance problems

ExBPA Timeline• V1.0 – September 21st

– 1200 point collection / 800 rules• V1.1 – December 6th

– Usability improvements– 1300 point collection / 900 rules

• V2.0 – Early March– Localized in all Exchange Server languages– Performance sampling and root cause analysis infrastructure– Admin API support (e.g. find out time of last backup)– Optional integration with MOM 2005– Export to XML / HTM / CSV– New baseline logic

• V3.0 – Later on in the year– More rules and refinements– MAPI.NET collector

Appendix: Screen Shots