25
Terminal Server With x64 Processor Architectures Costin Hagiu Test Lead Terminal Server costinh @ microsoft.com Microsoft Corporation

Terminal Server With x64 Processor Architectures Costin Hagiu Test Lead Terminal Server costinh @ microsoft.com Microsoft Corporation

Embed Size (px)

Citation preview

Terminal Server With x64 Processor Architectures

Costin HagiuTest LeadTerminal Servercostinh @ microsoft.comMicrosoft Corporation

Session OutlineSession Outline

Terminal Server and x86 architecture

Impact of x64 architecture

Evaluating scalability for Terminal Server (TS)

Infrastructure and tools

Evaluation criteria

x64 test results

Critical considerations for x64 systems

Session GoalsSession Goals

Attendees should leave this session with the following

A better understanding of benefits Windows Server 2003 x64 Editions and its architecture bring to Terminal Server scenarios

Knowledge of where to find resources for optimizing and evaluating server platforms for Terminal Server deployments

Terminal Server ArchitectureTerminal Server Architecture

Creates a virtual remote desktop

TermSrvTermSrv

TermDDTermDD

RDPWD.sysRDPWD.sys RDPDD.dllRDPDD.dll

Win32K.sysWin32K.sys

TdTcp.sysTdTcp.sys

CsrssCsrss

ApplicationApplicationWinlogonWinlogon

UserUser

KernelKernel

SessionSession

SystemSystem

GraphicsGraphicsKeyboard/MouseKeyboard/Mouse

x86 Windows Memory Layout And Usagex86 Windows Memory Layout And Usage

4 GB (232) address space per process

2 GB user mode virtual address (VA) space

Each process has it’s own

2 GB kernel mode (KM) virtual address space

Shared across processes

Kernel VA includesSystem Page Table Entry (PTE) area – KM thread stacks ~900 MBPaged Pool – page tables, kernel objects ~250 MBSystem Cache – file cache, registry ~512 MBOthers (Non Paged Pool, images)

System PTEs System PTEs (~900 GB)(~900 GB)

System Cache System Cache (~500 MB)(~500 MB)

Paged Pool (~270 MB)Paged Pool (~270 MB)

Non Paged Pool, Non Paged Pool, images, etc.images, etc.

Kernel VA (2 Kernel VA (2 GB)GB)

User VA User VA (2 GB)(2 GB)

Process Process NN

x86 Windows Constraints x86 Windows Constraints

Session count limitationsKnowledge Worker Scenario – maximum 300 sessions

Paged Pool/System PTEs exhaustionHighest consumption – thread kernel stacks

Large amounts of physical memory use significant Kernel VA for system data structures

Typically, little justification for more then 8 GB

Strange performance implicationsHigh Paged Pool usage triggers reclaim process for System Cache data structures which in turn degrades cache performance

Special hardware configurations further reduces amount of KM VA available

Roughly, 2 bytes of VA is lost for each byte of hot-swap memory

x64 Windows Advantages x64 Windows Advantages

Very large Kernel VA space8 TB Kernel virtual address space (128 GB paged pool, 128 GB system PTEs)

High theoretical number of sessions supportedBased on current constants built into system – in the order of 10000 sessions

This can easily be pushed further

Test results show that a 4P AMD Opteron dual-core system can go as high as 600 KW users

Little penalty for large RAMData structures used to manage RAM now have little impact

Evaluating Terminal Server ScalabilityEvaluating Terminal Server Scalability

Test bedHardware infrastructure for simulating TS deployment

User simulation toolsTools that simulate user interaction with TS Server based on specific scenarios

Ideally they do not interfere with server activity

Set of criteria for determining acceptable loadPerceived system responsiveness is the key indicator

Other typical metrics: CPU usage, paging rate, I/O activity, network usage

Test Setup ConfigurationTest Setup Configuration

Clients run controller agent (roboclient) and execute scripts that drive session interactionDomain Controller executes key infrastructure services (Active Directory, DNS, Dynamic Host Configuration Protocol {DHCP}) and hosts the test controller (roboserver)Application Server hosts services required by test scenario (Exchange, Internet Information Server {IIS})Everything runs on an isolated local network to avoid network traffic interferences

Scenario Emulation ToolsScenario Emulation Tools

Drive TS client to simulate user activity

Based on Visual Basic Script with some TS specific extensions

SendKey

WaitForText

Sample:

… '// #measure "save as" dialog pop-up time CanaryLog_Record "Excel- Save As dialog","KeyPress","a","Saveastype",

3000

'// #measure "file overwrite" dialog pop-up time TClient.SendText(FILE("MSExcelSpr")) CanaryLog_Record "Excel- overwrite pop-

up","VKeyPress",VK_RETURN,"Doyouwanttoreplace", 3000

TClient.SendText("y") WScript.Sleep(6000) …

Scenario Emulation ToolsScenario Emulation Tools

…Where CanaryLog_Record looks like:

sub CanaryLog_Record(Label,KeyCode,KeyChar,WaitText, SleepTime)

CanaryLog_NewLabel(Label) if KeyCode = "KeyAlt" Then WaitTime = TClient.PressKeyAndWait(asc(KeyChar), AltFlag, WaitText) elseif KeyCode = "KeyCtrl" Then WaitTime =

TClient.PressKeyAndWait(asc(KeyChar), CtrlFlag, WaitText) … end if

WScript.Sleep SleepTime CanaryLog_FinishLabelWithTime(WaitTime)

end sub

Response Time Analysis Response Time Analysis

Sample output from a test client

Excel- Chart dialog - Step 4,smc003,12:01:24,22,22

Excel- Save As dialog,smc003,12:01:49,149,149

Excel- overwrite pop-up,smc003,12:02:04,20,20

Excel- Close Menu,smc003,12:02:10,12,12

Excel- File Open menu,smc003,12:02:39,11,11

Excel- File Open dialog,smc003,12:02:44,142,142

Excel- Print dialog,smc003,12:03:06,28,28

Excel- File close menu,smc003,12:03:16,11,11

Excel- Save changes dialog,smc003,12:03:20,47,47

Outlook- Open New Menu,smc003,12:03:34,10,10

Outlook- Open Mail Message Menu,smc003,12:03:35,8,8

Outlook- Open Mail editor,smc003,12:03:36,141,141

IE- File open menu,smc003,12:07:27,26,26

IE- File open dialog,smc003,12:07:29,34,34

Response Time AnalysisResponse Time Analysis

Critical

Critical

regionregion

Data from an IWill H8501 (8-way) server with 40 GB memoryData from an IWill H8501 (8-way) server with 40 GB memory

Response Time – High Noise Response Time – High Noise

Data from a Newisys 4300-E (4-way) server with 64 GB memoryData from a Newisys 4300-E (4-way) server with 64 GB memory

System Performance AnalysisSystem Performance Analysis

Data from an IWill H8501 (8-way) server with 40 GB memoryData from an IWill H8501 (8-way) server with 40 GB memory

Terminal Server Benchmark CaveatsTerminal Server Benchmark Caveats

This is a synthetic benchmark

Real-life deployments should plan for more conservative load levels

Small scenario changes may trigger significant changes in CPU usage

The nature of the application influences greatly CPU/memory usage patterns

A lot of deployments focus on in-house applications

Java applications tend to have dramatically higher CPU/memory usage

Different criteria used to determine acceptable load

Other Terminal Server Benchmarking ToolsOther Terminal Server Benchmarking Tools

Fujitsu-Siemens Computers (TS4U)Similar approach to Microsoft’s

Based on a custom scripting system

Tracks response times to determine load viability

Absolute numbers are lower when compared to those provided by Microsoft tools, but relative numbers are comparable

Other commercial toolsLoad Runner

Scapa Technologies

Comparative Benchmark ResultsComparative Benchmark Results

This is a synthetic benchmark with a break point close to system resource exhaustion – real life deployments should plan for more conservative load levels

Based on Knowledge Worker workloadMultiple applications (Word, Outlook, Excel, Internet Explorer)

35 words per minute typing rate

IWill H8501 (8-way) server with 40 GB memory (2 GB Samsung ECC Registered DIMMs)

8 x Single-core Opteron processors – response time cut-off around 540 users

8 x Dual-core Opteron processors – response time cut-off around 580 users

CPU usage at 600 users < 70%

Newisys 4300-E (4-way) with 64 GB memory4 x Dual-core Opteron processors – response time cut-off around 580 users

Considerations For x64 TS DeploymentsConsiderations For x64 TS DeploymentsCPU And CacheCPU And Cache

CPU usage is a common limiting factorMost Terminal Server desktop applications are 32-bit – WoW64 on 64-bit Windows Server

x64 is more of a translation layer – very efficient

If system is CPU limited when using x86 Windows, it will be limited to slightly lower values on x64 Windows

CPU cache hit ratio drops slightlyLarger 64 bit data structures → less fit in cache

For AMD Opteron 875 2.2 GHz, 2 MB L2-cache99% hit ratio with 32-bit OS

98% hit ratio with 64-bit OS

Considerations For x64 TS DeploymentsConsiderations For x64 TS DeploymentsMemory And I/OMemory And I/O

Memory consumption increaseData structures are larger on 64-bit

Pointers are 2x larger for example

Overall memory consumption is about 2x higher when using x64

System File Cache requires more memory for performance similar to 32-bit Windows

Higher number of sessions raises pressure on disk I/O subsystem

Many concurrent file operations

High paging rate

High performance storage is a necessity

Call To ActionCall To Action

We need to be ready for 300+ users Terminal Server deploymentsAs a designer, exploit the right “ingredients” for your server designs

The right processor architecture – x64The full range of server designs

DP, 4-way, and 8-waySingle- and Dual-core capabilities

High-density memory products – 1/2/4/8 GB ECC Registered DIMMs

As an administrator, configure your servers to be well suited for x64 Terminal Server support

Large memory configurationsConsider processors with larger cache sizesConsider high performance storage

Community ResourcesCommunity Resources

Windows Hardware and Driver Central (WHDC)www.microsoft.com/whdc/default.mspx

Technical Communitieswww.microsoft.com/communities/products/default.mspx

Non-Microsoft Community Siteswww.microsoft.com/communities/related/default.mspx

Microsoft Public Newsgroupswww.microsoft.com/communities/newsgroups

Technical Chats and Webcastswww.microsoft.com/communities/chats/default.mspxwww.microsoft.com/webcasts

Microsoft Blogswww.microsoft.com/communities/blogs

Additional ResourcesAdditional Resources

Whitepapershttp://www.microsoft.com/windowsserver2003/techinfo/overview/tsscaling.mspx

Fujitsu-Siemens Computers Terminal Server Sizing Guide http://vilpublic.fujitsu-siemens.com/vil/pc/vil/primergy/performance/sizing/terminal_server_sizing_guide_en.pdf

Bernhard Tritsch’s Terminal Server scalability evaluation http://www.wtstek.com/item2/Article20041125.htm

Other ResourcesWindows Server 2003 Resource Kit Tools – TsScaling.exe http://www.microsoft.com/downloads/details.aspx?familyid=9d467a69-57ff-4ae7-96ee-b18c4790cffd&displaylang=en

Load Runner http://www.mercury.com

Scapa Technologies http://www.scapatech.com/home.html

© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.