Remote Desktop Session Host Capacity Planning in Windows Server 2008 R2
Microsoft Corporation
Published: February 2010
Abstract The Remote Desktop Session Host (RD Session Host) role service lets multiple concurrent users run
Windows®-based applications on a remote computer running Windows Server® 2008 R2. This white
paper is intended as a guide for capacity planning of RD Session Host in Windows Server 2008 R2. It
describes the most relevant factors that influence the capacity of a given deployment, methodologies to
evaluate capacity for specific deployments, and a set of experimental results for different combinations
of usage scenarios and hardware configurations.
Copyright Information
The information contained in this document represents the current view of Microsoft Corporation on the
issues discussed as of the date of publication. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft
cannot guarantee the accuracy of any information presented after the date of publication.
This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS,
IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights
under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval
system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or
otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.
2010 Microsoft Corporation. All rights reserved.
Microsoft, Hyper-V, Windows, and Windows Server are trademarks of the Microsoft group of companies.
All other trademarks are property of their respective owners.
Contents Introduction .................................................................................................................................................. 5
Capacity Planning for a Specific Deployment ............................................................................................... 6
Problem statement ................................................................................................................................... 6
What determines the capacity of a system? ............................................................................................ 7
Usage scenario ...................................................................................................................................... 7
Hardware resources .............................................................................................................................. 7
Typical evaluation approaches ................................................................................................................. 7
Load simulation tests ............................................................................................................................ 9
Testing methodology .............................................................................................................................. 10
Test bed configuration ........................................................................................................................ 11
Load generation .................................................................................................................................. 11
Response time measurement ............................................................................................................. 12
Scenarios ............................................................................................................................................. 14
Examples of test results for different scenarios ..................................................................................... 15
Tuning Your Server to Maximize Capacity .................................................................................................. 18
Impact of hardware on server capacity .................................................................................................. 18
CPU ...................................................................................................................................................... 19
Memory ............................................................................................................................................... 22
Disk storage ......................................................................................................................................... 25
Network .............................................................................................................................................. 26
Impact of Remote Desktop Services features on server capacity .......................................................... 28
32-bit color depth ............................................................................................................................... 28
Windows printer redirection (XPS) ..................................................................................................... 28
Compression algorithm for RDP data ................................................................................................. 28
Desktop Experience pack .................................................................................................................... 29
RemoteApp programs ............................................................................................................................. 29
Hyper-V ................................................................................................................................................... 30
Impact of Windows System Resource Manager (WSRM) ....................................................................... 31
Comparison with Windows Server 2008 ................................................................................................. 32
Conclusions ................................................................................................................................................. 32
Appendix A: Test Hardware Details ............................................................................................................ 33
Appendix B: Testing Tools ........................................................................................................................... 34
Test control infrastructure ...................................................................................................................... 34
Scenario execution tools ......................................................................................................................... 35
Appendix C: Test Scenario Definitions and Flow Chart ............................................................................... 36
Knowledge Worker v2 ............................................................................................................................. 36
Knowledge Worker v1 ............................................................................................................................. 38
Appendix D: Remote Desktop Session Host Settings .................................................................................. 40
Introduction The Remote Desktop Session Host (RD Session Host) role service lets multiple concurrent users run
Windows®-based applications on a server running Windows Server® 2008 R2. This white paper is
intended as a guide for capacity planning of an RD Session Host server running Windows Server 2008 R2.
In a server-based computing environment, all application execution and data processing occurs on the
server. As a consequence, the server is one of the most likely systems to run out of resources under
peak load and cause disruption across the deployment. Therefore it is very valuable to test the
scalability and capacity of the server system to determine how many client sessions a specific server can
support for specific deployment scenarios.
This document presents guidelines and a general approach for evaluating the capacity of a system in the
context of a specific deployment. Most of the key recommendations are also illustrated with examples
based on a few scenarios that use Microsoft® Office applications. The document also provides guidance
on the hardware and software parameters that can have a significant impact on the number of sessions
a server can support effectively.
Capacity Planning for a Specific Deployment
Problem statement One of the key questions faced by somebody planning a Remote Desktop Session Host server
deployment is: “How many users will this server be able to host?” (or one of its variants: “How much
hardware is required to properly host all my users?” or “What kind of server is required to host <N>
users?”). Determining the system configuration able to support the load generated by users is a typical
challenge faced by any service (such as Microsoft Exchange, Internet Information Services (IIS), SQL
Server). This is a difficult question to answer even for server roles that support workloads defined by a
relatively small set of transactions and parameters that characterize the profile of a workload (DNS is a
good example where the load can be well defined by DNS queries). The RD Session Host servers find
themselves at the other end of the spectrum because the load is defined fundamentally by the deployed
applications, the clients, and the user interaction. While one deployment may host a relatively
lightweight application that users access infrequently and with low resource costs (like a data entry
application), another may host a very demanding CAD application requiring a lot of CPU, RAM, disk
and/or network bandwidth.
There are a few assumptions implied by this question that are worth clarifying:
1. The deployment needs to be sized such that users’ applications perform at an acceptable level.
2. The number of resources that servers are provisioned with does not significantly exceed the
number required for meeting the deployment goals.
The performance criterion is difficult to state in objective terms because of the large spectrum of
applications that may be involved and the variety of ways that users can use those applications. One of
the most typical complaints that users have about the performance of their RD Session Host server
applications is that performance is slow or unresponsive, but there are other ways in which performance
degradation may occur, such as jittery behavior as opposed to a smooth, even response, sometimes in
alternating bursts and lags that may be extremely annoying even if the average performance may be
deemed acceptable. The tolerances to performance degradation vary substantially across deployments:
while some systems are business-critical and accept no substantial degradation at any time, others may
accept short time spans of peak load where performance is quite poor. Clarity on what the users’
expectations are in terms of performance is a key piece of input in the process of sizing the capacity of a
deployment.
Regarding the second goal, it is commonly expected that the planning exercise should estimate resource
requirements reasonably close to the values that are really required, without overestimating by large
margins. For example, if a server requires 14 gigabytes (GB) of RAM to properly accommodate the target
number of 100 users for a certain deployment, including peak load situations (all users open a memory
intensive application at the same time), it is a reasonable expectation that the estimate coming from the
planning exercise would be within the 14-16 GB of RAM range. But an estimate of 24 GB of RAM would
be a significant waste of resources, because a significant fraction of that RAM (14 GB) would never be
used.
What determines the capacity of a system? Before we discuss the details of testing a certain scenario on a server, it is important to know what
factors impact the scalability of the server. At a macro level, these factors fall under two buckets:
Usage scenario
An extremely important factor in determining the capacity of a given server is the usage scenario – the
typical sequence of interactions users have with the applications deployed on the server. A server of a
given hardware configuration may support 2 users or 200 users depending on the scenario. If the
scenario is light in resource usage, the server will be able to support a lot of users. An example of such a
light scenario is a user entering data in a simple line of business application. On the other hand, if the
scenario is heavy in resource usage, the server will not be able to support as many users. An example of
a heavy scenario is a user working with a CAD application or with a complex software development
environment that’s very CPU and input/output intensive.
This means that when trying to estimate the number of users a server can support, that number only
makes sense in the context of a particular scenario. If the scenario changes, the number of supported
users will also change.
Generally the scenario is defined by system software configuration, applications used, specific features
exercised for each application, the amount and content of data being processed, actions performed, and
the speed with which actions are being performed. Following are a few examples of significant factors
that can influence a simple scenario like editing a document:
Is the user typing in Notepad or Microsoft Word?
What version of Microsoft Word is used?
Is the spelling checker enabled?
Does the document contain pictures? Does it contain graphs?
What is the typing speed?
What is the session color depth?
Answering any of the questions incorrectly may throw off the results by significant amounts.
Hardware resources
The server hardware has a major impact on the capacity of a server. The main hardware factors that
have to be considered are CPU, memory, disk storage, and network. The impact of each of these factors
will be addressed in more detail later in this white paper.
Typical evaluation approaches The above considerations should make it clear that it is not possible to answer the capacity planning
questions with reasonable accuracy based on a set of pre-configured numbers. Surveys of Remote
Desktop Session Host server deployments show that the overwhelming majority of deployments support
between 25 and 150 users, so stating that a Remote Desktop Session Host server deployment would
host 85 users with an 85% error rate is an accurate statement, but not very useful. Similarly, choosing
one of the numbers measured on an actual deployment or simulation and applying it to another
deployment that has significant differences in scenario or hardware configuration is not any more useful
given the potential error. Therefore, unless careful consideration is given to the factors affecting the
deployment scenario, it is not reasonable to expect a high accuracy. There are practical approaches that
can help reduce the estimation error to more reasonable values, and these approaches typically result in
different trade-offs between effort invested and accuracy of results. To enumerate a few:
1. Piloting. This is probably the most common and simple approach. One test server is configured
and deployed, and then load is gradually increased over time while monitoring user feedback.
Based on user feedback, the system load is adjusted up and down until the load stabilizes
around the highest level that provides an acceptable user experience. This approach has the
advantage that it is fairly reliable and simple, but will require initial investments in hardware or
software that may turn out to be ultimately unsuitable for the deployment goals (for example,
the server cannot support enough memory to achieve desired consolidation). This approach can
be further enhanced by monitoring various load indicators (CPU usage, paging, disk and network
queue length etc.) to determine potential bottlenecks, and overcome them by adding hardware
resources (CPUs, RAM, disks, network adapters). However, the lack of control on the level of
load makes it difficult to correlate variation in indicators with actual system activity.
2. Simulation. In this approach, based on data collected about the specific usage scenario, you can
build a simulation by using specific tools that are used to generate various (typically increasing)
level of loads against a test server while monitoring the server’s ability to timely handle user
interactions. This approach requires a fairly high initial investment for building the usage
scenario simulation and relies significantly on the simulated scenario being a good
approximation of the actual usage scenario. However, assuming the simulation is accurate, it
allows you to determine very accurately the acceptable levels of load and the limiting factors,
and offers a good environment for iterating while adjusting various software and hardware
configurations.
3. Projection based on single user systems. This approach uses extrapolation based on data
collected from a single user system. In this case, various key metrics like memory usage, disk
usage, and network usage are collected from a single user system and then used as a reference
for projecting expected capacity on a multi-user system. This approach is fairly difficult to
implement because it requires detailed knowledge of system and application operations.
Furthermore, it is rather unreliable because the single user system data contain a significant
level of “noise” generated by interference with the system software. Also, in the absence of
sophisticated system modeling, translating the hardware performance metrics (CPU speed, disk
speed) to the target server from the reference system used to collect the data is a complex and
difficult process.
In general, the first approach will prove to be more time and cost effective for relatively small
deployments, while the second approach may be preferable for large deployments where making an
accurate determination of server capacity could have a more significant impact on purchasing decisions.
Load simulation tests
Load simulation, as outlined above, is one of the more accurate techniques for estimating the capacity
of a given system. This approach works well in a context in which the user scenarios are clearly
understood, relatively limited in variation, and not very complicated. Generally it involves several
distinct phases:
1. Scenario definition. Having a good definition of the usage scenarios targeted by the deployment
is a key prerequisite. Defining the scenarios may turn out to be complicated, either because of
the large variety of applications involved or complex usage patterns. Getting a reasonably
accurate usage scenario is likely the most costly stage of this approach. It is equally important to
capture not only the right sequence of user interactions, but also to use the right data content
(such as documents, data files, media content) because this also may play a significant role in
the overall resource usage on the system. Such a scenario can be built based on interviews with
users, monitoring user activity, tracking metrics on key infrastructure servers, project goals, etc.
2. Scenario implementation. In this phase, an automation tool is used to implement the scenario
so that multiple copies can be run simultaneously against the test system. An ideal automation
tool will drive the application user interface from the Remote Desktop Connection client, has a
negligible footprint on the server, is reliable, and tolerates variation in application behavior well
due to server congestion. At this stage, it is also important to have a clear idea of the metrics
used to gauge how viable the system is at various load levels and to make sure that the scenario
automation tools accommodate collecting those metrics.
3. Test bed setup. The test bed typically lives on an isolated network and includes 3 categories of
computers:
a. The RD Session Host server(s) to be tested
b. Infrastructure servers required by the scenario (such as IIS, SQL Server, Exchange) or
that provide basic services (DNS, DHCP, Active Directory)
c. Test clients used to generate the load
Having an isolated network is a very important factor because it avoids interference of network
traffic with either the Remote Desktop Connection traffic or the application-specific traffic. Such
interference may cause random slowdowns that would affect the test metrics and make it
difficult to distinguish such slowdowns from the ones caused by resource exhaustion on the
server.
4. Test execution. Typically this consists of gradually increasing the load against the server while
monitoring the performance metrics used to assess system viability. It is also a good idea to
collect various performance metrics on the system to help later in identifying the type of
resources that come under pressure when system responsiveness degrades. This step may be
repeated for various adjustments made based on conclusions derived from step 5.
5. Result evaluation. This is the final step where, based on the performance metrics and other
performance data collected during the test, you can make a determination of the acceptable
load the system can support while meeting the deployment performance requirements and the
type of resources whose shortage causes the performance to start degrading. The conclusions
reached in this step can be a starting point for a new iteration on hardware adjusted to mitigate
the critical resource shortage in order to increase load capacity.
Coming up with a single application-independent criterion for defining when an application performance
degrades is fairly difficult. However, there is an interaction sequence that captures the most
fundamental transaction of an interactive application: sending input, such as from a keyboard or mouse,
to the application and having the application draw something back in response. The most trivial case of
this would be typing, but other interactions like clicking a button, or selecting a check box or menu item
also map in a very straightforward way to this type of transaction. The reason this interaction pattern
stands out is that it captures the fundamental intention of connecting to a remote desktop: allowing a
user to interact with a rich user interface running on a remote system the same way he or she would if
the application were running locally. Although this metric will not cover all relevant metrics for tracking
application performance, it is a very good approximation for many scenarios, and degradation measured
through this metric correlates well in general with degradation from other metrics.
This capacity evaluation approach is what we recommend when a reasonably accurate number is
required, especially for cases like large system deployments where sizing the hardware accurately has
significant implications in terms of cost and a low error margin is desirable. We used the same approach
for the experimental data that we used to illustrate various points in this document, for the following
reasons:
This approach allowed us to make fairly accurate measurements of the server capacity under
specific conditions.
It makes it possible for independent parties to replicate and confirm the test results.
It allows a more accurate evaluation of various configuration changes on a reference test bed.
Testing methodology We included various results obtained in our test labs to illustrate many of the assertions made in this
document. These tests were executed in the Microsoft laboratories. The tests used a set of tools
developed specifically for the purpose of Remote Desktop Session Host server load test simulations so
that they meet all the requirements outlined above for effective load test execution. These tools were
used to implement a few scenarios based on Office2007 and Internet Explorer. Response times for
various actions across the scenarios were used to assess the acceptable level of load under each
configuration.
Test bed configuration
The Remote Desktop test laboratory configuration is shown in Figure 1.
Figure 1 – Test setup configuration
Windows Server 2008 R2 and Office 2007 were installed by using the settings described in Appendix D.
The test tools were deployed on the test controller, workstations, and test server as described
previously. User accounts were created for all users used during the testing and their profiles were
configured. For each user in the Knowledge Worker scenario, this included copying template files used
by the applications, setting up a home page on Internet Explorer, and configuring an e-mail account in
Outlook. An automated restart of the server and client workstations was performed before each test-
run to revert to a clean state for all the components.
Load generation
The test controller was used to launch automated scenario scripts on the workstations. Each script,
when launched, starts a remote desktop connection as a test user to the target server and then runs the
TTeesstt
SSeerrvveerr
WWoorrkkssttaattiioonnss
scenario. The Remote Desktop users were started by the test controller in groups of ten with 30 seconds
between successive users. After the group of ten users was started, a 5-minute stabilization period was
observed in which no additional sessions were started before starting with the next group. What this
means is that it takes 4 minutes and 30 seconds to start 10 users. Taking into account the 5-minute
stabilization period, the controller takes 1 hour and 30 minutes to start 100 users.
This approach of logging on users one at a time has two advantages. First, it ensures that we don't
overwhelm the server by logging on 100 users at the same time. Second, we can look at the resulting
data from the test and point to a specific number of users after which the server became unresponsive.
From the results in the following sections it can be seen that the number of supported users has been
reported to the nearest 10. The reason for this is that we use a group size of 10 users and the level of
precision that we get from the test data is not sufficient to clearly distinguish between users from the
same group.
Response time measurement
A user scenario is built by grouping a series of actions. An action sequence starts with the test script
sending a key stroke through the client to one of the applications running in the session. As a result of
the key stroke, the application does some drawing. For example, sending CTRL-F to Microsoft Word
results in the application drawing the File menu.
The test methodology is based on measuring the response time of all actions that result in drawing
events (except for typing text). The response time is defined as the time taken between the key stroke
and the drawing that happens as a result. A timestamp (T1) is taken on the client side when the test
tools on the client send a keystroke to the Remote Desktop client. When the drawing happens in the
server application, it is detected by a test framework tool that runs inside each Remote Desktop session.
The test tool on the server side sends a confirmation to the client side tools and at this point the client
side tools take another timestamp (T2). The response time of the action is calculated as T2 − T1. This
measurement gives an approximation of the actual response time. It is accurate to within a few
milliseconds (ms).
The response time measurement is important because it is the most reliable and direct measurement of
user experience as defined by system responsiveness. Looking at performance metrics such as CPU
usage and memory consumption only gives us a rough idea as to whether the system is still within
acceptable working conditions. For example, it is difficult to qualify exactly what it means for the users if
the CPU is at 90% utilization. The response times tell us exactly what the users will experience at any
point during the test.
As the number of users increases on a server, the response times for all actions start to degrade after a
certain point. This usually happens because the server starts running out of one or more hardware
resources. A degradation point is determined for the scenario beyond which the server is considered
unresponsive and therefore beyond capacity. To determine the degradation point for the entire
scenario, a degradation point is determined for each action based on the following criteria:
For actions that have an initial response time of less than 200 ms, the degradation point is
considered to be where the average response time is more than 200 ms and 110% of the initial
value.
For actions that have an initial response time of more than 200 ms, the degradation point is
considered to be the point where the average response time increases with 10% of the initial value.
These criteria are based on the assumption that a user will not notice degradation in a response time
when it is lower than 200 ms.
Generally, when a server reaches CPU saturation, the response time degradation point for most actions
is reached at the same number of users. In situations where the server is running out of memory, the
actions that result in file input/output degrade faster than others (because of high paging activity
resulting in congestion in the input/output subsystem), such as opening a dialog box to select a file to
open or save. For the purpose of this testing, the degradation point for the whole test was determined
to be the point where at least 20% of the user actions have degraded. A typical user action response
time chart is shown in Figure 2. According to the criteria described above, the degradation point for this
action is at 150 users.
Figure 2 – Response time evaluation
Duration Users 50 구간의 이동 평균 (Duration)
Degradation point
150 Users
Scenarios
The scenarios used for testing are automated and meant to simulate real user behavior. Although the
scripts used in these scenarios simulate tasks that a normal user could perform, the users simulated in
these tests are tireless—they never reduce their intensity level. The simulated clients type at a normal
rate, pause as if looking at dialog boxes, and scroll through mail messages as if to read them, but they do
not get up from their desks to get a cup of coffee, they never stop working as if interrupted by a phone
call, and they do not break for lunch. The tests assume a rather robotic quality, with users using the
same functions and data sets during a thirty-minute period of activity. This approach yields accurate but
conservative results.
Knowledge Worker v2
The knowledge worker scenario consists of a series of interactions with Microsoft Office 2007
applications (Word, Excel, Outlook, and PowerPoint) and Internet Explorer. The set of actions and their
frequency in Office segments of the scenario are based on statistics collected from the Software Quality
Management data submitted by Office users and should represent a good approximation of an “average”
Office user. The scenario includes the following:
Creating and saving Word documents
Printing spreadsheets in Excel
Using e-mail communication in Outlook
Adding slides to PowerPoint presentations and running slide shows
Browsing Web pages in Internet Explorer
This scenario is described in detail in Appendix A.
Knowledge Worker v2 with text-only presentation
This scenario is very similar to the Knowledge Worker scenario above. It is exactly the same except for
one difference—the PowerPoint presentation file used in this scenario is a text-only version. The file
used in the original Knowledge Worker scenario is rich in content. The comparison of these two
scenarios is interesting because it reveals how some differences in the scenarios can impact the capacity
of the server.
Knowledge Worker v2 without PowerPoint
This scenario is similar to the Knowledge Worker scenario in most ways. The significant difference in this
case is that the light Knowledge Worker scenario does not use PowerPoint. The duration of the scenario
is the same as the Knowledge Worker scenario, but instead of spending time using PowerPoint, the user
spends more time typing Word documents, filling Excel spreadsheets, and typing e-mail messages. This
scenario is significantly lighter in terms of CPU usage compared to the Knowledge Worker scenario
because PowerPoint, while taking only ~10% of the total work cycle duration, uses more than half of the
CPU. This also generates significant variation in the CPU usage during the work cycle, with much higher
levels of CPU usage during the short PowerPoint interaction sequence. There were two reasons to
introduce this scenario: PowerPoint usage data shows that it is not as widely used as the other Office
applications in the mix and this scenario gives an alternate angle on examining various factors due to its
relatively lighter load and smoother variations in resource usage.
Knowledge Worker v1
This is the Knowledge Worker scenario that was used for testing in the Windows Server 2003 Terminal
Server Capacity and Scaling (http://go.microsoft.com/fwlink/?LinkId=178901) white paper. This scenario
was significantly different from the current Knowledge Worker v2, and is described in detail in Appendix
A.
Examples of test results for different scenarios
Server Configuration Scenario Capacity
HP DL 585 4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
Knowledge Worker v2
150 users
HP DL 585 4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
Knowledge Worker v1
230 users
HP DL 585 4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
Knowledge Worker v2 with text-only presentation
200 users
HP DL 585 4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
Knowledge Worker v2 without PowerPoint
230 users
Table 1 - Server capacity by scenario
Table 1 shows the comparison of server capacity between different scenarios. The capacity numbers are
determined by using the criteria outlined above, but these numbers should be treated with caution and
may need to be adjusted for the real deployments.
The most important observation about these results is that relatively minor tweaks in the scenario cause
significant impact in scalability. Although both test that PowerPoint has the same test in the
presentation, the difference in the way it is rendered accounts for a 33% variation in capacity. Although
the PowerPoint interaction is only ~10% of the total scenario execution cycle, removing it increased the
capacity by ~53%. These examples serve as a strong reminder that careful consideration of the scenario
used for capacity measurements is paramount to having accurate numbers. It also makes a compelling
case that providing off-shelf numbers for capacity planning is not useful, and if such an effort is worth
considering, you need to actually customize it to your needs.
Server Configuration Scenario Capacity
HP DL 385 2 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 24 GB Memory
Knowledge Worker v2
80 users
HP DL 585 4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
Knowledge Worker v2
150 users
4 x AMD Opteron Quad-core CPUs 2.4 GHz 2048 KB L2 Cache 128 GB Memory
Knowledge Worker v2
310 users
HP DL 585 4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
Knowledge Worker v2 without PowerPoint
230 users
4 x AMD Opteron Quad-core CPUs 2.4 GHz 2048 KB L2 Cache 128 GB Memory
Knowledge Worker v2 without PowerPoint
450 users
Table 2- Server capacity by hardware configuration
As expected, hardware configuration changes would also play a big role in the capacity numbers. With
the new x64-based architecture removing some fundamental constraints in the x86-based Windows
Server architecture, properly configured servers should be able to accommodate large numbers of users
for many mainstream workloads. There is no reason to expect that RD Session Host servers are
inherently limited to a certain number of users.
Tuning Your Server to Maximize Capacity In the remainder of this document we will explore a series of hardware and software configuration
changes to assess their impact on the capacity of a server. The numbers below are specific to the
hardware and scenarios used in our tests and will likely differ for other scenarios/hardware
configurations, but they should still be able to give a good sense of the order of magnitude and direction
in which such a configuration change could impact a Remote Desktop Services deployment.
In general, there are two main categories of questions we are trying to address:
1. How can you tune a system to increase capacity?
2. What is the impact of turning on a certain feature?
Impact of hardware on server capacity There are a few general considerations as to what would be a suitable server for a Remote Desktop
Session Host server deployment that would give a reasonable approximation for a good server without
taking the scenario in consideration. There is a good range of 2U form factor servers today that have:
2 processor slots (some even 4) and would support 8 to 12 cores (16 in the near future when 8
core processors will be available)
4 to 9 memory DIMM slots per core which can be populated with up to 32–72 GB of RAM by
using cost effective 4-GB modules.
8 2.5” SAS/SATA drive slots
You can start with such a server, configured for 16 GB of RAM and with 4 disks and then, based on actual
usage data, extend RAM or disk configuration to accommodate more users. These servers have a very
good price/performance ratio, good rack density, very good storage support, and can accommodate a
lot of RAM if needed. They give you a lot of flexibility to tune the configuration to specific usage while
being very easy to scale out after there is a need for more capacity.
Going forward, we are going to focus on the hardware factors that most significantly impact the server
capacity: CPU, memory, disk storage, and network. The test results are presented below for each of
these.
CPU
The data presented in Table 3 was obtained by using 2 different test servers. The only difference
between the two servers was that one of them has a single Quad-core CPU and the other one has 2
Quad-core CPUs.
Server Configuration Scenario Capacity
AMD Opteron Quad-core CPU 2.7 GHz 512 KB L2 Cache 32 GB Memory
Knowledge Worker v2
110 users
2 x AMD Opteron Quad-core CPU 2.7 GHz 512 KB L2 Cache 32 GB Memory
Knowledge Worker v2
200 users
AMD Opteron Quad-core CPU 2.7 GHz 512 KB L2 Cache 32 GB Memory
Knowledge Worker v2 without PowerPoint
180 users
2 x AMD Opteron Quad-core CPU 2.7 GHz 512 KB L2 Cache 32 GB Memory
Knowledge Worker v2 without PowerPoint
300 users
Table 3 - Server capacity by CPU configuration and scenario
The data in Table 3 shows the results for two different scenarios. One of the important factors to
consider here is that the factor that determines capacity on all these systems is CPU, which is one of the
resources that is very often subjected to unexpected variations and pressure points. Therefore, in a real-
life deployment it is more prudent to put aside a fraction of CPU resources to act as a cushion when
unexpected spikes of activity happen on the box (such as everyone using a certain application at the
same time). Another factor that would play a significant role in this decision is the quality of service
expected by the users: the higher the expectation, the larger the spare capacity that needs to be
provisioned. Such a margin could range anywhere from 50% to 10% of the overall capacity and will
cause the capacity numbers to be adjusted accordingly.
As expected, increase in CPU power will allow a server to support more users if no other limitations are
encountered. The most interesting measure of how increasing CPU capacity affects the overall server
capacity is the scale factor defined as the ratio by which the server capacity increases when the CPU
capacity doubles. This scaling factor is always smaller than 2 on a system where there is no other
limitation except CPU. It is also expected to be a function of the initial number of CPUs involved, and
would decrease in value when the number of CPUs involved increase (the scaling factor going from 1 to
2 CPUs is larger than the one for going from 2 to 4 CPUs). Typically the scaling factor for Remote Desktop
Session Host servers would be found in the 1.5 to 1.9 range.
Although the same hardware box was used, different scenarios yielded different scaling factors: the
normal script version yielded a scale factor of ~1.8, and the version without PowerPoint yielded a factor
of 1.67. The reason for this is that the scenario that included PowerPoint had more variation in CPU
usage, and the system with more CPU capacity available softened the impact of local usage peaks that
can overwhelm the less powerful system.
Let’s take a look at the CPU usage profile for the test scenarios in more detail to understand how the
variance and fluctuation in server load impacts server capacity on a CPU limited system.
Figure 3 - CPU usage for Knowledge Worker without PowerPoint
The CPU curve in Figure 3 shows a general increase in CPU usage (green curve) as the number of active
users increases (blue curve). Looking at the CPU curve closely, we can see that every time there is an
increase in users, the CPU curve hits a peak. This peak is followed by a decline as the number of users
becomes constant for a while. This pattern is repeated throughout the test while the overall CPU keeps
rising. The CPU peak results from logon activity associated with the users that are logging on at that time
on the server. Users log on in groups of 10. Each group of users logs on within 5 minutes before the test
enters a steady state for another 5 minutes. Because the users are being logged on so close together,
the CPU spike caused by each user logon overlaps with the ones caused by users preceding/following
them and results in one large CPU peak for the group of 10 users.
Users % Processor Time 30 구간의 이동 평균 (% Processor Time)
100 % CPU Peak
The size of this CPU logon peak impacts the server capacity measurement. Server capacity is reached on
a CPU limited system when the CPU usage reaches close to saturation (100% usage). The slope of the
CPU curve is determined by the steady state load on the system as the number of users increases (this is
the CPU usage minus the logon peaks as depicted by the orange curve in Figure 3). If there was no
logon-related CPU activity, the server would reach capacity when this curve hits 100%. In reality, the
CPU hits 100% sooner because the logon peaks touch 100% (marked as 100% CPU Peak in Figure 3). The
bigger the peaks are, the sooner the CPU curve will touch 100%.
The size of the CPU logon peak is dependent on the total processing power of the server. On a 4-core
computer, the logon peak will be larger than on an 8-core computer. The 8-core computer has more
processing power to absorb the impact of the logon peak. This means that a scenario will be able to
reach further on the steady state CPU curve (the orange curve) on computers with more processing
power.
Figure 4 - Knowledge Worker CPU usage
The other thing to consider when looking at the CPU usage pattern is the variance of the workload in the
scenario. In terms of CPU usage, the variance of the workload is low when all parts of the scenario are
equally CPU intensive. If the variance is low, the CPU usage pattern will be very uniform as in Figure 3. If
the variance is high, the CPU usage pattern will be non-uniform and this can impact the server capacity.
Users % Processor Time
High CPU Peak
High CPU Peak
The variance of the Knowledge Worker scenario with PowerPoint is higher when compared to the
Knowledge Worker without PowerPoint. This is because the PowerPoint part of the scenario is much
more CPU- intensive when compared to the other parts of the scenario. This means that if several users
happen to start working in PowerPoint, the CPU usage jumps up across the system. When this phase
coincides with a user logon peak, the result is that the CPU peak becomes much higher than usual.
Figure 4 shows the CPU usage profile of the Knowledge Worker scenario. The peaks where logon activity
overlaps with a high number of users working in PowerPoint are marked in Figure 4 as "High CPU Peak."
It is not easy to predict when these high peaks will occur during the test beyond a few groups of users
because it becomes increasingly difficult to calculate what all the users are doing at a given time.
Because of these very high peaks, the CPU usage hits 100% even sooner. This means that a scenario with
a low CPU variance will scale better than one with high CPU variance. Also, in this case a computer with
more processing power is able to mitigate the impact of CPU variance and the high peaks and thus
scales better.
Memory
Determining the amount of memory necessary for a particular use of an RD Session Host server is
complex. It is possible to measure how much memory an application has committed—the memory the
operating system has guaranteed the application that it can access. But the application will not
necessarily use all that memory, and it certainly is not using all that memory at any one time. The subset
of pages that an application has accessed recently is referred to as the “working set” of that process.
Because the operating system can page the memory outside a process’s working set to disk without a
performance penalty to the application, the working set is a much better measure of the amount of
memory needed.
The process performance object's working set counter, used on the _Total instance of the counter to
measure all processes in the system, measures how many bytes have been recently accessed by threads
in the process. However, if the free memory in the computer is sufficient, pages are left in the working
set of a process even if they are not in use. If free memory falls below a threshold, unused pages are
trimmed from working sets.
The method used in these tests for determining memory requirements cannot be as simple as observing
a performance counter. It must account for the dynamic behavior of a memory-limited system.
The most accurate method of calculating the amount of memory required per user is to analyze the
results of several performance counters [Memory\Pages Input/sec, Memory\Pages Output/sec,
Memory\Available Bytes and Process\Working Set(Total_)] in a memory-constrained scenario. When a
system has abundant physical RAM, the working set will initially grow at a high rate, and pages will be
left in the working set of a process even if they are not in use. Eventually, when the total working set
tends to exhaust the amount of physical memory, the operating system will be forced to trim the
unused portions of the working set until enough pages are made available to free up the memory
pressure. This trimming of unused portions of the working sets will occur when the applications
collectively need more physical memory than is available, a situation that requires the system to
constantly page to maintain all the processes’ working sets. In operating systems theory terminology,
this constant paging state is referred to as “thrashing.”
Figure 5 shows the values of several relevant counters from a Knowledge Worker test when performed
on a server with 8 GB of RAM installed.
Figure 5 - Stages of memory usage
Zone 1 represents the abundant memory stage. This is when physical memory is greater than the total
amount of memory that applications need. In this zone, the operating system does not page anything to
disk, even seldom used pages.
Zone 2 represents the stage when unused portions of the working sets are trimmed. In this stage the
operating system periodically trims the unused pages from the processes’ working sets whenever the
amount of available memory drops to a critical value. Each time the unused portions are trimmed, the
total working set value decreases, increasing the amount of available memory, which results in a
significant number of pages being written to page files. As more processes are created, more memory is
needed to accommodate their working sets, and the number of unused pages that can be collected
during the trimming process decreases. The page- input rate is mostly driven by pages required when
Pag
es/
seco
nd
Me
mo
ry
Working Set Available Bytes
Pages Input/sec Active Users
Pages Output/sec 50 구간의 이동 평균 (Pages Output/sec)
zdfsd zdfsd zdfsdZone 1 Zone 2 Zone 3
OptimalPoint
creating new processes. The average is typically below the page-output rate. This state is acceptable as
long as the system has a suitable disk storage system. The applications should respond well because, in
general, only unused pages are being paged to disk.
Zone 3 represents the high pressure zone. The working sets are trimmed to a minimal value and mostly
contain pages that are frequented by the greater number of users. Page faults will likely cause the
ejection of a page that will need to be referenced in the future, thus increasing the frequency of page
faults. The output per second of pages will increase significantly, and the page-output curve follows the
shape of the page-input curve to some degree. The system does a very good job of controlling
degradation, almost linearly, but the paging activity increases to a level where the response times are
not acceptable.
In Figure 5, it seems as though the amount of physical memory is greater than 8 GB because the
operating system does not start to trim working sets until the total required is well above 14 GB. This is
due to cross-process code sharing, which makes it appear as if there is more memory used by working
sets than is actually available.
To determine the amount of memory needed per user by the system, we have to look at the three zones
again. Zone 1 is a clearly acceptable working stage for the system, while Zone 3 is clearly unacceptable.
Zone 2 needs more careful consideration. The average total paging activity (pages input and pages
output) steadily rises during this stage. In the example above, the paging activity increases from around
50 pages per second to over 1500 pages per second. This translates into an ever increasing disk access
activity. During this stage, how responsive a system will be is determined by how much the throughput
of the disk storage system is. If, for example, the system is using only a local disk for its storage with a
low throughput, its responsiveness will be unacceptable anywhere in Zone 2. On the other hand, if the
disk storage system is capable of handling this level of disk activity, the system will be responsive during
the entire Zone 2. Even with a responsive disk storage system, it is generally good to be conservative
about choosing the spot in Zone 2 where you think the system will still be responsive. A good rule of
thumb is to choose the point where the operating system does the second large trimming of the
working set (this is the point of the second large spike on the page-output curve marked as 'optimal
point' in Figure 2). The user response times should also be looked at to verify that they are acceptable at
this point.
The amount of memory required per user can be estimated by dividing the total amount of memory in
the system by the number of users at the optimal point in Zone 2. Such an estimate would not account
for the memory overhead required to support the operating system. A more precise measurement can
be obtained by running this test for two different memory configurations (for example, 4 GB and 8 GB),
determining the number of users, and dividing the difference in memory size (8 GB – 4 GB in this case)
by the difference in number of users at the optimal point in Zone 2. In practice, the amount of memory
required for the operating system can be estimated as the memory consumed before the test starts. In
the above example, the optimal point in Zone 2 is where the system has 110 active users logged on. The
total memory available at the start of the test was 7500 MB (the remaining having been consumed by
the operating system. These numbers mean that each user requires approximately 68 MB of memory.
Although a reasonable amount of paging is acceptable, paging naturally consumes a small amount of the
CPU and other resources. Because the maximum number of users that could be loaded onto a system
was determined on systems with abundant physical RAM, a minimal amount of paging occurred. The
working set calculations assume that a reasonable amount of paging has occurred to trim the unused
portions of the working set, but this would only occur on a system that was memory-constrained. If you
take the base memory requirement and add it to the number of users multiplied by the required
working set, you end up with a system that is naturally memory-constrained, and therefore acceptable
paging will occur. On such a system, expect a slight decrease in performance due to the overhead of
paging. This decrease in performance can reduce the number of users who can be actively working on
the system before the response time degrades above the acceptable level.
Comparison of different memory configurations
Server Configuration Model Number Knowledge Worker
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 8 GB Memory
DL585 120 users
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 4 GB Memory
DL585 60 users
Table 4 - Server capacity by memory configuration
Table 4 shows the comparison of server capacity between different memory configurations. On systems
where physical memory is the limiting factor, the number of supported users increases linearly with the
amount of physical memory.
Disk storage
Storage access is a very significant factor in determining server capacity and needs to be considered
carefully. Although the Knowledge Worker scenarios are not very demanding in terms of storage
performance (they average about 0.5 disk operations per second per user), they still provide a good
high-level view of what the concerns are in this space.
In general, these are the storage areas most likely to face high input/output loads:
1. The storage for user profiles will likely have to handle most of the input/output activity related
to file access because it holds user data, temporary file folders, application data, etc. Some of
this may be alleviated if folder redirection is configured to re-route some of the traffic to
network shares.
2. The storage holding system binaries and applications will service IOs during process creation and
application launch and page faults to executable files under higher memory pressure. This is
generally not much of a problem if the binaries (especially dlls) are not rebased during load
because their code pages are shared across processes (and across session boundaries).
3. The storage holding page files will be solicited only if the system is running low on memory, but
may face significant input/output load even under relatively moderate memory pressure
conditions due to the large amount of RAM involved. You can expect that initial trimming passes
will reclaim as much as 25% of the overall RAM size, which on a 16-GB system is 4 GB, a very
large amount of data that needs to be transferred in a relatively short period of time to disk.
Due to the potential high level of input/output involved in paging operations, we recommend isolating
the page file to its own storage device(s) to avoid its interference with the normal file operations
generated by the workload. We also recommend tracking dll base address collision/relocation problems
to avoid both unnecessary input/output traffic and memory usage.
Network
By default, the data sent over Remote Desktop connections is compressed for all connections, which
reduces the network usage for Remote Desktop scenarios. Network usage for two scenarios is shown in
Figure 6. This includes all traffic coming in and going out of the RD Session Host server for these
scenarios.
Figure 6 - Network usage by scenario
It is apparent from this figure that the total network traffic on the server (inbound and outbound) can
vary considerably depending on the scenario. The Knowledge Worker scenario is using richer graphics
계열1, Knowledge
Worker, 14000
계열1, Text-Only
Presentation, 3560계열1, Old
Knowledge Worker, 2800
Byt
es/
Use
r
compared to the other scenarios, especially because of the PowerPoint presentation slide show that is a
part of the scenario. As can be expected, this results in higher network usage.
Figure 7 shows network usage in bytes per user for the Knowledge Worker scenario. This is taken from
the Bytes Total/sec counter in the Network Interface performance object. This graph illustrates how the
bytes per user average were calculated, as it converges on a single number when a sufficient amount of
simulated users are running through their scripts. The number of user sessions is plotted on the primary
axis. The count includes both bytes received and sent by the RD Session Host server by using any
network protocol.
Figure 7 - Knowledge Worker scenario network usage per user
The network utilization numbers in these tests only reflect RDP traffic and a small amount of traffic from
the domain controller, Microsoft Exchange Server, IIS Server, and the test controller. In these tests, the
RD Session Host server’s local storage drives are used to store all user data and profiles; no network
home directories were used. In a normal RD Session Host server environment, there will be more traffic
on the network, especially if user profiles are not stored locally.
Use
rs
Byt
es/
Use
r
Bytes Total/User \\DL585-AMD64-0-2\Terminal Services\Active Sessions
Impact of Remote Desktop Services features on server capacity Server capacity can be impacted by choosing to use certain features and settings as opposed to the
system defaults. The default settings used for the tests performed for this white paper are described in
Appendix B. The impact of using some Remote Desktop Services features on server capacity is described
below.
32-bit color depth
Server Configuration Model Number Color Depth Capacity
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
DL585 16 bpp 150 users
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
DL585 32 bpp 140 users
Table 5 - Server capacity by desktop color depth for Knowledge Worker scenario
Choosing 32-bit color depth for Remote Desktop Connection sessions instead of 16-bit results in a slight
increase in CPU usage. For the Knowledge Worker scenario, this results in a reduced server capacity
from 150 users to 140 users. There is also an increase in network bandwidth usage (8% in this case).
How much of an impact there will be depends on the scenario as well. A graphics-rich scenario will show
a greater impact of choosing 32-bit color depth because there will be more graphics data to process and
send over the network.
Windows printer redirection (XPS)
Windows printer redirection enables the redirection of a printer installed on the client computer to the
RD Session Host server session. Through this feature, print commands issued to server applications get
redirected to the client printer and the actual printing happens on the client side. To assess the effect of
enabling printer redirection on RD Session Host server scalability, the Knowledge Worker script was run
in a configuration where an HP LaserJet 6P printer was installed on the NULL port on each client
computer, and the clients were configured to redirect to the local printer when connecting to the server.
The script prints twice during the 30-minute work cycle: the first print job is a 19-KB Word document
and the second print job is a 16-KB Excel spreadsheet.
Test results show that network bandwidth usage is not significantly affected by printer redirection, and
the impact on other key system parameters (memory usage, CPU usage) is negligible. There is no impact
in terms of server capacity in the Knowledge Worker scenario.
Compression algorithm for RDP data
It is possible to specify which Remote Desktop Protocol (RDP) compression algorithm to use for Remote
Desktop Services connections by applying the Group Policy setting Set compression algorithm for RDP
data. By default, servers use an RDP compression algorithm that is based on the server's hardware
configuration. In the case of the server computers used for this testing, this algorithm is "Optimize to
use less memory." Testing was performed by using the default compression policy as well as setting the
policy to "Optimize to use less network bandwidth." This option uses less network bandwidth, but is
more memory-intensive. The test results show that there is no impact on server capacity by setting the
compression policy to "Optimize to use less network bandwidth." The impact on memory usage is
negligible, and there is an overall reduction in bandwidth usage. Additionally, the server is slightly more
responsive in this case after capacity is reached compared to the default compression policy.
Desktop Experience pack
The Desktop Experience feature enables you to install a variety of Windows 7 features on your server
(such as Desktop Themes, Windows SideShow, Windows Defender). For the purpose of this test, the
Desktop Composition feature was installed on the server, which enables the Themes service and applies
the Aero theme for all users. There were two different tests performed with the Desktop Experience
pack installed. In the first test, Desktop Composition remoting was disabled from the client side. In the
second test, Desktop Composition remoting was enabled. The results are displayed in Table 6.
Server Configuration Desktop Experience Pack
Desktop Composition Remoting
Capacity
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
Not installed Disabled 140 users
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
Installed Disabled 140 users
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
Installed Enabled 120 users
Table 6 - Server capacity at 32 bpp color depth for Knowledge Worker scenario
In the case of the Desktop Experience pack when Desktop Composition remoting is disabled, the server
capacity remains unchanged. There is around 5% increase in memory usage, which can result in a
reduced server capacity on memory-limited systems.
In the case when Desktop Composition remoting is enabled, the server capacity drops from 140 users to
120 users caused by an increase in CPU usage. There is around 68% increase in network bandwidth
usage and a 5% increase in memory usage. When Desktop Composition remoting is enabled, there is a
significant increase in CPU and memory usage on the client side as well. A client computer running 12
instances of the Remote Desktop Connection client (mstsc.exe) showed a 100% increase in memory
usage as well as 70% increase in CPU usage when Desktop Composition remoting is enabled.
RemoteApp programs Remote Desktop Web Access enables users to access RemoteApp programs. RemoteApp programs are
applications that are accessed remotely through Remote Desktop Services and appear as if they are
running on the end user's local computer. A RemoteApp program scenario was created so that we can
compare server capacity when using RemoteApp programs to the Remote Desktop scenario. The
RemoteApp programs scenario is mostly the same as the Knowledge Worker scenario. The difference is
in the way the connection is made to the server and how the applications are launched. The comparison
between Remote Desktop and RemoteApp programs is shown in Table 7.
Server Configuration Model Number Scenario Capacity
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
DL585 Knowledge Worker 150 users
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
DL585 Knowledge Worker RemoteApp programs
135 users
Table 7 - Server capacity comparison of RemoteApp programs and Remote Desktop
Test results show higher CPU usage in the RemoteApp programs scenario, which results in 10% fewer
supported users compared to the Remote Desktop scenario. There is no significant difference in other
key system parameters (memory usage, network bandwidth).
Hyper-V Hyper-V™, the Microsoft hypervisor-based server virtualization technology, enables you to consolidate
multiple server roles as separate virtual machines (VMs) running on a single physical computer, and also
run multiple different operating systems in parallel on a single server. Hyper-V tests were performed for
this white paper to compare server capacity between an RD Session Host server running natively and an
RD Session Host server hosted as a virtual machine under Hyper-V. For these tests, Windows Server
2008 R2 was installed as the Hyper-V host server.
The test server used for this evaluation had a single Quad-core AMD CPU that supports Rapid
Virtualization Indexing (RVI). This feature provides hardware acceleration for virtualization memory
management tasks and is leveraged by the new Second Level Address Translation (SLAT) feature
available in Hyper-V in Windows Server 2008 R2.
When running inside a virtual machine, Windows Server 2008 R2 was also installed with the RD Session
Host role service enabled. The VM was the only VM configured on that host, with 30 GB of the overall 32
GB of available RAM allocated to it. In addition, it was configured with the maximum of 4 virtual
processors so that it can utilize all 4 CPU cores available. The Remote Desktop clients connected to the
VM for these tests.
There were two Hyper-V tests performed. One was with the default configuration that utilizes hardware
acceleration provided by RVI (a new feature for Hyper-V available in Windows Server 2008 R2), and the
other simulated a processor with no hardware assist by disabling the hardware assist support. The
results are shown in Table 8.
Server Configuration Scenario SLAT Capacity
AMD Opteron Quad-core CPU 2.7 GHz 512 KB L2 Cache 30 GB Memory
Native N/A 180 users
AMD Opteron Quad-core CPU 2.7 GHz 512 KB L2 Cache 30 GB Memory
Hyper-V Enabled 150 users
AMD Opteron Quad-core CPU 2.7 GHz 512 KB L2 Cache 30 GB Memory
Hyper-V Disabled 70 users
Table 8 - Server capacity for Knowledge Worker v2 scenario without PowerPoint
In the case of SLAT-capable hardware, the Hyper-V scenario supports 17% fewer users when compared
to running natively without Hyper-V. When SLAT is disabled, the server capacity is reduced by 53%
compared to the SLAT-enabled scenario. Obviously, SLAT makes a very significant difference when
running the RD Session Host role service under Hyper-V. The processors that support this feature—
Rapid Virtualization Index (RVI) for AMD processors and Extended Page Tables (EPT) for Intel
processors—are strongly recommended.
Impact of Windows System Resource Manager (WSRM) Windows System Resource Manager (WSRM) is an administrative tool that can control how CPU and
memory resources are allocated. The WSRM management policy used for testing was "Equal per User,"
which makes sure that each user's set of processes gets equal CPU share. What this means is that one
user's process should not be able to starve other users of CPU.
The test results show that the WSRM "Equal per User" policy does not have a significant impact on
server capacity. The Knowledge Worker scenario was supported at 150 users each with and without
WSRM. However, there is an important effect of the WSRM policy on individual response times in the
Knowledge Worker scenario. Keep in mind the fact that the most CPU-intensive part of the scenario is
the work done in the PowerPoint application. In the baseline case without WSRM, as the CPU usage
reaches 100%, most user action response times deteriorate rapidly. In the WSRM case, it is apparent
from the results that the actions performed in PowerPoint become unresponsive a little earlier than the
baseline case and at a steeper rate. The response times for all other actions deteriorate at a noticeably
gentler pace. This means that the system is not allowing processes that consume higher CPU to starve
other users' processes, and is thus protecting the system overall from users that cause high CPU usage.
Comparison with Windows Server 2008
Server Configuration Model Number OS Capacity
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
DL585 Windows Server 2008 160 Users
4 x AMD Opteron Dual-core CPUs 2.4 GHz 1024 KB L2 Cache 64 GB Memory
DL585 Windows Server 2008 R2
150 Users
Table 9 - Server capacity by operating system for Knowledge Worker scenario
Table 9 shows the server capacity comparison between Windows Server 2008 and Windows Server 2008
R2 for the knowledge worker scenario. The memory usage on both operating systems is very similar.
Windows Server 2008 R2 uses slightly higher CPU when compared to Windows Server 2008, resulting in
a slightly reduced server capacity.
Conclusions Capacity planning for Remote Desktop deployments is subject to many variables and there are no good
off-the-shelf answers. Based on usage scenario and hardware configuration, the variance in capacity can
reach up to two orders of magnitude. If you need a relatively accurate estimate, either deploying a pilot
or running a load simulation are quite likely the only reliable ways to get that.
Remote Desktop Session Host server can provide good consolidation for certain scenarios if care is taken
when configuring the hardware and software. Supporting 200 users on a dual socket 2U form factor
server is completely viable for some of the medium to lighter weight scenarios.
When configuring an RD Session Host server, give special attention to the following:
Provide more CPU cores to not only increase overall server capacity, but also allow a server to
better absorb temporary peaks in CPU load like logon bursts or variation in load.
Provide the server with at least 8 GB of RAM, typically 16 GB.
Remember that enabling Desktop Composition will have a significant impact on resource usage
and will affect server capacity negatively.
When running RD Session Host servers in a virtualized environment, make sure the processor
supports paging at the hardware level (RVI for AMD, EPT for Intel).
Use WSRM in deployments where there are wide swings in CPU usage.
Properly size the server input/output throughput capacity.
Appendix A: Test Hardware Details The following servers were tested for Remote Desktop Services capacity planning data:
HP ProLiant DL 585
o 4 x AMD Opteron 8216 2.4 GHz CPUs (Dual-core)
o 1024 KB x 2 L2 Cache per processor
o 64 GB DDR2 RAM
o 8 x 72 GB 15K RPM SAS drives
o 100/1000 Mbps Intel NIC
HP ProLiant DL 385
o 2 x AMD Opteron 2216 HE 2.4 GHz CPUs (Dual-core)
o 1024 KB x 2 L2 Cache per processor
o 24 GB DDR2 RAM
o 8 x 72 GB 15K RPM SAS drives
o 100/1000 Mbps Intel NIC
Other components of the test laboratory included:
Domain Controller and Test Controller: HP Proliant DL145
o Dual core AMD Opteron processor 280 2.4GHz
o 2 GB Memory
o Windows Server 2008 Standard
o This server is the DHCP and DNS server for the domain. It manages the workstations
running Windows 7 Ultimate, including script control, software distribution, and remote
reset of the workstations.
Mail server and Web server: Dell PowerEdge 1950
o 2 x Intel(R) Xeon(TM) Dual Core CPU 3.0 GHz
o 2 GB Memory
o Windows Server 2008 Standard
o Exchange Server 2007
Workstations: HP dx5150
o AMD Athlon 64 processor 3000+ 1.8GHz
o 1 GB Memory
o Windows 7 Ultimate
Appendix B: Testing Tools Microsoft developed the Remote Desktop Load Simulation Tools to perform scalability testing. Remote
Desktop Load Simulation Tools is a suite of tools that assists organizations with capacity planning for
Windows Server 2008 R2 Remote Desktop Services. These tools allow organizations to easily place and
manage simulated loads on a server. This in turn can allow an organization to determine whether or not
its environment is able to handle the load that the organization expects to place on it. If you’d like to
conduct a capacity planning exercise for your specific deployment, you can download the Remote
Desktop Load Simulation Tools from the Microsoft Download Center
(http://go.microsoft.com/fwlink/?LinkId=178956).
The automation tools included in the suite are described below.
Test control infrastructure Test Controller - RDLoadSimulationController.exe
The RDLoadSimulationController tool is the central control point for the load simulation testing. It is
typically installed on the test controller computer. RDLoadSimulationController controls all test
parameters and defines the progression of the simulated user load. It also controls all custom
actions that are executed at any point during the test process. It communicates with
RDLoadSimulationClients and RDLoadSimulationServerAgent to synchronize and drive the client-
server remote desktop automation. It commands the RDLoadSimulationClients to run scripts that
load the RD Session Host server at operator-specified intervals.
Client Agent - RDLoadSimulationClient.exe The RDLoadSimulationClient tool controls the client side of the load simulation testing.
RDLoadSimulationClient is typically installed on the test client computers. RDLoadSimulationClient
receives commands from RDLoadSimulationController to run scripts that load the RD Session Host
server at operator-specified intervals. It executes custom commands received from the
RDLoadSimulationController and also sends the status of the executing scripts to the
RDLoadSimulationController. RDLoadSimulationClient also performs desktop management on the
test client computers. It creates a new desktop for each script that it launches and provides the
means to navigate between all desktops.
Server Agent - RDLoadSimulationServerAgent.exe The RDLoadSimulationServerAgent tool runs on the target Remote Desktop Session Host server. It
runs custom commands that are sent to it by the RDLoadSimulationController. It is also used by
RDLoadSimulationController for test synchronization.
SwitchDesktop.exe The SwitchDesktop tool runs on the test client computers. It runs inside each new desktop that is
created on the client. Its only function is to provide a way to switch back to the default desktop
where the RDLoadSimulationClient is running.
Scenario execution tools Script automation tool - RemoteUIControl.dll
RemoteUIControl.dll is a COM based tool which provides functionality for driving the client side load
simulation. It exposes functionality for creating RDP connections to the server, as well as sending
keyboard input to the Remote Desktop Services session. It synchronizes executions based on
drawing events in the applications that are running inside the Remote Desktop Services session.
RUIDCOM.exe RUIDCOM is a DCOM tool which is a wrapper around RemoteUIControl.dll. This tool exposes all the
functionality of RemoteUIControl.dll. Test scripts use RUIDCOM instead of directly using
RemoteUIControl.dll because it provides some extra functionality. RUIDCOM communicates with
the RDLoadSimulationClient to report the status of a simulated user.
TSAccSessionAgent.exe TSAccSessionAgent runs on the target RD Session Host server. One instance of TSAccSessionAgent
runs inside every Remote Desktop Services session that is created for a simulated test user.
RemoteUIControl.dll on the client side communicates with TSAccSessionAgent to synchronize user
input with drawing events in the applications that are running inside the Remote Desktop Services
session.
Appendix C: Test Scenario Definitions and Flow Chart
Knowledge Worker v2 Typing Speed = 35 words per minute
Definition: the Knowledge Worker scenario includes creating and saving Word documents, printing Excel
spreadsheets, communicating by e-mail in Outlook, adding slides to PowerPoint presentations, running
slide shows, and browsing Web pages in Internet Explorer. The following workflow details the scenario.
Connect User “smcxxx”
Start (Outlook) - Send new e-mail messages Send a new appointment invitation Send a new e-mail message Minimize Outlook Start (Word) - Start and exit Word
Start (Microsoft Excel) - Start and exit Excel loop(forever)
Start (Word) - Type a page of text and print Open a Word document Type a page of text Modify and format text Check spelling Print Save Exit Word
Start (Microsoft Excel) - Load Excel spreadsheet, modify, and print it Load Excel spreadsheet Modify data and format Print Save Exit Excel
Start (PowerPoint) - Load presentation and run slide show Load a PowerPoint presentation Navigate Add a new slide Format text Run slide show Save file Exit PowerPoint
Switch To Process, (Outlook) - send e-mail, read message, and respond Send e-mail to other users Read e-mail and respond Minimize Outlook
Start (Internet Explorer) - Load presentation and run slide show Loop (2)
URL http://tsexchange/tsperf/WindowsServer.htm URL http://tsexchange/tsperf/Office.htm URL http://tsexchange/tsperf/MSNMoney.htm
End of loop Exit Internet Explorer
End of loop
Knowledge Worker v1 Typing Speed = 35 words per minute
Definition: a worker who gathers, adds value to, and communicates information in a decision support
process. Cost of downtime is variable but highly visible. Projects and ad-hoc needs towards flexible tasks
drive these resources. These workers make their own decisions on what to work on and how to
accomplish the task. The usual tasks they perform are marketing, project management, sales, desktop
publishing, decision support, data mining, financial analysis, executive and supervisory management,
design, and authoring.
Connect User “smcxxx” Start (Microsoft Excel) - Load massive Excel spreadsheet and print it
Open File c:\documents and settings\smcxxx\Carolinas Workbook.xls Print Close document Minimize Excel
Start (Outlook) - Send a new, short e-mail message ( e-mail2 ) Minimize Outlook
Start (Internet Explorer) URL http://tsexchange/tsperf/Functions_JScript.asp Minimize Internet Explorer
Start (Word) - Type a page of text ( Document2 ) Save Print Close document Minimize Word
Switch To (Excel) Create a spreadsheet of sales vs months
( spreadsheet )
Create graph ( graph ) Save
Close document Minimize Excel
Switch To Process, (Outlook) - read e-mail message and respond ( Reply2 )
Minimize Outlook Now, Toggle between apps in a loop
loop(forever)
Switch To Process, (Excel) Open File c:\documents and settings\smcxxx\Carolinas Workbook.xls Print Close document Minimize Excel
Switch To Process, (Outlook) – E-Mail Message ( e-mail2 ) Minimize Outlook
Switch To Process, (Internet Explorer) Loop (2) URL http://tsexchange/tsperf/Functions_JScript.asp
URL http://tsexchange/tsperf/Conditional_VBScript.asp URL http://tsexchange/tsperf/Conditional_JScript.asp URL http://tsexchange/tsperf/Arrays_VBScript.asp URL http://tsexchange/tsperf/Arrays_JScript.asp
End of loop Minimize Internet Explorer
Switch To Process, (Word) - Type a page of text ( Document2 ) Save Print Close document Minimize Word
Switch To Process, (Excel) Create a spreadsheet of sales vs months
( spreadsheet )
Create graph ( graph )
Save Close document Minimize Excel Switch To Process, (Outlook) - read message and respond
( reply2 )
Minimize Outlook End of loop Log off
Appendix D: Remote Desktop Session Host Settings Operating system installation
All drives formatted by using NTFS
Roles
o Remote Desktop Session Host role service installed
Networking left at default with typical network settings
Server joined as a member to a Windows Server 2008 domain
Page file initial and maximum size set to 56 GB
System and user profiles data resides on a single logical RAID 5 drive
Page files reside on a single logical RAID 5 drive that is separate from the one used for system and
user profiles data
RDP protocol client settings
Disable all redirections (drive, Windows printer, Clipboard, , LPT, COM, audio and video playback,
audio recording, Plug and Play devices)
Color depth is set to 16 bit for Remote Desktop Services connections
Office 2007 settings
Office 2007 installed enabling the following features from Office customization
o Microsoft Office Excel
o Microsoft Office Outlook
o Microsoft Office PowerPoint
o Microsoft Office Word
o Office Shared Features
o Office Tools
Outlook settings
Mailbox on Exchange server
E-mail options
o AutoSave of messages disabled
o Automatic name checking disabled
o Do Not Display New Mail Alert for users enabled
o Suggest names while completing To, Cc, and Bcc fields disabled
o Return e-mail alias if it exactly matches the provided e-mail address when searching OAB
enabled
o AutoArchive disabled
Word Settings
o Background grammar-checking disabled
o Check Grammar With spelling disabled
o Background saves disabled
o Save AutoRecover information disabled
o Always show full menus enabled
o Microsoft Office Online disabled
o Customer Experience Improvement Program disabled
o Automatically receive small updates to improve reliability disabled
Printer settings
HP Color LaserJet 9500 PCL 6 created to print to NUL port
User profiles
Configuration script executed to pre-create cached profiles, copy template files for applications,
configure e-mail accounts, and set home page on Internet Explorer
Roaming profiles used for all users
Performance logger
Performance counters are logged on to the RD Session Host server itself
General settings
o Disable screen saver for all users through Group Policy
o Disable Windows Firewall
o Enable Remote Desktop Connections
o Set power settings to High Performance
o Delete all office and XPS printers installed at setup