59
Windows Internal Ch. 9 Memory Management Kent Huang

Windows Internal - Ch9 memory management

Embed Size (px)

Citation preview

Windows Internal

Ch. 9 Memory ManagementKent Huang

Page Fault HandlingReason for Fault ResultAccessing a page that isn’t resident in memory but ison disk in a page file or a mapped file

Allocate a physical page, and read the desired pagefrom disk and into the relevant working set

Accessing a page that is on the standby or modified list Transition the page to the relevant process, session,or system working set

Accessing a page that isn’t committed (for example,reserved address space or address space that isn’tallocated)

Access violation

Accessing a page from user mode that can be accessedonly in kernel mode

Access violation

Writing to a page that is read-only Access violation

Accessing a demand-zero page Add a zero-filled page to the relevant working set

Writing to a guard page Guard-page violation (if a reference to a user-modestack, perform automatic stack expansion

Writing to a copy-on-write page Make process-private (or session-private) copy ofpage, and replace original in process, session, orsystem working set

Writing to a page that is valid but hasn’t been writtento the current backing store copy

Set Dirty bit in PTE

Executing code in a page that is marked as no execute Access violation (supported only on hardwareplatforms that support no execute protection)

9.7 Page Fault

• 9.7.1 Invalid PTEs• 9.7.2 Prototype PTEs• 9.7.3 In-Paging I/O• 9.7.4 Collided Page Fault• 9.7.5 Clustered Page Fault• 9.7.6 Page Files

Invalid PTEs

• Page File– Page resides within a paging file

• Demand zero– Page must be satisfied with a page of zeros

• Transition– Page is in memory on either the standby, modified,

or modified-nowrite list or not on any list• Unknown– The PTE is zero, or the page table doesn’t yet exist

Prototype PTEs

• If a page can be shared between two processes, the memory manager uses a software structure called prototype page table entries

In-Paging I/O

• A read operation must be issued to a file to satisfy a page fault

Collided Page Faults

• When another thread in the same process or a different process faults a page that is currently being in-paged is known as a collided page fault

Clustered Page Faults

• Memory manager prefetches large clusters of pages to satisfy page faults and populate the system cache

Page Files

• Page files are used to store modified pages that are still in use by some process but have had to be written to disk

• Windows support max 16 page files– X86: 4GB/per page file– X64: 16 TB/per page file

• Page file contains parts of process and kernel virtual memory, for security reasons the system can be configured to clear the page file at system shutdown– HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\

ClearPageFileAtShutdown = 0x1

9.8.1 User Stacks

• When a thread is created, the memory manager automatically reserves a predetermined amount of virtual memory, which by default is 1 MB.– Can be configured by function “CreateThread”, “CreateRemoteThread”– By compiler setting “/STACKRESERVE”

9.8.2 Kernel Stacks

• Kernel stack is significantly smaller: 12 KB on x86 and 16 KB on x64

• Special case by “KeExpandKernelStackAndCallout”– Graphics system calls handled by Win32k.sys and its subsequent

callbacks into user mode can cause recursive re-entries in the kernel on the same kernel stack

9.8.3 DPC Stack

• Windows keeps a per-processor DPC stack available for use by the system whenever DPCs are executing, an approach that isolates the DPC code from the current thread’s kernel stack

9.9 Virtual Address Descriptors

• 9.9.1 Process VADs– VADs are organized into a self-balancing AVL tree that

optimally balances the tree

• 9.9.2 Rotate VADs– Graphic driver usually need to copy data from user-mode

to other kinds of system cache (AGP,GPU)– Rotate VADs can direct access memory in Graphic device

Process VADs

Rotate VADs

9.10 NUMA

• Use in large server systems which have many physical CPUs.

• Choose best node to get memory

Section Objects

• The section object, which the Windows subsystem calls a “file mapping object”, represents a block of memory that two or more processes can share

• Windows ensures that any process that accesses (reads or writes) a file will always see the same, consistent data

• Special case: (Duplication pages)– when an image file has been accessed as a data file and then run as an

executable image

Driver Verifier

• Initialize Driver Verifier– 1. Check registry when system boot up– 2. Check which drivers need to be verified– 3. Load driver by function VfLoadDriver provided

by Driver Verifier.– 4. Kernel function will be changed by Driver

Verifier’s function

Special Pool

• Special pool will add two invalid page before and after pool buffer.

• Access invalid page will cause BSOD, use to detect overrun

• Check IRQL level

Pool Tracking

• If pool tracking is enabled, the memory manager checks at driver unload time whether the driver freed all the memory allocations it made

• Use to detect memory leak in kernel

Force IRQL Checking

• Pageable memory can not be page out in DPC/dispatch or higher IRQL level.

• Force all the pageable memory page out before IRQL is elevated

Low Resources Simulation

• To randomly fail memory allocations that verified device drivers perform

• Can set fail rate (Default 6%) and delay time

Miscellaneous Checks

• Active work items in freed memory • Active resources in freed memory • Active look-aside lists in freed memory • Etc…

Page Frame Number Database

• How Windows manage Physical Memory?– Working Set• The resident pages owned by a process or the system

– PFN• The state of each page in physical memory

Page List – Get a zero page

• Needs a zero-initialized page or User mode committed private page1. Trigger demand-zero page fault2. Search zero page list3. Search free page list4. Search standby list

Page List – Get a page

• Trigger Page Fault1. Search free page list2. Search zero page list3. Search standby page list4. Remove invalid PTE flag from page table

Page List – Remove page

• Want to remove page from Working Set1. Check do page have been modified?2. Move page to standby list if No3. Move page to modified list if Yes4. All private page will be moved to free list when

process exist

Page Priority

• The page priority is a number in the range 0 to 7.

• Standby list will be divided by 8 sub lists.

• Each thread and process in the system is also assigned a page priority.

Modified Page Writer

• System use 2 threads to write page into hard disk and move those pages back to standby lists based on their priority– MiModifiedPageWriter – MiMappedPageWriter

• Use 2 threads to prevent dead lock

MmMappedPageWriterEvent

• Routines (MmWorkingSetManager) signal event when modified page lists have more than 800 pages.

MmModifiedPageWriterGate

• Waiting this object until– The total size of the zeroed and free page lists has

dropped below 20,000 pages – A request to flush all pages has been received – The number of available pages has dropped below

262,144 pages, or below 256 pages during a page list operation.

PFN Data Structures

Physical Memory Limits (Win7)

Windows Client Memory Limits

• Although 32bits OS can support more than 4G RAM by PAE addressing modes

• Some 3rd party drivers are not designed RAM more than 4G in 32bit Windows.

• Can use BCD to enable RAM larger than 4G

32-Bit Client Effective Memory Limits

• The effective limit is actually lower and dependent on the system’s chipset and connected devices.

• Physical address map includes not only RAM but device memory

Working Sets

• How Windows keeps track of physical memory– Subset of Virtual pages resident in physical

memory call “Working Sets”• Process Working Sets– For single process

• System Working Sets – For system (ex. Ntoskrnl.exe and drivers)

• Session Working Sets– For each session

Demand Paging

• MM use demand-paging algorithm with clustering to load pages into memory

• When Page fault:– Load many pages into memory around faulted

page• If page fault at excitable file:– Load 3 pages

• Else:– Load 7 pages

Logical Prefetcher

• Why?– During a typical system boot or application

startup, the file access could be not sequential.– Non-sequential file access could cause lots of page

fault

Logical Prefetcher

• Prefetcher will try to speed the boot process and application startup by monitoring the data and code accessed, using that information at the beginning of a subsequent boot or app startup to read code and data.– Application startup: 10 sec– System boot: 30 sec ~ 120 sec

Lab Enable Prefetch

Logical Prefetcher

• Prefetcher data will be stored as file (*.pf)• Naming rule– App: [exe_name]-[hash of path].pf• Ex. NOTEPAD.EXE-AF43252301.PF

– Boot: NTOSBOOT-B00DFAAD.PF. • Check load file sequence when boot or app

start every 3 days– Layout.ini

Placement Policy

• When page fault, MM need to determine which physical memory should be drop– Least recently used (LRU)– First in, First out (FIFO)

• Global– Allows a page fault to be satisfied by any page frame

• Local – Limit its search for the oldest page to the set of

pages already owned by the process

Working Sets Management

• Only hard working set limits is useful• Other setting of working set limits could be

ignore by Working Sets Management

Balance Set Manager and Swapper

• Balance Set Manager (KeBalanceSetManager)– Wait 2 event: • 1 sec Timer• Internal event from Working Set Manager

– Every 1 sec:• Queue a DPC associated with 1 sec timer• Call Swapper every 4 sec• Check look-aside list• Adjust IRP credits• Call Working Set Manager

System Working Set

Memory Notification Events

Proactive Memory Management (Superfetch)

• Standby list management of previous Windows versions has had two limitations – Prioritization of pages relies only on the recent

past behavior of processes and does not anticipate their future memory requirements

– Data used for prioritization is limited to the list of pages owned by a process at any given point in time

SuperFetch

• Tracer– Trace info of page, session, process and file

• Trace Collector and Processor– Create raw data log from tracer

• Agents– Maintain the history log. Grouping the data.

• Scenario Manager– Manages three plans: hibernation, standby, and fast-user switching

• Rebalancer– Adjust prioritize of each page. Building the prioritized standby lists

Scenarios

• hibernation – Intelligently decide which pages are saved in the

hibernation file other than the existing working set pages

• standby – Completely remove hard faults after resume

• fast user switching– Keep an accurate priority and understanding of

each user’s memory

Tracing and Logging

• Superfetch always keeps a trace running and continuously queries trace data from the system – page usage and access – File by fileInfo driver

• User mode: Superfetch service

Page Priority and Rebalancing

• Superfetch assigns page priority based on an internal score it keeps for each page, part of which is based on frequency-based usage

• Priority could be set from 1 to 6– normal applications: p5– background applications: p1– high-importance pages: p6– Tracing data and history log: p7

Robust Performance

• Watches for specific file I/O access that might harm system performance by populating the standby lists with unneeded data – Ex. Copy large file will full the standby list.

• When SuperFetch detect:– sequential file access – sequential directory access

• Set page priority to 2

ReadyBoost

• Use USB flash disk to store file cache• Random disk I/O is faster on USB flash disk• Create file “ReadyBoost.sfcache” on USB disk• Compress rate 2:1 • Encrypt by AES

RAM Optimization Software