Upload
junior-norris
View
220
Download
3
Embed Size (px)
Citation preview
Cache-ConsciousStructure Definition
By Trishul M. Chilimbi, Bob Davidson, and James R. Larus
Presented by Shelley ChenMarch 10, 2003
Motivation
Processor-Memory Performance Gap Growing at 53% per year!
Chilimbi and Larus previous work Placed objects with high temporal
locality in the same cache block Works best with objects < ½ cache block
Current paper proposes techniques for larger data structures
Improving Cache Performance
Two Cache-Conscious Definition Techniques
Structure Splitting Split large data structures into two
smaller structures
Field Reordering Group fields in a structure with high
temporal locality into the same cache block
Structure Splitting
Split large structures into two smaller structures “hot” structure contains frequently accessed
fields “cold” structure contains rarely accessed fields
Allow more “hot” structures to fit into the cacheHas been done manually in the past, this paper is first to automate
Class Splitting Overview
Experimental Setup
Ran compiled programs on Sun Ultraserver E50002 GB of memoryL1 dcache, 16 KB DM, 16 byte blocks L2 cache, 1 MB DM, 64 byte blocks
Results: Structure SplittingReduces L2 miss rates by 10-27%; improves execution time by 10-20%
Field ReorderingLogical ordering of the program is not usually consistent with its data access patterns Frequently accessed fields may be coded next to
rarely accessed fields, putting them in the same cache block
cause excessive cache misses
Reorder field definitions of structure fields with high temporal affinity in same cache
block
bbcache
Tool that produces structure field reordering recommendations Construct a database containing both static and
dynamic information about the structure field accesses
Process database to construct field affinity graphs for each structure
Produce the structure field order recommendations for the affinity graphs
Attempts to group fields with high temporal affinity into the same cache block
Experimental Setup
4 processor 400MHz Pentium II Xeon system1MB L2 cache/processor4GB memoryRan TPC-C on Microsoft SQL Server 7.0 to collect traces as input to bbcacheChose 5 structures which showed largest potential from the benefits of reordering
Results: Field Reordering
Cache block pressure =Σ (b1, …,bn)/n
Cache block utilization = Σ(f11, …, fnbn)/ Σ(b1, …,bn)
Modified SQL Server was consistently better by 2-3%
Conclusion
2 techniques to improve cache performance change the internal organization of
fields in a data structure Structure splitting Field reordering
Structure Layouts better left to compiler Easier to determine field access order
Questions?
Class Splitting Algorithm
Structure Access Database
Building Field Affinity Graphs