Home

News

Forums

Hardware

CPUs

Mainboards

Video

Guides

CPU Prices

Memory Prices

Shop



Sharky Extreme : November 21, 2008





Regular Sections

- Weekly CPU Prices
- Weekly Memory Prices
- PC Buyer's Guides
- Private Eye
- Forums Spotlight
- The Rear View
- The Silicon Money Pit
- SharkyForums
- Site Info
- Links
- About Us


While the actual operation is much more complex than the previous example, we can see how the processor speeds have outrun the DRAM memory speeds. As indicated in the 'Memory Basics' article, a small amount of SRAM cache was added to the memory subsystem early on to help offset this speed mismatch. Over time, this cache has increased from just a few kilobytes, to as much as 2MB for desktop systems. Some high-end server systems include even more cache where low price is not as critical.

Cache is essentially a temporary storage area for small bits of data that is relatively 'close' to the CPU. The principle that cache theory is based upon is called locality of time and space (or temporal and spatial locality). Temporal locality simply means that a piece of data that was accessed once has a high probability of being accessed again soon, while spatial locality means that there is a high probability that the next piece of data to be accessed will be near the one just accessed.

One way to picture this would be a legal office environment where paperwork is stored in a filing cabinet across the room. When a case is to be worked on, the entire case file may be pulled and placed in the lawyers in-basket so all related documents are within 'arms reach' (spatial locality). During the research, it is likely that a number of these documents will be referred to several times (temporal locality). The alternative would be for the lawyer to have to get up and pull/store each individual document as it is used - very time consuming!

For most applications, this implementation works very well and studies have shown that as much as 98% of all requests from the CPU are satisfied from cache, when it is implemented properly. Unfortunately, as processors have continued to increase speed and the working set size of mainstream applications grows larger, cache implementations must be constantly modified and tuned. In fact, cache theory has become a study unto itself, which we will investigate in a future article.

At some point, simply adding more cache (or more cache levels) reaches a point of diminishing returns, and may even begin to reduce performance. Using the previous analogy, you can see the problem that might exist if instead of one or two case files in the lawyers in-basket there are 40 or 50. With some applications the working set size is large enough to overflow even a 2MB cache, and with the SSE instruction set the cache can be bypassed completely. Under these circumstances other methods must be used to get the data from the DRAM and/or SRAM to the CPU faster.




Copyright © 2001 INT Media Group, Incorporated. All Rights Reserved. Legal Notices | Licensing , Reprints , & Permissions | Privacy Policy