Cache algorithm

A Cache algorithm is an algorithm used to manage a cache or group of data. When the cache is full, it decides which item should be deleted from the cache. The word hit rate describes how often a request can be served from the cache. The term latency describes for how long a cached item can be obtained. Cache alorithms are a trade-off between hit-rate and latency.

  • The most efficient caching algorithm would be to always discard the information that will not be needed for the longest time in the future. This optimal result is referred to as Belady's optimal algorithm or the clairvoyant algorithm. Since it is generally impossible to predict how far in the future information will be needed, this is generally not implementable in practice. The practical minimum can be calculated only after experimentation, and one can compare the effectiveness of the actually chosen cache algorithm with the optimal minimum.
  • Least Recently Used (LRU): deletes the least recently used items first. This algorithm requires keeping track of what was used when. This is expensive if one wants to make sure the algorithm always discards the least recently used item. General implementations of this technique require to keep "age bits" for cache-lines and track the "Least Recently Used" cache-line based on age-bits. In such implementation, every time a cache-line is used, the age of all other cache-lines changes. LRU is actually a family of caching algorithms with members including: 2Q by Theodore Johnson and Dennis Shasha and LRU/K by Pat O'Neil, Betty O'Neil and Gerhard Weikum.
  • Most Recently Used (MRU): This algorithm deletes the most recently used items first. This caching mechanism is commonly used for database memory caches.
  • Pseudo-LRU (PLRU): There are certain cases where LRU is very difficult to implement. In such cases it may be enough to know that in most cases, one of the least recently used items is deleted. The PLRU algorithm can be used there, because it only needs one bit per cache item to work.
  • 2-way set associative: for high-speed CPU caches where even PLRU is too slow. The address of a new item is used to calculate one of two possible locations in the cache where it is allowed to go. The LRU of the two is discarded. This requires one bit per pair of cache lines, to indicate which of the two was the least recently used.
  • Direct-mapped cache: for the highest-speed CPU caches where even 2-way set associative caches are too slow. The address of the new item is used to calculate the one location in the cache where it is allowed to go. Whatever was there before is discarded.
  • Least Frequently Used (LFU): LFU counts how often an item is needed. Those that are used least often are discarded first.
  • Adaptive Replacement Cache (ARC):[1] constantly balances between LRU and LFU, to improve combined result.
  • Multi Queue (MQ) caching algorithm:[2] (by Y. Zhou J.F. Philbin and Kai Li).

Other things to consider:

  • Items with different cost: keep items that are expensive to obtain, e.g. those that take a long time to get.
  • Items taking up more cache: If items have different sizes, the cache may want to discard a large item to store several smaller ones.
  • Items that expire with time: Some caches keep information that expires (e.g. a news cache, a DNS cache, or a web browser cache). The computer may discard items because they are expired. Depending on the size of the cache no further caching algorithm to discard items may be necessary.

Various algorithms also exist to maintain cache coherency. This applies only to situation where multiple independent caches are used for the same data (for example multiple database servers updating the single shared data file).

References

  1. http://www.usenix.org/events/fast03/tech/full_papers/megiddo/megiddo.pdf
  2. "Index of /Events/Usenix01/Full_papers/Zhou". Archived from the original on 2005-11-24. Retrieved 2008-12-21.