Demote Scheme and Promote Scheme for Exclusive Cache

13 Mar 2018 Comparch HW Cache ideas Microarchitecture

Unlike inclusive cache, it is harder to implement an exclusive policy on cache. This post introduces two schemes for exclusive cache.

Demote

Demote is the naturally idea used in multi-level cache design, that evicted cache blocks are demoted and go to lower level cache. To guarantee the exclusiveness, it behaves as the follows:

On RFO (Request For Ownership) miss or read miss, sent request to lower level cache;
On read hit, relocate the block to the level 1 cache, which means invalidate the entry that gets hit, if it is in a lower level cache;
On eviction, level 2 cache services as the victim cache for level 1 cache, level 3 cache services as the victim cache for level 2 cache … and there is never a writeback hit at lower level caches;

As in Demote, only when a page is accessed via different paths in a multi-path hierachy can a page appear in multiple locations (which is better than enforcing absolute exclusivity).

Here “page” could be interpreted as cache block. An example is split level 1 instruction cache and level 1 data cache, which are common in modern processors, but need to be handled carefully even in inclusive cache (for coherency issue).

Compared with inclusive caches, exclusive caches utilize its storage capcity in a more efficient manner, therefore improve hit ratio. While the performance of demote scheme cache could easily degrade:

Further, an eviction, which used to be a trivial operation, now becomes an expensive operation almost equivalent to a write request to the lower cache. This leads to further degradation of performance.

Becuase demote exclusive caches intrudoce too many writeback operations by conducting writeback upon every eviction, no matter the evicted block is dirty or not.

Cache ping-pong

Here is an example of cache ping-pong to help illustrating how demote exclusive caches become worse than inclusive caches: Simply assume the processor repeats reading A then B, and A shares one entry in level 1 cache with B, shares two entries in level 2 cache with B. As the above figure shows. Inclusive caches are always getting hits at level 2 except for the first two compulsory misses. And because all are read operations, no writeback needed but simply overwrite the entry in level 1. While it is true that exclusive caches also get hits at level 2, before bring block A (or block B) to level 1, it needs to write the another block back to level 2. Unfortunately, because exclusive cache is writeback-intensive, it is highly likely the writeback buffer becomes full, then it has to stall.

Promote

Promote scheme performs cache data movements in one-way path, that it is only possible to promote a block from lower level cache to upper level cache, but lower level cache would not hold the blocks evicted from upper level cache.

According to the USENIX paper, On Multi-level Exclusive Caching: Offline Optimality and Why promotions are better than demotions, the machenism they applied to implement promote scheme is adding one bit in the return packet to indicate whether the block is already cached in lower level cache or not.

Each level cache decides whether allocate an entry for the block or simply pass it to upper block based on two criterions:

the bit coming together with the return data: if that bit says the block is already cached in lower level cache, by no means should this level cache to cache it again;
promote probability of this level cache: a value between 0 and 1 is calculated and associated to every level cache as the probability to promote a cache block (of course the top level cache is 0), and a random value is generated then compared with the value for each passed-by packet, so allocate space if the random is greater;

Upon eviction, clean blocks are dropped, and dirty blocks are written back to main memory directly. So unlike demote scheme, promote scheme is somehow immune to cache ping-pong.

The Promote technique ensures that only one cache claims ownership of a page on its way to the client. On subsequent accesses to the page, the page can either stay in the same cache or be promoted upwards.

Obviously the promote probability has great influence on the performance of the promote scheme. However, they calculate it under an idea that seems rather empirical than grounded to me.

The fundamental idea in the Promote technique is to use this leverage to create a situation where the “usefulness” of pages evicted from the various caches a roughly the same (if possible).

Another issue is that the probability is merely linked to cache level, but the access history of cache block is not used.

Credits

On Multi-level Exclusive Caching: Offline Optimality and Why promotions are better than demotions - The paper for demote scheme.
ChampSim - Playing with ChampSim gives me some exposure to cache behaviors.