programmatically disable hardware prefetching on AMD systems

All AMD Family 10h processors (including Barcelona and Istanbul) have two different hardware prefetchers.

  1. The first is the traditional data cache prefetcher that recognizes contiguous streams of either ascending or descending cache line accesses. It can be disabled by setting bit 13 of MSRC001_1022 to "1".

  2. The other hardware prefetcher is the "memory controller prefetcher". This is a somewhat more general prefetcher, but only operates within the memory controller (i.e., it does not send the prefetched data to a core -- it just enables the memory controller to return it more quickly when the core requests it).

    • The primary control for this prefetcher is in PCI configuration space, Function 2, offset 11Ch, with additional control in Function 2, offset 1B0h for the processors after Barcelona.
    • I have had success in disabling and re-enabling this prefetcher on a "live" Barcelona system by updating the values in PCI configuration space via the /dev/mem device driver. (Don't try this at home!)
    • The activity of the memory controller prefetcher is shown by the hardware performance counter event 1F0h, with UnitMasks 02 and 04.
    • Note that the memory controller prefetcher for Shanghai/Istanbul/MagnyCours operates "coherently" (meaning that cache coherence probe operations are issued along with the memory prefetches), while the memory controller prefetcher in Barcelona does not issue cache coherence operations (they don't get issued until the core's request for the cache line arrives at the memory controller).

The stuff above is documented in the BIOS and Kernel Developer's Guide for Family 10h processors: http://support.amd.com/us/Processor_TechDocs/31116.pdf