How reliable is page write-tracking in Windows given processor caches

It's no coincidence that a write watch operates at page granularity. That's because this handled at CPU level, via the Page Table for the MMU. I can't find an authoritative source, but I understand that this works via the read-only page attribute. A watched page is read-only, but the soft page fault when writing is handled by adding the watched page to the modified list.

As such, stale data in processor caches are irrelevant. This is handled at MMU level, and the MMU is closely coupled to caches anyway.

I'd be more worried about race conditions, because those appear at C++ level. A write to the watched page could happen from another thread even as GetWriteWatch is running.

The blunt answer is assumed Yes .

While the documentation doesn't give an explicit guarentee, it can be assumed since it is dealing with MMU and CPU and low level memory management. This works as the rest of API: see creating guard pages etc. Taken together, all these guard and protect features of the API wouldn't be half as usefull as they actual are if you couldn't count on them being precise down to the instruction causing the fault. That being said, how this is actually accomplished by OS/CPU/MMU/TLB/CACHE is somewhat in the dark to me- will update if I figure it out.

In your example, I would be more worried about the compiler/optimizer playing some trick on you- so perhaps take a look at the generated assembly and see where the actual write is.