Atomic operations on floats

To improve upon previous answers in the context of Go, we can use https://golang.org/pkg/math/#Float64bits and https://golang.org/pkg/math/#Float64frombits to convert float64s to and from uint64s without the direct use of the unsafe package.

Given a uint64, we can then use all the available atomic primitives.

type AtomicFloat64 uint64

func (f *AtomicFloat64) Value() float64 {
    return math.Float64frombits(atomic.LoadUint64((*uint64)(f)))
}

func (f *AtomicFloat64) Add(n float64) float64 {
    for {
        a := atomic.LoadUint64((*uint64)(f))
        b := math.Float64bits(math.Float64frombits(a) + n)
        if atomic.CompareAndSwapUint64((*uint64)(f), a, b) {
            return
        }
    }
}

Let's ponder floating point atomics, from the OS/hardware design point of view...

Atomics exist because they're needed for synchronisation. What does the majority of synchronisation involve? Handles, flags, mutexes, spinlocks - things whose actual value is meaningless as long as it's consistent per user and different between users. Even for something like a semaphore where the value is more meaningful - it's still about counting rather than measuring, so 32 bits is worth 32 bits whatever we deem it to represent.

Secondly, technical issues. Pretty much anything we can program on does integer operations. Not so floating point - when FP operations are being emulated by the C library, those atomics are going to be between difficult and impossible to implement. Even in hardware, FP operations are usually going to be slower than integer, and who wants slow locks? The design of the FPU itself may even make it difficult to implement atomic operations - e.g. if it's hanging off a coprocessor interface without any direct access to the memory bus.

Second-and-a-halfth, if we want float, surely we want double as well? But double often has the problem of being bigger than a machine word, ruling out atomicity of even loads and stores on many architectures.

Third, when it comes to things like atomics, CPU architects tend to implement what system designers and OS folks are demanding, and OS folks don't exactly love floating point in general - stupid extra registers to save, slowing down context switches... More instructions/features in the hardware cost power and complexity, and if the customers don't want them...

So, in short, there's not enough of a use case, so there's no hardware support, so there's no language support. Of course, on some architectures you can roll your own atomics, and I imagine GPU compute may have more demand for synchronisation on primarily floating-point hardware, so who knows if it'll stay that way?