Is it appropriate to use haveged as a source of entropy on virtual machines?

(Caveat: I certainly don't claim that HAVEGE lives up to its claims. I have not checked their theory or implementation.)

To get randomness, HAVEGE and similar systems feed on "physical events", and in particular on the timing of physical events. Such events include occurrences of hardware interrupts (which, in turn, gathers data about key strokes, mouse movements, incoming ethernet packets, time for a hard disk to complete a write request...). HAVEGE also claims to feed on all the types of cache misses which occur in a CPU (L1 cache, L2 cache, TLB, branch prediction...); the behaviour of these elements depends on what the CPU has been doing in the previous few thousands clock cycles, so there is potential for some "randomness". This hinges on the possibility to measure current time with great precision (not necessarily accuracy), which is where the rdtsc instruction comes into play. It returns the current contents of an internal counter which is incremented at each clock cycle, so it offers sub-nanosecond precision.

For a virtual machine system, there are three choices with regards to this instruction:

  1. Let the instruction go to the hardware directly.
  2. Trap the instruction and emulate it.
  3. Disable the instruction altogether.

If the VM manager chooses the first solution, then rdtsc has all the needed precision, and should work as well as if it was on a physical machine, for the purpose of gathering entropy from hardware events. However, since this is a virtual machine, it is an application on the host system; it does not get the CPU all the time. From the point of view of the guest operating system using rdtsc, this looks as if its CPU was "stolen" occasionally: two successive rdtsc instructions, nominally separated by a single clock cycles, may report an increase of the counter by several millions. In short words, when rdtsc is simply applied on the hardware, then the guest OS can use it to detect the presence of an hypervisor.

The second solution is meant to make the emulation more "perfect" by maintaining a virtual per-VM cycle counter, which keeps track of the cycles really allocated to that VM. The upside is that rdtsc, from the point of view of the guest, will no longer exhibit the "stolen cycles" effect. The downside is that this emulation is performed through triggering and trapping a CPU exception, raising the cost of the rdtsc opcode from a few dozen clock cycles (it depends on the CPU brand; some execute rdtsc in less than 10 cycles, other use 60 or 70 cycles) to more than one thousand of cycles. If the guest tries to do a lot of rdtsc (as HAVEGE will be prone to do), then it will slow down to a crawl. Moreover, the exception handling code will disrupt the measure; instead of measuring the hardware event timing, the code will measure the execution time of the exception handler, which can conceivably lower the quality of the extracted randomness.

The third solution (disabling rdtsc) will simply prevent HAVEGE from returning good randomness. Since it internally uses a PRNG, the output may still fool statistical analysis tools, because there is a huge difference between "looking random" and "being unpredictable" (statistical analysis tools follow the "look random" path, but cryptographic security relies on unpredictability).

The VirtualBox manual claims that VirtualBox, by default, follows the first method (rdtsc is unconditionally allowed and applied on the hardware directly), but may be configured to apply the second solution (which you don't want, in this case).

To test what your VM does, you can try this small program (compile with gcc -W -Wall -O on Linux; the -O is important):

#include <stdio.h>

#if defined(__i386__)

static __inline__ unsigned long long rdtsc(void)
{
        unsigned long long int x;

        __asm__ __volatile__ (".byte 0x0f, 0x31" : "=A" (x));
        return x;
}

#elif defined(__x86_64__)

static __inline__ unsigned long long rdtsc(void)
{
        unsigned hi, lo;

        __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
        return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
}

#endif

int
main(void)
{
        long i;
        unsigned long long d;

        d = 0;
        for (i = 0; i < 1000000; i ++) {
                unsigned long long b, e;

                b = rdtsc();
                e = rdtsc();
                d += e - b;
        }
        printf("average : %.3f\n", (double)d / 1000000.0);
        return 0;
}

On a non-virtual machine, with the "true" rdtsc, this shall report a value between 10 and 100, depending on the CPU brand. If the reported value is 0, or if the program crashes, then rdtsc is disabled. If the value is in the thousands, then rdtsc is emulated, which means that the entropy gathering may not work as well as expected.

Note that even getting a value between 10 and 100 is not a guarantee that rdtsc is not emulated, because the VM manager, while maintaining its virtual counter, may subtract from it the expected time needed for execution of the exception handler. Ultimately, you really need to have a good look at the manual and configuration of your VM manager.


Of course, the whole premise of HAVEGE is questionable. For any practical security, you need a few "real random" bits, no more than 200, which you use as seed in a cryptographically secure PRNG. The PRNG will produce gigabytes of pseudo-alea indistinguishable from true randomness, and that's good enough for all practical purposes.

Insisting on going back to the hardware for every bit looks like yet another outbreak of that flawed idea which sees entropy as a kind of gasoline, which you burn up when you look at it.


On the polarssl advisory: A detailed technical response can be found in the most recent debian source:

http://bazaar.launchpad.net/~ubuntu-branches/debian/experimental/haveged/experimental/revision/12?start_revid=12#debian/README.Debian

The executive summary is: polarssl != haveged != HAVEGE

On the embloms.se emulator experiments:

The haveged test suite, NIST and ent, verify that a source build has produced a functional RNG. Run-time testing is required to verify haveged operation in a virtual environment. That is a fairly recent addition to haveged.

On statistical testing:

Suppose you have a hardware RNG, a TRNG, how do you know it isn't broken? Hardware breaks. The German standards body has a spec that deals with that very problem, AIS31. That approach is adopted by haveged. A (biased) discussion of standards for RNG as they apply to haveged can be found at:

http://www.issihosts.com/haveged/ais31.html

The non-predictability of HAVEGE is exactly the same mechanism seen in benchmarking software. It is not due to timer drift, but the aggregation of the asynchronous behaviors in a modern processor. It really does not matter whether the performance variations come from a cache miss or a time slice delivered to another processor. The timer accuracy also does not matter as long as it's 'adequate'. How is that determined? By the output. By design (or perhaps over design) /dev/random is immune to poor input data. The only way to subvert the design is to lie about the entropy added to the entropy pool. The most recent versions of haveged perform a run-time entropy estimate on the output generate to insure the output is consistent with an ideal TRNG.

Executive summary: haveged output is indistinguishable from a TRNG according to tests used by the German standards body to certify TRNG. If that is not the case, haveged will shut down.


(Updated, with thanks to gwuertz the current author/maintainer of haveged, I missed the separation between HAVEGE and haveged.)

haveged is a distinct implementation of the HAVEGE method for generating random numbers, it is current, maintained and documented here: http://www.issihosts.com/haveged/ (and no longer uses libhavege directly).

HAVEGE isn't very current (2006), though someone has patched it more recently (2009) for speed and correctness. I would be cautious because it does not document its method, does not mention virtual machines, and as noted, relies (heavily) on RDTSC (or platform equivalent). (Also the source code makes me shudder, but that's quite subjective.)

I will argue that the host VM should not be unintentionally leaking any state into the guests, so it should not be considered a good source of entropy when using this method.

A better approach on Linux is rng-tools with the virtio-rng paravirtualised rng driver (i.e. allow a guest to access entropy gathered by the host, eliminating many potential issues about a guests view of "random" events), or you may find Entropy Broker more useful. On recent Intel chips you can also expose the RDRAND instruction to guests, side-stepping the issue.

This article (from hpa's talk at LinuxCon Europe 2012) is useful reading: http://lwn.net/Articles/525459/ , it also discusses HAVEGE/haveged (though the distinction isn't clear here either).

See the answers to this question for some thoughts on how to determine lack of randomness: What statistics can be used to identify pseudorandom data?