How do I interpret the statistics of a memtest run?

TL;DR

The most important number first: The error count for healthy memory should be 0. Any number above 0 may indicate damaged/faulty sectors.


Screen explanation

     Memtest86+ v1.00      | Progress of the entire pass (test series)
CPU MODEL and clock speed  | Progress of individual, current test
Level 1 cache size & speed | Test type that is currently running
Level 2 cache size & speed | Part of the RAM (sector) that is being tested
RAM size and testing speed | Pattern that is being written to the sector
Information about the chipset that your mainboard uses
Information about your RAM set-up, clock speed, channel settings, etc.

WallTime   Cached  RsvdMem   MemMap   Cache  ECC  Test  Pass  Errors  ECC Errs
---------  ------  -------  --------  -----  ---  ----  ----  ------  --------
Elapsed    Amount  Amount    Mapping  on     on   Test  # of  # of    # of ECC
time       of RAM  of        used     or     or   type  pass  errors  errors
           cached  reserved           off    off        done  found   found
                   RAM, not
                   tested

Data/Test explanation

MemTest runs a number of tests, it writes specific patterns to every sector of the memory and retrieves it. If the retrieved data differs from the data that was originally stored, MemTest registers an error and increases the error count by one. Errors are usually signs of bad RAM strips.

Since memory isn't just a notepad that holds information but has advanced functions like caching, several different tests are done. This is what the Test # indicates. MemTest runs a number of different tests to see if errors occur.

Some (simplified) test examples:

  • Test sectors in this order: A, B, C, D, E, F. (Serial)
  • Test sectors in this order: A, C, E, B, D, F. (Moving)
  • Fill all sectors with pattern: aaaaaaaa
  • Fill all sectors with a random pattern.

More detailed description of all tests from: https://www.memtest86.com/technical.htm#detailed

Test 0 [Address test, walking ones, no cache]

Tests all address bits in all memory banks by using a walking ones address pattern.

Test 1 [Address test, own address, Sequential]

Each address is written with its own address and then is checked for consistency. In theory previous tests should have caught any memory addressing problems. This test should catch any addressing errors that somehow were not previously detected. This test is done sequentially with each available CPU.

Test 2 [Address test, own address, Parallel]

Same as test 1 but the testing is done in parallel using all CPUs and using overlapping addresses.

Test 3 [Moving inversions, ones&zeros, Sequential]

This test uses the moving inversions algorithm with patterns of all ones and zeros. Cache is enabled even though it interferes to some degree with the test algorithm. With cache enabled this test does not take long and should quickly find all "hard" errors and some more subtle errors. This test is only a quick check. This test is done sequentially with each available CPU.

Test 4 [Moving inversions, ones&zeros, Parallel]

Same as test 3 but the testing is done in parallel using all CPUs.

Test 5 [Moving inversions, 8 bit pat]

This is the same as test 4 but uses a 8 bit wide pattern of "walking" ones and zeros. This test will better detect subtle errors in "wide" memory chips.

Test 6 [Moving inversions, random pattern]

Test 6 uses the same algorithm as test 4 but the data pattern is a random number and it's complement. This test is particularly effective in finding difficult to detect data sensitive errors. The random number sequence is different with each pass so multiple passes increase effectiveness.

Test 7 [Block move, 64 moves]

This test stresses memory by using block move (movsl) instructions and is based on Robert Redelmeier's burnBX test. Memory is initialized with shifting patterns that are inverted every 8 bytes. Then 4mb blocks of memory are moved around using the movsl instruction. After the moves are completed the data patterns are checked. Because the data is checked only after the memory moves are completed it is not possible to know where the error occurred. The addresses reported are only for where the bad pattern was found. Since the moves are constrained to a 8mb segment of memory the failing address will always be less than 8mb away from the reported address. Errors from this test are not used to calculate BadRAM patterns.

Test 8 [Moving inversions, 32 bit pat]

This is a variation of the moving inversions algorithm that shifts the data pattern left one bit for each successive address. The starting bit position is shifted left for each pass. To use all possible data patterns 32 passes are required. This test is quite effective at detecting data sensitive errors but the execution time is long.

Test 9 [Random number sequence]

This test writes a series of random numbers into memory. By resetting the seed for the random number the same sequence of number can be created for a reference. The initial pattern is checked and then complemented and checked again on the next pass. However, unlike the moving inversions test writing and checking can only be done in the forward direction.

Test 10 [Modulo 20, ones&zeros]

Using the Modulo-X algorithm should uncover errors that are not detected by moving inversions due to cache and buffering interference with the the algorithm. As with test one only ones and zeros are used for data patterns.

Test 11 [Bit fade test, 90 min, 2 patterns]

The bit fade test initializes all of memory with a pattern and then sleeps for 5 minutes. Then memory is examined to see if any memory bits have changed. All ones and all zero patterns are used.

Because bad sectors may sometimes work and not work another time, I recommend letting MemTest run a few passes. A full pass is a completed test series that have passed. (The above test series 1-11) The more passes you get without errors, the more accurate your MemTest run. I usually run around 5 passes to be sure.

The error count for healthy memory should be 0. Any number above 0 may indicate damaged/faulty sectors.

ECC error count should only be taken into account when ECC is set to off. ECC stands for Error-correcting code memory and it's a mechanism to detect and correct wrong bits in a memory state. It can be compared slightly to the parity checks done on RAID or optical media. This technology is quite expensive and will likely only be encountered in server set-ups. The ECC count counts how many errors have been corrected by the memory's ECC mechanism. ECC shouldn't have to be invoked for healthy RAM, so an ECC error count above 0 may also indicate bad memory.


Error explanation

Example of Memtest that has encountered errors. It shows which sector/address has failed.

Memtest screen with errors

The first column (Tst) shows which test has failed, the number corresponds to the test number from the list already mentioned above. The second column (Pass) shows if that test has passed. In the case of the example, test 7 has no passes.

The third column (Failing Address) shows exactly which part of the memory has errors. Such a part has an address, much like an IP address, which is unique for that piece of data storage. It shows which address failed and how big the data chunk is. (0.8MB in the example)

The fourth (Good) and fifth (Bad) columns show the data that was written and what was retrieved respectively. Both columns should be equal in non-faulty memory (obviously).

The sixth column (Err-Bits) shows the position of the exact bits that are failing.

The seventh column (Count) shows the number of consecutive errors with the same address and failing bits.

Finally, the last, column seven (Chan) shows the channel (if multiple channels are used on the system) which the memory strip is in.


If it finds errors

If MemTest discovers any errors, the best method of determining which module is faulty is covered in this Super User question and its accepted answer:

Use the process of elimination -- remove half of the modules and run the test again...

If there are no failures, then you know that these two modules are good, so put them aside and test again.

If there are failures, then cut down to half again (down to one of four memory modules now) then test again.

But, just because one failed a test, don't assume that the other doesn't fail (you could have two failing memory modules) -- where you've detected a failure with two memory modules, test each of those two separately afterwards.

Important note: With features like memory interleaving, and poor memory module socket numbering schemes by some motherboard vendors, it can be difficult to know which module is represented by a given address.


Test number: the number of the specific test that memtest is currently running. There are a lot of them.

Count of errors: The number of memory errors encountered

ECC errors: Number of errors corrected by ECC. Your chipset/memory doesn't have ECC, so this number doesn't matter.

If your memory has any number of errors above 0, you're going to want to replace it.

EDIT: The tests are the different patterns that memtest writes into memory. It writes different patterns into memory and reads them back to check for errors, and it uses different patterns to be able to test all the states of all the bits.

The count indicates the number of times that the result read back into memtest did not match what it wrote into memory, signifying that there is an error in the chunk of memory being tested.

ECC is an error correction technology built into memory chips for servers and workstations. Most dekstops don't support memory modules with ECC built in. Almost all servers/workstations have support for it, and usually require it. The number of errors corrected by ECC are the number of errors that the ECC chip successfully fixed.


Number of Errors

When going through the tests, if the memory fails for any of the tests, it'll increment the number of errors. If I recall correctly, it counts the number of addresses that failed the test.

Number of ECC Errors

ECC memory is a special kind of memory chip that is used to keep data from getting corrupted. Your ECC Errs column counts how many problems were fixed by ECC.

(ECC is slow and expensive and is basically for mission-critical systems that can't be bothered to swap RAM out.)

Test Number

Memtest does different kinds of tests on your memory, which are described on the Memtest86 website. Just as a quick plain English translation:

Test 0: Walking Ones Address Test

Memtest will write 00000001 in the first memory location, 00000010 in the next, and so on, repeating this pattern every 8 bytes. Then it reads the memory and makes sure that the value didn't change. (Source)

Tests 1&2: Own Address Address Test

Memtest writes each memory location with its own address, and checks that the value didn't change.

Test 1 is sequential, and test 2 is parallel (i.e., uses concurrency).

Test 3&4 Moving Inversions Test

In essence, this test loads 0s into memory, and then

  1. takes each location of memory (starting from the first/lowest location),
  2. and writes the inverse of the pattern (I would believe it's a bitwise NOT, but I couldn't find any documentation on that).

The goal here is to try to test every bit and its adjacent bits "every possible combination of 0s and 1s".

Test 3 does not use concurrency, while test 4 does.

Test 5: Moving Inversions, 8-bit Pat

This does the moving inversions method again, but this time with the walking 1s from test 0 in 8-bit blocks.

Test 6: Moving Inversions, Random Pattern

Memtest uses random numbers instead of all 0s or walking 1s.

Test 7: Block move

This one's fun. It loads patterns into memory, moves them around in blocks of 4mb, and verifies them.

Test 8: Moving Inversion, 32-bit Pat

Same as test 5, but uses 32-bit blocks instead. This one does in fact load every possible 32-bit value in every location.

Test 9: Random Numbers

This one loads pseudo-random numbers into memory and verifies. The cool thing about the pseudo-random number generator is that it's not very random (if you've ever ran printf("%d", rand()); in a C program without seeding and gotten the oh-so-random 41, you know what I mean). So it verifies by resetting the random number seeder and running the generator again.

Test 10: Modulo-X

Every 20 locations, it writes a pattern (all 0s or all 1s) and writes the complement in all the other locations, then verifies.

Test 11: Bit fade Test

This one loads the RAM with all 1s (and again with all 0s), waits 5 minutes, and sees if any of the values change.