Do modern processors have redundancy in their logic units to compensate production faults?

Not in the logic.

However if there are big memories (SRAM) it is common to use a memory with 'redundancy'. These have special logic which can be programmed to replace a area, often a number of rows or columns.

The failing area is detected during testing and then the redundant memory is programmed to replace the faulty location(s).

However this 'replacement' must be set-up using OTP (One-Time-Programmable) bits or some other memory which holds its value. Thus these memories are only used in chips which have such a 'permanent memory' feature, or such a programming feature must be added as well, with all the costs this incurs.


This is certainly not the case for simple MCUs, or typical single core processors. The cost of having spare blocks would not be worth it, and those processors don't use cutting-edge engraving processes, and don't require huge silicon areas, so the yield is good enough.

However, this is done for some multi-core processors, for which the silicon area is rather large, and that uses finer engraving processes which can lead to higher defect rates. On these processors, entire cores can be disabled (which are rather big logic blocks, containing much more than an ALU) when they are defective. The processor is then sold as a lower-end model.

Source: https://skeptics.stackexchange.com/questions/15704/are-low-spec-computer-parts-just-faulty-high-spec-computer-parts


As others have said, it is difficult to see redundant ALU logic within a core.

A core was designed to optimize throughput. Any additional logic for a redundant ALU would impact performance and increased area would slow down the whole core. As technology evolved, the silicon became smaller, making cores faster, but essentially using the same intellectual property. Why have redundant ALU's, when space is available for redundant cores to increase production yields?

In 2011, Intel filed a patent for at least 32 cores with 16 active and 16 spare. The patent states failing cores would have higher temperatures allowing a spare core to be switched in. Essentially, dynamic core allocation as required.

You could have high-power and low-power cores allocated as required by tasks. Or switch out a bad core detected by higher temperature levels. Operate the cores in a checkerboard manner to reduce heat.

Intel Patent: Enhancing Reliability of a Many-Core Processor