Is this matrix-vector multiplication function in VHDL parallelized?

In 'hardware' (VHDL or Verilog) all loops are unrolled and executed in parallel.

Thus not only your inner loop, also your outer loop is unrolled.

That is also the reason why the loop size must be known at compile time. When the loop length is unknown the synthesis tool will complain.


It is a well known trap for beginners coming from a SW language. They try to convert:

int a,b,c;
   c = 0;
   while (a--)
     c +=  b;

To VHDL/Verilog hardware. The problem is that it all works fine in simulation. But the synthesis tool needs to generate adders: c = b+b+b+b...b;

For that the tool needs to know how many adders to make. If a is a constant fine! (Even if it is 4.000.000. It will run out of gates but it will try!)

But if a is a variable it is lost.

Tags:

Matrix

Fpga

Vhdl