How can I multiply two 64-bit numbers using x86 assembly language?

Use what should probably be your course textbook, Randall Hyde's "The Art of Assembly Language".

See 4.2.4 - Extended Precision Multiplication

Although an 8x8, 16x16, or 32x32 multiply is usually sufficient, there are times when you may want to multiply larger values together. You will use the x86 single operand MUL and IMUL instructions for extended precision multiplication ..

Probably the most important thing to remember when performing an extended precision multiplication is that you must also perform a multiple precision addition at the same time. Adding up all the partial products requires several additions that will produce the result. The following listing demonstrates the proper way to multiply two 64 bit values on a 32 bit processor ..

(See the link for full assembly listing and illustrations.)


This code assumes you want x86 (not x64 code), that you probably only want a 64 bit product, and that you don't care about overflow or signed numbers. (A signed version is similar).

MUL64_MEMORY:
     mov edi, val1high
     mov esi, val1low
     mov ecx, val2high
     mov ebx, val2low
MUL64_EDIESI_ECXEBX:
     mov eax, edi
     mul ebx
     xch eax, ebx  ; partial product top 32 bits
     mul esi
     xch esi, eax ; partial product lower 32 bits
     add ebx, edx
     mul ecx
     add ebx, eax  ; final upper 32 bits
; answer here in EBX:ESI

This doesn't honor the exact register constraints of OP, but the result fits entirely in the registers offered by the x86. (This code is untested, but I think it's right).

[Note: I transferred (my) this answer from another question that got closed, because NONE of the other "answers" here directly answered the question].


If this was 64x86,

function(x, y, *lower, *higher)
movq %rx,%rax     #Store x into %rax
mulq %y           #multiplies %y to %rax
#mulq stores high and low values into rax and rdx.
movq %rax,(%r8)   #Move low into &lower
movq %rdx,(%r9)   #Move high answer into &higher

Since you're on x86 you need 4 mull instructions. Split the 64bit quantities into two 32bit words and multiply the low words to the lowest and 2nd lowest word of the result, then both pairs of low and high word from different numbers (they go to the 2nd and 3rd lowest word of the result) and finally both high words into the 2 highest words of the result. Add them all together not forgetting to deal with carry. You didn't specify the memory layout of the inputs and outputs so it's impossible to write sample code.