Has any conclusion been reached wrt. overhead of 30-bit multiplication
on 32-bit systems? IIUC, the single-digit multiplication is equivalent
to the C program
unsigned long long m(unsigned long long a, unsigned long b)
{
return a*b;
}
(i.e. one digit is cast into two digits, and multiplied with the other
one). gcc 4.3.3, on x86, compiles this into
movl 12(%esp), %eax
movl 8(%esp), %ecx
imull %eax, %ecx
mull 4(%esp)
leal (%ecx,%edx), %edx
ret
In pseudo-code, this is
tmp = high_a * b;
high_res:low_res = low_a * b;
high_res += tmp
So it does use two multiply instructions (plus an add), since one
argument got cast into 64 bits.
VS2008 compiles it into
push eax
push ecx
push 0
push edx
call __allmu
i.e. it widens both arguments to 64 bits, then calls a library routine.