Message 82362 - Python tracker

Has any conclusion been reached wrt. overhead of 30-bit multiplication
on 32-bit systems? IIUC, the single-digit multiplication is equivalent
to the C program

unsigned long long m(unsigned long long a, unsigned long b)
{
        return a*b;
}

(i.e. one digit is cast into two digits, and multiplied with the other
one). gcc 4.3.3, on x86, compiles this into

        movl    12(%esp), %eax
        movl    8(%esp), %ecx
        imull   %eax, %ecx
        mull    4(%esp)
        leal    (%ecx,%edx), %edx
        ret

In pseudo-code, this is

        tmp = high_a * b;
        high_res:low_res = low_a * b;
        high_res += tmp

So it does use two multiply instructions (plus an add), since one
argument got cast into 64 bits.

VS2008 compiles it into

	push	eax
	push	ecx
	push	0
	push	edx
	call	__allmu

i.e. it widens both arguments to 64 bits, then calls a library routine.