I am reading about x86-64 (and assembly in general) through the book "computer systems a programmer's perspective"(3rd edition). The author, in compliance with other sources from the web, states that
idivq takes one operand only - just as this one claims. But then, the author, some chapters later, gives an example with the instruction
idivq $9, %rcx.
Two operands? I first thought this was a mistake but it happens a lot in the book from there.
Also, the dividend should be given from the quantity in registers
%rdx (high-order 64 bits) and
%rax (low-order 64 bits) - so if this is defined in the architecture then it does not seem possible that the second operand could be a specified dividend.
Here is an example of an exercise (too lazy to write it all down - so a picture is the way to go). It claims that GCC emits
idivq $9, %rcx when compiling a short C function.
That's a mistake. Only
imul has immediate and 2-register forms.
mul, div, or idiv still only exist in the one-operand form introduced with 8086, using RDX:RAX as the implicit double-width operand for output (and input for division).
Or EDX:EAX, DX:AX, or AH:AL, depending on operand-size of course. Consult an ISA reference like Intel's manual, not this book! https://www.felixcloutier.com/x86/idiv
x86-64's only hardware division instructions are
div. 64-bit mode removed
aam, which does 8-bit division by an immediate. (Dividing in Assembler x86 and Displaying Time in Assembly has an example of using
aam in 16-bit mode).
Of course for division by constants
aam) are very inefficient. Use shifts for powers of 2, or a multiplicative inverse otherwise, unless you're optimizing for code-size instead of performance.
CS:APP apparently has multiple serious x86-64 instruction-set mistakes like this, claiming that GCC emits impossible instructions. Not just typos or subtle mistakes, but misleading nonsense that's very obviously wrong to people familiar with the x86-64 instruction set. It's not just a syntax mistake, it's trying to use instructions that aren't encodeable (no syntax can exist to express them, other than a macro that expands to multiple instructions. Defining
idivq as a pseudo-instruction using a macro would be pretty weird).
e.g. I correctly guessed missing part of a function, but gcc generated assembly code doesn't match the answer is another one where it suggests that
(%rbx, %rdi, %rsi) and
(%rsi, %rsi, 9) are valid addressing modes! The scale factor is actually a 2-bit shift count so these are total garbage and a sign of a serious lack of knowledge by the authors about the ISA they're teaching, not a typo.
Their code won't assemble with any AT&T syntax assembler.
Also What does this x86-64 addq instruction mean, which only have one operand? (From CSAPP book 3rd Edition) is another example, where they have a nonsensical
addq %eax instead of
inc %rdx, and a mismatched operand-size in a
It seems that they're just making stuff up and claiming it was emitted by GCC. IDK if they start with real GCC output and edit it into what they think is a better example, or actually write it by hand from scratch without testing it.
GCC's actual output would have used multiplication by a magic constant (fixed-point multiplicative inverse) to divide by 9 (even at
-O0, but this is clearly not debug-mode code. They could have used
Presumably they didn't want to talk about Why does GCC use multiplication by a strange number in implementing integer division? and replaced that block of code with their made-up instruction. From context you can probably figure out where they expect the output to go; perhaps they mean
rcx /= 9.
The rest of CSAPP would have to be pretty amazing to make up for these errors and make the textbook worth using. I'm not familiar with it at all, but I wonder if it used to be good in earlier editions but porting it to x86-64 introduced a lot of errors. (If older editions were maybe 32-bit x86).
How a hypothetical
idiv reg, reg or
idiv $imm, reg would be designed
Also, the dividend should be given from the quantity in registers %rdx (high-order 64 bits) and %rax (low-order 64 bits) - so if this is defined in the architecture then it does not seem possible that the second operand could be a specified dividend.
If Intel or AMD had introduced a new convenient forms for
idiv, they would have designed it to use a single-width dividend because that's how compilers always use it.
Most languages are like C and implicitly promote both operands for + - * / to the same type and produce a result of that width. Of course if the inputs are known to be narrow that can be optimized away. (e.g. using one
imul r32 to implement
a * (int64_t)b).
idiv fault if the quotient overflows so it's not safe to use a single 32-bit
idiv when compiling
int32_t q = (int64_t)a / (int32_t)b.
Compilers always use
xor edx,edx before DIV or
cqo before IDIV to actually do n / n => n-bit division.
Real full-width division using a dividend that isn't just zero- or sign-extended is only done by hand with intrinsics or asm (because gcc/clang and other compilers don't know when the optimization is safe), or in gcc helper functions that do e.g. 64-bit / 64-bit division in 32-bit code. (Or 128-bit division in 64-bit code).
So what would be most helpful is a div/idiv that avoids the extra instruction to set up RDX, too, as well as minimizing the number of implicit register operands. (Like
imul r32, r/m32 and
imul r32, r/m32, imm do: making the common case of non-widening multiplication more convenient with no implicit registers. That's Intel-syntax like the manuals, destination first)
The simplest way would be a 2-operand instruction that did
dst /= src. Or maybe replaced both operands with quotient and remainder. Using a VEX encoding for 3 operands like BMI1
andn, you could maybe have
idivx remainder_dst, dividend, divisor. With the 2nd operand also an output for the quotient. Or you could have the remainder written to RDX with a non-destructive destination for the quotient.
Or more likely to optimize for the simple case where only the quotient is needed,
idivx quot, dividend, divisor and not store the remainder anywhere. You can always use regular
idiv when you want the quotient.
mulx uses an implicit
rdx input operand because its purpose is to allow multiple dep chains of add-with-carry for extended-precision multiply. So it still has to produce 2 outputs. But this hypothetical new form of
idiv would exist to save code-size and uops around normal uses of
idiv that aren't widening. So 186 or 286
imul reg, reg/mem is the point of comparison, not BMI2
IDK if it would make sense to introduce an immediate form of
idivx as well; you'd only use it for code-size reasons. Multiplicative inverses are more efficient division by constants so there's very little real-world use-case for such an instruction.
I think your book has made a mistake.
idivq only has one operand. If I try to assemble this snippet:
idivq $9, %rcx
I get this error:
test.s: Assembler messages: test.s:1: Error: operand type mismatch for `idiv'
but you probably already know that.
It may also be a macro (unlikely, but possible. credit to @HansPassant for this).
Perhaps you should contact the book's author so that they can add an entry to the errata.
Interestingly, gas seems to allow the following:
mov $20, %rax mov $0, %rdx mov $5, %rcx idivq %rcx, %rax ret
This is still performing the one operand division under the hood, but it LOOKS like two-operand form. As long as the first operand is a register and the second operand is specifically %rax, this works. However, in general idivq seems to require the one operand form.