Here is a simple C function:
long foo (unsigned a, unsigned b)
{
return ((long)b<<32)|a;
}
Compile it with an x86-64-targeted GCC with proper optimizations enabled (
-O2 for example), you get the following instructions (in
AT&T-style assembly):
foo:
movq %rsi, %rax
mov %edi, %edi
salq $32, %rax
orq %rdi, %rax
ret
Pay attention to the red line. Literally it means assigning the value of register
edi to register
edi. Five years ago, anybody would agree this instruction does nothing like
nops. But in an x86-64 system, this is not the case.
In x86-64 assembly, any instruction with a 32-bit register as its destination zeroes the higher 32 bits of the corresponding 64-bit register at the same time. Consequently, the function of ‘
mov %edi, %edi’ is zeroing bits 32 to 63 of register
rdi while leaving the lower 32 bits (i.e., register
edi) unchanged.
One may want to rewrite it with a more intuitive
and instruction:
andq $0xffffffff, %rdi
But this does NOT assemble! Because
$0x00000000ffffffff is not representable in signed 32-bit format, but 64-bit immediates are currently allowed only in
mov instructions whose destination is a general-purpose register (such a
mov is usually explicitly written as
movabsq). So if one must use
and, one need something like this:
movl $0xffffffff, %eax
andq %rax, %rdi
Remember the zeroing rule for operations on 32-bit registers, so ‘
movl $0xffffffff, %eax’ is equivalent to ‘
movabsq $0xffffffff, %rax’...
X86-64 assembly really is too ugly, at least in this sense...
Reference
[1]
Gentle Introduction to x86-64 Assembly