Issue 22656 - SSE2 instructions have inconsistent layouts in the disassembler output
Summary: SSE2 instructions have inconsistent layouts in the disassembler output
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: x86_64 Linux
: P1 minor
Assignee: No Owner
URL:
Keywords: disassembler, pull
Depends on:
Blocks:
 
Reported: 2022-01-08 02:42 UTC by mhh
Modified: 2022-01-13 02:31 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description mhh 2022-01-08 02:42:28 UTC
I have listed this as SSE2 instructions however there may be more affected (it could be the prefix in general rather than these particular instructions).

```
extern(C)
void blah()
{
    asm {
        naked;
        xor EAX, EAX;
        jmp [RIP];
        cvttpd2dq XMM0, XMM1;
        movq [RAX], XMM0;
    }
}
```
Yields
```
blah:
0000:   31 C0                   xor     EAX,EAX
0002:   48 FF 25 00 00 00 00    jmp     qword ptr [00h][RIP]
0009:66         0F E6 C1                cvttpd2dq       XMM0,XMM1
000d:66         0F D6 00                movq    [RAX],XMM0
```

The latter two instructions are not displayed properly.

From llvm's objdump we get:
```
0000000000000000 blah:
       0: 31 c0                         xor     eax, eax
       2: 48 ff 25 00 00 00 00          jmp     qword ptr [rip]
       9: 66 0f e6 c1                   cvttpd2dq       xmm0, xmm1
       d: 66 0f d6 00                   movq    qword ptr [rax], xmm0
      11: 00 00                         add     byte ptr [rax], al
      13: 00                            <unknown>
```

From GNU:
```
0000000000000000 <blah>:
   0:   31 c0                   xor    eax,eax
   2:   48 ff 25 00 00 00 00    rex.W jmp QWORD PTR [rip+0x0]        # 9 <blah+0x9>
   9:   66 0f e6 c1             cvttpd2dq xmm0,xmm1
   d:   66 0f d6 00             movq   QWORD PTR [rax],xmm0
  11:   00 00                   add    BYTE PTR [rax],al
```
Comment 1 Brian Callahan 2022-01-08 03:42:38 UTC
I think it may be the prefix; I can trigger this issue reliably with plain ol' leas and calls using code from my Intel 8080 assembler. But like you, it's that 66 prefix causing issues:

Code in question:
```
import std.stdio;
import std.range;

string label;
string argument1, argument2;

void title()
{
    if ((label.empty && !argument1.empty && !argument2.empty) == false)
        stderr.writeln("argument check failed");
}
```

Relevant -vasm output:
```
_D4vasm5titleFZv:
0000:   55                      push    RBP
0001:   48 8B EC                mov     RBP,RSP
0004:66         48 8D 3D FC FF FF FF    lea     RDI,[0FFFFFFFCh][RIP]
000c:66 66      48 E8 00 00 00 00       call      L0
0014:   48 89 C7                mov     RDI,RAX
0017:   E8 00 00 00 00          call      L0
001c:   84 C0                   test    AL,AL
001e:   74 38                   je      L58
0020:66         48 8D 3D FC FF FF FF    lea     RDI,[0FFFFFFFCh][RIP]
0028:66 66      48 E8 00 00 00 00       call      L0
0030:   48 89 C7                mov     RDI,RAX
0033:   E8 00 00 00 00          call      L0
0038:   34 01                   xor     AL,1
003a:   74 1C                   je      L58
003c:66         48 8D 3D FC FF FF FF    lea     RDI,[0FFFFFFFCh][RIP]
0044:66 66      48 E8 00 00 00 00       call      L0
004c:   48 89 C7                mov     RDI,RAX
004f:   E8 00 00 00 00          call      L0
0054:   34 01                   xor     AL,1
0056:   75 04                   jne     L5c
0058:   31 C0                   xor     EAX,EAX
005a:   EB 05                   jmp short       L61
005c:   B8 01 00 00 00          mov     EAX,1
0061:   A8 01                   test    AL,1
0063:   75 19                   jne     L7e
0065:   E8 00 00 00 00          call      L0
006a:   48 8D 15 FC FF FF FF    lea     RDX,[0FFFFFFFCh][RIP]
0071:   BE 15 00 00 00          mov     EAX,015h
0076:   48 89 C7                mov     RDI,RAX
0079:   E8 00 00 00 00          call      L0
007e:   5D                      pop     RBP
007f:   C3                      ret
```

Here's GNU objdump for comparison:
```
0000000000000000 <_D4vasm5titleFZv>:
   0:   55                      push   %rbp
   1:   48 8b ec                mov    %rsp,%rbp
   4:   66 48 8d 3d 00 00 00    data16 lea 0x0(%rip),%rdi        # c <_D4vasm5titleFZv+0xc>
   b:   00
   c:   66 66 48 e8 00 00 00    data16 data16 rex.W call 14 <_D4vasm5titleFZv+0x14>
  13:   00
  14:   48 89 c7                mov    %rax,%rdi
  17:   e8 00 00 00 00          call   1c <_D4vasm5titleFZv+0x1c>
  1c:   84 c0                   test   %al,%al
  1e:   74 38                   je     58 <_D4vasm5titleFZv+0x58>
  20:   66 48 8d 3d 00 00 00    data16 lea 0x0(%rip),%rdi        # 28 <_D4vasm5titleFZv+0x28>
  27:   00
  28:   66 66 48 e8 00 00 00    data16 data16 rex.W call 30 <_D4vasm5titleFZv+0x30>
  2f:   00
  30:   48 89 c7                mov    %rax,%rdi
  33:   e8 00 00 00 00          call   38 <_D4vasm5titleFZv+0x38>
  38:   34 01                   xor    $0x1,%al
  3a:   74 1c                   je     58 <_D4vasm5titleFZv+0x58>
  3c:   66 48 8d 3d 00 00 00    data16 lea 0x0(%rip),%rdi        # 44 <_D4vasm5titleFZv+0x44>
  43:   00
  44:   66 66 48 e8 00 00 00    data16 data16 rex.W call 4c <_D4vasm5titleFZv+0x4c>
  4b:   00
  4c:   48 89 c7                mov    %rax,%rdi
  4f:   e8 00 00 00 00          call   54 <_D4vasm5titleFZv+0x54>
  54:   34 01                   xor    $0x1,%al
  56:   75 04                   jne    5c <_D4vasm5titleFZv+0x5c>
  58:   31 c0                   xor    %eax,%eax
  5a:   eb 05                   jmp    61 <_D4vasm5titleFZv+0x61>
  5c:   b8 01 00 00 00          mov    $0x1,%eax
  61:   a8 01                   test   $0x1,%al
  63:   75 19                   jne    7e <_D4vasm5titleFZv+0x7e>
  65:   e8 00 00 00 00          call   6a <_D4vasm5titleFZv+0x6a>
  6a:   48 8d 15 00 00 00 00    lea    0x0(%rip),%rdx        # 71 <_D4vasm5titleFZv+0x71>
  71:   be 15 00 00 00          mov    $0x15,%esi
  76:   48 89 c7                mov    %rax,%rdi
  79:   e8 00 00 00 00          call   7e <_D4vasm5titleFZv+0x7e>
  7e:   5d                      pop    %rbp
  7f:   c3                      ret
```

llvm-objdump looks a bit nicer:
```
0000000000000000 <_D4vasm5titleFZv>:
       0: 55                            pushq   %rbp
       1: 48 8b ec                      movq    %rsp, %rbp
       4: 66 48 8d 3d 00 00 00 00       leaq    (%rip), %rdi            # 0xc <_D4vasm5titleFZv+0xc>
       c: 66 66 48 e8 00 00 00 00       callq   0x14 <_D4vasm5titleFZv+0x14>
      14: 48 89 c7                      movq    %rax, %rdi
      17: e8 00 00 00 00                callq   0x1c <_D4vasm5titleFZv+0x1c>
      1c: 84 c0                         testb   %al, %al
      1e: 74 38                         je      0x58 <_D4vasm5titleFZv+0x58>
      20: 66 48 8d 3d 00 00 00 00       leaq    (%rip), %rdi            # 0x28 <_D4vasm5titleFZv+0x28>
      28: 66 66 48 e8 00 00 00 00       callq   0x30 <_D4vasm5titleFZv+0x30>
      30: 48 89 c7                      movq    %rax, %rdi
      33: e8 00 00 00 00                callq   0x38 <_D4vasm5titleFZv+0x38>
      38: 34 01                         xorb    $1, %al
      3a: 74 1c                         je      0x58 <_D4vasm5titleFZv+0x58>
      3c: 66 48 8d 3d 00 00 00 00       leaq    (%rip), %rdi            # 0x44 <_D4vasm5titleFZv+0x44>
      44: 66 66 48 e8 00 00 00 00       callq   0x4c <_D4vasm5titleFZv+0x4c>
      4c: 48 89 c7                      movq    %rax, %rdi
      4f: e8 00 00 00 00                callq   0x54 <_D4vasm5titleFZv+0x54>
      54: 34 01                         xorb    $1, %al
      56: 75 04                         jne     0x5c <_D4vasm5titleFZv+0x5c>
      58: 31 c0                         xorl    %eax, %eax
      5a: eb 05                         jmp     0x61 <_D4vasm5titleFZv+0x61>
      5c: b8 01 00 00 00                movl    $1, %eax
      61: a8 01                         testb   $1, %al
      63: 75 19                         jne     0x7e <_D4vasm5titleFZv+0x7e>
      65: e8 00 00 00 00                callq   0x6a <_D4vasm5titleFZv+0x6a>
      6a: 48 8d 15 00 00 00 00          leaq    (%rip), %rdx            # 0x71 <_D4vasm5titleFZv+0x71>
      71: be 15 00 00 00                movl    $21, %esi
      76: 48 89 c7                      movq    %rax, %rdi
      79: e8 00 00 00 00                callq   0x7e <_D4vasm5titleFZv+0x7e>
      7e: 5d                            popq    %rbp
      7f: c3                            retq
```
Comment 2 Dlang Bot 2022-01-11 07:23:22 UTC
@WalterBright created dlang/dmd pull request #13510 "fix Issue 22656 - SSE2 instructions have inconsistent layouts in the …" fixing this issue:

- fix Issue 22656 - SSE2 instructions have inconsistent layouts in the disassembler output

https://github.com/dlang/dmd/pull/13510
Comment 3 Dlang Bot 2022-01-13 02:31:26 UTC
dlang/dmd pull request #13510 "fix Issue 22656 - SSE2 instructions have inconsistent layouts in the …" was merged into master:

- a0809bbc4642fe679bee83d1f3d8bfe0ae3691e0 by Walter Bright:
  fix Issue 22656 - SSE2 instructions have inconsistent layouts in the disassembler output

https://github.com/dlang/dmd/pull/13510