D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 17523 - Sporadic ICEs with inline asm
Summary: Sporadic ICEs with inline asm
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: x86_64 Windows
: P1 regression
Assignee: No Owner
URL:
Keywords: ice
Depends on:
Blocks:
 
Reported: 2017-06-19 11:17 UTC by Vladimir Panteleev
Modified: 2017-07-03 21:01 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Vladimir Panteleev 2017-06-19 11:17:19 UTC
With a 64-bit dmd.exe (as built using win64.mak), some inline asm instructions can randomly cause an ICE. For example, consider the given file:

////// test.d /////
void fun()
{
    asm
    {   
        fstp ST(0);
    }
}
///////////////////

Compiling the file as usual (dmd.exe -c test.d) can result in any of the following results:

63% chance: the file compiles successfully
25% chance: "Internal error: ddmd\backend\cod3.c 6869"
6% chance: "FLunde  Internal error: ddmd\backend\cod3.c 5527"
6% chance: "FLunde  Internal error: ddmd\backend\cod3.c 6757"

This affects building Druntime and Phobos - building them will likely fail with one of the above errors, or (less probably) succeed.

Introduced in https://github.com/dlang/dmd/pull/6379
Comment 1 uplink.coder 2017-06-20 15:55:24 UTC
My advice revert the PR.
It's not like iasm is a bottleneck in most cases.
Comment 2 Walter Bright 2017-06-21 00:39:51 UTC
Cannot duplicate with the Win32 build of DMD.
Comment 3 uplink.coder 2017-06-21 00:50:27 UTC
(In reply to Walter Bright from comment #2)
> Cannot duplicate with the Win32 build of DMD.

Some iasm test. It fails on my linux system sometimes.
There is definitely something wrong.
Maybe try compiling the code about 100-200 times ?
Comment 4 Vladimir Panteleev 2017-06-21 01:02:31 UTC
(In reply to Walter Bright from comment #2)
> Cannot duplicate with the Win32 build of DMD.

It is not reproduced with a 32-bit dmd.exe; you must make a Win64 build as the issue description says.
Comment 5 Walter Bright 2017-06-21 01:33:13 UTC
25% chance: "Internal error: ddmd\backend\cod3.c 6869"
6% chance: "FLunde  Internal error: ddmd\backend\cod3.c 5527"
6% chance: "FLunde  Internal error: ddmd\backend\cod3.c 6757"

These lines are nowhere near asserts in the current HEAD. Can you try with HEAD?
Comment 6 Walter Bright 2017-06-21 01:36:35 UTC
(In reply to uplink.coder from comment #3)
> Some iasm test. It fails on my linux system sometimes.
> There is definitely something wrong.
> Maybe try compiling the code about 100-200 times ?

I tried a 64 bit dmd build on Linux, several thousand times. No errors.
Comment 7 Walter Bright 2017-06-21 01:38:10 UTC
Ran it under valgrind on Linux. No errors.
Comment 8 Vladimir Panteleev 2017-06-21 01:49:31 UTC
(In reply to Walter Bright from comment #5)
> 25% chance: "Internal error: ddmd\backend\cod3.c 6869"
> 6% chance: "FLunde  Internal error: ddmd\backend\cod3.c 5527"
> 6% chance: "FLunde  Internal error: ddmd\backend\cod3.c 6757"
> 
> These lines are nowhere near asserts in the current HEAD. Can you try with
> HEAD?

At the time I couldn't due to issue 17522.

Here are the line numbers with HEAD:

FLunde  Internal error: ddmd\backend\cod3.c 5500
FLunde  Internal error: ddmd\backend\cod3.c 6730
Internal error: ddmd\backend\cod3.c 6842
Comment 9 Walter Bright 2017-06-21 01:50:52 UTC
Also tried memset'ing the stack variables opnd1 .. opnd4 to 0xFF at the close of asmSemantic(). No errors.

There is no host dependent code in iasm.d. So this is a bit mystifying.
Comment 10 Walter Bright 2017-06-21 02:02:11 UTC
(In reply to Vladimir Panteleev from comment #8)
> At the time I couldn't due to issue 17522.

17522 has been fixed.

> Here are the line numbers with HEAD:
> 
> FLunde  Internal error: ddmd\backend\cod3.c 5500
> FLunde  Internal error: ddmd\backend\cod3.c 6730
> Internal error: ddmd\backend\cod3.c 6842

These suggest a bad opcode in code.Iop.

At the end of asmSemantic(), I suggest inserting:

    printf("asmcode: %02x\n", s.asmcode.Iop);

and at the beginning of pinholeopt() before L1:

    c->print();
Comment 11 Vladimir Panteleev 2017-06-21 02:32:39 UTC
Here is the output with the changes you requested from 10 runs:

asmcode: e0
code 0000019B245D80B8: nxt=0000000000000000 op=0xE0 flg=40 FLunde  
code 0000019B245D80F0: nxt=0000000000000000 op=0xC3
code 0000019B245D80F0: nxt=0000000000000000 op=0xC3
Internal error: ddmd\backend\cod3.c 6843
=== Exited with status 1 ===

asmcode: 40
code 00000277E13B8EE8: nxt=0000000000000000 op=0x40 flg=40
code 00000277E13B8F20: nxt=0000000000000000 op=0xC3
code 00000277E13B8F20: nxt=0000000000000000 op=0xC3
=== Exited with status 0 ===

asmcode: d0
code 000001E109C46C18: nxt=0000000000000000 op=0xD0 flg=40 rm=0xD8=3,3,0
code 000001E109C46C50: nxt=0000000000000000 op=0xC3
code 000001E109C46C50: nxt=0000000000000000 op=0xC3
=== Exited with status 0 ===

asmcode: 60
code 000002285BE98B38: nxt=0000000000000000 op=0x60 flg=40
code 000002285BE98B70: nxt=0000000000000000 op=0xC3
code 000002285BE98B70: nxt=0000000000000000 op=0xC3
=== Exited with status 0 ===

asmcode: c0
code 00000244E18AC278: nxt=0000000000000000 op=0xC0 flg=40 rm=0xD8=3,3,0 FLunde  
code 00000244E18AC2B0: nxt=0000000000000000 op=0xC3
code 00000244E18AC2B0: nxt=0000000000000000 op=0xC3
Internal error: ddmd\backend\cod3.c 6843
=== Exited with status 1 ===

asmcode: 70
code 000002295B099D28: nxt=0000000000000000 op=0x70 flg=40 FLunde  
FLunde  Internal error: ddmd\backend\cod3.c 5501
=== Exited with status 1 ===

asmcode: 80
code 0000022F4FE57D38: nxt=0000000000000000 op=0x80 flg=40 rm=0xD8=3,3,0 FLunde  
code 0000022F4FE57D70: nxt=0000000000000000 op=0xC3
code 0000022F4FE57D70: nxt=0000000000000000 op=0xC3
Internal error: ddmd\backend\cod3.c 6843
=== Exited with status 1 ===

asmcode: 50
code 000001E7F37DC868: nxt=0000000000000000 op=0x50 flg=40
code 000001E7F37DC8A0: nxt=0000000000000000 op=0xC3
code 000001E7F37DC8A0: nxt=0000000000000000 op=0xC3
=== Exited with status 0 ===

asmcode: 60
code 00000292EE418E48: nxt=0000000000000000 op=0x60 flg=40
code 00000292EE418E80: nxt=0000000000000000 op=0xC3
code 00000292EE418E80: nxt=0000000000000000 op=0xC3
=== Exited with status 0 ===

asmcode: 00
code 0000025538918CE8: nxt=0000000000000000 op=0x00 flg=40 rm=0xD8=3,3,0
code 0000025538918D20: nxt=0000000000000000 op=0xC3
code 0000025538918D20: nxt=0000000000000000 op=0xC3
=== Exited with status 0 ===

If you're having trouble getting win64.mak to work, the latest version of Digger should be able to build a 64-bit DMD on any Windows system with no other dependencies. Run "digger checkout", edit the source files under work/repo/dmd/src/ddmd, and run "digger -c build.components.dmd.dmdModel=64 rebuild --model=32 --without=phobos --without=druntime --without=rdmd --without=phobos-includes" to build a 64-bit dmd.exe under work/result/bin.
Comment 12 Walter Bright 2017-06-21 03:31:15 UTC
Hmmm, the opcode should be DD:

        DD D8                   fstp    ST

Which suggests the problem is confined to iasm.d
Comment 13 Walter Bright 2017-06-21 03:36:42 UTC
So the way forward is to follow the flow backwards to where c.Iop is being set and print that. I'd start with line 1384:

    pc.Iop = opcode;

and insert:

    printf("test1: opcode = 0x%x\n", opcode);

Ain't debugging fun? :-)
Comment 14 Vladimir Panteleev 2017-06-21 03:40:18 UTC
(In reply to Walter Bright from comment #13)
> Ain't debugging fun? :-)

Doing it through Bugzilla comments isn't very effective, though.

Have you tried building a 64-bit DMD using win64.mak or Digger yet?
Comment 15 Vladimir Panteleev 2017-06-21 04:02:32 UTC
(In reply to Walter Bright from comment #13)
>     printf("test1: opcode = 0x%x\n", opcode);

After adding this printf, the bug can no longer be reproduced.

Since that pointed at a codegen bug in the host compiler, I tried building DMD with 2.074.1 (instead of Digger's default of 2.070.2 for Win64), and I can no longer reproduce the bug either.

I'm going to bisect what caused the behaviour change, just to ensure that the bug disappeared because of a codegen fix, and not because some random change made it no longer manifest for that test case.
Comment 16 Walter Bright 2017-06-21 04:12:13 UTC
I haven't tried building Win64 dmd. I figured that would be time consuming :-)
Comment 17 Walter Bright 2017-06-21 04:13:06 UTC
(In reply to Vladimir Panteleev from comment #15)
> Since that pointed at a codegen bug in the host compiler, I tried building
> DMD with 2.074.1 (instead of Digger's default of 2.070.2 for Win64), and I
> can no longer reproduce the bug either.
> 
> I'm going to bisect what caused the behaviour change, just to ensure that
> the bug disappeared because of a codegen fix, and not because some random
> change made it no longer manifest for that test case.

That's a good plan. There's a lot that has changed from 70 to 74.
Comment 18 Vladimir Panteleev 2017-06-28 21:50:33 UTC
(In reply to Vladimir Panteleev from comment #15)
> I'm going to bisect what caused the behaviour change, just to ensure that
> the bug disappeared because of a codegen fix, and not because some random
> change made it no longer manifest for that test case.

Done. Bisecting the version to bootstrap DMD really put Digger to the test, and I needed to make some upgrades :)

The bug was fixed in this PR:

https://github.com/dlang/dmd/pull/5924

It does seem to be a fix in the backend, but it fixes an ICE, not resulting codegen. Does that make sense to you?
Comment 19 Walter Bright 2017-07-03 21:01:53 UTC
> It does seem to be a fix in the backend, but it fixes an ICE, not resulting codegen. Does that make sense to you?

Yes, because the compiler itself was corrupted by the Win64 build.