D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 9634 - [CTFE] wrong code concatenating arrays of structs
Summary: [CTFE] wrong code concatenating arrays of structs
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: All All
: P2 critical
Assignee: No Owner
URL:
Keywords: CTFE, wrong-code
: 7440 (view as issue list)
Depends on:
Blocks:
 
Reported: 2013-03-02 10:50 UTC by Nils
Modified: 2013-11-28 13:56 UTC (History)
5 users (show)

See Also:


Attachments
Stripped down regex parser that shows the bug (14.04 KB, application/octet-stream)
2013-03-12 07:56 UTC, Dmitry Olshansky
Details

Note You need to log in before you can comment on or make changes to this issue.
Description Nils 2013-03-02 10:50:40 UTC
Maybe a duplicate of issue 7440. But this one gives a different error message.

---
cat > test.d <<code
import std.regex;
auto re = ctRegex!"(?:a+)";
code
dmd -c -o- test.d
---
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(2739): Error: ['c', 'a', 's', 't', '(', 'I', 'R', ')', '0'][0u..9u]
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(2099):        called from here: (Kickstart!(char) __ctmp1227 = 0;
 , __ctmp1227).this(this, new uint[](256u))
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(2153):        called from here: this.lightPostprocess()
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(1966):        called from here: (Regex!(char) __ctmp1948 = 0;
 , __ctmp1948).this(this)
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6489):        called from here: parser.program()
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6472):        called from here: regexImpl(pattern, flags)
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6497):        called from here: regex("(?:a+)", [])
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(2739): Error: ['c', 'a', 's', 't', '(', 'I', 'R', ')', '0'][0u..9u]
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(2099):        called from here: (Kickstart!(char) __ctmp1227 = 0;
 , __ctmp1227).this(this, new uint[](256u))
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(2153):        called from here: this.lightPostprocess()
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(1966):        called from here: (Regex!(char) __ctmp1948 = 0;
 , __ctmp1948).this(this)
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6489):        called from here: parser.program()
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6472):        called from here: regexImpl(pattern, flags)
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6499):        called from here: regex("(?:a+)", [])
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6499):        called from here: ctGenRegExCode(regex("(?:a+)", []))
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(2739): Error: ['c', 'a', 's', 't', '(', 'I', 'R', ')', '0'][0u..9u]
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(2099):        called from here: (Kickstart!(char) __ctmp1227 = 0;
 , __ctmp1227).this(this, new uint[](256u))
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(2153):        called from here: this.lightPostprocess()
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(1966):        called from here: (Regex!(char) __ctmp1948 = 0;
 , __ctmp1948).this(this)
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6489):        called from here: parser.program()
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6472):        called from here: regexImpl(pattern, flags)
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6506):        called from here: regex("(?:a+)", [])
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6506):        called from here: StaticRegex(null, Regex(null, null, null, 0u, 0u, 0u, 0u, 0u, null, null, ShiftOr(null, 0u, 0u))).this(regex("(?:a+)", []), & func)
/home/nils/d/dmd/dmd2/linux/bin32/../../src/phobos/std/regex.d(6523): Error: template instance std.regex.ctRegexImpl!("(?:a+)", []) error instantiating
test.d(2):        instantiated from here: ctRegex!("(?:a+)")
test.d(2): Error: template instance std.regex.ctRegex!("(?:a+)") error instantiating
---
Comment 1 Dmitry Olshansky 2013-03-12 07:56:06 UTC
Created attachment 1200 [details]
Stripped down regex parser that shows the bug

Don I belive I've found something that of interest.

In short - I've got cut down sample, there is no _ctfe branching here, ~500 lines and most of imports are there to print stuff.

Up to last step in bytecode generation all is OK and the result is the same, then at the last step somehow even simple fill with 1 to n loop fails.
(see below)

See lines marked as CRITICAL POINT to see this bisection of the execution.

The output is:

RT version
0       Char   (0x0)
1       Char ☺ (0x1)
2       Char ☻ (0x2)
3       Char ♥ (0x3)
4       Char ♦ (0x4)
5       cast(IR)0
6       End

CT version
0       Char   (0x0)
1       Char ♥ (0x3)
2       Char ☻ (0x2)
3       Char ♥ (0x3)
4       Char ♦ (0x4)
5       cast(IR)0
6       End

Instead it should be the same in both versions.
Comment 2 Dmitry Olshansky 2013-04-17 13:18:11 UTC
(In reply to comment #1)
> Created an attachment (id=1200) [details]
> Stripped down regex parser that shows the bug
> 

I think I've pinpointed the issue in an even smaller test case (~90 LOCs).
It directly relates to dealing with arrays of structs at CTFE.

Interesting point is that the assertion in main passes in the code below if you switch 'Bytecode' struct that is nothing more then one int to simply int. Hence my thought about structs being the trigger.


struct Bytecode
{
    int raw;
}

struct Parser
{
    dchar _current;
    bool empty;
    string pat;      
    Bytecode[] ir;     
 
    this(string pattern)
    {
        pat = pattern;
        next();
        uint fix;//fixup pointer
        for(;;)
        {
            switch(current)
            {
            case '(':
                next();
                fix = cast(uint)ir.length;
                assert(current == '?');
                next();
                assert(current == ':');                    
                ir ~= Bytecode(-1);
                next();                    
                break;
            case ')': //CRITICAL POINT: the last closing paren
                //return; // up to this point generated bytecode is the same
                next();
                //return; //still OK
                { //CRITICAL POINT                    
                    size_t cnt = ir.length-fix-1;
                    //even simple write loop is failing with awful results
                    for(size_t i = 0; i < cnt; i++)
                    {
                        auto bc = Bytecode(i+10);
                        ir[fix+i] = bc;
                    }
                }
                return; // and here it differs            
            default:
                uint start = cast(uint)ir.length;
                ir ~= Bytecode(10*current);
                next();
                uint len = cast(uint)ir.length - start;
                next();
                ir ~= Bytecode(-4);
                ir ~= ir[start .. start+len];
                ir ~= Bytecode(-1);
            }
        }
    }

    @property dchar current(){ return _current; }

    bool next()
    {
        if(pat.length == 0)
        {
            empty = true;
            return false;
        }
        _current = pat[0];
        pat = pat[1..$];
        return true;
    }
}

public auto getIr(string pattern)
{
    auto ir = Parser(pattern).ir;    
    return ir;
}

void main()
{
    auto re = getIr("(?:a+)");
    static re2 = getIr("(?:a+)");  
//uncomment to see that it's a 3rd element of 2 arrays that differs
/*
    import std.stdio;
    writeln("RT version");
    writeln(re);
    writeln("\n\nCT version");
    writeln(re2);
*/
    assert(re == re2);
}
Comment 3 Nils 2013-04-17 14:20:35 UTC
(In reply to comment #2)
> //uncomment to see that it's a 3rd element of 2 arrays that differs
> /*
>     import std.stdio;
>     writeln("RT version");
>     writeln(re);
>     writeln("\n\nCT version");
>     writeln(re2);
> */

For me, it's the second element that differs (DMD32 D Compiler v2.063-devel-0630526):
---
RT version
[Bytecode(10), Bytecode(11), Bytecode(12), Bytecode(13), Bytecode(-1)]


CT version
[Bytecode(10), Bytecode(13), Bytecode(12), Bytecode(13), Bytecode(-1)]
---
Comment 4 Dmitry Olshansky 2013-04-17 14:29:31 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > //uncomment to see that it's a 3rd element of 2 arrays that differs
> > /*
> >     import std.stdio;
> >     writeln("RT version");
> >     writeln(re);
> >     writeln("\n\nCT version");
> >     writeln(re2);
> > */
> 
> For me, it's the second element that differs (DMD32 D Compiler
> v2.063-devel-0630526):
> ---
> RT version
> [Bytecode(10), Bytecode(11), Bytecode(12), Bytecode(13), Bytecode(-1)]
> 
> 
> CT version
> [Bytecode(10), Bytecode(13), Bytecode(12), Bytecode(13), Bytecode(-1)]
> ---

You're right that must a mistake on my part.
It's 2nd one that is different as I'm seeing this exact output.
Comment 5 Nils 2013-04-17 15:12:50 UTC
Reduced the code from comment #2 further:

struct Bytecode
{
    int raw;
}

Bytecode[] getIr()
{
    Bytecode[] ir;
    
    ir ~= Bytecode(42);
    ir ~= ir[0 .. 1]; // add .dup and CTFE gets it
    assert(&ir[0] != &ir[1]); // fails in CTFE
    
    ir[0].raw = 13; // overwrites both ir[0] and ir[1]
    assert(ir[0].raw == 13);
    assert(ir[1].raw == 42); // fails in CTFE
    
    return ir;
}

void main()
{
    enum expected = [Bytecode(13), Bytecode(42)];
    assert(getIr() == expected); // passes
    static assert(getIr() == expected); // fails
}
Comment 6 Dmitry Olshansky 2013-05-12 13:42:34 UTC
(In reply to comment #5)
> Reduced the code from comment #2 further:
> 
> struct Bytecode
> {
>     int raw;
> }
> 
> Bytecode[] getIr()
> {
>     Bytecode[] ir;
> 
>     ir ~= Bytecode(42);
>     ir ~= ir[0 .. 1]; // add .dup and CTFE gets it

A-ha! That indicates that the bug is in appending slices of that same array...

That might get us workaround (without dup) and indeed if I change the above line to this:
    foreach(v; ir[0 .. 1])
        ir ~= v; 

it passes even for bigger slices.

>     assert(&ir[0] != &ir[1]); // fails in CTFE
> 
>     ir[0].raw = 13; // overwrites both ir[0] and ir[1]
>     assert(ir[0].raw == 13);
>     assert(ir[1].raw == 42); // fails in CTFE
> 
>     return ir;
> }
> 
> void main()
> {
>     enum expected = [Bytecode(13), Bytecode(42)];
>     assert(getIr() == expected); // passes
>     static assert(getIr() == expected); // fails
> }

(In reply to comment #5)
> Reduced the code from comment #2 further:
> 
> struct Bytecode
> {
>     int raw;
> }
> 
> Bytecode[] getIr()
> {
>     Bytecode[] ir;
> 
>     ir ~= Bytecode(42);
>     ir ~= ir[0 .. 1]; // add .dup and CTFE gets it
>     assert(&ir[0] != &ir[1]); // fails in CTFE
> 
>     ir[0].raw = 13; // overwrites both ir[0] and ir[1]
>     assert(ir[0].raw == 13);
>     assert(ir[1].raw == 42); // fails in CTFE
> 
>     return ir;
> }
> 
> void main()
> {
>     enum expected = [Bytecode(13), Bytecode(42)];
>     assert(getIr() == expected); // passes
>     static assert(getIr() == expected); // fails
> }
Comment 8 Dmitry Olshansky 2013-05-12 14:03:14 UTC
*** Issue 7440 has been marked as a duplicate of this issue. ***
Comment 9 Martin Nowak 2013-05-18 08:30:26 UTC
This is related to bug 6052.
Comment 10 Martin Nowak 2013-05-21 07:10:00 UTC
This necessitates a workaround in Phobos.
https://github.com/D-Programming-Language/phobos/pull/1290
Comment 12 Don 2013-06-17 23:48:12 UTC
Reduced test case shows it doesn't even require ~=.
---------
struct Bug9634 {
    int raw;
}

bool bug9634()
{
    Bug9634[] jr = [Bug9634(42)];
    Bug9634[] ir = jr ~ jr;
    assert(&ir[0] != &ir[1]);
    return true;
}

static assert(bug9634());
Comment 13 github-bugzilla 2013-06-19 02:12:22 UTC
Commit pushed to master at https://github.com/D-Programming-Language/dmd

https://github.com/D-Programming-Language/dmd/commit/baf2d607b6b723b498753b730b62b9b258a5659e
Fix bug 9634 [CTFE] wrong code concatenating arrays of structs

When concatenating, the array literals need to be duplicated, if
they are CTFE values.
Comment 14 yebblies 2013-06-19 02:13:10 UTC
(hopefully)
Comment 15 github-bugzilla 2013-11-28 13:56:20 UTC
Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/3e22b63a8b06c6687dc5ea29e8076b2c6ac802b1
CTFE bug 9634 was fixed, remove the workaround

https://github.com/D-Programming-Language/phobos/commit/a419cd25ffa0f7df43005429a9d659e98d5bf1ed
Merge pull request #1732 from blackwhale/revert-wa-regex

[TRIVIAL] CTFE bug 9634 was fixed, remove the workaround