D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 18792 - Incorrect scope analysis with -dip1000 for small-sized-optimized string
Summary: Incorrect scope analysis with -dip1000 for small-sized-optimized string
Status: RESOLVED DUPLICATE of issue 21868
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: All All
: P1 normal
Assignee: No Owner
URL:
Keywords: safe
Depends on:
Blocks:
 
Reported: 2018-04-23 09:46 UTC by Per Nordlöw
Modified: 2021-06-10 19:22 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Per Nordlöw 2018-04-23 09:46:43 UTC
The following code should fail to compile in functions shouldFail1 and shouldFail2 when compiled with flag -dip1000:


/** Small-size-optimized (SSO) variant of `string`.
 *
 * Store on the stack if constructed with <= `smallCapacity` number of
 * characters, otherwise on the GC heap.
 *
 * See_Also: https://forum.dlang.org/post/pb87rn$2icb$1@digitalmars.com
 */
struct SSOString
{
    private alias E = immutable(char); // immutable element type

    pure nothrow @nogc:

    scope E[] opSlice() const return @trusted // TODO @safe for -dip1000?
    {
        if (isLarge)
        {
            union RawLarge
            {
                Raw raw;
                Large large;
            }
            RawLarge copy = void;
            copy.large = cast(Large)large;
            copy.raw.length /= 2; // adjust length
            return copy.large;
        }
        else
        {
            return small.data[0 .. small.length/2]; // scoped. TODO use .ptr when proved stable
        }
    }

    /// ditto
    scope E[] opSlice(size_t i, size_t j) const return @trusted // TODO @safe for -dip1000?
    {
        return opSlice()[i .. j];
    }

private:

    /** Returns: `true` iff this is a large string, otherwise `false.` */
    @property bool isLarge() const @trusted
    {
        return large.length & 1; // first bit discriminates small from large
    }

    struct Raw                  // same memory layout as `E[]`
    {
        size_t length;          // can be bit-fiddled without GC allocation
        E* ptr;
    }

    alias Large = E[];

    enum smallCapacity = Large.sizeof - Small.length.sizeof;
    static assert(smallCapacity > 0, "No room for small source for E being " ~ E.stringof);
    version(LittleEndian) // see: http://forum.dlang.org/posting/zifyahfohbwavwkwbgmw
    {
        struct Small
        {
            ubyte length; // TODO only first 4 bits are needed to represent a length between 0-15, use other 4 bits
            E[smallCapacity] data;
        }
    }
    else
    {
        static assert(0, "TODO Add BigEndian support and test");
    }

    union
    {
        Raw raw;
        Large large;
        Small small;
    }
}

///
@safe pure nothrow @nogc unittest
{
    string shouldFail1() @safe pure nothrow @nogc
    {
        SSOString x;
        return x[];             // TODO should fail with -dip1000
    }
    string shouldFail2() @safe pure nothrow @nogc
    {
        SSOString x;
        return x[0 .. 0];       // TODO should fail with -dip1000
    }
}
Comment 1 Walter Bright 2020-03-04 10:07:42 UTC
Please reduce to a much smaller case of what the problem is.
Comment 2 Per Nordlöw 2020-03-04 11:06:48 UTC
I managed to reduce the code snippet to


struct SSOString
{
pure nothrow @nogc:
 
    inout(char)[] opSlice() inout return scope @trusted // TODO @safe for -dip1000?
    {
        return small.data[0 .. small.length]; // scoped. TODO use .ptr when proved stable
    }
    
    struct Small
    {
        ubyte length; // TODO only first 4 bits are needed to represent a length between 0-15, use other 4 bits
        char[15] data;
    }

    struct Raw                  // same memory layout as `char[]`
    {
        size_t length;          // can be bit-fiddled without GC allocation
        char* ptr;
    }

    union
    {
        Raw raw; // PROBLEM this declaration prevents DIP-1000 scope analysis from kicking in in `opSlice`
        Small small;
    }
}

@safe pure nothrow @nogc unittest
{
    char[] shouldFail1() @safe pure nothrow @nogc
    {
        SSOString x;
        return x[];             // TODO should fail with -dip1000
    }
}


When compiled with -dip100 the function `shouldFail1` should fail to compile but it does still compile. When I remove the line

    Raw raw;

scope analysis suddenly starts working which correctly triggers the error

foo.d(33,17): Error: returning `x.opSlice()` escapes a reference to local variable `x`
        return x[];             // TODO should fail with -dip1000


Small enough, Walter?
Comment 3 Per Nordlöw 2020-03-04 11:08:43 UTC
(In reply to Per Nordlöw from comment #2)
> Small enough, Walter?

Forgot to remove obselete comments. Here's a new version with irrelevant comments removed.


struct SSOString
{
pure nothrow @nogc:
 
    inout(char)[] opSlice() inout return scope @trusted 
    {
        return small.data[0 .. small.length]; 
    }
    
    struct Small
    {
        ubyte length; 
        char[15] data;
    }

    struct Raw        
    {
        size_t length;
        char* ptr;
    }

    union
    {
        Raw raw; // PROBLEM this declaration prevents DIP-1000 scope analysis from kicking in in `opSlice`
        Small small;
    }
}

@safe pure nothrow @nogc unittest
{
    char[] shouldFail1() @safe pure nothrow @nogc
    {
        SSOString x;
        return x[];             // TODO should fail with -dip1000
    }
}
Comment 4 Walter Bright 2020-03-04 21:07:07 UTC
much better, thank you
Comment 5 Dennis 2021-06-10 19:22:40 UTC
I know this issue was first, but 21868 already got a pull.

It is the same problem: `return scope` without `ref return` applies return the values (e.g. SSOString.raw.ptr), not the address of SSOString itself (e.g. SSOString.small.data.ptr), but the compiler doesn't check for return-scope or return-ref, just return. The declaration of `Raw raw` is needed because otherwise SSOString has no pointers, so return scope is stripped away, so the compiler can't mistake the return attribute anymore and raises the correct error.

*** This issue has been marked as a duplicate of issue 21868 ***