D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 9498 - Range violation using AA
Summary: Range violation using AA
Status: RESOLVED INVALID
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: All All
: P2 regression
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-12 11:26 UTC by Tim Volckmann
Modified: 2013-07-08 21:52 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Tim Volckmann 2013-02-12 11:26:48 UTC
The following worked in DMD 2.060 (and previous versions):

string[string] myValues;

ref string getValue(string v) {
   return myValues[v];
}

void main() {
   getValue("myValue") = "myString";
}

But in 2.061 it throws a range violation.
Comment 1 Maxim Fomin 2013-02-12 11:49:32 UTC
This is neither a bug nor a regression. 

An element is accessed in return statement, which makes it an rvalue, so allocation if absence does not applies here. 

It works as clarified in TDPL. From TDPL (p. 115):

"To read a value off an associative array given a key, just read aa [ key] . (The compiler distinguishes reads from writes and invokes slightly different functions.)" - it is implemented in druntime as aaGetValueX and aaGetRvalueX or something like that.

Further: 

"If you try to read the value for a key not found in the associative array, a range violation exception is thrown."
Comment 2 yebblies 2013-06-29 20:32:45 UTC
Note the ref return from getValue, which makes 'myValues[v]' an lvalue.  It could still be invalid, but not for the reasons stated.
Comment 3 Maxim Fomin 2013-06-29 21:59:13 UTC
(In reply to comment #2)
> Note the ref return from getValue, which makes 'myValues[v]' an lvalue.  It
> could still be invalid, but not for the reasons stated.

This is irrelevant. The fact that function returns by ref indicates that value should not be copied but returned by ref. If "myValue" was present that would mean that pointer to the string is returned. Irrespective of how value is returned, myValues[v] is invalid value here.
Comment 4 yebblies 2013-06-29 22:15:17 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > Note the ref return from getValue, which makes 'myValues[v]' an lvalue.  It
> > could still be invalid, but not for the reasons stated.
> 
> This is irrelevant. The fact that function returns by ref indicates that value
> should not be copied but returned by ref. If "myValue" was present that would
> mean that pointer to the string is returned. Irrespective of how value is
> returned, myValues[v] is invalid value here.

This code works because indexing an AA with a key that does not exist is valid in an lvalue context:

int[string] aa;
aa["asd"] += 2;

The following code also indexes an AA with a missing key in an lvalue context:

int[string] aa;
ref int get() { return aa["asd"]; }
get() += 2;

The assumption here is that indexing an AA with a non-existing key in a context where an lvalue is expected does not result in a range violation.

Do you know of any reason why this is an incorrect assumption?
Comment 5 Maxim Fomin 2013-06-29 22:34:57 UTC
(In reply to comment #4)
> This code works because indexing an AA with a key that does not exist is valid
> in an lvalue context:
> 
> int[string] aa;
> aa["asd"] += 2;
> 
> The following code also indexes an AA with a missing key in an lvalue context:
> 
> int[string] aa;
> ref int get() { return aa["asd"]; }
> get() += 2;
> 
> The assumption here is that indexing an AA with a non-existing key in a context
> where an lvalue is expected does not result in a range violation.
> 
> Do you know of any reason why this is an incorrect assumption?

Actually D does not have notion of lvalue context - TDPL says about read and write operations. According to it, returning from function is a read operation.

And function refness does not guarantee that original lvalue will be modified. The use case could be:

int a = get(); 

which means refness is wiped out and operation is effectively a read operation. Situation could be follows:

ref int foo() { return aa["asd"]; }
int bar() { return aa["asd"]; }

not only this is confusing and inconsistent, but it defeats the purpose of having aaGetRvalueX - anyone who want allocation would be just putting ref to function declaration to make things works.
Comment 6 yebblies 2013-06-29 22:59:48 UTC
(In reply to comment #5)
> 
> Actually D does not have notion of lvalue context - TDPL says about read and
> write operations. According to it, returning from function is a read operation.
> 

TDPL does not completely define the language.  Unfortunately neither does the spec, it is defined through a combination of TDPL, the spec, and DMD.  When these disagree none are a definitive source.

The compiler most certainly has a notion of lvalue context for index expressions.

> And function refness does not guarantee that original lvalue will be modified.
> The use case could be:
> 
> int a = get(); 
> 
> which means refness is wiped out and operation is effectively a read operation.
> Situation could be follows:
> 
> ref int foo() { return aa["asd"]; }
> int bar() { return aa["asd"]; }
> 
> not only this is confusing and inconsistent, but it defeats the purpose of
> having aaGetRvalueX - anyone who want allocation would be just putting ref to
> function declaration to make things works.

I'm not sure how any of this is relevant.
Comment 7 Maxim Fomin 2013-06-29 23:43:51 UTC
(In reply to comment #6)
> > And function refness does not guarantee that original lvalue will be modified.
> > The use case could be:
> > 
> > int a = get(); 
> > 
> > which means refness is wiped out and operation is effectively a read operation.
> > Situation could be follows:
> > 
> > ref int foo() { return aa["asd"]; }
> > int bar() { return aa["asd"]; }
> > 
> > not only this is confusing and inconsistent, but it defeats the purpose of
> > having aaGetRvalueX - anyone who want allocation would be just putting ref to
> > function declaration to make things works.
> 
> I'm not sure how any of this is relevant.

Than you should read again if you don't see the relevance. Function returning AA value as discussed here is a case which should throw according to TDPL (spec is silent unfortunately) and current D implementation in case of non-ref functions. Presence of ref attribute does not matter because function performs read operation in the first place. How value is returned (by ref or not) is of secondary importance.
Comment 8 yebblies 2013-06-29 23:49:40 UTC
(In reply to comment #7)
> Than you should read again if you don't see the relevance. Function returning
> AA value as discussed here is a case which should throw according to TDPL (spec
> is silent unfortunately) and current D implementation in case of non-ref
> functions. Presence of ref attribute does not matter because function performs
> read operation in the first place. How value is returned (by ref or not) is of
> secondary importance.

TDPL is not an authoritative specification, so the fact this is in line with the way TDPL presents AAs does not conclusively prove this is correct behavior.
Comment 9 Kenji Hara 2013-07-08 21:52:32 UTC
The old behavior was wrong-code bug, and fixed in 2.061.

Introduced commit/codeline:
https://github.com/D-Programming-Language/dmd/commit/9553d0c66337b477375cb77743defcdd79b8064d#L4L3995

AA indexing + assign syntax is a special form in D. If the acquired payload from AA is _immediately_ set by assignment, it is treated as an insertion for non-existing key and won't throw RangeError. Otherwise, AA indexing should be always treated as a read of existing key - even if it is on Lvalue-context.

I'd like to show more specific example.

struct S { int value = 42; }

S[string] myValues;

ref S getValue(string v) {
   return myValues[v];
}

void main() {
    import std.stdio;
    writeln(getValue("myValue").value); // what will be displayed?
}

In 2.060 and earlier, myValues[v] had returned a reference to *uninitialized* memory. From 2.061, it correctly throws RangeError. That's @safe behavior.