Issue 7753 - Support opIndexCreate as part of index operator overloading in user-defined types
Summary: Support opIndexCreate as part of index operator overloading in user-defined t...
Status: NEW
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: All All
: P4 enhancement
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-03-22 21:18 UTC by hsteoh
Modified: 2023-05-10 09:08 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description hsteoh 2012-03-22 21:18:10 UTC
Currently, a nested indexing expression such as:

    a[b][c][d] = 0;

for a user-defined type that overloads opIndex* gets translated into:

    a.opIndex(b).opIndex(c).opIndexAssign(0,d);

However, if a[b][c] do not yet exist, this will fail. This works correctly for built-in associative arrays, because the expressions get translated into a series of calls to _aaGetX(), which creates new entries if they don't already exist. But currently, there is no way for a user-defined type to accomplish the same thing.

Suggested fix: if the expression as a whole is being assigned to with an assignment operator, then the upper-level indexing calls should be translated into opIndexCreate() instead of just opIndex():

    a[b][c][d] = 0;

becomes:

    a.opIndexCreate(b).opIndexCreate(c).opIndexAssign(0,d);

If opIndexCreate is not defined, then replace it with opIndex (for backward compatibility). The semantics of opIndexCreate(k) is to return the entry indexed by k if it exists, and if it doesn't, create a new entry with key k and the .init value of the value type, and return the new entry.

Preferably, this will apply to any expression that ends with a call to opIndexAssign, opIndexUnary, and opIndexOpAssign. But at the very least, this needs to work when the expression ends with opIndexAssign.
Comment 1 Dmitry Olshansky 2012-03-23 03:28:52 UTC
It might be a good thing, but ...
Why not just return a proxy type upon each indexing?
The proxy type will have createIndex that will forward to others in turn.

Here a prof of concept I belive it could be generalized and polished.
For simplicity sake it's for n-dim arrays:

import std.stdio, std.exception;

struct Proxy(T)
{
		T* _this;
		int idx;
		void opAssign(X)(X value){
                    debug writeln("Proxy.opAssign"); 
		    createIndex(idx) = value;			
		}
	static if(typeof(*_this).dimension >= 2)
	{

		// somewhere io expression ...a[idx][jdx]... is create all, except last one
                auto opIndex(int jdx){
			return proxy(&_this.createIndex(idx), jdx);
		}
			//a[idx][jdx] = y; is create if non-existent
                auto opIndexAssign(X)(X val, int jdx){  //TODO: constraints!
			debug writeln("Proxy.opIndexAssign");
			_this.createIndex(idx).createIndex(jdx) = val;
		}
	}

		@property ref expr(){
                    debug writeln("Proxy.expr"); 
		    return _this.normalIndex(idx);
		}

		alias expr this;
}

auto proxy(T)(T* x, int idx){ return Proxy!(T)(x,idx); }



struct M(size_t dim)
{
	static if(dim == 1){
		alias int Val;
	}
	else{
		alias M!(dim-1) Val;
	}
	enum dimension = dim;

	
	Val[] arr;
        

	ref createIndex(int idx){
		debug writeln("Created ", typeof(this).stringof);
		if(arr.length < idx)
			arr.length = idx+1;
		return arr[idx];
	}
        ref normalIndex(int idx){
		debug writeln("Indexed ", typeof(this).stringof);
		return arr[idx];
	}
	auto opIndex(int idx){
		return Proxy!(M)(&this, idx);
	}
        alias arr this;
}

unittest{
	M!(3) d3arr;	
	d3arr[1][2] = [2, 3, 4];
	assert(d3arr[1][2][2] == 4);
	int[] x = d3arr[1][2];                                              
	assert(d3arr[1][2].length == 3); 
	assert(d3arr[1][1] == null); //inited 
	//booom used before explicit = 
	assert(collectException!Error(d3arr[2][2][1] + 1 == 1) !is null);
}
Comment 2 hsteoh 2012-03-23 07:36:46 UTC
That's a pretty neat idea. Can it be made to work with containers that contain other containers (possibly of a different type)? E.g., a linked list of arrays of AA's?
Comment 3 Dmitry Olshansky 2012-03-23 08:17:58 UTC
Well, linked list is, for sure, not indexed so not a problem ;)
As for sets I don't see a problem, I can extend this idea to arbitrary set easily. In fact it's even cleaner for sets (maps) then arrays.
If you need a headstart I can scratch up a simple version for integer sets.
Comment 4 Dmitry Olshansky 2012-03-23 08:22:00 UTC
Ahm. So Q was about geterogenious stuff like arrays of sets(maps) or maps of arrays ? 
I think the 2 mentioned situations cover it all, thus you can parametrize this idea on basis of:
a) contigous container, to get item with index X you need to allocted all elements up X. Here X is obviously can be only integer of some sort.
b)non-contigous container, to get item with index X you check/create only slot indexed by X.
Comment 5 anonymous4 2015-05-28 08:59:21 UTC
Whoa, this feature is weird indeed. Given the declaration int[int][int] a; a[0][0]=0 shouldn't work because the AA has no entry with key 0, so a[0] should throw RangeError similar to how b=a[0] throws RangeError.
Comment 6 Martin Nowak 2015-05-29 11:25:17 UTC
Igor Stepanov suggested [¹] an opIndex extension.

ref Foo opIndex(bool lvalue)(size_t idx)

Where the compiler would call opIndex!true when an lvalue is required and opIndex!false when an rvalue suffices.

[¹]: http://forum.dlang.org/post/vorfqumugibjcztdrezb@forum.dlang.org
Comment 7 Martin Nowak 2015-05-29 17:47:36 UTC
(In reply to Martin Nowak from comment #6)
> Igor Stepanov suggested [¹] an opIndex extension.
> 
> ref Foo opIndex(bool lvalue)(size_t idx)

A separate opIndexCreate might be better, b/c it allows you to 
have a ref return for one and an rvalue return for the other 
function. It also allows to use those operands as polymorphic 
functions in classes.
Comment 8 Steven Schveighoffer 2020-02-27 21:27:10 UTC
Just wanted to post something here.

The call a[b][c][d] = 0 results in different calls (_aaGetY now) than x = a[b][c][d] (_aaGetRvalueX).

So I think H S Teoh is onto the right path. Having a different opIndex available for assignment (or lvalue manipluation) makes sense and should be straightforward to define.

Now, to define this a little more concretely, I think opIndexCreate should ONLY be used when the entire chain of indexing results in a definite lvalue requirement. This means ONLY opIndexCreate (or AA usage) available in the expression, and the final "call" should be an opAssign or opOpAssign or opIndexOpAssign. I would also throw in opUnary that expects mutation (i.e. ++ or --) because it's currently supported by AAs.

Unfortunately, the existing behavior is somewhat inconsistent:

struct S {
  int x;
  void opUnary(string s : "++")() {++x;}
}

S[int][int] aa;

aa[1][1] = S(1); // ok
aa[2][2].x = 5; // range violation
++aa[3][3]; // range violation

int[int] aa2;

aa2[4] += 3; // ok
++aa2[5]; // ok

void foo(ref int x)

foo(aa2[6]); // range violation

So there is not 100% consistency here. The most rational logical implementation would just require lvalue usage. But the reality is different.

Also note that ++aa[5] does not match ANY operator that would be on the type of aa. There is no opIndexOpUnary akin to opIndexOpAssign. And it doesn't work if your underlying type supports ++. That is an inconsistency that will be tough to duplicate.