Issue 24570 - printing a range of ranges consumes sub-ranges
Summary: printing a range of ranges consumes sub-ranges
Status: NEW
Alias: None
Product: D
Classification: Unclassified
Component: phobos (show other issues)
Version: D2
Hardware: All All
: P1 enhancement
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-27 14:57 UTC by Steven Schveighoffer
Modified: 2024-12-01 16:42 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Steven Schveighoffer 2024-05-27 14:57:08 UTC
If you are printing something, you might correctly not expect the act of printing it to modify that type.

However, printing a range *requires* that you modify it. It's how you get at the elements inside. For this reason, a range must be treated as a *view* of data, not the thing to print.

If you print a range whose elements are ranges, those elements should behave as if you printed them. That is, the act of printing them should not modify the elements themselves.

However, currently `formatValue` will accept range elements by reference, and print them by iterating them. This means that a range-of-ranges where the outer range uses lvalue elements will consume all the nested ranges.

An example:

```d
import std.range;
import std.conv;
import std.algorithm;
struct R
{
    int* ptr;
    size_t len;
    int front() {return  *ptr;}
    void popFront() { ++ptr; --len; }
    bool empty() {return len == 0;}
    typeof(this) save() { return this; }
}

void main()
{
    import std.conv;
    auto intarr = [1, 2, 3];
    auto r = R(intarr.ptr, intarr.length);
    assert(text(r) == "[1, 2, 3]"); // ok, r is passed by value
    assert(text(r) == "[1, 2, 3]"); // still there
    auto ndarr = [r, r]; // a range of ranges
    assert(text(ndarr) == "[[1, 2, 3], [1, 2, 3]]"); // ok, but this consumed the subranges
    assert(text(ndarr) == "[[], []]"); // now they are gone
    ndarr = [r, r];

    // what phobos should do, is save every forward range before printing
    assert(text(ndarr.map!(e => e.save)) == "[[1, 2, 3], [1, 2, 3]]"); // ok
    assert(text(ndarr.map!(e => e.save)) == "[[1, 2, 3], [1, 2, 3]]"); // ok
}
```

Note: I used text instead of writeln so the asserts can be written, but the same thing happens with writeln.
Comment 1 Steven Schveighoffer 2024-05-27 14:58:33 UTC
A further note is that an array of arrays is not consumed when printed -- because formatValue has a specialized case for that.
Comment 2 Salih Dincer 2024-05-28 06:05:31 UTC
(In reply to Steven Schveighoffer from comment #1)
> A further note is that an array of arrays is not consumed when printed --
> because formatValue has a specialized case for that.

If you do the following instead of `auto save() => return this;`, the problem is solved:

```
struct R
{
  wchar* ptr;
  size_t len;
  
  this(T)(T[] range)
  {
    ptr = cast(wchar*)range.ptr;
    len = range.length;
  }
  
  auto empty() => len == 0;
  auto front() => *ptr++;
  auto popFront() => len--;
  auto save()
  {
    auto r = R([]);
    r.len = len;
    r.ptr = ptr;
    return r;
  }
}

void main()
{
  auto c = ['€', '₺', '₽'];
  auto r = R(c);
  auto arr = [r, r, r];

  assert(!arr.empty);
  
  import std.conv : text;
  auto str = arr.text; // "€₺₽"
  
  assert(!arr.empty);
}
```
Comment 3 Salih Dincer 2024-05-28 06:33:51 UTC
There's no reason why this issue can't be easily fixed. Because when you include narrow string or wchar, there is no problem of not being able to save(): https://forum.dlang.org/post/qgtrcupsniezqgazkztd@forum.dlang.org
Comment 4 Steven Schveighoffer 2024-05-28 23:29:14 UTC
(In reply to Salih Dincer from comment #2)

> If you do the following instead of `auto save() => return this;`, the
> problem is solved:
> 
> ```
> struct R
> {
>   wchar* ptr;
>   size_t len;
>   
>   this(T)(T[] range)
>   {
>     ptr = cast(wchar*)range.ptr;
>     len = range.length;
>   }
>   
>   auto empty() => len == 0;
>   auto front() => *ptr++;

this is an invalid implementation. Using `front` more than once is supported.

>   auto popFront() => len--;
>   auto save()
>   {
>     auto r = R([]);
>     r.len = len;
>     r.ptr = ptr;
>     return r;
>   }
> }
> 
> void main()
> {
>   auto c = ['€', '₺', '₽'];
>   auto r = R(c);
>   auto arr = [r, r, r];
> 
>   assert(!arr.empty);
>   
>   import std.conv : text;
>   auto str = arr.text; // "€₺₽"
>   
>   assert(!arr.empty);

But it was never arr that was empty. It's just an array of empty ranges at this point.

But the difference here is that you are using wchar and not short. format treats character data differently (as evidenced by the printout, it's not an array of items, but a string).

Yet another reason why this bug has gone undetected for so long.
Comment 5 dlangBugzillaToGithub 2024-12-01 16:42:37 UTC
THIS ISSUE HAS BEEN MOVED TO GITHUB

https://github.com/dlang/phobos/issues/10553

DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB