D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 6495 - array(file.byLine()) is a problem
Summary: array(file.byLine()) is a problem
Status: RESOLVED WONTFIX
Alias: None
Product: D
Classification: Unclassified
Component: phobos (show other issues)
Version: D2
Hardware: Other Windows
: P2 minor
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-14 12:12 UTC by Andrej Mitrovic
Modified: 2020-03-21 03:56 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Andrej Mitrovic 2011-08-14 12:12:05 UTC
I can use readText, but if I want to use the byLine range with a call to array() to save the lines of a text I end up getting garbage (I know why):

auto file = File("foo.txt", "r");
auto lineBuff = array(file.byLine());

I don't know if it can be fixed, I mean array() is not to blame (it only exhausts a range), and byLine reuses a char[] buffer to stay efficient. But it looks like something that could be thought about.

As a workaround I can do:

lineBuff = readText(r"file.txt").splitLines;

So it's not a big issue, but I ran into it and hence this report.
Comment 1 Vladimir Panteleev 2011-08-14 12:14:15 UTC
Another workaround:

array(map!`a.idup`(file.byLine()))
Comment 2 Andrej Mitrovic 2011-08-14 12:33:03 UTC
(In reply to comment #1)
> Another workaround:
> 
> array(map!`a.idup`(file.byLine()))

Thanks, first I tried .idup on the right side but that didn't work.

I think this falls under "suble things you should be aware of", not really a bug.
Comment 3 bearophile_hugs 2012-10-16 13:28:26 UTC
One solution to avoid this problem is to modify byLine to perform a copy on default and not perform it on request:

auto file = File("foo.txt", "r");
auto lineBuff = array(file.byLine());

==> good



auto file = File("foo.txt", "r");
auto lineBuff = array(file.byLine!false());

==> does't copy, lineBuff contains garbage

This makes the code safe (but a little slower) on default, when you don't know what you are doing, and offers a simple way to have faster code when you know what you are doing and you know you don't want a copy (this ides also follows the general safety philosophy of D).
Comment 4 hsteoh 2013-07-31 22:17:25 UTC
I like the idea of making byLine non-transient by default, but allowing a compile-time option to make it transient when you want to optimize your code (so the user is responsible for making sure transience doesn't break his code before turning it on).
Comment 5 hsteoh 2013-07-31 22:18:45 UTC
I don't like the use of a boolean flag, though. Maybe byLine!CopyBuffer() (default) vs. byLine!ReuseBuffer()?
Comment 6 Andrej Mitrovic 2013-08-01 06:33:14 UTC
(In reply to comment #4)
> I like the idea of making byLine non-transient

This "non-transient" term is being used often lately but it has no definition on the website, do you think you could write up a pull for dlang.org so we can officially document it? Otherwise we're lost in the conversation. :)
Comment 7 hsteoh 2013-08-01 09:58:04 UTC
Transient just means the value of .front changes when you call .popFront(), so you cannot simply assign .front to a variable and expect it to retain its value after you popFront(). Non-transient just means that you can assign .front to a variable and rest assured that popFront() will not mutate its value from under you.

I'm not sure we want to put this on the website just yet -- it *is* a term that we coined for the sake of distinguishing between, say, the current byLine(), which reuses the .front buffer, and the proposed change where it calls .dup on every line returned.
Comment 8 monarchdodra 2013-08-01 10:15:05 UTC
(In reply to comment #7)
> Transient just means the value of .front changes when you call .popFront(), so
> you cannot simply assign .front to a variable and expect it to retain its value
> after you popFront(). Non-transient just means that you can assign .front to a
> variable and rest assured that popFront() will not mutate its value from under
> you.
> 
> I'm not sure we want to put this on the website just yet -- it *is* a term that
> we coined for the sake of distinguishing between, say, the current byLine(),
> which reuses the .front buffer, and the proposed change where it calls .dup on
> every line returned.

Not *quite* as simple as that. If front returns a "non reference type" (eg an int) but by reference, is it transient?

struct S
{
    int front;
    void popFront()
    {++front;}
    enum empty = false;
}

int  a = myRange.front; //Non transient
int* b = &myRange.front; //Transient
myRange.popFront();
assert(*b == a); //Fails

Is that range transient? Is the range "reference transient"?
Comment 9 hsteoh 2013-08-01 10:50:33 UTC
No, transience is a property of the *range*, not what you do with the data returned by the range. Taking the address of .front technically undefined, since you can't guarantee that that's even possible for arbitrary range types (in general, it's not possible, since .front may be implemented as a @property function). If .front were a ref function, then you may have a point.
Comment 10 monarchdodra 2013-08-01 11:16:36 UTC
(In reply to comment #9)
> Taking the address of .front technically undefined,
> since you can't guarantee that that's even possible for arbitrary range types
> (in general, it's not possible, since .front may be implemented as a @property
> function). If .front were a ref function, then you may have a point.

That's BS. That's like saying indexing is undefined because an arbitrary range isn't indexable. We have the `hasLValueElements` trait specifically for this. Ranges that offer references to their internal exist, and are a class of their own. Also, the "@property" thing is a non-argument: It is just buggy behaviour; The new implementation (as explained by Kenji) is that &propFun should and *will* return the address of what is returned after having called propFun.

That said, I'll *agree* that there is not a single trait that guarantees references remain valid after a pop, nor does *anybody* rely on it, so we can indeed safely say that my point was, well, pointless :)
Comment 11 basile-z 2017-08-26 18:15:25 UTC
Isn't .byLineCopy fix this issue ?
Comment 12 Steven Schveighoffer 2017-09-13 15:47:52 UTC
(In reply to b2.temp from comment #11)
> Isn't .byLineCopy fix this issue ?

Yes. Marking as WONTFIX as I think that's the closest to what happened.