D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 5743 - readf cannot read wchar or dchar from UTF-8 stdin
Summary: readf cannot read wchar or dchar from UTF-8 stdin
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: phobos (show other issues)
Version: D2
Hardware: Other Linux
: P2 normal
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-16 13:20 UTC by Ali Cehreli
Modified: 2020-03-21 03:56 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Ali Cehreli 2011-03-16 13:20:54 UTC
I compiled the following program with dmd 2.052 on an Ubuntu 10.10 console.

The following program reads only the first code unit instead of the whole character.

import std.stdio;

void main()
{
    wchar c;         // Please note: same problem with dchar as well
    readf(" %s", &c);
    writeln(c);
}

For example when the input is the character ö (encoded with byte values 195 182 in UTF-8), only the first code unit is read and the output becomes the Unicode character that corresponds to the value of that code unit.

In a sense, the program reads a code unit and outputs it as a code point.

Thank you,
Ali
Comment 1 Don 2011-03-19 17:14:45 UTC
This is marked as 'regression'. What previous version did it work with?
Comment 2 Ali Cehreli 2011-03-19 17:49:29 UTC
"regression" turns out to be my mistake. I just went back more than a dozen dmd versions and see that std.stdio.readf (or File.readf) is pretty new.

I've been using std.cstream.din, which used to work better than stdio.readf. Thinking that they must be using the same underlying format functions I thought that this was a regression.
Comment 3 basile-z 2015-11-21 13:58:19 UTC
2.069, works now, test with ö and é as well.