Issue 24132 - ImportC: Add support for wchar_t, char16_t, char32_t
Summary: ImportC: Add support for wchar_t, char16_t, char32_t
Status: RESOLVED WONTFIX
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: All Windows
: P1 enhancement
Assignee: Walter Bright
URL:
Keywords: ImportC, pull, rejects-valid
Depends on:
Blocks:
 
Reported: 2023-09-02 21:07 UTC by Adam Wilson
Modified: 2023-11-21 03:35 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Adam Wilson 2023-09-02 21:07:00 UTC
Currently ImportC handles wchar_t as a ushort, which works but is painful to use in D code that expects wchar* instead of ushort* for string pointers (example: the return value of toUTF16z).
Comment 1 ryuukk_ 2023-09-03 00:14:31 UTC
Sounds like an easy PR to do, add the define here:

https://github.com/dlang/dmd/blob/master/druntime/src/importc.h
Comment 2 Adam Wilson 2023-09-03 14:45:11 UTC
(In reply to ryuukk_ from comment #1)
> Sounds like an easy PR to do, add the define here:
> 
> https://github.com/dlang/dmd/blob/master/druntime/src/importc.h

It's not *quite* that simple. wchar_t is #define as an unsigned short in C, and either an unsigned short *or* an intrinsic type in C++. This means that when the preprocessor runs it emits an unsigned short. However, most C preprocessors have a switch that treat wchar_t as an intrinsic type instead of "typedef wchar_t unsigned short" However, the current ImportC implementation does not support this and vomits up errors when you try to use it.

What we need is the ability for ImportC to recognize wchar_t, char16_t, and char32_t *after* the preprocessor has run so that ImportC can emit the appropriate char/wchar/dchar types.
Comment 3 Walter Bright 2023-09-12 00:54:48 UTC
C11 defines char32_t as uint_least32_t, which is specified to be a typedef, not a macro or a keyword.

Preprocessors usually key off the existence of __cplusplus to turn C++ semantics on and off. ImportC currently does not do that.

I suggest putting:

     typedef wchar_t unsigned short;

in your copy of importc.h and see how far that gets?
Comment 4 Adam Wilson 2023-10-28 09:48:02 UTC
So I was poking around the compiler source today and I noticed that in lexer.d at line 80 there is a reference to wchar_t in ImportC specific code, so it appears to know about wide-chars in C. 

Then I discovered Ckeywords in tokens.d. So all we have to do is add wchar_ and dchar_ to that list and add the following to importc.h

#define wchar_t wchar
#define char16_t wchar
#define char32_t dchar

A little hacky maybe, but for ImportC it would work, and it would allow us to use D style strings natively which is the semantically correct outcome. IIRC, the wchar and dchar types are unsigned short and unsigned int respectively when using export(C) so that should function as normal.
Comment 5 Dlang Bot 2023-10-31 03:45:55 UTC
@LightBender created dlang/dmd pull request #15757 "Issue 24132 - ImportC: Add support for wide-chars." fixing this issue:

- Fix Issue 24132. Add wchar/dchar to C Keywords list. Use #defines to convert C wide-chars to wchar/dchar.

https://github.com/dlang/dmd/pull/15757
Comment 6 Adam Wilson 2023-11-21 03:35:58 UTC
I'm going to close this as WON'T FIX. 

I'll try to attack my problem with a post-processor script.