Issue 22367 - Modules compiled with -betterC never generate a ModuleInfo
Summary: Modules compiled with -betterC never generate a ModuleInfo
Status: NEW
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: All All
: P1 blocker
Assignee: No Owner
URL:
Keywords: betterC
Depends on:
Blocks:
 
Reported: 2021-10-08 16:41 UTC by Andrei Alexandrescu
Modified: 2022-12-06 09:20 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Andrei Alexandrescu 2021-10-08 16:41:56 UTC
This matter has been tenuously discovered by Rikki Cattermole with the help of several other folks in the community during work on https://github.com/dlang/dmd/pull/12832/.

Under certain conditions, not readily reproducible in the small, importing a module XYZ that is separately-compiled with `-betterC` causes a linker error caused by the absence of a symbol called __ModuleInfo that is supposed to be defined by the XYZ module.

One remedy is to define this inside XYZ:

extern(C) __gshared ModuleInfo _D3dmd7backend7ptrntab12__ModuleInfoZ;

The apparent reason is that importing a module adds in turn references to all imported modules, see the definition of the importedModules property defined in .object.ModuleInfo.
Comment 1 Iain Buclaw 2021-10-08 17:02:08 UTC
Is this really an issue?  What use-case is there for compiling a D library with -betterC then using it from a D program?
Comment 2 Andrei Alexandrescu 2021-10-08 17:07:35 UTC
(In reply to Iain Buclaw from comment #1)
> Is this really an issue?  What use-case is there for compiling a D library
> with -betterC then using it from a D program?

A use case is the dmd compiler itself (not sure why it's built this way).

Anyway, there are a variety of cases in which you want to build a library in D that could work with both C and D.
Comment 3 Richard (Rikki) Andrew Cattermole 2021-10-08 17:27:41 UTC
Minified:

$ dmd -betterC -lib mylib1.d mylib2.d
$ dmd -I. mylib1 myexe.d -main

module myexe;
import mylib2;

module mylib1;
static this() {}

module mylib2;
import mylib1;
Comment 4 Mathias LANG 2021-10-21 13:53:41 UTC
Why does `mylib1` has a module ctor ? Shouldn't `-betterC` reject it ?
Comment 5 Iain Buclaw 2021-10-21 19:53:27 UTC
(In reply to Mathias LANG from comment #4)
> Why does `mylib1` has a module ctor ? Shouldn't `-betterC` reject it ?

I think someone had the ingenuity to make module ctors `pragma(crt_constructor)` in betterC mode.
Comment 6 Mathias LANG 2021-10-22 02:26:49 UTC
Shouldn't that be only for `shared static this` ?
Comment 7 Walter Bright 2022-11-30 07:42:03 UTC
> $ dmd -betterC -lib mylib1.d mylib2.d

This compiles mylib1.d and mylib2.d, and creates a library file mylib.lib containing the object code from both.

> $ dmd -I. mylib1 myexe.d -main

This compiles mylib1.d and myexe.d together to form an executable named mylib1.exe. It fails to find anything from mylib2.d, because that wasn't given on the command line. This is not a compiler bug.

I think what you really meant was:

    dmd -betterC mylib1.d mylib2.d

which creates mylib1.obj into which is placed the compiled versions of mylib1.d and mylib2.d

    dmd -I. myexe.d mylib1.obj -main

which compiles myexe.d and links it to mylib1.obj, creating an executable named myexe.exe. Or at least it tries to, as it gives:

    myexe.obj(myexe)
     Error 42: Symbol Undefined __D6mylib212__ModuleInfoZ

because -betterC suppresses generating a ModuleInfo, while myexe.d expects it to be there. This is a compiler bug, or at least a compiler problem.
Comment 8 Walter Bright 2022-11-30 08:14:07 UTC
A module generates a ModuleInfo if at least one of these is true:

1. it imports a module that generates ModuleInfo
2. it has a static constructor
3. it has a static destructor
4. it has a unit test declaration

but is disabled if -betterC is on. This particular bug report is the result of having a static constructor.
Comment 9 Walter Bright 2022-11-30 08:25:59 UTC
Iain has the right idea. The solution is to, when in -betterC mode:

1. automatically annotate static constructors with:

    pragma(crt_constructor) extern (C)

2. do the same for static destructors

3. not set `needmoduleinfo` for (1) and (2)

This will run the constructors and destructors using the C runtime library mechanism. The downside of this is the order of construction and destruction will be in the order the object files are seen by the linker, rather than a depth-first hierarchical order.

----

Mathias' suggestion is a good one. Give an error on `static this()`, and only work with `shared static this()`.
Comment 10 Walter Bright 2022-11-30 09:06:25 UTC
That doesn't quite solve the problem. Will have to think about it some more.
Comment 11 Walter Bright 2022-11-30 09:58:32 UTC
mylib1.d has a static constructor in it. When does construction happen?

In C code, the C runtime takes care of it, in the order they appear to the linker.

In D code, the D startup code takes care of it, *after* the C runtime does its initializations, in depth-first order.

The two are different, and are irreconcilable (though most static constructors probably don't care about the order, we can't really rely on that).

myexe.d has no way to know that it is importing a betterC module, so it can't do the right thing with the construction.

So, I propose another solution. mylib1.d simply has to choose whether it wants to do C construction or D construction. C construction would be:

    pragma(crt_constructor) extern (C) static this() { ... }

D construction would be:

    static this() { ... }

myexe.d, upon seeing the D static constructor, is going to expect a ModuleInfo from mylib1.d. The compiler, when compiling mylib1.d with -betterC, and it sees a D static constructor, can create a ModuleInfo for that static constructor.

The programmer creating a betterC library for both betterC and D programs, would use:

    pragma(crt_constructor) extern (C) static this() { ... }
Comment 12 Richard (Rikki) Andrew Cattermole 2022-11-30 21:06:09 UTC
D module constructors of course shouldn't work in -betterC code. They can throw a warning as dead code and hence don't require ModuleInfo to be generated (this should be easy to resolve). That'll prevent surprises in the future.

However, the fundamental problem here is with pay-as-you-go runtime. You can't turn on ModuleInfo when you need it in -betterC to move you into pay-as-you-go area of the scale.

We can't turn on ModuleInfo generation right now, because DllImport is incomplete (a good bit harder to implement). If we did turn it on right now, it would result in segfaults if you have D DLL's with D with the runtime executable.

I am arguing that instead of fixing this bug, we solve the DllImport issues first and use code like this as test cases to verify that it is indeed fixed.
Comment 13 Walter Bright 2022-12-01 03:32:57 UTC
In this case, ModuleInfo is how the D runtime runs static constructors. Programs compiled with betterC are meant to link only with the C runtime library, which knows nothing about ModuleInfo.

The problem here is writing a library that is compiled with betterC, and meant to be linked with either a betterC program or a D program.

Simply turning off ModuleInfo generation means the betterC's library does not run its static constructors.

Since a D program that is importing betterC modules does not know if they are betterC modules or not, it is the betterC modules' responsibility to choose how to do its own static construction.

I.e. a betterC module should use the following to run its static construction:

    pragma(crt_constructor) extern (C) void doMyStaticConstruction() { ... }

If a betterC is to be only linked with a D main, it should do:

    static this() { ... }

It should not do both, as then the static constructions will get run twice if it is linked with a D main.

The change to fix this bug report, then, is for betterC modules to generate a ModuleInfo if it has a `static this` constructor. And to add these instructions to the documentation.

The DLL export stuff is an orthogonal problem.
Comment 14 Richard (Rikki) Andrew Cattermole 2022-12-01 03:47:01 UTC
(In reply to Walter Bright from comment #13)
> The change to fix this bug report, then, is for betterC modules to generate
> a ModuleInfo if it has a `static this` constructor. And to add these
> instructions to the documentation.

This should not be automatic.

-betterC is a collection of switches all rolled in together. One of these is turning off of ModuleInfo generation. This is how it works in both LDC and GDC.

It needs to be opt-in via switches. Otherwise, behavior that wasn't expected may occur.
Comment 15 Dlang Bot 2022-12-02 06:07:19 UTC
@WalterBright created dlang/dmd pull request #14665 "fix Issue 22367 - Modules compiled with -betterC never generate a Mod…" fixing this issue:

- fix Issue 22367 - Modules compiled with -betterC never generate a ModuleInfo

https://github.com/dlang/dmd/pull/14665
Comment 16 Walter Bright 2022-12-02 06:09:53 UTC
Better than turning on/off the ModuleInfo generation, it is better to do or not do the triggers that cause the ModuleInfo to be generated.

For example, if a static constructor is written, and the ModuleInfo is suppressed, the program will link but the static constructor will never be run, causing the resulting program to not behave as expected.
Comment 17 Walter Bright 2022-12-06 09:20:57 UTC
I looked further into this.

Essentially, betterC code cannot generate ModuleInfo, because ModuleInfo also generates a call to _d_so_registry in druntime.

Having a `static this` in a betterC module, or a betterC module importing a module with a `static this`, requires a ModuleInfo to guarantee the semantics. Simply turning off ModuleInfo generation will get the program to link, but will leave the static this code un-run, i.e. the code will not work.

Instead, betterC code must use pragma(crt_constructor) functions instead to perform static initializations. These functions will be called by the C runtime startup code.

To reply on pragma(crt_constructor) means fixing the reported problems with them, and making sure they are correctly defined. To that end is:

https://github.com/dlang/dmd/pull/14669

which is to be followed by going through druntime to replace `static this` with `pragma(crt_constructor)` wherever possible, such as with:

https://github.com/dlang/dmd/pull/14671