Currently the D GC allocates arrays aligned to 16 bytes fit to be used in XMM registers: auto a1 = new double2[128]; But I think the D GC should also return this a2 aligned to 32 bytes, as needed for efficiency for code that uses YMM registers, that are 256 bits long: auto a2 = new double4[64]; Eventually the D GC should return this a3 aligned to 64 bytes for efficiency of code that uses ZMM registers (Intel Xeon Phi), that are 512 bits long: auto a3 = new double8[32];
Yes, double4 should intrinsically be align(32), just like float4/double2 is intrinsically align(16). Likewise, align(64) for ZMM regs. The GC should respect the explicit alignment of any type. If it doesn't, then that is another bug.
For clarity, as a simple compiler rule, all __vector() types should be intrinsically aligned to their .sizeof. This is correct on all architectures I know of. There is the occasional architecture that might not mind a smaller alignment, but I think it's still valuable to enforce the alignment on those (rare) platforms for portability (structure consistency across platforms), especially since those platforms are often tested less thoroughly.
This came up again: https://forum.dlang.org/post/rgionugyekzpxuetyslh@forum.dlang.org I have an idea that might work: when allocating an array of items with alignment greater than 16 bytes, just offset the first element when calculating the size.
@schveiguy created dlang/druntime pull request #3192 "fix issue 10826 -- make sure large arrays obey 32-byte or greater alignment" fixing this issue: - fix issue 10826 -- make sure 32-byte aligned types (such as __vector(ubyte[32]) ) are aligned to 32-bytes when put into large arrays. https://github.com/dlang/druntime/pull/3192
Raising the importance to critical. Greater-than-natural alignments are respected by LDC pretty much everywhere AFAIK (stack and globals) - except for druntime's GC. And are used for optimizations. It's pretty embarrassing that people cannot safely GC-allocate arrays of vectors > 128 bit without potentially hitting segfaults (incl. @safe code obviously). And it's obviously not limited to vectors or arrays, but applies to all GC allocations of types with alignment > 16.
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/17259 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB