D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 8047 - important opcodes missing from core/simd.d
Summary: important opcodes missing from core/simd.d
Status: REOPENED
Alias: None
Product: D
Classification: Unclassified
Component: druntime (show other issues)
Version: D2
Hardware: x86_64 All
: P2 major
Assignee: No Owner
URL:
Keywords: bootcamp, SIMD
Depends on:
Blocks:
 
Reported: 2012-05-05 00:31 UTC by Sean Cavanaugh
Modified: 2024-12-07 13:31 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Sean Cavanaugh 2012-05-05 00:31:58 UTC
There are a number of opcodes that are missing, but some are far more critical than others, more or less listed here in order of most important first:

missing store instructions (and some loads)
STOSS
STOSD
STOAPS
STOAPD
STOD
STOQ
(there are a few others scattered in the enum table)


movemask (critical for doing branching tests against simd registers):
MOVMSKPD
MOVMSKPS


missing comparisons
CMPPS
CMPPD
CMPSD
CMPSS


missing conversions
CVTPS2PI
CVTSD2SI
CVTSI2SD
CVTSI2SS
CVTSS2SI
CVTTPD2PI
CVTTPS2PI
CVTTSD2SI
CVTTSS2SI
Comment 1 Marco Leise 2013-11-22 16:43:26 UTC
Some mnemonics like PMOVMSKB cannot even be expressed with the interface that is offered. It returns a 32-bit word consisting of only the high bit of every byte in the MMX or SSE register.
Since I've tried other workarounds up inline asm and hard coding hex values and nothing worked, I've set this bug to 'major'.
The inline asm workaround usually ends in this:
Internal error: backend/cgcod.c 1561
But that's not what this bug report is about. I'm just stating that there are more SIMD bugs lurking under the surface.
Comment 2 John Colvin 2014-12-10 16:58:20 UTC
Also missing is PCMPGT[SDQ]

Can they just be added to the druntime file or are compiler modifications necessary?
Comment 3 Martin Nowak 2014-12-10 19:12:08 UTC
(In reply to John Colvin from comment #2)
> Also missing is PCMPGT[SDQ]
> 
> Can they just be added to the druntime file or are compiler modifications
> necessary?

Looks like most can simply be added, just have to add the correct opcode.
But PCMPGTQ is already there and works for me on 2.066.1.
https://github.com/D-Programming-Language/druntime/blob/109a604a08c7592687a9b482ac2a8bb8ded80ccc/src/core/simd.d#L3633
Comment 4 Marco Leise 2015-10-03 19:33:40 UTC
    //PMOVMSKB = 0x660FD7,

has been commented out in core.simd. We may as well comment out all instructions returning non-XMM values until this is resolved. The ones I could find so far are:

COMISD
COMISS
CVTSD2SI
CVTSS2SI
CVTTPD2PI
CVTTPS2PI
CVTTSD2SI
CVTTSS2SI
MASKMOVDQU
MASKMOVQ
MOVMSKPD
MOVMSKPS
PCMPESTRI
PCMPISTRI
PMOVMSKB
PTEST
UCOMISS
UCOMISD

CRC32, POPCNT and LZCNT don't belong in the XMM enum. They were introduced side-by-side with SSE4.2, but don't work on XMM registers and the latter two have their separate CPUID flags.
Comment 5 Walter Bright 2016-11-22 01:04:30 UTC
These have been in core.simd for a while.
Comment 6 Marco Leise 2016-11-22 08:36:34 UTC
(In reply to Walter Bright from comment #5)
> These have been in core.simd for a while.

While that is true for the original bug description, the hard issue is not missing enum values themselves, but a lack of support for them, namely returning something else than SIMD vectors as I outlined in comment #1 and #4 above. The XMM enum is still rather messy if you look at it from some distance:

There are some non-SSE opcodes in it as noted in their comment (i.e. POPCNT and LZCNT have nothing to do with SSE). They should be handled in core.bitop instead, IMHO.

Some non-working opcodes are rightfully commented out until this bug is resolved (i.e. PMOVMSKB).

Other non-working opcodes are NOT commented out (i.e. MOVMSKPD from the original description, see comment #4 for a list).

AMD's SSE4a seems to have an undecided fate with its opcodes commented out in entirety. This may be consider a separate bug, but then again, whoever works on this bug will probably look at them as well.

The ddoc for XMM still says: "XMM opcodes that conform to the following: opcode xmm1,xmm2/mem and do not have side effects (i.e. do not write to memory)." This description doesn't apply to e.g. CRC32 or PREFETCH.

DMD + core.simd still need some work to move SIMD support out of proof-of-concept phase. Admittedly I didn't run any tests since 2015, so if any of the above is in good shape now, shame on me. :)
Comment 7 ponce 2021-01-07 13:55:33 UTC
Hello,

Can't implement the following intrinsics for DMD:

_mm_movemask_ps needs MOVMSKPS support, as Marco Leise said 7 years ago it is an instruction that return in a general purpose register instead of an XMM register.

----------------------------------------------------
int _mm_movemask_ps (__m128 a) pure @trusted
{
    static if (DMD_with_DSIMD)
    {
        // suggested API ? This API returning an int doesn't exist in core.simd
        int res =  __simd_int(XMM.MOVMSKPS, a); 
        return res;
    }
    else static if (GDC_with_SSE)
    {
        return __builtin_ia32_movmskps(a);
    }
    else static if (LDC_with_SSE1)
    {
        return __builtin_ia32_movmskps(a);
    }
    else
    {
        int4 ai = cast(int4)a;
        int r = 0;
        if (ai.array[0] < 0) r += 1;
        if (ai.array[1] < 0) r += 2;
        if (ai.array[2] < 0) r += 4;
        if (ai.array[3] < 0) r += 8;
        return r;
    }
}
----------------------------------------------------


Same remark for:
- _mm_movemask_epi8 (pmovmskb), 
- _mm_movemask_pd (movmskpd),
Comment 8 dlangBugzillaToGithub 2024-12-07 13:31:59 UTC
THIS ISSUE HAS BEEN MOVED TO GITHUB

https://github.com/dlang/dmd/issues/17117

DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB