D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 5728 - "rol" in core.bitop
Summary: "rol" in core.bitop
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: All All
: P2 enhancement
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-10 14:10 UTC by bearophile_hugs
Modified: 2011-08-09 18:08 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description bearophile_hugs 2011-03-10 14:10:33 UTC
I'd like the "rol" X86 instrinsic added to core.bitop module.

--------------------

Rationale:

I'd like the intrinsic because rotating bits in a 32 bit number is a common enough operation (there was recently a thread in D.learn of a person in need for the fastest rotation of the bytes of a uint), and in C-derived languages there isn't a bitwise operator for it, while most CPUs have four or more instructions to rotate a register.

In C-like languages the problem is sometimes solved making the compiler a bit smarter, but this little test shows DMD is not able to optimize two shifts plus or into a single rot instruction:

dmd -O -release -inline test.d


int main(string[] args) {
    uint x = args.length;
    return (x << 24) | (x >> 8);
}


__Dmain comdat
    mov EAX,4[ESP]
    mov ECX,4[ESP]
    shl EAX,018h
    shr ECX,8
    or  EAX,ECX
    ret

--------------------

Using inline asm is an option:

int main(string[] args) {
    uint x = args.length;
    asm {
        rol x, 8;
    }
    return x;
}

__Dmain	comdat
    push	EBP
    mov	EBP,ESP
    push	EAX
    mov	EAX,8[EBP]
    mov	-4[EBP],EAX
    rol	-4[EBP],8
    mov	EAX,-4[EBP]
    mov	ESP,EBP
    pop	EBP
    ret

-----------------

But in some cases the inline gives problems, so I think an inlined intrinsic is more handy, safer, shorter, simpler to use:

union Four {
    uint u;
    ubyte[4] a;
}
void main() {
    Four f;
    asm {
        rol f.u, 8;
    }
}


test.d(8): bad type/size of operands 'f.u'

-----------------

Note: an alternative to the intrinsic is to introduce in D/DMD a standard syntax for inlined asm expressions using __asm(), as present in LDC, but they are quite less easy to use than a rot() function-like intrinsic (but they are much more useful than a single intrinsic):

http://www.dsource.org/projects/ldc/wiki/InlineAsmExpressions
Comment 1 bearophile_hugs 2011-07-28 05:35:18 UTC
Now DMD too is able to figure by itself when a rol is needed, but I think I'd like the instrinsic still:

https://github.com/D-Programming-Language/dmd/commit/32ea0206dead162f564689339b6c43212042643c

https://github.com/D-Programming-Language/dmd/commit/25df0466ab3ed9d36c354da7576e39f0c64663a5
Comment 2 Adam D. Ruppe 2011-08-03 19:18:02 UTC
I just want to say that dmd commit is fantastic. We got a little embarrassed by that in a discussion a couple weeks ago!

Re bearophiles request: it probably doesn't need an intrinsic since the optimzation should let a plain library function do it just as well.
Comment 3 Walter Bright 2011-08-03 22:45:51 UTC
Adam's right.
Comment 4 bearophile_hugs 2011-08-07 11:23:10 UTC
As reference, DMD 2.055head compiles this D2 function:


T rot(T)(T x, int shift) {
    return (x >> shift) | (x << (T.sizeof * 8 - shift));
}
void main() {
  int a = 0b_1111_1111;
  int b = 0b_0000_0010;
  rot(a, b);
}


To (-O -release):

_D4test10__T3rotTiZ3rotFiiZi    comdat
    push    EAX
    mov EAX,8[ESP]
    mov ECX,[ESP]
    sar EAX,CL
    mov ECX,020h
    mov EDX,8[ESP]
    sub ECX,[ESP]
    shl EDX,CL
    or  EAX,EDX
    pop ECX
    ret 4
Comment 5 Walter Bright 2011-08-09 17:34:51 UTC
(In reply to comment #4)
> As reference, DMD 2.055head compiles this D2 function:
> T rot(T)(T x, int shift) {
>     return (x >> shift) | (x << (T.sizeof * 8 - shift));
> }

The mistake is this is not a correct rotate function. (x >> shift) is a signed right shift. Rewrite as (x >>> shift).
Comment 6 bearophile_hugs 2011-08-09 18:08:29 UTC
(In reply to comment #5)

> The mistake is this is not a correct rotate function. (x >> shift) is a signed
> right shift. Rewrite as (x >>> shift).

Thank you, I'm wrong all the time. Now it shows the ror:


_D4test10__T3rotTiZ3rotFNaNbxixiZi
    push EAX
    mov  EAX,8[ESP]
    mov  ECX,[ESP]
    ror  EAX,CL
    pop  ECX
    ret  4