Issue 4488 - Faster fixed-size array initialization from literal
Summary: Faster fixed-size array initialization from literal
Status: RESOLVED DUPLICATE of issue 2356
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: All All
: P2 enhancement
Assignee: No Owner
URL:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2010-07-19 14:43 UTC by bearophile_hugs
Modified: 2010-07-19 23:18 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description bearophile_hugs 2010-07-19 14:43:10 UTC
This D2 program initializes a small fixed-size array allocated on the stack:


import std.c.stdio: printf;
import std.c.stdlib: atof;
void main() {
    double x = atof("1.0");
    double y = atof("2.0");
    double[2] arr = [x, y];
    printf("%f\n", arr[1]);
}


The asm generated by dmd in an optimized build shows two calls to initialize the array (one to build it on the heap and one to copy from heap to stack). Even LDC leaves the first call.

Inizialization of small stack-allocated arrays like this are often used in high-performance code, such two calls can reduce performance if they are inside a function called in an inner loop.

So can such two calls be removed in this simple situation? The compiler can recognize that there is no need for heap allocations in this case.

------------

DMD v.2.047 asm, optimized build:

__Dmain comdat
        sub ESP,04Ch
        mov EAX,offset FLAT:_DATA
        push    EBX
        push    ESI
        push    EAX
        call    near ptr _atof
        mov ECX,offset FLAT:_DATA[4]
        fstp    qword ptr 010h[ESP]
        push    010h
        push    ECX
        call    near ptr _atof
        add ESP,0FFFFFFFCh
        mov EDX,offset FLAT:_D12TypeInfo_xAd6__initZ
        fstp    qword ptr [ESP]
        push    dword ptr 020h[ESP]
        push    dword ptr 020h[ESP]
        push    2
        push    EDX
>       call    near ptr __d_arrayliteralT
        add ESP,018h
        push    EAX
        lea EBX,028h[ESP]
        push    EBX
>       call    near ptr _memcpy
        mov ESI,offset FLAT:_DATA[8]
        push    dword ptr 038h[ESP]
        push    dword ptr 038h[ESP]
        push    ESI
        call    near ptr _printf
        add ESP,01Ch
        xor EAX,EAX
        pop ESI
        pop EBX
        add ESP,04Ch
        ret

------------

LDC asm, optimized build:

_Dmain:
    subl    $52, %esp
    movl    $.str, (%esp)
    call    atof
    fstpt   28(%esp)
    movl    $.str1, (%esp)
    call    atof
    fstpt   16(%esp)
    movl    $2, 4(%esp)
    movl    $_D11TypeInfo_Ad6__initZ, (%esp)
>   call    _d_newarrayvT
    fldt    28(%esp)
    fstpl   (%eax)
    fldt    16(%esp)
    fstpl   40(%esp)
    movsd   40(%esp), %xmm0
    movsd   %xmm0, 8(%eax)
    movsd   %xmm0, 4(%esp)
    movl    $.str2, (%esp)
    call    printf
    xorl    %eax, %eax
    addl    $52, %esp
    ret $8
Comment 1 nfxjfg 2010-07-19 19:52:27 UTC
The problem is that array and struct initializers never worked with non-static data (which is utterly retarded; even C can initialize any data with these; what's even more retarded is that this was "fixed" by simply removing struct initializers in D2 and replaced them by constructors, which are redundant to opCall anyway... but I'm digressing.)

This means the above snippet is really:

double[2] arr;
arr = [x, y];

This makes it more obvious why it's allocating.
Comment 2 Don 2010-07-19 21:47:34 UTC
This looks like a duplicate of bug 2356.
Comment 3 bearophile_hugs 2010-07-19 23:18:32 UTC

*** This issue has been marked as a duplicate of issue 2356 ***