D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 6192 - std.algorithm.sort performance
Summary: std.algorithm.sort performance
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: phobos (show other issues)
Version: D2
Hardware: All All
: P2 enhancement
Assignee: No Owner
URL:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2011-06-21 15:51 UTC by bearophile_hugs
Modified: 2016-10-01 14:22 UTC (History)
4 users (show)

See Also:


Attachments
sort bench (5.33 KB, application/octet-stream)
2011-06-21 15:51 UTC, bearophile_hugs
Details

Note You need to log in before you can comment on or make changes to this issue.
Description bearophile_hugs 2011-06-21 15:51:37 UTC
Created attachment 1004 [details]
sort bench

The small benchmark program shows the performance of std.algorithm.sort
compared to a different one (it doesn't accept a key sort function, etc).

One output example:

sort-sort2 benchmarks (milliseconds), N=6000000:
  Random distribution:
    sort:    4131
    sort2:   2635
  Already sorted arrays:
    sort:    1979
    sort2:    787
  Inverted order arrays:
    sort:    2105
    sort2:   1374
  Few random doubles appended to the sorted arrays:
    sort:    2890
    sort2:   1551


See also bug 5077
Comment 1 Dmitry Olshansky 2011-06-26 08:35:32 UTC
Results for me on upcomming 2.054 Phobos:

sort-sort2 benchmarks (milliseconds), N=6000000:
  Random distribution:
    sort:    1040
    sort2:   1012
  Already sorted arrays:
    sort:     357
    sort2:    312
  Inverted order arrays:
    sort:     361
    sort2:    576
  Few random doubles appended to the sorted arrays:
    sort:    3657
    sort2:    482

compiled with:
dmd -O -inline -release sort_bench.d
Comment 2 bearophile_hugs 2011-07-06 10:42:20 UTC
Was this update in the sort code caused by this enhancement request, or are they unrelated?

Timings with DMD 2.054beta, a different CPU:


sort-sort2 benchmarks (milliseconds), N=6000000:
  Random distribution:
    sort:    1892
    sort2:   1131
  Already sorted arrays:
    sort:     748
    sort2:    376
  Inverted order arrays:
    sort:    1048
    sort2:    656
  Few random doubles appended to the sorted arrays:
    sort:    3085
    sort2:    734


It seems the random case is improved a lot, the already sorted case is improved, the inverted order is about the same, and the few random values added case is slower than before.
Comment 3 kennytm 2011-07-06 11:00:55 UTC
(In reply to comment #2)
> Was this update in the sort code caused by this enhancement request, or are
> they unrelated?
> 
> Timings with DMD 2.054beta, a different CPU:
> 
> 
> sort-sort2 benchmarks (milliseconds), N=6000000:
>   Random distribution:
>     sort:    1892
>     sort2:   1131
>   Already sorted arrays:
>     sort:     748
>     sort2:    376
>   Inverted order arrays:
>     sort:    1048
>     sort2:    656
>   Few random doubles appended to the sorted arrays:
>     sort:    3085
>     sort2:    734
> 
> 
> It seems the random case is improved a lot, the already sorted case is
> improved, the inverted order is about the same, and the few random values added
> case is slower than before.

https://github.com/D-Programming-Language/phobos/pull/74
Comment 4 bearophile_hugs 2011-07-11 04:58:06 UTC
Thank you for your work. The new timings (after pull 74):

sort-sort2 benchmarks (milliseconds), N=6000000:
  Random distribution:
    sort:    1261
    sort2:   1201
  Already sorted arrays:
    sort:     300
    sort2:    441
  Inverted order arrays:
    sort:     315
    sort2:    729
  Few random doubles appended to the sorted arrays:
    sort:    9962
    sort2:    632


The last case is a common use case in Python code. Python programmers sometimes append unsorted items to a sorted list, and once in a while they sort it. Because the Timsort used in Python (and Java) is able to face this case very well, this is an efficient strategy to keep a rarely updated list sorted in Python.
I don't know how much common this pattern is in real D code (but experience shows that often real-world data is already partially sorted).
Comment 5 Andrei Alexandrescu 2014-02-17 15:01:46 UTC
Just pasting this for future reference: dual-pivot quicksort

http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf

Seems to be used in Java 8's sort for primitive types according to http://vkostyukov.ru/posts/dual-pivot-binary-search/
Comment 6 Andrei Alexandrescu 2016-01-12 03:22:19 UTC
https://github.com/D-Programming-Language/phobos/pull/3922
Comment 7 Andrei Alexandrescu 2016-01-12 03:27:23 UTC
(In reply to Andrei Alexandrescu from comment #6)
> https://github.com/D-Programming-Language/phobos/pull/3922

With this PR and the command line:

dmd -O -inline -release -run sort-benchmark.d 6000000

the output is:

sort-sort2 benchmarks (milliseconds), N=6000000:
  Random distribution:
    sort:     696
    sort2:    680
  Already sorted arrays:
    sort:     122
    sort2:    113
  Inverted order arrays:
    sort:     132
    sort2:    224
  Few random doubles appended to the sorted arrays:
    sort:     244
    sort2:    234
Comment 8 Andrei Alexandrescu 2016-01-12 06:06:15 UTC
After a few more tweaks:

dmd -O -inline -release -run sort-benchmark.d 6000000                                 

sort-sort2 benchmarks (milliseconds), N=6000000:
  Random distribution:
    sort:     702
    sort2:    689
  Already sorted arrays:
    sort:     108
    sort2:    115
  Inverted order arrays:
    sort:     154
    sort2:    215
  Few random doubles appended to the sorted arrays:
    sort:     247
    sort2:    251
Comment 9 Andrei Alexandrescu 2016-10-01 05:07:51 UTC
https://github.com/dlang/phobos/pull/4826
Comment 10 Andrei Alexandrescu 2016-10-01 14:22:59 UTC
Fixed by https://github.com/dlang/phobos/pull/4826