D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 14223 - TimSort algorithm is incorrect
Summary: TimSort algorithm is incorrect
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: phobos (show other issues)
Version: D2
Hardware: All All
: P1 normal
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-02-24 21:34 UTC by Ali Cehreli
Modified: 2017-07-19 17:42 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Ali Cehreli 2015-02-24 21:34:38 UTC
The following article describes and proposes a fix for a common bug in the TimSort algorithm:

  http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

ketmar agrees that Phobos's version of TimSort has the same bug:

  http://forum.dlang.org/thread/mcigvq$11p0$1@digitalmars.com#post-mciit8:242dvo:24102:40digitalmars.com

Ali
Comment 1 Ketmar Dark 2015-02-24 21:48:10 UTC
a possible patch:

fixes TimSort implementation, according to
http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

diff --git a/std/algorithm/sorting.d b/std/algorithm/sorting.d
index a71eb47..f40e957 100644
--- a/std/algorithm/sorting.d
+++ b/std/algorithm/sorting.d
@@ -1230,21 +1230,21 @@ private template TimSortImpl(alias pred, R)
             while (stackLen > 1)
             {
                 immutable run3 = stackLen - 1;
-                immutable run2 = stackLen - 2;
+                auto run2 = stackLen - 2;
                 immutable run1 = stackLen - 3;
-                if (stackLen >= 3 && stack[run1].length <= stack[run2].length + stack[run3].length)
+                immutable run0 = stackLen - 4;
+                if ((stackLen >= 3 && stack[run1].length <= stack[run2].length + stack[run3].length) ||
+                    (stackLen >= 4 && stack[run0].length <= stack[run2].length + stack[run1].length))
                 {
-                    immutable at = stack[run1].length <= stack[run3].length
-                        ? run1 : run2;
-                    mergeAt(range, stack[0 .. stackLen], at, minGallop, temp);
-                    --stackLen;
+                    if (stack[run1].length < stack[run3].length)
+                        --run2;
                 }
-                else if (stack[run2].length <= stack[run3].length)
+                else if (stackLen < 2 || stack[run2].length > stack[run3].length)
                 {
-                    mergeAt(range, stack[0 .. stackLen], run2, minGallop, temp);
-                    --stackLen;
+                    break; // invariant is established
                 }
-                else break;
+                mergeAt(range, stack[0 .. stackLen], run2, minGallop, temp);
+                --stackLen;
             }
         }
 

i haven't analyzed the code, though, and didn't run any unittests besides the simplest ones, so i can't guarantee that i did it right.

so hereby i sommon someone more knowledgeable to properly check it.