Created attachment 1182 [details] Source file with program that The attached program implements a part of Brent's minimization algorithm for one-dimensionsal functions. The code is from Numerical Recipes 3rd edition. I use dmd 2.061/ When I run the program with "rdmd brent_test.d" it runs fine and gives the correct result. When I run it with optimization, i.e. with "rdmd -O brent_test.d", it behaves differently. It enters some infinite loop and eventually throws the expected exception for too many iterations. You can see that I placed a writefln() into line 45, which outputs the value of variable a. When you move this writefln statement just one line below, i.e. below the if-statement, the code runs fine, even with optimization. I colleague of mine suggested that there might be a bug related to a large number of local variables. Maybe some limiting number of registers causes the machine to cache things into memory and pulling them back in a wrong way or something. Appreciate help! Stephan
During debugging, I actually looked at the value of every single local variable, and you can actually see how the value of some variables (for example "a") changes from one iteration to the next, without any assignment.
I just checked: The bug definitely was introduced with version 2.061! With dmd version 2.060, everything works fine, with and without the "-O" switch.
I can't reproduce this with the latest dmd. I'll upload a new beta tomorrow you can try.
(In reply to comment #3) > I can't reproduce this with the latest dmd. I'll upload a new beta tomorrow you > can try. What actually seems to be corrupted are the precompiled executables on the zip-file on the web. We checked this for the osx and the linux version. Both of these precompiled versions produce this bug. When we compile dmd from source, even for version 2.061 from the web, this bug does not occur. Stephan
(In reply to comment #3) > I can't reproduce this with the latest dmd. I'll upload a new beta tomorrow you > can try. Sorry to jump back and forth here. I have to again correct my previous statement: With the latest version of dmd/druntime/phobos (2.062 from git), this bug does occur! But only when you compile and run separately. When you use dmd -run, both versions with and without -O work fine. This is quite weird. So: dmd -O brent_test.d ./brent_test should produce a different outcome than dmd brent_test.d ./brent_test I will try use bisect to find out when this bug was introduced. Stephan
When I compile and run separately, it works fine. You should also clarify whether you are using -m64 or not.
Right, I use the 64bit model. And I tested this on OSX and on linux, with same outcomes on both platforms. It's frustating that you can't reproduce. Thanks for responding quickly on this anyway. I will see what I can find out with bisect. Stephan
BTW you might be interested in std.numeric.findRoot, which is the root-finding-by-bracketing algorithm (in contrast to "Brent's algorithm" which is minima-finding-by-bracketing). In terms of number of calls, I believe it beats all published algorithms (in some cases, by an order of magnitude). I should really publish it. I did some work on the minima problem as well, and put it into Tango, but it isn't in Phobos. The code is very old now, dating from a time where there were many compiler limitations, and it could use a review.
...and I can reproduce your bug.
I think there is an uninitialized variable in there. When I compile with -O, if I run the same executable multiple times, sometimes it passes, sometimes it fails.
Hi Don, glad to hear that you can reproduce the bug! I tested initializing all variables by hand, and the bug still occurs. Thanks for the suggestion to use std.numeric. Looks very useful! The Numerical Recipes Code style is worse than horrible! All those 1-letter variables...
Here is a more reduced test case (still enormous): Without -O, it returns on the first pass through the loop. With -O, one of two things happen: (a) it hits the assert(0) on the first pass through the loop; or (b) it generates an alignment hardware exception. It looks as though it is a issue with misalignment of SSE registers. Removing the assert(0) causes an ICE. --- import std.math : abs; void minimize() { double a,b,d=0.0,etemp,fu,fv,fw,fx; double p; double q,r,tol1,tol2,u,v,w,x,xm; double e=0.0; double ax,bx,cx,fa,fb,fc; double tol; ax = 2.8541; bx = 3; cx = 3.0458; fa = 0.145898; fb = 0; fc = 0.381966; tol = 3.0e-8; a= ax; b= cx; v = bx; w = bx; x = bx; fx = 0; fv = fx; fw = fx; a = 2.97871347812973974456; b = 3.0458; v =2.9442711606; w =2.9787134781; x = 3; fx= 0; fv = 0.00310570354087098691; fw = 0.00045311601333306815; e =-0.0557288394; d = -0.0212865219; for (int iter=0;iter<1;iter++) { xm=0.5*(a+b); tol1=tol*abs(x); tol2=2.0*(tol1); if (abs(x-xm) <= (tol2-0.5*(b-a))) { return; } if (abs(e) > tol1) { r=(x-w)*(fx-fv); q=(x-v)*(fx-fw); p=(x-v)*q-(x-w)*r; q=2.0*(q-r); if (q > 0.0) p = -p; q=abs(q); etemp=e; e=d; if (abs(p) >= abs(0.5*q*etemp) || q < p) { d= b-x; } else { d=p/q; u=x+d; if (u-a < tol2 || b-u < tol2) d = xm - x; } } else { d= (e=(x >= xm ? a-x : b-x)); } u= (abs(d) >= tol1) ? x+d : x+3.0e-8; if (u < 3.01) return; else assert(0); // FAILS HERE fu = (u-3.0)*(u-3.0); if (fu <= fx) { assert(0); } } } void main() { minimize(); }
A reduced test case for the ICE: import std.math : abs; void bug9387() { double x = 3; double r = (x-2.1)*0.1; double q = (x-2.1)*0.1 - r; double p = (x-2.1)*q - (x-2.1)*r; if (q > 0.0) p = -p; if (abs(p) >= q ) { } } --- dmd -O -m64 bug.d Internal error: backend/cgcod.c 769
ICE, further reduced: -------------- void bug9387a(double x) { } void ice9387() { double x = 0.3; double r = x*0.1; double q = x*0.1 + r; double p = x*0.1 + r*0.2; if ( q ) p = -p; bug9387a(p); }
And a reduction for the wrong-code case. This sometimes segfaults but usually hangs. Looks like the saved RBX register gets trampled: double brent(double x) { return x; } void wrong9387() { for (int iter=0; iter<1; iter++) { double v =2.94; if (brent(v)<= 2.9) { return; } double w = 2.97; double r = (0.2-w) * 0.1; double q = (0.2-v) * 0.1 - r; double p = 0.7*q - (0.2-v)*0.3; if (q > 0.0) p = -p; q = brent(q); double d = p-q; if (2.94 + d) w = v -v; brent(w); } } void main() { wrong9387(); }
Commit pushed to dmd-1.x at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/bfa5d0f0ba80c7ff6e0d67806714763584666fb2 fix Issue 9387 - Compiler switch -O changes behavior of correct code
https://github.com/D-Programming-Language/dmd/pull/1584 Thanks, Don, for the minimizations which made it easy for me to find the problem. It was not a regression, although it looked like one. The bug is nasty and I'm glad to get it fixed.
Don and Walter, thanks for reducing the code and fixing the bug, all on a very short timescale! This is going to be a very important fix for me. Using the optimization switch is critical for me. Stephan
Commits pushed to master at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/06d991f039eab23561398aea4ea764ea49a6dea4 fix Issue 9387 - Compiler switch -O changes behavior of correct code https://github.com/D-Programming-Language/dmd/commit/9f3ab3f0b4713bd12a3ada71ca783bab1edae663 Merge pull request #1584 from WalterBright/b45 fix Issue 9387 - Compiler switch -O changes behavior of correct code
> Don and Walter, thanks for reducing the code and fixing the bug, all on a very short timescale! Thanks. Optimizer bugs get top priority, and this was the one of the worst bugs of all time. I found test cases where the executable was wrong, yet still produced correct results in 90% of runs. I don't think I've ever seen a bug that was so difficult to reduce.