D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 1001 - print stack trace (in debug mode) when program die
Summary: print stack trace (in debug mode) when program die
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: druntime (show other issues)
Version: D2
Hardware: All All
: P2 enhancement
Assignee: Alexandru Razvan Caciulescu
URL:
Keywords: bootcamp
Depends on:
Blocks: 4044
  Show dependency treegraph
 
Reported: 2007-02-22 23:56 UTC by sa
Modified: 2021-01-18 14:26 UTC (History)
19 users (show)

See Also:


Attachments
first pass patch to add stack tracing for exceptions on linux (1.83 KB, patch)
2010-02-03 01:19 UTC, Brad Roberts
Details | Diff
sort hacky move of demangle from phobos to druntime and hook into the stacktrace code (21.56 KB, patch)
2010-05-31 14:58 UTC, Brad Roberts
Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description sa 2007-02-22 23:56:56 UTC
Just as Java, it's a great life saver in development, especially when we don't have a decent debugger right now.
Comment 1 Tyler Knott 2007-02-23 00:07:10 UTC
The Flectioned runtime reflection library (http://flectioned.kuehne.cn/) can provide this, though language support would be welcome.
Comment 2 Trass3r 2010-01-17 17:58:15 UTC
Flectioned hasn't been updated since over 2 years though.
Comment 3 Sean Kelly 2010-02-02 20:12:53 UTC
Flectioned doesn't work any more I'm afraid.  A replacement would be welcome, though I may be able to sort something out quickly with the stuff in ucontext.h on *nix.  Some platforms even have a backtrace call here, though output is fixed to a specific format.
Comment 4 BCS 2010-02-02 22:40:41 UTC
does anyone know if the "backtrace" function from execinfo.h works with DMD/Linux?


If it does, than that's the way to go. Heck, I've even got a blob of CPP code I'd give way that calls addr2line to get nice output.
Comment 5 Brad Roberts 2010-02-02 22:52:04 UTC
The c function works, so it will work with dmd.  I've been meaning to hook the thing into the runtime for ages.  It's easy to use.  The only interesting trick is that the app needs to be linked with -rdynamic (check the man pages, this is off the top of my head from when I've done it at work).  That'll likely require a minor tweak to dmd itself since it's what invokes the linker.  It could probably be added to dmd.conf as an alternative.
Comment 6 BCS 2010-02-02 23:44:41 UTC
(In reply to comment #5)
> The c function works, so it will work with dmd.

It will work with extern C functions but something's ticking a memory that D functions aren't necessarily the same, or is that just the arguments layout?

What the heck, I'll try it tomorrow and see if it works.
Comment 7 Brad Roberts 2010-02-03 01:19:34 UTC
Created attachment 560 [details]
first pass patch to add stack tracing for exceptions on linux

It's dumping filename (always the test binary name right now) and address, but not function name for some reason.. at least on my debian amd/64 box.  I'm not sure yet why it's failing to find the symbols right now.  I need to pull the code over into a C app to make sure it works there.  That'd at least help narrow down where the problem lies.

Current output:
$ ./obj/posix/debug/unittest.brad            
object.Exception: blah
----------------
./obj/posix/debug/unittest.brad [0x804b684]
./obj/posix/debug/unittest.brad [0x804b65c]
./obj/posix/debug/unittest.brad [0x804ab0b]
./obj/posix/debug/unittest.brad [0x804a8f0]
./obj/posix/debug/unittest.brad [0x804ab27]
./obj/posix/debug/unittest.brad [0x80491de]
./obj/posix/debug/unittest.brad [0x80491f0]
./obj/posix/debug/unittest.brad [0x804b93c]
./obj/posix/debug/unittest.brad [0x804b745]
./obj/posix/debug/unittest.brad [0x804b97e]
./obj/posix/debug/unittest.brad [0x804b745]
./obj/posix/debug/unittest.brad [0x804b603]
/lib32/libc.so.6(__libc_start_main+0xe6) [0xf7e02b46]
./obj/posix/debug/unittest.brad [0x8049101]

test code:
public import std.c.stdio;

void foo()
{
    throw new Exception("blah");
}

int main(char[][] args)
{   
    foo();
    printf("Success!\n");
    return 0;
}
Comment 8 Sean Kelly 2010-02-04 16:38:11 UTC
printstack doesn't exist on OSX, so it has to be faked there (which is pretty easy to do).  The only trick is that the stack trace integration is line-based using opApply, so on platforms where backtrace isn't available, the trace should really be output to a buffer and then parsed for opApply output.  This might not be possible with printstack though.  In any case, here's some code I've used to mimic printstack on OSX.  The backtrace_symbols call is what I'd use for opApply, since it's pretty much exactly what's needed.  If I remember correctly, backtrace is just a wrapper for some of the stuff in <ucontext.h>.

#if defined(sun) || defined(__sun) || defined(_sun_) || defined(__solaris__)
#   include <ucontext.h>
#elif defined(__APPLE__)
#   include <execinfo.h>

    int printstack( int fd )
    {
        void* callstack[128];
        int   frames = backtrace( callstack, 128 );
        backtrace_symbols_fd( callstack, frames, fd );

        /*
        char** strs = backtrace_symbols( callstack, frames );
        for( i = 0; i < frames; ++i )
        {
            fprintf( fd, "%s\n", strs[i] );
        }
        free( strs );
        */
        return 0;
    }
#else
    int printstack( int fd )
    {
        return 0;
    }
#endif
Comment 9 Sean Kelly 2010-02-04 16:40:01 UTC
Oops!  Looks like Brad beat me to it.  So it looks like backtrace is available on both OSX and Linux.  I guess that leaves us needing backtrace support on Win32.
Comment 10 Witold Baryluk 2010-02-04 16:41:22 UTC
It would be greate to have similar flexibiliy like in flectioned:
 - allow to have all exception be tracable or not, using runtime or compile switches,
 - or always trace only this exceptions which are derived from TracedException
 - have interface to get stacktrace from catched TracedException object (in case of normal Exceptions, with runtime/compile enabled stacktraces for all Exceptions it still should be possible to call such method, but it should return error, empty list, null StackTrace object, or ... give exception.).

rationale for this is that backtrace construction is slow probably, and EH is already slow, but EH is used extensivly sometimes for flow control (i know it is wrong aproach), but backtrace will most probably triple this  cost. exception handling is also used internally for things like destructions of scope variables.
Comment 11 Witold Baryluk 2010-02-04 16:48:33 UTC
AFAIK on freebsd it my be needed to link with -lexecinfo 

and according to my manual:
       These functions are GNU extensions.
and:
       backtrace(), backtrace_symbols(), and backtrace_symbols_fd() are provided in glibc since version 2.1.

So one should consult interfaces and implementation with LDC team and Tango team.
Comment 12 Sean Kelly 2010-02-04 16:50:35 UTC
Stack tracing is integrated with Throwable now, so if it's enabled it will happen for all exceptions that are created with "new" (ie. it won't happen on OutOfMemory).  It can either be shipped as a standalone library so the user has to link it, or it can be enabled/disabled via a code statement, something like:

Runtime.traceHandler = defaultTraceHandler; // turn on
Runtime.traceHandler = null; // turn off

The runtime property is already there, so it would just be a matter of exposing "defaultTraceHandler".
Comment 13 Witold Baryluk 2010-02-04 17:19:01 UTC
(In reply to comment #12)
> Stack tracing is integrated with Throwable now, so if it's enabled it will
> happen for all exceptions that are created with "new" (ie. it won't happen on
> OutOfMemory).  It can either be shipped as a standalone library so the user has
> to link it, or it can be enabled/disabled via a code statement, something like:
> 
> Runtime.traceHandler = defaultTraceHandler; // turn on
> Runtime.traceHandler = null; // turn off
> 
> The runtime property is already there, so it would just be a matter of exposing
> "defaultTraceHandler".

IMHO it is not sufficient. It is global switch, will change behaviour of all exceptions (beyond few system errors).

What is problematic is that one can have implementation of something (ie. iterators) which uses havily exceptions (throw new ... them milion per second). If this part of code is already tested we don't want to trace stack for this control-exceptions becuase it is pointless and can for example slow down code by order of magnitude (just a guess).

Possible solutions:
  - do not throw new object: save somewhere throwed object, and when back to throw it again similar control exception, throw really buffered object (this way we will not call constructor of exception).
  - have a way to specify which exceptions are traced, this can be done in multiple ways:
    - trace all, minus blacklisted classes (defined in runtime or using derivation from some class)
    - trace none, plus whitelisted classes (as above).

It would be also usefull to have handler of SIGSEGV signal which will show trace of the thread/process which done something wrong. Also INT3 is intersting to have.


In flectioned (which is nice, but really hackish, have ELF parser, lots of asembler and whiteliste of many functions), this is solved using (phobos part);
 - tracing only of exceptions derived from flectioned.TracedException
 - and optionally tracing all exception by calling first
                 flectioned.TracedException.traceAllExceptions(true);

Well maybe You have better ideas :)
Comment 14 Sean Kelly 2010-02-04 20:13:33 UTC
Exception handling is for error conditions, not flow control.  If someone is actually constructing exceptions for some other purpose then they're using the wrong tool :-)
Comment 15 Witold Baryluk 2010-02-05 05:28:30 UTC
(In reply to comment #14)
> Exception handling is for error conditions, not flow control.  If someone is
> actually constructing exceptions for some other purpose then they're using the
> wrong tool :-)

Well, yes you are right. I just checked my old code which i was thinking is using heavly EH for flow control. But I wasn't so stupid :) EH is only used for escaping in rare ocasions (not for error conditions but for flow control, yes, but i'm comparing more throw here to break; like in for/while, than to continiue :) )

Sorry for the problem. 


Anyway in documentation of this functionality I think there should be the statment that stacktrace is only intended to help in debuging and crash reporting, and no code should directly depend on the fact  that returned backtrace is nonempty or correct. (becuase in one can set traceHandler to null, or disabled it in -release mode for example).
Comment 16 Sean Kelly 2010-05-05 15:03:24 UTC
This can now be enabled by:

    import core.runtime;
    ...
    Runtime.traceHandler = &defaultTraceHandler;

I can't have it automatically set in debug mode because only a release version of the lib is currently shipped.
Comment 17 Brad Roberts 2010-05-05 19:00:08 UTC
personally.. I'd suggest it just always be enabled.  Maybe leave the hook but make it be something to turn off rather than on.
Comment 18 Sean Kelly 2010-05-06 10:15:57 UTC
Fair enough.  To disable the trace handler you'll do:

Runtime.traceHandler = null;

I'm looking into adding trace functionality for Windows as well (probably using WalkStack64), but that's a bit trickier.  I need to set up a Windows dev environment before I can even start on it.
Comment 19 Brad Roberts 2010-05-25 00:15:31 UTC
Sean, any objection to me submitting this minor diff:

Index: src/object_.d
===================================================================
--- src/object_.d       (revision 296)
+++ src/object_.d       (working copy)
@@ -1189,6 +1189,13 @@
     traceHandler = h;
 }
 
+/**
+ * Return the current trace handler
+ */
+extern (C) TraceHandler rt_getTraceHandler()
+{
+    return traceHandler;
+}
 
 /**
  * This function will be called when an exception is constructed.  The
Index: src/core/runtime.d
===================================================================
--- src/core/runtime.d  (revision 296)
+++ src/core/runtime.d  (working copy)
@@ -23,6 +23,7 @@
 
     extern (C) void rt_setCollectHandler( CollectHandler h );
     extern (C) void rt_setTraceHandler( TraceHandler h );
+    extern (C) TraceHandler rt_getTraceHandler();
 
     alias void delegate( Throwable ) ExceptionHandler;
     extern (C) bool rt_init( ExceptionHandler dg = null );
@@ -172,6 +173,13 @@
         rt_setTraceHandler( h );
     }
 
+    /**
+     * Return the current trace handler
+     */
+    static TraceHandler traceHandler()
+    {
+        return rt_getTraceHandler();
+    }
 
     /**
      * Overrides the default collect hander with a user-supplied version.  This


This would enable code like this:

    auto oldTH = Runtime.traceHandler;
    Runtime.traceHandler = null;
    scope(exit) Runtime.traceHandler = oldTH;

I ran across this 'need' while working on the dmd test suite that is checking some object throwing results, specifically two asserts like this:

    Object e = new Exception("hello");
    assert(e.toString() == "object.Exception: hello");
    assert(format(e) == "object.Exception: hello");
Comment 20 Sean Kelly 2010-05-25 15:18:01 UTC
Not at all.  We should really make all of the Runtime properties get/settable.
Comment 21 Brad Roberts 2010-05-25 15:49:27 UTC
Submitted as svn r297.
Comment 22 bearophile_hugs 2010-05-26 04:25:20 UTC
This is an example of Python program that gives a stack trace (here I have used a lambda also to show that stack trace printing code sometimes has problems with anonymous functions):

reverser = lambda s: s[-1] + reverser(s[:-1]) if s else ""
print reverser("this is a test" * 200)

After battling with huge stack traces in Python, there's another feature that I'd like to have in D (I am not sure if on default or not): stack trace compression. If a recursive function keeps calling itself, or two functions keep calling each other (other possibilities exist, but those two cover most cases), the stack trace can become too much long to print and read.

So just looking at the latest stack frame printed and penultimate stack frame printed it can compress it, reporting only how many time the last one or the last two ones are repeated (the uncompressed stack trace can be obtained too (on request if the compressed one is on default, otherwise it's the compressed one that's on request), that shows all the line numbers too).
Comment 23 Don 2010-05-26 05:41:33 UTC
(In reply to comment #22)

> If a recursive function keeps calling itself, or two functions
> keep calling each other (other possibilities exist, but those two cover most
> cases), the stack trace can become too much long to print and read.
> 
> So just looking at the latest stack frame printed and penultimate stack frame
> printed it can compress it, reporting only how many time the last one or the
> last two ones are repeated (the uncompressed stack trace can be obtained too
> (on request if the compressed one is on default, otherwise it's the compressed
> one that's on request), that shows all the line numbers too).

That's what's done with the template instantiation backtraces in the compiler, and I think it works very well. The basic idea is to always print out the first few frames, and only start looking for recursion beginning at frame 3 or 4.
I intend to add something similar to the interpreter, so that we have a CTFE stack trace. Still needs work though.
Comment 24 Brad Roberts 2010-05-29 00:42:56 UTC
There's two big things left on my list for stacktraces (at least on linux) that need to be done:

1) the default dmd.conf needs to have -L--export-dynamic in it
2) the strings from backtrace_symbols need to be demangled

Any collapsing of recursion is a distant second in my opinion.

Obviously, for those that use windows, traces on windows would probably go above #1 in priority -- but I'm not in that set. :)
Comment 25 Brad Roberts 2010-05-31 14:58:39 UTC
Created attachment 650 [details]
sort hacky move of demangle from phobos to druntime and hook into the stacktrace code

Arguably the demangler belongs in the runtime.  The current code is, well, less than ideal.  At least from an interface standpoint.  But this get's the pieces moved and put together to demonstrate what it could be.  In the process I also noticed that the current unittest for druntime fails, which is handy for testing.  The results:

<snip a bunch of foo unittest lines>
_arraySliceSliceMulass_s unittest
core.exception.AssertError@gc.gcx(264): Assertion failure
----------------
./unittest(class core.exception.AssertError core.exception.AssertError.__ctor(immutable(char)[], uint) . +0x25) [0x8077a65]
./unittest(onAssertError+0x28) [0x8077c18]
./unittest(_d_assertm+0x16) [0x8095c02]
./unittest(void gc.gcx.__assert(int) . +0x12) [0x807f46e]
./unittest(void gc.gcx.GC.enable() . +0x54) [0x807c930]
./unittest(gc_enable+0x16) [0x807be7e]
./unittest(void core.memory.GC.enable() . +0x8) [0x8077ea0]
./unittest(_Dmain+0x2b) [0x8074b17]
./unittest(extern (C) int rt.dmain2.main(int, char**) . void runMain() . +0x14) [0x8095f44]
./unittest(extern (C) int rt.dmain2.main(int, char**) . void tryExec(void delegate()) . +0x1d) [0x8095ea9]
./unittest(extern (C) int rt.dmain2.main(int, char**) . void runAll() . +0x2d) [0x8095f81]
./unittest(extern (C) int rt.dmain2.main(int, char**) . void tryExec(void delegate()) . +0x1d) [0x8095ea9]
./unittest(main+0x88) [0x8095e58]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x12bbd6]
./unittest() [0x80741a1]

vs what it emitted before the changes:

_arraySliceSliceMulass_s unittest
core.exception.AssertError@gc.gcx(264): Assertion failure
----------------
./unittest(_D4core9exception11AssertError6__ctorMFAyakZC4core9exception11AssertError+0x25) [0x8077a65]
./unittest(onAssertError+0x28) [0x8077c18]
./unittest(_d_assertm+0x16) [0x8095be2]
./unittest(_D2gc3gcx8__assertFiZv+0x12) [0x807f426]
./unittest(_D2gc3gcx2GC6enableMFZv+0x54) [0x807c8e8]
./unittest(gc_enable+0x16) [0x807be36]
./unittest(_D4core6memory2GC6enableFZv+0x8) [0x8077ea0]
./unittest(_Dmain+0x2b) [0x8074b17]
./unittest(_D2rt6dmain24mainUiPPaZi7runMainMFZv+0x14) [0x8095f24]
./unittest(_D2rt6dmain24mainUiPPaZi7tryExecMFMDFZvZv+0x1d) [0x8095e89]
./unittest(_D2rt6dmain24mainUiPPaZi6runAllMFZv+0x2d) [0x8095f61]
./unittest(_D2rt6dmain24mainUiPPaZi7tryExecMFMDFZvZv+0x1d) [0x8095e89]
./unittest(main+0x88) [0x8095e38]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0xd23bd6]
./unittest() [0x80741a1]
Comment 26 Jonathan M Davis 2010-08-22 13:50:51 UTC
I don't know what the current state of stack traces is overall or where it is with regards to work being done on them, but I would point out that all you get on Linux right now is a list of addresses. e.g.

object.Exception@gregorian.d(241): Invalid year.
----------------
./test() [0x805936a]
./test() [0x804b168]
./test() [0x804b269]
./test() [0x805b9fb]
./test() [0x8068d3c]
./test() [0x8060bed]
./test() [0x8068c57]
./test() [0x8061a88]
./test() [0x80619b0]
./test() [0x8061956]
/opt/lib32/lib/libc.so.6(__libc_start_main+0xe6) [0xf75cdc76]
./test() [0x8049341]


So, as it stands (using dmd 2.048), stack traces on Linux are pretty useless. I would assume that Sean is aware of this, but I thought that I should post a reminder of the current state of stack traces on Linux.
Comment 27 nfxjfg 2010-08-22 23:19:26 UTC
Note that Tango has a full implementation for backtraces both on Windows and Linux with demangled function names, files and line numbers. They are shown on uncatched exceptions, segfaults, or when explicitly requested. Only requirements are that the D program got compiled with debug infos switched on, and that backtracing explicitly is enabled by importing tango.core.tools.TraceException.

Of course Phobos can use the Tango BSD implementation since Phobos switched to Boost license, but it shows that it's very possible. I wish the Phobos team good luck duplication the functionality.
Comment 28 faithful 2010-08-23 04:32:19 UTC
What annoys the heck out of me are the Tango fanboy faggots advertising their viral library in every possible place.

There are good reasons why D 2.0 was born. The Boost license is good for larger companies like Facebook because they simply refuse to give any attribution to anyone in internal NDA projects. Doubting the excellence of Suckerberg is bad for your Linkedin reputation when applying for a job. How would D succeed without good attitude toward companies?

Another reason are OOP crazy design of Tango and politically inflammable secret society style they use. The Phobos developer situation is much better. If the rulers like your Boost inspired code, you get a developer status.

This is all OT, sorry * 1000.
Comment 29 Johannes Pfau 2010-08-23 06:33:45 UTC
@Jonathan M Davis
You need to add "-L--export-dynamic" to the compiler flags, maybe adding it to dmd.conf is the best idea. This gives stack traces with mangled names. Sean said somewhere in the newsgroup that he's working on demangling stack traces, but druntime can't use phobos code, so he has to rewrite the demangling code.
Comment 30 Sean Kelly 2010-09-09 13:32:47 UTC
Okay, demangling added for Linux and OSX.  I'll try to make sure that -L--export-dynamic is added to dmd.conf on Linux for the next release.
Comment 31 Trass3r 2010-09-16 05:07:27 UTC
Note that ddmd has custom working stack trace code for windoze. Maybe this could help in some way.

http://dsource.org/projects/ddmd
Comment 32 vano 2010-10-02 10:18:36 UTC
(In reply to comment #24)
> There's two big things left on my list for stacktraces (at least on linux) that
> need to be done:
> 
> 1) the default dmd.conf needs to have -L--export-dynamic in it
> 2) the strings from backtrace_symbols need to be demangled
> 
> Any collapsing of recursion is a distant second in my opinion.
> 
> Obviously, for those that use windows, traces on windows would probably go
> above #1 in priority -- but I'm not in that set. :)

So true! The lack of backtrace (stacktrace) on Windows is so frustrating. I really hope that with this bug having quite few votes Sean would implement it in the near future.
Comment 33 Benjamin Thaut 2010-10-07 06:39:34 UTC
For a win32 stacktrace (XP+) you might check my project on this:

http://3d.benjamin-thaut.de/?p=15

Kind Regards
Benjamin Thaut
Comment 34 Witold Baryluk 2010-11-24 08:22:14 UTC
(In reply to comment #30)
> Okay, demangling added for Linux and OSX.  I'll try to make sure that
> -L--export-dynamic is added to dmd.conf on Linux for the next release.

Wouldn't it be better, if it will be added by compiler itself when command line have any of  -g, -gc, -debug, -cov, -profile?


BTW. after adding -L--export-dynamic under Linux stacktrace display works almost perfectly.

Example:

$D/dmt> ./dmt  s.dt
std.stream.OpenException: Cannot open or create file 's.dt'
----------------
./dmt(std std.stream.StreamException.__ctor(immutable(char)[])) [0x809ccf9]
./dmt(std std.stream.StreamFileException.__ctor(immutable(char)[])) [0x80a0507]
./dmt(std std.stream.OpenException.__ctor(immutable(char)[])) [0x80a0527]
./dmt(_D3std6stream4File4openMFAyaE3std6stream8FileModeZv+0xe5) [0x80a06d9]
./dmt(_D3std6stream4File6__ctorMFAyaE3std6stream8FileModeZC3std6stream4File+0x25) [0x80a05ed]
./dmt(_D3std6stream12BufferedFile6__ctorMFAyaE3std6stream8FileModekZC3std6stream12BufferedFile+0x28) [0x80a0c9c]
./dmt(bool dmt.Convert(immutable(char)[], immutable(char)[])) [0x808a7e9]
./dmt(_Dmain+0x1ef) [0x808b323]
./dmt(extern (C) int rt.dmain2.main(int, char**)) [0x8091279]
./dmt(extern (C) int rt.dmain2.main(int, char**)) [0x80911cf]
./dmt(extern (C) int rt.dmain2.main(int, char**)) [0x80912bf]
./dmt(extern (C) int rt.dmain2.main(int, char**)) [0x80911cf]
./dmt(main+0xa7) [0x8091177]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0xb7630c76]
./dmt() [0x808a431]
$

I had everything compiled with -gc -L--export-dynamic including druntime and phobos. For some reasons it still do not demangle some symbols. It happens both in when compiling phobos/druntime with -release and without it.
Comment 35 Sean Kelly 2011-01-20 15:40:41 UTC
(In reply to comment #33)
> For a win32 stacktrace (XP+) you might check my project on this:
> 
> http://3d.benjamin-thaut.de/?p=15

Thanks!  I've noticed that the code doesn't have a Boost-compatible copyright.  Would you be averse to changing this so some derivative could be used in druntime?  I also found this in the MSDN docs:

"All DbgHelp functions, such as this one, are single threaded. Therefore, calls from more than one thread to this function will likely result in unexpected behavior or memory corruption. To avoid this, you must synchronize all concurrent calls from more than one thread to this function."

http://msdn.microsoft.com/en-us/library/ms681327(v=vs.85).aspx

I guess that means that the stack trace generation on Windows will have to be wrapped in a synchronized block.
Comment 36 Trass3r 2011-03-15 05:10:07 UTC
What's the status of Windows stack trace integration?

Note that Benjamin's code needs to be updated by adding an opApply version with index to class Callstack.
Comment 37 Benjamin Thaut 2011-03-23 00:56:13 UTC
(In reply to comment #36)
> What's the status of Windows stack trace integration?
> 
> Note that Benjamin's code needs to be updated by adding an opApply version with
> index to class Callstack.

I just updated the code on my blog, the opApply issue is fixed and it is also under a new licence which hopefully removes the problem of adding it to phobos.
Comment 38 Sean Kelly 2011-03-24 12:10:44 UTC
The changes are in.  I worked from the original implementation, since I'd already taken care of the opApply issue and such.
Comment 39 Matt Peterson 2011-05-20 20:53:12 UTC
I'm not sure what the problem is, but this isn't working for me on Linux 64bit. I get the "----------------" line that signifies the start of the stack trace but nothing shows up. I've been tinkering around with the druntime library a little bit but I can't seem to get any useful information, I'm not sure where to check anyway.

What can I do to help resolve this? Would it be useful to test the latest version from github?
Comment 40 Sean Kelly 2011-05-23 15:27:35 UTC
The code for this is in core/runtime.  If you're using a custom dmd.conf, the problem may be that you're missing an -L--export-dynamic.
Comment 41 Matt Peterson 2011-05-23 15:41:50 UTC
(In reply to comment #40)
> The code for this is in core/runtime.  If you're using a custom dmd.conf, the
> problem may be that you're missing an -L--export-dynamic.

I'm not using a custom dmd.conf and that switch is there.
Comment 42 Jonathan M Davis 2011-05-23 17:44:47 UTC
I haven't been seeing the stack traces with 64-bit either. I don't even get the addresses (which is what you get if you don't have -L--export-dynamic). I'm seeing the same behavior as Matt. It works fine on my 32-bit machine, but on my 64-bit machine where I'm using pure 64-bit D (dmd, druntime, and Phobos are all 64-bit), the stack traces are always empty.
Comment 43 Matt Peterson 2011-05-23 21:05:38 UTC
 I just tested building with -m32 and that works, although the symbols are still mangled. With -m64 I don't get any lines in the stack trace at all. In fact, I only see one line of ---------. I'm not not getting the one there should be at the end.
Comment 44 Matt Peterson 2011-05-24 08:00:56 UTC
Nevermind, I am seeing the second line of dashes, I was just getting confused at the difference between Throwable.toString and the way the runtime prints it out when it's unhandled.
Comment 45 bearophile_hugs 2011-07-03 12:55:16 UTC
To be closed?
Comment 46 Brad Roberts 2011-07-03 12:58:44 UTC
Maybe.  I think there's a couple outstanding issues that could be separated:

1) common look and feel of the stack trace output for all platforms.  Right now every platform produces different traces.

2) there's a report that the top frame is missing from one platform or another (from you I think)

3) 64 traces doesn't seem to work at all.

I need to re-check, but I think freebsd/* might also be broken.
Comment 47 Trass3r 2011-08-09 18:51:40 UTC
Any news regarding x64 traces?
They still don't work at all.
Comment 48 Sean Kelly 2011-08-18 11:45:02 UTC
The prototype for backtrace() had a size_t when it should have been an int.  Try the latest from git and see if it works.
Comment 49 Trass3r 2011-08-18 13:08:29 UTC
No, stack trace is still empty.

btw, while trying to get an exception I noticed that neither of the following is caught:

void main()
{
	int b = 0;
	int a = 5/b;
}
--- killed by signal 8
void main()
{
	int* a;
	int b = *a;
}
--- killed by signal 11

Is this ok?
Comment 50 Jonathan M Davis 2011-08-18 13:20:46 UTC
Those don't generate exceptions. They generate signals, which are completely different. And it's the OS which generates them, I believe.
Comment 51 Trass3r 2011-08-18 13:31:02 UTC
I do know about signals, I just wasn't sure if druntime is supposed to catch (some of) them.
Comment 52 bearophile_hugs 2011-08-18 16:32:21 UTC
(In reply to comment #50)
> Those don't generate exceptions. They generate signals, which are completely
> different. And it's the OS which generates them, I believe.

If the compiler adds tests (in nonrelease mode only, or maybe only in nonrelease -debug mode only) before those operations, then they too are able to produce a stack trace. Stack traces are handy. So is this a new feature worth asking for?
Comment 53 Jonathan M Davis 2011-08-18 16:56:26 UTC
Walter refuses to add null pointer checks - even in non-release mode. There is _zero_ chance you're going to get checks for arithmetic. That would have a seriously negative impact on performance. And honestly, while I think that null pointer checks in non-release mode might be nice, I definitely don't want basic arithmetic to be being checked. And besides, on Linux at least, you get a core dump in this kind of situation, and you can use gdb to get a stack trace and see what happened - and in far greater detail than an exception would give you. So, it's not like you can't get a stack trace. It's just that you need a debugger to get it.
Comment 54 Matt Peterson 2011-08-18 17:01:16 UTC
Yeah, Linux sends a signal when you have a memory violation like accessing a null pointer, and I've had success with creating a signal handler for SIGSEGV and throwing an exception manually. It's actually really easy, and it's much faster than having null pointer checks.
Comment 55 Don 2011-08-18 17:43:11 UTC
(In reply to comment #54)
> Yeah, Linux sends a signal when you have a memory violation like accessing a
> null pointer, and I've had success with creating a signal handler for SIGSEGV
> and throwing an exception manually. It's actually really easy, and it's much
> faster than having null pointer checks.

And Windows creates an exception, which does appear in the stack trace.
The behaviour is very strongly OS-dependent.
Comment 56 Sean Kelly 2011-08-19 14:12:24 UTC
If you want exceptions for these events on Posix, create a signal handler and throw one.  On some OSes it will work and on others it won't.  Assuming you're targeting a platform where it works though, doing so might be a net win.
Comment 57 Trass3r 2011-12-01 17:25:12 UTC
Seems like stack traces are finally supported on x64. Any plans to add line numbers to them?
Comment 58 Vladimir Panteleev 2012-01-06 09:33:29 UTC
(In reply to comment #57)
> Seems like stack traces are finally supported on x64. Any plans to add line
> numbers to them?

Looks like that would be difficult without a dependency on binutils.
Even then, binutils is GPL. I'm not sure how GPL works with regards to weak (optional) dependencies?
Comment 59 bearophile_hugs 2012-07-24 06:00:03 UTC
How much work is left to do before closing down this issue as fixed?
Comment 60 Vladimir Panteleev 2012-07-24 06:43:58 UTC
I believe two matters remain:

1) Getting stack traces on unhandled signals (or at least SIGSEGV).

There is ongoing discussion in this pull request:
https://github.com/D-Programming-Language/druntime/pull/187

2) Getting line numbers on POSIX.

I don't think this goal is easily directly reachable. The corresponding library (binutils) that parses DWARF debug information and extracts line numbers in licensed under the GNU GPL, meaning that loading it dynamically and passing around data structures is out of the question (see http://www.gnu.org/licenses/gpl-faq.html#NFUseGPLPlugins). However, the D distribution can include a small GPL-licensed program that can take an exception stack trace as stdin, and convert it to include line number information, similar to the addr2line utility.
Comment 61 Benjamin Thaut 2012-12-23 06:48:44 UTC
For windows x86:
https://github.com/D-Programming-Language/druntime/pull/368
Comment 62 tyler-dev 2013-05-23 01:34:51 UTC
(In reply to comment #60)
> 2) Getting line numbers on POSIX.
> 
> I don't think this goal is easily directly reachable. The corresponding library
> (binutils) that parses DWARF debug information and extracts line numbers in
> licensed under the GNU GPL, meaning that loading it dynamically and passing
> around data structures is out of the question (see
> http://www.gnu.org/licenses/gpl-faq.html#NFUseGPLPlugins). However, the D
> distribution can include a small GPL-licensed program that can take an
> exception stack trace as stdin, and convert it to include line number
> information, similar to the addr2line utility.

Would something like this work?
http://sourceforge.net/apps/trac/elftoolchain/

It's being developed for FreeBSD, so it's likely to work there and on Linux. I don't know of the progress, but it seems to be slated for the FreeBSD 10 release:
https://wiki.freebsd.org/GPLinBase
Comment 63 Martin Nowak 2014-01-05 07:28:16 UTC
So we have stack traces on all platforms by now, can I close the bug?

Regarding DWARF processing, it shouldn't be too hard.
But we could also dynamically load libdw.so from elfutils if it's installed similarly to how we load dbghelp.dll on Windows.

http://forum.dlang.org/post/eahjyebbtjynlivijqtn@forum.dlang.org
https://github.com/bombela/backward-cpp
Comment 64 Benjamin Thaut 2014-01-06 10:46:55 UTC
on Windows 64 bit a executable still crashes silently when a access violation occures. At least with 2.064.2

Code used to test:
import core.stdc.stdio;

void test()
{
  int* i;
  *i = 5;
}

void main(string[] args)
{
test();
printf("done\n");
}
Comment 65 Vladimir Panteleev 2014-01-06 12:21:36 UTC
(In reply to comment #64)
> on Windows 64 bit a executable still crashes silently when a access violation
> occures. At least with 2.064.2

Fixed in git HEAD, see Issue 11865.
Comment 66 Vladimir Panteleev 2014-01-06 12:22:50 UTC
(In reply to comment #65)
> (In reply to comment #64)
> > on Windows 64 bit a executable still crashes silently when a access violation
> > occures. At least with 2.064.2
> 
> Fixed in git HEAD, see Issue 11865.

My bad, never mind. I get a WER dialog, though, so it's not silent?
Comment 67 Vladimir Panteleev 2014-01-06 13:55:44 UTC
(In reply to comment #66)
> My bad, never mind. I get a WER dialog, though, so it's not silent?

Putting a null pointer dereference into try/catch does not catch it, so I guess the root cause is that D exceptions are not integrated with Windows exceptions (through SEH or whatever mechanism is used on Win64).
Comment 68 Benjamin Thaut 2014-01-06 13:59:30 UTC
Its not neccessary to implement windows SEH on 64-bit to support stack traces on access violations (or other hardware exceptions). For now we could simply use SetUnhandledExceptionFilter to handle such errors and then terminate the program.
Comment 69 Martin Nowak 2014-01-07 02:20:21 UTC
(In reply to comment #67) 
> Putting a null pointer dereference into try/catch does not catch it, so I guess
> the root cause is that D exceptions are not integrated with Windows exceptions
> (through SEH or whatever mechanism is used on Win64).

Which is rather good, turning asynchronous SEH into normal Exceptions ws a bad idea.
Comment 70 Vladimir Panteleev 2014-01-07 02:24:10 UTC
(In reply to comment #69)
> Which is rather good, turning asynchronous SEH into normal Exceptions ws a bad
> idea.

Could you elaborate why? I thought SEH was designed so that it would integrate neatly with C++ exceptions.
Comment 71 Martin Nowak 2014-01-07 02:55:12 UTC
http://en.wikipedia.org/wiki/Exception_handling#Exception_synchronicity
It's simply that the compiler cannot handle cleanup when every instruction could throw. In case an asynchronous exception happens in the middle of some statement your program is immediately in an invalid state. Continuing could deadlock or corrupt data, much worse than a crash.
Comment 72 Vladimir Panteleev 2014-01-11 04:06:23 UTC
(In reply to comment #71)
> It's simply that the compiler cannot handle cleanup when every instruction
> could throw. In case an asynchronous exception happens in the middle of some
> statement your program is immediately in an invalid state. Continuing could
> deadlock or corrupt data, much worse than a crash.

OK... so the problem is basically that we call destructors / finally blocks / scope(exit) blocks when Errors are thrown, and those may behave in a bad way since the program was in an indeterminate state? I imagine that it's the same for signals on POSIX? In that case, I suppose we could handle both in the same way: immediately print a stack trace and exit, but still provide a mechanism for the user to customize handling of such conditions.

I recall a discussion regarding whether thrown Errors should call finalizers on the stack, but I suppose it's not really clear-cut.
Comment 73 Martin Nowak 2014-01-11 10:53:13 UTC
(In reply to comment #72)
> (In reply to comment #71)
> OK... so the problem is basically that we call destructors / finally blocks /
> scope(exit) blocks when Errors are thrown, and those may behave in a bad way
> since the program was in an indeterminate state? I imagine that it's the same
> for signals on POSIX? In that case, I suppose we could handle both in the same
> way: immediately print a stack trace and exit, but still provide a mechanism
> for the user to customize handling of such conditions.
> 
> I recall a discussion regarding whether thrown Errors should call finalizers on
> the stack, but I suppose it's not really clear-cut.

It's a bigger topic amd opinions vary. AFAIK Errors skip cleanup code, but it's possible to catch Errors to perform minimal cleanup.
Comment 74 Jonathan M Davis 2014-01-11 15:16:45 UTC
> It's a bigger topic amd opinions vary. AFAIK Errors skip cleanup code, but 
> it's possible to catch Errors to perform minimal cleanup.

According to Walter, there is no guarantee that any cleanup code is run when an Error is thrown (though I'm not sure what the spec says on the matter). However, some of the folks that believe that cleanup should be done on Errors made it so that most cleanup code _is_ run when an Error is thrown. So, unless something is changed, I believe that normally finally blocks, scope(exit), scope(failure), and destructors will all be run when an Error is thrown. Where they will get skipped is when an Error is thrown from a nothrow function, and the cleanup code is outside the nothrow function, because the caller of the nothrow function will assume that the nothrow function doesn't throw anything and optimize based on that.

So, at this point, whether cleanup code is run on an Error depends on the code, and at minimum, it will never be the case that cleanup code is always run on an Error, because it won't be done for nothrow functions, or we lose the ability to optimize it based on the fact that it won't throw an exception (which is one its benefits). However, whether it will ever be changed such that cleanup code is never run for Errors is an open question and a topic of hot debate.
Comment 75 Vladimir Panteleev 2014-02-15 12:17:15 UTC
(In reply to comment #63)
> So we have stack traces on all platforms by now, can I close the bug?

What about line numbers? I think we only have them on Win64.

> Regarding DWARF processing, it shouldn't be too hard.

DWARF uses a weird state machine for efficient representation of file/line information. Doable but not trivial.

> But we could also dynamically load libdw.so from elfutils if it's installed
> similarly to how we load dbghelp.dll on Windows.

It's GPL just like binutils. IANAL, but I'm not sure about dynamically loading GPL libs. I know the GPL forbids redistribution of any programs that include the library... Doesn't that mean that it would make it impossible for e.g. Linux distributions to distribute non-GPL D software together with the library?

> https://github.com/bombela/backward-cpp

This seems to only support binutils and elfutils, both of which are GPL.
Comment 76 Andrej Mitrovic 2014-02-15 12:23:39 UTC
(In reply to comment #64)
> on Windows 64 bit a executable still crashes silently when a access violation
> occures. At least with 2.064.2
> 
> Code used to test:
> import core.stdc.stdio;
> 
> void test()
> {
>   int* i;
>   *i = 5;
> }
> 
> void main(string[] args)
> {
> test();
> printf("done\n");
> }

I can't reproduce this on Win7 x64 with 2.064. Here's the results for me:

$ dmd
> DMD32 D Compiler v2.064

$ dmd -run test.d
object.Error: Access Violation
----------------
0x00402012 in __xc_a
0x004022FF in __xi_a
0x00402217 in __xc_z
0x00402047 in __xc_a
0x75CC3677 in BaseThreadInitThunk
0x77859D72 in __RtlUserThreadStart
0x77859D45 in _RtlUserThreadStart
----------------

$ dmd -g -run test.d
object.Error: Access Violation
----------------
0x00402019 in void test.test() at C:\dev\code\d_code\test.d(6)
0x0040202C in _Dmain at C:\dev\code\d_code\test.d(12)
0x0040233C in void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll().void __lambda1()
0x0040230F in void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll()
0x00402227 in _d_run_main
0x00402054 in main
0x004140A9 in mainCRTStartup
0x75CC3677 in BaseThreadInitThunk
0x77859D72 in __RtlUserThreadStart
0x77859D45 in _RtlUserThreadStart
----------------
Comment 77 Benjamin Thaut 2014-02-15 12:31:09 UTC
@Andrej Mitrovic
You forgott the -m64 switch in your test. You are testing 32-bit binaries. The problem is not the 64 bit operating system it is the 64-bit executable, as 64 bit executables do not implement SEH.
Comment 78 Vladimir Panteleev 2014-02-15 12:31:49 UTC
(In reply to comment #76)
> I can't reproduce this on Win7 x64 with 2.064. Here's the results for me:

The problem applies to 64-bit programs. I can reproduce it with -m64.

(In reply to comment #75)
> It's GPL just like binutils. IANAL, but I'm not sure about dynamically loading
> GPL libs. I know the GPL forbids redistribution of any programs that include
> the library... Doesn't that mean that it would make it impossible for e.g.
> Linux distributions to distribute non-GPL D software together with the library?

Just found this:
http://www.gnu.org/licenses/gpl-faq.html#IfLibraryIsGPL

Quoting:
> Q: If a library is released under the GPL (not the LGPL), does that mean that 
> any software which uses it has to be under the GPL or a GPL-compatible license?
> 
> A: Yes, because the software as it is actually run includes the library.

So I guess that gives us a definitive answer. We CANNOT use binutils or elfutils, so libdw and libbfd are out of the question. From the discussed options, that leaves FreeBSD's Elftoolchain, integrating with addr2line, or our own DWARF/ELF parser.
Comment 79 Andrej Mitrovic 2014-02-15 12:50:07 UTC
(In reply to comment #77)
> @Andrej Mitrovic
> You forgott the -m64 switch in your test. You are testing 32-bit binaries.

My bad. It's a huge thread and I didn't notice this. I can now confirm your test results with -m64.
Comment 80 Andrei Alexandrescu 2016-10-13 20:12:48 UTC
Maybe Lucia can take a look.
Comment 81 Vladimir Panteleev 2016-10-14 04:09:34 UTC
Worth noting that I think something changed within the 2 years since the last message from 2014, since we now have stack traces with line numbers on Linux.
Comment 82 Brad Roberts 2016-10-14 22:21:45 UTC
I think what's left here is for someone, probably Lucia, to survey the current state of all the platforms and file a few smaller bug reports that cover what doesn't work well.  One of them is probably a "cleanup the actual output to be consistent across platforms".  This specific bug report has likely outlived it's usefulness.
Comment 83 RazvanN 2021-01-18 14:26:52 UTC
Closing this based on previous comments.