Overhead associated with OutputDebugString in release build

Measured - 10M calls take about 50 seconds. I think it's significant overhead for unused functionality.

Using a macro can help get rid of this in release build:

#ifdef _DEBUG
    #define LOGMESSAGE( str ) OutputDebugString( str );
#else
    #define LOGMESSAGE( str )
#endif

Not only removes the calls, but also the parameters evaluation and the text strings are entirely removed and you'll not see them in the binary file.


I had read in an article that OutPutDebugString internally does few interesting things:

  1. Creates\Opens mutex and wait infinitely till mutex is acquired.
  2. Passes the data between the application and the debugger is done via a 4kbyte chunk of shared memory, with a Mutex and two Event objects protecting access to it.

Even if the debugger is not attached ( in release mode) there is significant cost involved in using OutputDebugstring with the usage of various kernel objects.

Performance hit is very evident if you write a sample code and test.


I've not seen a problem in dozens of server-side release mode apps over the years, all of which have built-in metrics. You can get the impression that it's slow because most of the debug-catcher applications you can find (DBWIN32 et al) are pretty slow at throwing the data up onto the screen, which gives the impression of lag.

Of course all of our applications have this output disabled by default, but it is useful to be able to turn it on in the field, since you can then view debug output from several applications, serialised in something like DBWin32. This can be a very useful debugging technique for bugs which involve communicating applications.


I'm writing this long after this question has been answered, but the given answers miss a certain aspect:

OutputDebugString can be quite fast when no one is listening to its output. However, having a listener running in the background (be it DbgView, DBWin32, Visual Studio etc.) can make it more than 10 times slower (much more in MT environment). The reason being those listeners hook the report event, and their handling of the event is done within the scope of the OutputDebugString call. Moreover, if several threads call OutputDebugString concurrently, they will be synchronized. For more, see Watch out: DebugView (OutputDebugString) & Performance.

As a side note, I think that unless you're running a real-time application, you should not be that worried about a facility that takes 50 seconds to run 10M calls. If your log contains 10M entries, the 50 seconds wasted are the least of your problems, now that you have to somehow analyze the beast. A 10K log sounds much more reasonable, and creating that will take only 0.05 seconds as per sharptooth's measurement.

So, if your output is within a reasonable size, using OutputDebugString should not hurt you that much. However, have in mind a slowdown will occur once someone on the system starts listening to this output.