Is there any performance difference between ++i and i++ in C#?

Ah... Open again. OK. Here's the deal.

ILDASM is a start, but not an end. The key is: What will the JIT generate for assembly code?

Here's what you want to do.

Take a couple samples of what you are trying to look at. Obviously you can wall-clock time them if you want - but I assume you want to know more than that.

Here's what's not obvious. The C# compiler generates some MSIL sequences that are non-optimal in a lot of situations. The JIT it tuned to deal with these and quirks from other languages. The problem: Only 'quirks' someone has noticed have been tuned.

You really want to make a sample that has your implementations to try, returns back up to main (or wherever), Sleep()s, or something where you can attach a debugger, then run the routines again.

You DO NOT want to start the code under the debugger or the JIT will generate non-optimized code - and it sounds like you want to know how it will behave in a real environment. The JIT does this to maximize debug info and minimize the current source location from 'jumping around'. Never start a perf evaluation under the debugger.

OK. So once the code has run once (ie: The JIT has generated code for it), then attach the debugger during the sleep (or whatever). Then look at the x86/x64 that was generated for the two routines.

My gut tells me that if you are using ++i/i++ as you described - ie: in a stand alone expression where the rvalue result is not re-used - there won't be a difference. But won't it be fun to go find out and see all the neat stuff! :)

There is no difference in the generated intermediate code for ++i and i++ in this case. Given this program:

class Program
    const int counter = 1024 * 1024;
    static void Main(string[] args)
        for (int i = 0; i < counter; ++i)

        for (int i = 0; i < counter; i++)

The generated IL code is the same for both loops:

  IL_0000:  ldc.i4.0
  IL_0001:  stloc.0
  // Start of first loop
  IL_0002:  ldc.i4.0
  IL_0003:  stloc.0
  IL_0004:  br.s       IL_0010
  IL_0006:  ldloc.0
  IL_0007:  call       void [mscorlib]System.Console::WriteLine(int32)
  IL_000c:  ldloc.0
  IL_000d:  ldc.i4.1
  IL_000e:  add
  IL_000f:  stloc.0
  IL_0010:  ldloc.0
  IL_0011:  ldc.i4     0x100000
  IL_0016:  blt.s      IL_0006
  // Start of second loop
  IL_0018:  ldc.i4.0
  IL_0019:  stloc.0
  IL_001a:  br.s       IL_0026
  IL_001c:  ldloc.0
  IL_001d:  call       void [mscorlib]System.Console::WriteLine(int32)
  IL_0022:  ldloc.0
  IL_0023:  ldc.i4.1
  IL_0024:  add
  IL_0025:  stloc.0
  IL_0026:  ldloc.0
  IL_0027:  ldc.i4     0x100000
  IL_002c:  blt.s      IL_001c
  IL_002e:  ret

That said, it's possible (although highly unlikely) that the JIT compiler can do some optimizations in certain contexts that will favor one version over the other. If there is such an optimization, though, it would likely only affect the final (or perhaps the first) iteration of a loop.

In short, there will be no difference in the runtime of simple pre-increment or post-increment of the control variable in the looping construct that you've described.

If you're asking this question, you're trying to solve the wrong problem.

The first question to ask is "how to I improve customer satisfaction with my software by making it run faster?" and the answer is almost never "use ++i instead of i++" or vice versa.

From Coding Horror's post "Hardware is Cheap, Programmers are Expensive":

Rules of Optimization:
Rule 1: Don't do it.
Rule 2 (for experts only): Don't do it yet.
-- M.A. Jackson

I read rule 2 to mean "first write clean, clear code that meets your customer's needs, then speed it up where it's too slow". It's highly unlikely that ++i vs. i++ is going to be the solution.