Using custom performance counters across appdomain recycles

The above error may also be experienced in the Application Logs when using process based WCF performance counters. The symptoms of this are a block of three errors in the Application event logs:

A performance counter was not loaded.
Category Name: ServiceModelService 3.0.0.0
Counter Name: Calls
Exception:System.InvalidOperationException: Instance 'abc@def|service.svc' already exists with a lifetime of Process. It cannot be recreated or reused until it has been removed or until the process using it has exited.

This always occurs immediately after the following Information message in the System event logs:

A worker process with process id of 'nnnn' serving application pool 'MyAppPool' has requested a recycle because the worker process reached its allowed processing time limit.

This causes the performance monitor counters for this service to become unavailable, apparently for a few hours.

The ServiceModelService 3.0.0.0 version number will depend on the version of .NET you are using (this was tested using .NET 3.5).

Background

The fault is triggered by the worker process reaching its processing time limit, at which point it must be recycled. This is set in the IIS Application Pool Recycling settings (IIS 6.0 and above, therefore Windows Server 2003 and above). Recycling of the worker process causes the new process based performance counter name to conflict with the old, which creates an error. This is because IIS uses overlapped recycling, where the worker process to be terminated is kept running until after the new worker process is started.

Reproduction

(Tested on Windows Server 2003 = IIS 6.0)

  • Create a WCF service.
  • Add the following to the web.config in the <system.serviceModel> section:
    <diagnostics performanceCounters="All" />
  • In the Application Pool properties for the service under Properties → Recycling, set the Recycle worker process to 1 minute. Manually recycling the Application Pool has no effect, as this does not create a request from the worker process to recycle (as evident in the Event Viewer System logs with W3SVC Information events).
  • In soapUI create a Test Suite / Test Case / Test Step / Test Request for the WCF operation.
  • Add the test case to either a load test in soapUI, or use loadUI, firing the call at the rate of 1 per second.
  • Every 1 minute the worker process will request a recycle (evident in the System logs) . Every 2 minutes this will result in a batch of three errors in the Application logs from System.ServiceModel 3.0.0.0.
  • The performance counters for that service will become unavailable until the worker process recycles again. (NB Setting the recycle period to a higher value at this point to see how long the performance counters are unavailable for will actually recycle the processes and make the counters available again.)

Possible solutions

Solution 1 - the red herrings

Hotfix KB981574 supersedes hotfix KB971601. The latter hotfix describes the problem:

FIX: The performance counters that monitor an application stop responding when the application exits and restarts and you receive the System.InvalidOperationException exception on a computer that is running .NET Framework 2.0

Applying the former hotfix does not remedy the problem. Applying the latter hotfix caused app pool errors.

Solution 2 - a working solution

It is possible to create a service host factory that exposes a custom service host:

using System;
using System.Diagnostics;
using System.ServiceModel;
using System.ServiceModel.Activation;

namespace MyNamespace
{
    public class WebFarmServiceHostFactory : ServiceHostFactory
    {
        protected override ServiceHost CreateServiceHost(
                   Type serviceType, Uri[] baseAddresses)
        {
            return new WebFarmServiceHost(serviceType, baseAddresses);
        }
    }

    public class WebFarmServiceHost : ServiceHost
    {
        public WebFarmServiceHost(
            Type serviceType, params Uri[] baseAddresses)
            : base(serviceType, baseAddresses) { }

        protected override void ApplyConfiguration()
        {
            base.ApplyConfiguration();

            Description.Name = "W3wp" + Process.GetCurrentProcess().Id +
                               Description.Name;
        }
    }
}

And refer to the factory in the service markup .svc file:

<%@ ServiceHost Language="C#" Debug="true" Factory="MyNamespace.WebFarmServiceHostFactory" Service="WcfService1.Service1" CodeBehind="Service1.svc.cs" %>

Final tweaks

Unfortunately, although this makes the problem occur much less frequently, it still occurs (as an approximation around 30 times less, although I've not measured accurate stats on this).

A simple tweak which seems to remedy the problem completely is to add a sleep command just before base.ApplyConfiguration();:

Thread.Sleep(1);

This only fires once per worker process recycle, so should have negligible impact on the performance of the service. You may have to raise this value (you could make it configurable) although the minimum setting of 1ms worked for me (debate of how long this command actually sleeps for aside).

Solution 3 - the simplest fix

In IIS there is a DisallowOverlappingRotation Metabase Property. This can be set as follows (example given for MyAppPool application pool):

cscript %SYSTEMDRIVE%\inetpub\adminscripts\adsutil.vbs SET w3svc/AppPools/MyAppPool/DisallowOverlappingRotation TRUE

Comparison of solutions

Solution #3 will cause your site to be down longer when a worker process recycles due to the absence of IIS overlapped recycling.

On soapUI load tests of a basic web service using 100+ transactions per second with 5 threads, a 'freeze' where new transactions were blocked was evident for a couple of seconds every time the worker process recycled. This freeze was more prolonged (8+ seconds) when a more complex web service was tested.

Solution #2 produced no such blocking, had a smooth flow of responses during the recycle, and produced no perfmon conflict errors.

Conclusion

For web services where low latency is not a requirement, Solution #3 could be used. You could even set the recycling to be daily at a set time if you know the load distribution and quiet times of day (this can be done in the same tab in IIS). This could even be staggered if a web farm was used.

For web services which cannot tolerate such delays, it seems Solution #2 is the best way forwards.


IIRC, IIS will not make sure that your first AppDomain is closed before it starts the second, particularly when you are recyclying it manually or automatically. I believe that when a recycle is initiated, the second AppDomain is instantiated first, and once that succeeds, new incoming requests are directed towards it, and then IIS waits for the first AppDomain (the one being shut down) to finish process any requests it has.

The upshot is that there's an overlap where two AppDomains are in existence, both with the same value for instance_name.

However, not all is solved. I have corrected this problem in my code by including the process ID as part of the instance name. But it seems to have introduced another problem -- what I thought was a process-scoped performance counter never seems to go away without rebooting the computer. (That may be a bug on my part, so YMMV).

This is the routine I have for creating an instance name:

    private static string GetFriendlyInstanceName()
    {
        string friendlyName = AppDomain.CurrentDomain.FriendlyName;
        int dashPosition = friendlyName.IndexOf('-');
        if (dashPosition > 0)
        {
            friendlyName = friendlyName.Substring(0, dashPosition);
        }
        friendlyName = friendlyName.TrimStart('_');
        string processID = Process.GetCurrentProcess().Id.ToString();
        string processName = Process.GetCurrentProcess().ProcessName;
        string instanceName = processName + " " + processID + " " + friendlyName.Replace('/', '_').Trim('_').Trim();
        return instanceName;
    }