Azure function visibilityTimeout

I have tested this with

using System;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Host;
using Microsoft.Extensions.Logging;
using Microsoft.WindowsAzure.Storage.Queue;

namespace WorkerFunctions
{
    public static class WorkerFunctions
    {
        [FunctionName("WorkerFunction1")]
        public static async Task Function1(
            [QueueTrigger("outputQueue")]
            CloudQueueMessage item,
            [Queue("outputQueue")]
            CloudQueue outputQueue,
            DateTimeOffset nextVisibleTime,
            DateTimeOffset expirationTime,
            DateTimeOffset insertionTime,
            ILogger log)
        {
            log.LogInformation("########## Function 1 ###############");
            log.LogInformation($"NextVisibleTime: {nextVisibleTime}");
            log.LogInformation($"NextVisibleTime: {(nextVisibleTime-insertionTime).TotalSeconds}");
            log.LogInformation($"C# Queue trigger function processed: {item.AsString}");

            Thread.Sleep(TimeSpan.FromMinutes(20));
        }

        [FunctionName("WorkerFunction2")]
        public static async Task Function2(
            [QueueTrigger("outputQueue")]
            CloudQueueMessage item,
            [Queue("outputQueue")]
            CloudQueue outputQueue,
            DateTimeOffset nextVisibleTime,
            DateTimeOffset expirationTime,
            DateTimeOffset insertionTime,
            ILogger log)
        {
            log.LogInformation("########## Function 2 ###############");
            log.LogInformation($"NextVisibleTime: {nextVisibleTime}");
            log.LogInformation($"NextVisibleTime: {(nextVisibleTime - insertionTime).TotalSeconds}");
            log.LogInformation($"C# Queue trigger function processed: {item.AsString}");

            Thread.Sleep(TimeSpan.FromMinutes(20));
        }
    }
}

With this host file

{
  "version": "2.0",
  "extensions": {
    "queues": {
      "maxPollingInterval": "00:00:02",
      "visibilityTimeout": "00:00:10",
      "batchSize": 16,
      "maxDequeueCount": 5,
      "newBatchThreshold": 8
    }
  }
}

And when i put a simple message on the queue and let it run, I see the following:

  1. the function that grabs it, doesn't release it before the sleep is over
  2. i can't see it in logs that the lease is renewed, but it seems like it happens under the hood

What this tells me:

  1. if the function doesn't fail, OR the host doesn't fail, well then the lease is autorenewed according to: https://stackoverflow.com/a/31883806/21199
  2. when the visibility timeout is reached, and the function is running, the message doesn't get "readded" to the queue
  3. that the documentation about the visibilityTimeout is true: "The time interval between retries when processing of a message fails." (from https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue#hostjson-settings)

I haven't saved any links to 3. party that contradicted this (sorry I haven't saved these), but they exists. I wish someone will answer this, so I can get clarification.


It appears there are actually two different visibility timeout values used here. Both are set by the Azure WebJobs SDK but only one is configurable.

When the function fails

The queues.visibilityTimeout configuration option would be more aptly named retryDelay.

When the function throws an exception or fails for some other kind of error, the message gets returned to the queue to be retried. The message is returned with the configured visibilityTimeout (see here), which delays when the function will next attempt to run.

This allows your application to cope with transient errors. For example, if an email API or other external service is temporarily down. By delaying the retry, there is a chance that the service may be back online for the next function attempt.

Retry is limited to maxDequeueCount attempts (5 default) before a message is moved to the Poison Queue.

While the function is running

When the QueueTrigger binding runs the function, it dequeues the message with a visibility timeout of 10mins (hard-coded here). It then sets a timer to extend the visibility window when it reaches half-time as long as the function is running (see the timer and visibility update in the source).

Ordinarily you don't need to worry about this as long as your functions use CancellationTokens correctly. This 10min timeout only matters if the Azure Function/WebJob host doesn't get to shut down gracefully. For example:

  • someone "pulls the plug" on the web host
  • if the function doesn't respond to the CancellationToken in time during scale-in or other Azure shutdown events

So, as long as the function is still running, the message will remain hidden from the queue.

Verification

I did a similar experiment to check:

[FunctionName("SlowJob")]
public async Task Run(
    [QueueTrigger("slow-job-queue")] CloudQueueMessage message,
    ILogger log)
{
    for (var i = 0; i < 20; i++)
    {
        log.LogInformation($"Next visible {i}: {message.NextVisibleTime}");
        await Task.Delay(60000);
    }
}

Output:

Next visible 0: 5/11/2020 7:49:24 +00:00
Next visible 1: 5/11/2020 7:49:24 +00:00
Next visible 2: 5/11/2020 7:49:24 +00:00
Next visible 3: 5/11/2020 7:49:24 +00:00
Next visible 4: 5/11/2020 7:49:24 +00:00
Next visible 5: 5/11/2020 7:54:24 +00:00
Next visible 6: 5/11/2020 7:54:24 +00:00
Next visible 7: 5/11/2020 7:54:24 +00:00
Next visible 8: 5/11/2020 7:54:24 +00:00
Next visible 9: 5/11/2020 7:54:24 +00:00
Next visible 10: 5/11/2020 7:59:24 +00:00
Next visible 11: 5/11/2020 7:59:24 +00:00
...