Chrome S3 Cloudfront: No 'Access-Control-Allow-Origin' header on initial XHR request

You're making two requests for the same object, one from HTML, one from XHR. The second one fails, because Chrome uses the cached response from the first request, which has no Access-Control-Allow-Origin response header.

Why?

Chromium bug 409090 Cross-origin request from cache failing after regular request is cached describes this problem, and it's a "won't fix" -- they believe their behavior is correct. Chrome considers the cached response to be usable, apparently because the response didn't include a Vary: Origin header.

But S3 does not return Vary: Origin when an object is requested without an Origin: request header, even when CORS is configured on the bucket. Vary: Origin is only sent when an Origin header is present in the request.

And CloudFront does not add Vary: Origin even when Origin is whitelisted for forwarding, which should by definition mean that varying the header might modify the response -- that's the reason why you forward and cache against request headers.

CloudFront gets a pass, because its response would be correct if S3's were more correct, since CloudFront does return this when it's provided by S3.

S3, a little fuzzier. It is not wrong to return Vary: Some-Header when there was no Some-Header in the request.

For example, a response that contains

Vary: accept-encoding, accept-language

indicates that the origin server might have used the request's Accept-Encoding and Accept-Language fields (or lack thereof) as determining factors while choosing the content for this response. (emphasis added)

https://tools.ietf.org/html/rfc7231#section-7.1.4

Clearly, Vary: Some-Absent-Header is valid, so S3 would be correct if it added Vary: Origin to its response if CORS is configured, since that indeed could vary the response.

And, apparently, this would make Chrome do the right thing. Or, if it doesn't do the right thing in this case, it would be violating a MUST NOT. From the same section:

An origin server might send Vary with a list of fields for two purposes:

  1. To inform cache recipients that they MUST NOT use this response to satisfy a later request unless the later request has the same values for the listed fields as the original request (Section 4.1 of [RFC7234]). In other words, Vary expands the cache key required to match a new request to the stored cache entry.

...

So, S3 really SHOULD be returning Vary: Origin when CORS is configured on the bucket, if Origin is absent from the request, but it doesn't.

Still, S3 is not strictly wrong for not returning the header, because it's only a SHOULD, not a MUST. Again, from the same section of RFC-7231:

An origin server SHOULD send a Vary header field when its algorithm for selecting a representation varies based on aspects of the request message other than the method and request target, ...

On the other hand, the argument could be made that Chrome should implicitly know that varying the Origin header should be a cache key because it could change the response in the same way Authorization could change the response.

...unless the variance cannot be crossed or the origin server has been deliberately configured to prevent cache transparency. For example, there is no need to send the Authorization field name in Vary because reuse across users is constrained by the field definition [...]

Similarly, reuse across origins is arguably constrained by the nature of Origin but this argument is not a strong one.


tl;dr: You apparently cannot successfully fetch an object from HTML and then successfully fetch it again with as a CORS request with Chrome and S3 (with or without CloudFront), due to peculiarities in the implementations.


Workaround:

This behavior can be worked-around with CloudFront and Lambda@Edge, using the following code as an Origin Response trigger.

This adds Vary: Access-Control-Request-Headers, Access-Control-Request-Method, Origin to any response from S3 that has no Vary header. Otherwise, the Vary header in the response is not modified.

'use strict';

// If the response lacks a Vary: header, fix it in a CloudFront Origin Response trigger.

exports.handler = (event, context, callback) => {
    const response = event.Records[0].cf.response;
    const headers = response.headers;

    if (!headers['vary'])
    {
        headers['vary'] = [
            { key: 'Vary', value: 'Access-Control-Request-Headers' },
            { key: 'Vary', value: 'Access-Control-Request-Method' },
            { key: 'Vary', value: 'Origin' },
        ];
    }
    callback(null, response);
};

Attribution: I am also the author of the original post on the AWS Support forums where this code was initially shared.


The Lambda@Edge solution above results in fully correct behavior, but here are two alternatives that you may find useful, depending on your specific needs:

Alternative/Hackaround #1: Forge the CORS headers in CloudFront.

CloudFront supports custom headers that are added to each request. If you set Origin: on every request, even those that are not cross-origin, this will enable correct behavior in S3. The configuration option is called Custom Origin Headers, with the word "Origin" meaning something entirely different than it means in CORS. Configuring a custom header like this in CloudFront overwrites what is sent in the request with the value specified, or adds it if absent. If you have exactly one origin accessing your content over XHR, e.g. https://example.com, you can add that. Using * is dubious, but might work for other scenarios. Consider the implications carefully.

Alternative/Hackaround #2: Use a "dummy" query string parameter that differs for HTML and XHR or is absent from one or the other. These parameters are typically named x-* but should not be x-amz-*.

Let's say you make up the name x-request. So <img src="https://dzczcexample.cloudfront.net/image.png?x-request=html">. When accessing the object from JS, don't add the query parameter. CloudFront is already doing the right thing, by caching different versions of the objects using the Origin header or absence of it as part of the cache key, because you forwarded that header in your cache behavior. The problem is, your browser doesn't know this. This convinces the browser that this is actually a separate object that needs to be requested again, in a CORS context.

If you use these alternative suggestions, use one or the other -- not both.