Is it possible to inject HTML into image to provoke XSS?

I'm pretty sure bstpierre was talking about a situation, where you let users upload images to display in a gallery and furthermore you display meta-data about that image taken from EXIF tags. If the EXIF tags were specifically constructed with malicious html tags, and there was no input sanitation (e.g., escape characters like <, >, &, so they are rendered in the page like &lt; &gt;, &amp) you could conceivably use them for an XSS attack. Again, it is not a problem if your web app sanitizes all user-input including extracted tags from image files.

bobince was talking about a few totally different things; the closest being that someone uploads a file - say fake_image.jpg but instead of being an image, the content is actually a html file. By content sniffing, he meant that some browsers, when you click to open up a page http://example.com/fake_image.jpg can figure out that fake_image.jpg is not an image but instead a html file and will display the html page in the file. Thus, the uploaded html file (with an image file name) could be used in an XSS attack. For this attack to be useful, you have to have a browser that allows this sort of content-sniffing, and then you have to go to the URL of the image (not just have <img src='fake_image.jpg'> embedded in the page).


Sadly, you guessed wrong. These attacks are possible, without a browser bug; they exploit a design flaw in many earlier browsers. You'll want to read about MIME content type sniffing attacks, and related attacks. There's a lot that's been written about this:

  • MIME sniffing protection

  • Why should I restrict the content type of files be uploaded to my site?

  • Using file extension and MIME type (as output by file -i -b) combination to determine unsafe files?

  • Is it safe to serve any user uploaded file under only white-listed MIME content types?

  • Does X-Content-Type-Options really prevent content sniffing attacks?

  • How can I be protected from pictures vulnerabilities?


An inline SVG in HTML can contain event handlers and SVG script nodes. So if I can specify an SVG image for your page to load, and get you to inline it in the page, then I can inject script via that image.

The HTML5 spec

The svg element from the SVG namespace falls into the embedded content, phrasing content, and flow content categories for the purposes of the content models in this specification.

...

The semantics of SVG elements are defined by the SVG specification and other applicable specifications. [SVG]

Mario Heiderich exploited confusion in Opera about which domain SVG content should run in to create an image that when loaded cross-domain attacks various layers and ends up calling his phone.

Wrap-Up

  • SVGs are not just images but mini-applications
  • tags can now deploy Java, PDF and Flash – and call you on Skype
  • In-line SVG creates small XML islands enabling XML attacks on HTML websites
  • SVG and XSLT work too, enabling DoS and other attacks
  • Web-security and XML security, they meet again!
  • And XXE is back – remember 2002's advisories?
  • SVG is not getting enough attention in the security community
  • SVG provides a lot of room for more security research

In earlier slides, Mario discusses XSS specifically and problems with download of SVG files for running locally and notes

  • Allowing SVG for upload == allowing HTML for upload

It is possible to write polyglots -- files which are valid in multiple languages like a file that is both an HTML page and a JPEG image GIF and a JavaScript program. That second page explains:

the image above is a perfectly valid GIF file just as it is a perfectly valid javascript program (in fact – it’s even a valid Caja program!). An image tag expects its src attribute to point to content which parses correctly as an image, just as a script tag expects its src attribute to point to a javascript file. The tag specifies a context in which content of a particular type is expected. If the only information a browser used to render content was the context created for it by the surrounding tag, things would be simple. But things in the browser world are never simple. When a server sends a file, it also sends that file’s MIME type in a Content-Type header. All is well when the Content-Type the server asserts is consistent with the expected context in which that content gets used. What happens when the server does not send a Content-Type? What happens when a file with one Content-Type is sent when a different type is expected?

...

Browsers perform content-sniffing ostensibly in the interest of usability so even badly configured servers can continue to “work”. The problem here is that a browser gives different types of content different amounts of access. If you can fool the browser into thinking one type of content is actually another you can bypass the restrictions placed on the actual content’s access. For example, an HTML page is allowed to load external images, stylesheets and scripts. In this case the security context these resources execute in is derived from the URL of the page that these resources are embedded in. On the other hand, if the type of content being loaded is Flash or Java applet say, the security context is derived from the URL of the applet object itself. If the browser uses heuristics and gets confused between a Flash object and an image, there are real security implications! It was this type of confusion which was the source of the GIFAR attack.