How to set fallback encoding to UTF-8 in Firefox?

Setting fallback encoding to UTF-8 in Firefox has been deliberately blocked - see bugzilla.mozilla.org/show_bug.cgi?id=967981#c4.

Two ways around this that I've been looking at are:

1] Apply some trivial patches to the source and build Firefox yourself to add a Unicode[UTF-8] option to Preferences|Content|Fonts & Colors|Advanced|"Fallback Text Encoding" drop-down menu.

2] Run a local [Apache] httpd server, and set up a Name-based Virtual Server, utfx, for the utf-8 encoded files in directory /my/utf-8/files. A utf-8 charset http header can then be generated, which Firefox will recognize and display the file as UTF-8 encoded. Of course, the actual file encoding has to be UTF-8!

a) /etc/httpd/httpd.conf - add:

<VirtualHost *:80>
    # This first-listed virtual host is also the default for *:80
    ServerName localhost
    DocumentRoot "/srv/httpd/htdocs"
</VirtualHost>
<VirtualHost *:80>
    ServerName utfx
    DocumentRoot "/my/utf-8/files"
      <Directory "/my/utf-8/files">
          Options Indexes
          Require all granted
      </Directory>
## show UTF-8 characters in file names:
    IndexOptions Charset=UTF-8
## for files with extension html or txt:
    AddCharset UTF-8 txt html
## for extensionless files:
      <Files *>
          ForceType 'text/plain; charset=UTF-8'
      </Files>
      <Files *\.*>
          ForceType None
      </Files>
</VirtualHost>

(Re)start the server - apachectl restart or apachectl graceful.

b) /etc/hosts - add the domain name for accessing the utf-8 encoded files:

127.0.0.1   utfx

The content-type info being sent by the server can be checked with wget -S <URL>:

wget -S http://utfx/test{æø,.txt,.html} 2>&1 >/dev/null | grep Content-Type

for the three file types (testæø, test.txt, test.html).
The output should be:

Content-Type: text/plain; charset=utf-8
Content-Type: text/plain; charset=utf-8
Content-Type: text/html; charset=utf-8

c) about:config - add New|Boolean:

browser.fixup.domainwhitelist.utfx  "true"

then just enter utfx in the Firefox address bar to get the files list ..


Update: this has been fixed since Firefox 66

UTF-8-encoded HTML (and plain text) files loaded from file: URLs are now supported without <meta charset="utf-8"> or the UTF-8 BOM

https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/66#HTML


Historical information from 2016

The reasoning behind this behavior seems to be described in Mozilla bugs 815551 (Autodetect UTF-8 by default) and 1071816 (Support loading BOMless UTF-8 text/plain files from file: URLs)

As far as I understand it basically boils down to "one should always specify the encoding as detection is too unreliable".

  • For non-local content you should leverage the protocol. With HTTP this would be providing the correct charset in the Content-Type Header
  • For HTML content you may additionally use the Doctype, i.e. <meta charset="utf-8" />
  • And for anything else the only standard way left ist to specify a BOM...

Mozilla devs seem to be open for a patch that adds a preference setting, so one day it might be possible to open local BOM-less UTF-8 documents in Firefox.


As I have commented in your question I was struggling to obtain the same with the purpose of correctly displaying partial html (encoding is known but there's no meta tag for encoding) from Mutt in Firefox through Mailcap.

In the end I've figure out a command that works, and which may help you too:

  • uconv --add-signature -f %{charset} -t UTF-8 %s | sponge %s && firefox -new-tab %s & sleep 5

I've discovered that when your UTF-8 encoded file contains BOM, Firefox then assumes it's UTF-8. So I've used the uconv command to add the BOM signature. Assume that %{charset} is the input charset and %s is the filename. The sponge tool (from the moreutils package) helps changing the file inplace and the sleep is just so that Mutt doesn't delete the file before Firefox finishes loading it.

I have not found any other option to set a fallback encoding in Firefox.