Make emacs not remove the BOM from XML files

Emacs will write a BOM or not depending on what coding system it is using. Emacs automatically chooses the coding system it uses when visiting a file.

You can change the coding system to utf-8-with-signature, which will tell Emacs to write the BOM.

To change the coding system of a visited file:

C-x RET r utf-8-with-signature RET

You can set the coding system that Emacs uses for a particular file by setting a file variable. See the fine manual section 57.3.4 Local Variables in Files to learn how to do that.


Followup on Richard Hoskins’s answer: if you never want the BOM to be hidden by emacs, you can disable the *-with-signature codings with this snippet:

(setq auto-coding-regexp-alist
  (delete (rassoc 'utf-16be-with-signature auto-coding-regexp-alist)
  (delete (rassoc 'utf-16le-with-signature auto-coding-regexp-alist)
  (delete (rassoc 'utf-8-with-signature auto-coding-regexp-alist)
          auto-coding-regexp-alist))))

The BOM is U+FEFF, the “zero-width non-breaking space”, and doesn’t show up as a box in my emacs 23.1.1—instead, the top line of the file’s moved slightly down, and a box sometimes appears around the first line—but you can see that the BOM’s there, and delete it if necessary.


Emacs "itself" should not mess with the BOM; if it's really doing that, then it would have to be the code implementing the Emacs "mode" you are useing to edit your XML files which removes the BOM. Since you don't say which one that is, I can only refer you to the documentation for that mode, or that you open the files in fundamental-mode (or similar non-destructive mode). Or try M-x find-file-literally if all else fails.

Tags:

Xml

Emacs