"Content is not allowed in prolog" when parsing perfectly valid XML on GAE

The encoding in your XML and XSD (or DTD) are different.
XML file header: <?xml version='1.0' encoding='utf-8'?>
XSD file header: <?xml version='1.0' encoding='utf-16'?>

Another possible scenario that causes this is when anything comes before the XML document type declaration. i.e you might have something like this in the buffer:

helloworld<?xml version="1.0" encoding="utf-8"?>  

or even a space or special character.

There are some special characters called byte order markers that could be in the buffer. Before passing the buffer to the Parser do this...

String xml = "<?xml ...";
xml = xml.trim().replaceFirst("^([\\W]+)<","<");

I had issue while inspecting the xml file in notepad++ and saving the file, though I had the top utf-8 xml tag as <?xml version="1.0" encoding="utf-8"?>

Got fixed by saving the file in notpad++ with Encoding(Tab) > Encode in UTF-8:selected (was Encode in UTF-8-BOM)


This error message is always caused by the invalid XML content in the beginning element. For example, extra small dot “.” in the beginning of XML element.

Any characters before the “<?xml….” will cause above “org.xml.sax.SAXParseException: Content is not allowed in prolog” error message.

A small dot “.” before the “<?xml….

To fix it, just delete all those weird characters before the “<?xml“.

Ref: http://www.mkyong.com/java/sax-error-content-is-not-allowed-in-prolog/