How to stop json.Marshal from escaping < and >?

This doesn't answer the question directly but it could be an answer if you're looking for a way how to deal with json.Marshal escaping < and >...

Another way to solve the problem is to replace those escaped characters in json.RawMessage into just valid UTF-8 characters, after the json.Marshal() call.

It will work as well for any letters other than < and >. (I used to do this to make non-English letters to be human readable in JSON :D)

func _UnescapeUnicodeCharactersInJSON(_jsonRaw json.RawMessage) (json.RawMessage, error) {
    str, err := strconv.Unquote(strings.Replace(strconv.Quote(string(_jsonRaw)), `\\u`, `\u`, -1))
    if err != nil {
        return nil, err
    }
    return []byte(str), nil
}

func main() {
    // Both are valid JSON.
    var jsonRawEscaped json.RawMessage   // json raw with escaped unicode chars
    var jsonRawUnescaped json.RawMessage // json raw with unescaped unicode chars

    // '\u263a' == '☺'
    jsonRawEscaped = []byte(`{"HelloWorld": "\uC548\uB155, \uC138\uC0C1(\u4E16\u4E0A). \u263a"}`) // "\\u263a"
    jsonRawUnescaped, _ = _UnescapeUnicodeCharactersInJSON(jsonRawEscaped)                        // "☺"

    fmt.Println(string(jsonRawEscaped))   // {"HelloWorld": "\uC548\uB155, \uC138\uC0C1(\u4E16\u4E0A). \u263a"}
    fmt.Println(string(jsonRawUnescaped)) // {"HelloWorld": "안녕, 세상(世上). ☺"}
}

https://play.golang.org/p/pUsrzrrcDG-

I hope this helps someone.


In Go1.7 the have added a new option to fix this:

encoding/json: add Encoder.DisableHTMLEscaping This provides a way to disable the escaping of <, >, and & in JSON strings.

The relevant function is

func (*Encoder) SetEscapeHTML

That should be applied to a Encoder.

enc := json.NewEncoder(os.Stdout)
enc.SetEscapeHTML(false)

Simple example: https://play.golang.org/p/SJM3KLkYW-


As of Go 1.7, you still cannot do this with json.Marshal(). The source code for json.Marshal shows:

> err := e.marshal(v, encOpts{escapeHTML: true})

The reason json.Marshal always does this is:

String values encode as JSON strings coerced to valid UTF-8, replacing invalid bytes with the Unicode replacement rune. The angle brackets "<" and ">" are escaped to "\u003c" and "\u003e" to keep some browsers from misinterpreting JSON output as HTML. Ampersand "&" is also escaped to "\u0026" for the same reason.

This means you cannot even do it by writing a custom func (t *Track) MarshalJSON(), you have to use something that does not satisfy the json.Marshaler interface.

So, the workaround, is to write your own function:

func (t *Track) JSON() ([]byte, error) {
    buffer := &bytes.Buffer{}
    encoder := json.NewEncoder(buffer)
    encoder.SetEscapeHTML(false)
    err := encoder.Encode(t)
    return buffer.Bytes(), err
}

https://play.golang.org/p/FAH-XS-QMC

If you want a generic solution for any struct, you could do:

func JSONMarshal(t interface{}) ([]byte, error) {
    buffer := &bytes.Buffer{}
    encoder := json.NewEncoder(buffer)
    encoder.SetEscapeHTML(false)
    err := encoder.Encode(t)
    return buffer.Bytes(), err
}

https://play.golang.org/p/bdqv3TUGr3

Tags:

Go