Why are Craigslist posts full of question marks?

Often that happens when you cut and paste from a program that's using its own code page or character set. The local program sees them as apostrophes, but since it's a non-standard character it has no meaning for the renderer so it just defaults to the black diamond/white question mark �.

Obligatory Joelonsoftware photo:

alt text

More on unicode: http://www.joelonsoftware.com/articles/Unicode.html


Here's an example of that from Craigslist.

That page is encoded using ISO-8859-1 encoding, however, the web server is announcing that the page is in UTF-8 by sending down the following header:

Content-Type: text/html; charset=utf-8

This is a bug in Craigslist. It is a fair assumption that the Craigslist programmers do not know the absolute minimum that working programmers should know about Unicode.

Those curly apostrophes, in ISO-8859-1, are encoded using bytes which, in UTF-8, would not be valid. Thus they appear as <?> in Firefox and squares in IE.

To fix the problem when you are viewing the page, go up to the View menu and choose Character Encoding > Western (ISO-8859-1) to tell the browser what encoding the page is really in.