Why is the use of TAB (%09) characters in the middle a 'javascript:' URL valid?

When an element is clicked, the link is followed and its URL is resolved. This causes the string to be processed by the basic URL parser.

In your example, the parameter is processed in the path state during which TAB (U+0009) characters, along with carriage return and line feed characters, are ignored, and therefore do not form part of the resolved URL.Why does anchor tag remove tab character in URL specified inside href attribute?

As per the specification of Basic URL Parser, the ASCII tab or newline will be removed from the URL (or just ignored).

I found an old discussion that you might find interesting for understanding the possible historical reasons behind this choice.


That's a 19-year-old bug in Mozilla. The problem was that one website was not working as expected, because Mozilla didn't strip tab characters inside the URL in a link. The page worked as expected in Internet Explorer, which apparently ignored the tabs. Tabs are often used for indentation in HTML files, so sometimes you can expect a few tabs after a new line. Somebody cited a IETF standard suggesting that "whitespace should be ignored when extracting the URI". However, others were not fully convinced that removing all whitespace characters would be a good idea, because sometimes you might run across URIs with unencoded spaces (for example: https://www.example.com/path with spaces/), even though that would be wrong, at least according to current standards. Therefore they decided to just add tabs to the list of removed characters (carriage-return and line-feed characters were already being removed). Note though that spaces are allowed, and ignored, when they are at the beginning or at the end of the URI (example: <a href=" http://www.example.com "></a>).

So I suppose the historical reason for this choice is that they wanted to make sure the following code would work:

<!-- URL with new lines and TABS for indentation -->
<a href="https://www.example.com/?
   Click on this example link

<!-- URL with unencoded spaces -->
<a href="https://www.example.com/path with spaces/foo">Click here</a>

However they did not check exactly where the spaces or tabs were in the URL, they just decided to keep the spaces and remove the tabs. As a result, the first example doesn't work if you use spaces for indentation, and tab characters can be included anywhere in the URL without affecting anything (so even java<tab>script will be ok).