Sanitizing HTML input

You will have to decide between good and lightweight. The recommended choice is 'HTMLPurifier', because it provide no-fuss secure defaults. As faster alternative it is often advised to use 'htmLawed'.

See also this quite objective overview from the HTMLPurifier author: http://htmlpurifier.org/comparison


I really like HTML Purifier, which allows you to specify which tags and attirbutes are allowed in your HTML code -- and generates valid HTML.


Use BB codes (or like here on SO), otherwise chances are very slim. Example function...

function parse($string){

    $pattern = array(
    "/\[url\](.*?)\[\/url\]/",
    "/\[img\](.*?)\[\/img\]/",
    "/\[img\=(.*?)\](.*?)\[\/img\]/",
    "/\[url\=(.*?)\](.*?)\[\/url\]/",
    "/\[red\](.*?)\[\/red\]/",
    "/\[b\](.*?)\[\/b\]/",
    "/\[h(.*?)\](.*?)\[\/h(.*?)\]/",
    "/\[p\](.*?)\[\/p\]/",    
    "/\[php\](.*?)\[\/php\]/is"
    );

    $replacement = array(
    '<a href="\\1">\\1</a>',
    '<img alt="" src="\\1"/>',
    '<img alt="" class="\\1" src="\\2"/>',
    '<a rel="nofollow" target="_blank" href="\\1">\\2</a>',
    '<span style="color:#ff0000;">\\1</span>',
    '<span style="font-weight:bold;">\\1</span>',
    '<h\\1>\\2</h\\3>',
    '<p>\\1</p>',
    '<pre><code class="php">\\1</code></pre>'
    );

    $string = preg_replace($pattern, $replacement, $string);

    $string = nl2br($string);

    return $string;

}

...

echo parse("[h2]Lorem Ipsum[/h2][p]Dolor sit amet[/p]");

Result...

<h2>Lorem Ipsum</h2><p>Dolor sit amet</p>

enter image description here

Or just use HTML Purifier :)