Fastest way to retrieve a <title> in PHP

or making this simple function slightly more bullet proof:

function page_title($url) {

    $page = file_get_contents($url);

    if (!$page) return null;

    $matches = array();

    if (preg_match('/<title>(.*?)<\/title>/', $page, $matches)) {
        return $matches[1];
    } else {
        return null;
    }
}


echo page_title('http://google.com');

You can get it without reg expressions:

$title = '';
$dom = new DOMDocument();

if($dom->loadHTMLFile($urlpage)) {
    $list = $dom->getElementsByTagName("title");
    if ($list->length > 0) {
        $title = $list->item(0)->textContent;
    }
}

<?php
    function page_title($url) {
        $fp = file_get_contents($url);
        if (!$fp) 
            return null;

        $res = preg_match("/<title>(.*)<\/title>/siU", $fp, $title_matches);
        if (!$res) 
            return null; 

        // Clean up title: remove EOL's and excessive whitespace.
        $title = preg_replace('/\s+/', ' ', $title_matches[1]);
        $title = trim($title);
        return $title;
    }
?>

Gave 'er a whirl on the following input:

print page_title("http://www.google.com/");

Outputted: Google

Hopefully general enough for your usage. If you need something more powerful, it might not hurt to invest a bit of time into researching HTML parsers.

EDIT: Added a bit of error checking. Kind of rushed the first version out, sorry.

Tags:

Html

Php

Parsing