How to detect whether an HTML page contains a video?

There are many ways to embed Video into a HTML page - as Flash Video or instances of Platform-Specific players through <object> and <embed> tags (but not every one of those tags is a video! The same holds true for .swf - it's just the file extension of Flash files, Video or not), the new HTML 5 <video> tag... They are not impossible to find out but it's a lot of work to catch all possible player types, formats and embed codes, and will result in a lot of false positives / negatives.

Then, there are JavaScript libraries that initialize players after the containing page has loaded - those are almost impossible to detect.

It's still a very complex issue to get video into a web page reliably, and subsequently, it's even more complex to find it out. Depending on what you are trying to achieve, I would consider dropping it.


For your case (CNN site) you can parse Open Graph micro-markup for a video information.

Meta tags such as og:video:type, og:image will help you.

Video hosting services usually support micro-markup, e.g. open graph or scheme.org.

So you can parse these markups.

Tags:

Html

Video