Streaming Media From Inside HTML Pages, by Example

How do streaming media players, running inside HTML pages and served by HTML servers, establish streaming (RTSP, etc.) connections with streaming media servers (serving RTSP requests)?

Common Applications

RTSP currently seems to be used more with applications/device interfaces that directly live stream (e.g. IP camera) or re-stream (like an engine) than it is for streaming saved media files from a physical location via an HTTP web playback interface with an embedded player.

It seems that RTSP is a stateful protocol and it uses UDP more than TCP when streaming, and it's used more as a server device (like an IP camera) that is connected to a TCP/IP network, and feeds out streams via UDP, etc. You then connect to these feeds (the server) as the client on the same network and you can issue RTSP requests to utilize accordingly.


Protocol directives

While similar in some ways to HTTP, RTSP defines control sequences useful in controlling multimedia playback. While HTTP is stateless, RTSP has state; an identifier is used when needed to track concurrent sessions. Like HTTP, RTSP uses TCP to maintain an end-to-end connection and, while most RTSP control messages are sent by the client to the server, some commands travel in the other direction (i.e. from server to client).

Presented here are the basic RTSP requests. Some typical HTTP requests, like the OPTIONS request, are also available. The default transport layer port number is 554[3] for both TCP and UDP, the latter being rarely used for the control requests.

source


Stateless

A stateless protocol does not require the server to retain session information or status about each communications partner for the duration of multiple requests. In contrast, a protocol which requires keeping of the internal state on the server is known as a stateful protocol.

A disadvantage of statelessness is that it may be necessary to include additional information in every request, and this extra information will need to be interpreted by the server.

source


Logical Flow

The way I understand the flow of streaming media in this form is:

  • the server where the media content resides will encapsulate, compress, encode, etc. the video/audio data content in the proper formats and segments for stream delivery
  • the web server that listens for connections to access the streaming media will deliver all resources needed to stream the media
  • the client requests and downloads applicable resources and files, and then assembles them in a continuous fashion for playback via the URL pointer as configured and other parameters. The playback software at the client level assembles the packets transmitted in sequence to allow proper playback of the content.

Please see the Streaming Technologies section below for a general comparison of HTTP versus RTSP.


Furthermore

In the below 10 Reasons Why You Should Never Host Your Own Videos section I've quoted the parts that get to the point to help answer your question in "general" without being too specific.

Essentially it says that the website that has the embedded media player controls will:

  • (1)detect the client web browser settings upon "connection and request" from the client and
  • (2)this will set the codec and any other client side detection settings to applicable parameter values, and then
  • (3)it'll stream the video directly from the streaming server you host the video and audio files on based on further code in your embedded media player configurations pointing to the URL of the media file on the hosted server.

Streaming Technologies

The client browser must receive the data from the server and pass it to the streaming application for processing. The streaming application converts the data into pictures and sounds. An important factor in the success of this process is the ability of the client to receive data faster that the application can display the information. Excess data is stored in a buffer – an area of memory reserved for data storage within the application. If the data is delayed in transfer between the two systems, the buffer empties and the presentation of the material will not be smooth.

HTTP Protocol

The HTTP is the predominant way in which documents are linked on the Internet. The client makes a connection to the server containing the file to be streamed, the file is retrieved and the connection closed. The HTTP server communicates to the browser the type of file to be transferred.

Benefits Using HTTP

When streaming a file using HTTP, a special streaming server is not required. As long as your browser understands MIME types it can receive a streaming file from a HTTP server. One of the distinct advantages of streaming files using HTTP is that it can pass through firewalls and utilize proxy servers.

Some Disadvantages

HTTP streaming uses TCP/IP (Transmission Control Protocol and Internet Protocol) to ensure reliable delivery of the files. This process checks for missing packets and asks for them to be retransmitted. This become problematic in the streaming scenario when you want the data to be disregarded if it is lost in delivery, so dynamic files keep playing. HTTP cannot detect modem speed so server administrators must purposefully produce files at different compression rates to server users with different types of connections. Streaming files from HTTP servers is not recommended for high-demand situations.

RTSP Protocol

RTSP is the standard protocol used by most of the streaming server vendors. RTSP servers use the UDP (User Datagram Protocol) to transfer media files. UDP does not continually check that files have arrived at their destination. This is an advantage for streaming applications because it allows for file transfers to be interrupted as long as the delay is not too long. The result of this method is that there is data loss at times, but files continue to play if the delay is small.

source


10 Reasons Why You Should Never Host Your Own Videos

We’re Talking About Embedding vs. Self-Hosted Video

First, you upload your video file to a third-party video hosting service like YouTube, Vimeo, or Wistia.

Then, you copy a small bit of code that they furnish to you, and paste it into your post or page on your own WordPress site. The video will appear on your site, in the location where you pasted the embed code, but the video itself is being streamed from the video host’s servers, as opposed to your own web server, where your WordPress site is hosted.

4. No Single File Format Standard for Web Video

The current HTML5 draft specification does not specify which video formats browsers should support. As a result, the major web browsers have diverged, each one supporting a different format. Internet Explorer and Safari will play H.264 (MP4) videos, but not WebM or Ogg. Firefox will play Ogg or WebM videos, but not H.264. Thankfully, Chrome will play all the major video formats, but if you want to ensure your video will play back on all the major web browsers, you’ll have to convert your video into multiple formats: .mp4, .ogv, and .webm

5. Hope you like converting videos. A lot.

Most of your audience will likely watch your videos from their desktop or laptop with the benefit of a high-speed Internet connection. For those folks, you’ll want to deliver a large, HD-quality file so they can watch it full-screen if they so choose. Generally, this means a 1080p or 720p file at a high streaming bitrate (5000 – 8000 kbps).

But you’ll also want to encode a smaller, lower-resolution version for delivery to mobile devices like phones and tablets, as well as delivery to viewers with slower Internet connections.

6. Video Players

A video player is a small piece of web software you install on your site that will automatically detect which device is requesting your video, along with its connection speed, and then deliver the appropriate version to that person.

7. Cumbersome Code [or Shortcodes]

Whether you use a third-party plugin or WordPress’ built-in video capabilities, you’ll need to create a bit of code to tell the video player which formats you’ve created, as well as their location on the server. It looks something like this…

<video poster="movie.jpg" controls>
<source src="movie.webm" type='video/webm; codecs="vp8.0, vorbis"'/>
<source src="movie.ogg" type='video/ogg; codecs="theora, vorbis"'/>
<source src="movie.mp4" type='video/mp4; codecs="avc1.4D401E, mp4a.40.2"'/>
<p>This is fallback content</p>
</video>

So what’s the best solution for adding video to your site?

Simply use a third-party video hosting service, then just embed your video into your WordPress post or page.

Step One: Upload your video to one of the popular, well-established video hosting services like Vimeo PRO.

Step Two: Once your video has been uploaded and is ready for viewing, copy the URL to your video. Return to your WordPress site and paste the URL into your post or page where you want the video to appear.


When folks view your page, the video will appear in the location where you pasted the URL. But the video file itself will be streamed from the video host’s servers, as opposed to your own server, where your WordPress site is hosted.

The embedded video player will automatically detect the user’s device, browser, and Internet connection speed, and then serve the appropriate version of the video file to them. Nothing to install on your site. No plugins to keep up to date. No tricky code.

source


I will be treating below principally your question of what goes on when a video is displayed in the browser. The subject is vast, so I will only be touching upon the relevant items.

HTML5 has introduced the <VIDEO> tag which solved the problem of integrating the displayed video into the browser while using JavaScript and CSS. The previous <OBJECT> tag required external software and was badly integrated with the page. The new tag in effect required the browser to also become a video player, although no standards were imposed. The result was total fragmentation of the standards, to which the only solution is that the video server will make available several video formats and that all these alternative sources be specified in the <VIDEO> tag, from which the browser will pick the one it supports.

An example of a tag with multiple sources :

<video width=320 height=240 controls poster=image.jpg>
   <source src="movie.mpd">
   <source src="movie.webm">
   Your browser does not support the video tag.
</video>

The <VIDEO> tag itself is protocol-agnostic, so can use any protocol supported by the browser including RTSP. Support for the MPEG-DASH protocol (Dynamic Adaptive Streaming over HTTP) has lately become very comprehensive, so it will play on most devices and browsers native, or using HTML5, which means no extra plugins are required. See this Device and Browser Compatibility chart. See also this Mozilla article for preparing your server for serving MPEG-DASH. DASH works via HTTP, so this will work as long as your HTTP server supports byte range requests and it's set up to serve .mpd files with mimetype="application/dash+xml".

The normal interaction between client and server looks similar to the following. For HTML5 VIDEO, the browser is also the player, although it may open a new connection for playing.

image

The initial connection supplies the metadata that the client uses to display the video. If the RTSP protocol was used to get that metadata, then an RTP connection is later created for transferring the video+audio data. The RTCP protocol is used to transfer additional commands to the server.

RTP, RTCP, and RTSP all operate on different ports. Usually when RTP is on port N, RTCP is on port N+1. An RTP session may contain multiple streams to be combined at the receiver's end; for example, audio and video may be on separate channels.

So that nobody gets locked out of your content, you should make available both royalty-free codecs, webM or Theora, and H.264 video, and both Vorbis and MP3 audio. (Easy said, hard to do.)

This is what happens in detail for RTSP :

  1. The client establishes a TCP connection to the servers, typically on TCP port 554, the well-known port for RTSP.

  2. The client will then commence issuing a series of RTSP header commands that have a similar format to HTTP, each of which is acknowledged by the server. Within these RTSP commands, the client will describe to the server details of the session requirements, such as the version of RTSP it supports, the transport to be used for the data flow, and any associated UDP or TCP port information. This information is passed using the DESCRIBE and SETUP headers and is augmented on the server response with a Session ID that the client, and any transitory proxy devices, can use to identify the stream in further exchanges.

  3. Once the negotiation of transport parameters has been completed, the client will issue a PLAY command to instruct the server to commence delivery of the RTP data stream.

  4. Once the client decides to close the stream, a TEARDOWN command is issued along with the Session ID instructing the server to cease the RTP delivery associated with that ID.

Further reading :

  • Basics of streaming protocols
  • Understanding Application Layer Protocols - Real Time Streaming Protocol (RTSP)
  • Introduction to HTML5 Video (Opera)
  • HTML5 Audio and Video: What you Must Know
  • RTSP Analysis Wireshark