Extract timestamp of specific frames in video

Import the video Import[] (How can I import a mp4 file?)

Often, MP4 files can be imported as AVI. Another way is to wrap the MP4 file with QuickTime without transcoding, and import the MOV file.

Get attributes of the video file with:

{bitDepth, colorSpace, duration, frameCount, frameRate, imageSize, 
  videoEncoding} = 
 Import["file.mp4", {"AVI", {"BitDepth", "ColorSpace", "Duration", 
    "FrameCount", "FrameRate", "ImageSize", "VideoEncoding"}}]

(*{8,"RGB",421.467,12646,30.0047,{1280,720},"H.264"}*)

How can I extract all timestamps for the video frames?

To find the timestamp for a frame that matches a slide, subtract 1 from the frame number and divide by the frame rate, which gives the time in seconds:

timeStamp[frame_] := (frame - 1)/frameRate
timeStamp[1000]

(*33.2947*)

Edit: answer totyped's comment about speed

How to search frames more quickly

For a large video file, importing every frame is a slow process. One way to avoid importing the entire video is to use a sample of frames. For example, here's how to sample 12 frames from a video.

Import["file.mp4", {"AVI","ImageList", Range[1, frameCount, Round[frameCount/12]]}]

Using Range to choose only some of the video frames, the video can be easily sampled and tested using M. Stern's method. This range:

Range[1, frameCount, Round[frameRate/2]]

samples frames approximately every 1/2 second, and

Range[1, frameCount, Round[2*frameRate]]

samples at approximately 2-second intervals.

Assuming the individual slides in the video change about 20 seconds per slide, a 2- or 3-second interval should allow enough frames to test. Adjust the sample interval depending on the length of the video and how often the slides change.

Avoid testing parts of the video that don't have slides. If there's a 20-second lead-in before the first slide, start Range at Round[20*frameRate] instead of frame 1.

There's a trade-off between sample interval and how accurately a slide change will be timestamped. A large interval tests fewer frames, but the timestamp might miss the exact time that a slide change happens.


First I have to say that I'm a bit skeptical whether what you want to do can work in general. What if the person blocks everything that is unique of that slide? What if some slides look the same?

But let's ignore those possible problems for now and try a very simple approach. This is only meant as a starting point! My answer is based on this very good answer here and this one, which contains more explanation and ways to improve.

First let us import your example data:

gif = Import["https://i.stack.imgur.com/k4ChI.gif"];
framenohead = Import["https://i.stack.imgur.com/JMthj.png"];
framewithhead = Import["https://i.stack.imgur.com/pwjwb.png"];

We scale the images down to 32x32 and obtain the pixel data using ImageData. Scaling down will increase the robustness against small differences between the slide we are searching for and the video, as well as decrease the computation time. Note that you could probably scale down the whole video beforehand. We search for the frame with the head on it, change this line to search for the other one if you want to try it.

seeked = Flatten[ImageData@ImageResize[framewithhead, {32, 32}], 1];
small = Flatten[ImageData[ImageResize[#, {32, 32}]], 1] & /@ gif;

In order to decide whether two colors are similar we can define the following function. Play around with the threshold value!

SimilarColor[a_, b_] := If[Total[(a - b)^2] < 0.0005, 1, 0];

Now we just pick the frame with the highest score, i.e. the highest number of sectors that are similar to the frame we are looking for.

score = Total@MapThread[SimilarColor, {#, seeked}] & /@ small;
Position[score, Max[score]]

This returns 15 in both cases (with and without head)!

Edit: for the provided slides

Lets rename your slides, such that {{1}} became {{001}}. You can do it with something like

Do[RenameFile[
  NotebookDirectory[] <> "\\frames\\{{"  <> ToString[i] <> "}}.jpg", 
  NotebookDirectory[] <> "\\frames\\{{0" <> ToString[i] <> "}}.jpg"], {i, 10, 99}]

Now we import all those images, for example:

frames = Import[#] & /@ 
   FileNames["*.jpg", NotebookDirectory[] <> "\\frames"];
slides = Import[#] & /@ 
   FileNames["*.jpg", NotebookDirectory[] <> "\\slides"];

We can scale down the images, I used a slightly higher resolution because of some details in your slides.

res = 48;
smallframes = 
  Flatten[ImageData[ImageResize[#, {res, res}]], 1] & /@ frames;
smallslides = 
  Flatten[ImageData[ImageResize[#, {res, res}]], 1] & /@ slides;

I also slightly changed the color comparison, I'm not actually sure that you need to change it, but why not try a different one :)

SimilarColor[a_, b_] := If[And @@ ((# < 0.05) & /@ ((a - b)^2)), 1, 0];

Now comes the heavy calculation (takes a couple of minutes on my laptop): We label each frame by the slide that's in the background.

labels = Monitor[Table[
    With[{score = 
        Total@MapThread[SimilarColor, {#, smallframes[[i]]}] & /@ smallslides},
         Position[score, Max[score]]][[1, 1]], {i, Length[frames]}], i]

{2,2,2,2,2,2,2,2,2,2,2,2,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,9,9,9,9,9,9,10,10,10,10,10,10,10,10,10,10,10,11,11,11,11,12,12,12,13,13,13,13,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,20,20,20,20,20,20,20,20,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,3,3,3,3,3,3,3,3,3,3}

This list should contain all the information you need, in particular the first occurrence of slide 9 (with the photo mask) is

 Min@Position[labels, 9]

95