Detect MPEG4/H264 I-Frame (IDR) in RTP stream

As far as I know, MPEG4-ES stream fragments in RTP payload usually start with MPEG4 startcode, which can be one of these:

  • 0x000001b0: visual_object_sequence_start_code (probably keyframe)
  • 0x000001b6: vop_start_code (keyframe, if the next two bits are zero)
  • 0x000001b3: group_of_vop_start_code, which contains three bytes and then hopefully a vop_start_code that may or may not belong to a keyframe (see above)
  • 0x00000120: video_object_layer_start_code (probably keyframe)
  • 0x00000100-0x0000011f: video_object_start_code (those look like keyframes as well)
  • something else (probably not a keyframe)

I'm afraid that you'll need to parse the stream to be sure :-/


Ok so I figured it out for h264 stream.

How to detect I-Frame:

  • remove RTP header
  • check the value of the first byte in h264 payload
  • if the value is 124 (0x7C) it is an I-Frame

I cant figure it out for the MPEG4-ES stream... any suggestions?

EDIT: H264 IDR

This works for my h264 stream (fmtp:96 packetization-mode=1; profile-level-id=420029;). You just pass byte array that represents the h264 fragment received through RTP. If you want to pass whole RTP, just correct the RTPHeaderBytes value to skip RTP header. I always get the I-Frame, because it is the only frame that can be fragmented, see here. I use this (simplified) piece of code in my server, and it works like a charm!!!! If the I-Frame (IDR) is not fragmented, the fragment_type would be 5, so this code would return true for the fragmented and not fragmented IDRs.

public static bool isH264iFrame(byte[] paket)
    {
        int RTPHeaderBytes = 0;

        int fragment_type = paket[RTPHeaderBytes + 0] & 0x1F;
        int nal_type = paket[RTPHeaderBytes + 1] & 0x1F;
        int start_bit = paket[RTPHeaderBytes + 1] & 0x80;

        if (((fragment_type == 28 || fragment_type == 29) && nal_type == 5 && start_bit == 128) || fragment_type == 5)
        {
            return true;
        }

        return false;
   }

Here's the table of NAL unit types:

 Type Name
    0 [unspecified]
    1 Coded slice
    2 Data Partition A
    3 Data Partition B
    4 Data Partition C
    5 IDR (Instantaneous Decoding Refresh) Picture
    6 SEI (Supplemental Enhancement Information)
    7 SPS (Sequence Parameter Set)
    8 PPS (Picture Parameter Set)
    9 Access Unit Delimiter
   10 EoS (End of Sequence)
   11 EoS (End of Stream)
   12 Filter Data
13-23 [extended]
24-31 [unspecified] 

EDIT 2: MPEG4 I-VOP

I forgot to update this... Thanx to Che and ISO IEC 14496-2 document, I managed to work this out! Che was rite, but not so precise in his answer... so here is how to find I, P and B frames (I-VOP, P-VOP, B-VOP) in short:

  1. VOP (Video Object Plane -- frame) starts with a code 000001B6(hex). It is the same for all MPEG4 frames (I,P,B)
  2. Next follows many more info, that I am not going to describe here (see the IEC doc), but we only (as che said) need the higher 2 bits from the following byte (next two bits after the byte with the value B6). Those 2 bits tell you the VOP_CODING_TYPE, see the table:

    VOP_CODING_TYPE (binary)  Coding method
                          00  intra-coded (I)
                          01  predictive-coded (P)
                          10  bidirectionally-predictive-coded (B)
                          11  sprite (S)
    

So, to find I-Frame find the packet starting with four bytes 000001B6 and having the higher two bits of the next byte 00. This will find I frame in MPEG4 stream with a simple video object type (not sure for advanced simple).

For any other problems, you can check the document provided (ISO IEC 14496-2), there is all you want to know about MPEG4. :)


Actually, you was correct for h264 stream, if the NAL value (first byte) is 0x7C it means that the I-Frame is fragmented. No other frames (P and B) can be fragmented, so if there is packetization-mode=1 in SDP, then it means that the I-Frames are fragmented, and therefore if you read 0x7C as first byte, then it is I-Frame. Read more here: http://www.rfc-editor.org/rfc/rfc3984.txt.

Tags:

Rtp

Rtsp