Get pixel value from CVPixelBufferRef in Swift

Here is a method for getting the individual rgb values from a BGRA pixel buffer. Note: Your buffer must be locked before calling this.

func pixelFrom(x: Int, y: Int, movieFrame: CVPixelBuffer) -> (UInt8, UInt8, UInt8) {
    let baseAddress = CVPixelBufferGetBaseAddress(movieFrame)
    
    let bytesPerRow = CVPixelBufferGetBytesPerRow(movieFrame)
    let buffer = baseAddress!.assumingMemoryBound(to: UInt8.self)
    
    let index = x*4 + y*bytesPerRow
    let b = buffer[index]
    let g = buffer[index+1]
    let r = buffer[index+2]
    
    return (r, g, b)
}

baseAddress is an unsafe mutable pointer or more precisely a UnsafeMutablePointer<Void>. You can easily access the memory once you have converted the pointer away from Void to a more specific type:

// Convert the base address to a safe pointer of the appropriate type
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)

// read the data (returns value of type UInt8)
let firstByte = byteBuffer[0]

// write data
byteBuffer[3] = 90

Make sure you use the correct type (8, 16 or 32 bit unsigned int). It depends on the video format. Most likely it's 8 bit.

Update on buffer formats:

You can specify the format when you initialize the AVCaptureVideoDataOutput instance. You basically have the choice of:

  • BGRA: a single plane where the blue, green, red and alpha values are stored in a 32 bit integer each
  • 420YpCbCr8BiPlanarFullRange: Two planes, the first containing a byte for each pixel with the Y (luma) value, the second containing the Cb and Cr (chroma) values for groups of pixels
  • 420YpCbCr8BiPlanarVideoRange: The same as 420YpCbCr8BiPlanarFullRange but the Y values are restricted to the range 16 – 235 (for historical reasons)

If you're interested in the color values and speed (or rather maximum frame rate) is not an issue, then go for the simpler BGRA format. Otherwise take one of the more efficient native video formats.

If you have two planes, you must get the base address of the desired plane (see video format example):

Video format example

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
let bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)

// Get luma value for pixel (43, 17)
let luma = byteBuffer[17 * bytesPerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

BGRA example

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer)
let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
let int32Buffer = UnsafeMutablePointer<UInt32>(baseAddress)

// Get BGRA value for pixel (43, 17)
let luma = int32Buffer[17 * int32PerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

Swift 5

I had the same problem and ended up with the following solution. My CVPixelBuffer had dimensionality 68 x 68, which can be inspected by

CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
print(CVPixelBufferGetWidth(pixelBuffer))
print(CVPixelBufferGetHeight(pixelBuffer))

You also have to know the bytes per row:

print(CVPixelBufferGetBytesPerRow(pixelBuffer))

which in my case was 320.

Furthermore, you need to know the data type of your pixel buffer, which was Float32 for me.

I then constructed a byte buffer and read the bytes consecutively as follows (remember to lock the base address as shown above):

var byteBuffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<Float32>.self)
var pixelArray: Array<Array<Float>> = Array(repeating: Array(repeating: 0, count: 68), count: 68)
for row in 0...67{
    for col in 0...67{
        pixelArray[row][col] = byteBuffer.pointee
        byteBuffer = byteBuffer.successor()    
    }
    byteBuffer = byteBuffer.advanced(by: 12)
}
CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))

You might wonder about the part byteBuffer = byteBuffer.advanced(by: 12). The reason why we have to do this is as follows.

We know that we have 320 bytes per row. However, our buffer has width 68 and the data type is Float32, e.g. 4 bytes per value. That means that we virtually only have 272 bytes per row, followed by zero-padding. This zero-padding probably has memory layout reasons.

We, therefore, have to skip the last 48 bytes in each row which is done by byteBuffer = byteBuffer.advanced(by: 12) (12*4 = 48).

This approach is somewhat different from other solutions as we use pointers to the next byteBuffer. However, I find this easier and more intuitive.


Update for Swift3:

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0));
let int32Buffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<UInt32>.self)
let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
// Get BGRA value for pixel (43, 17)
let luma = int32Buffer[17 * int32PerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)