How to combine video clips with different orientation using AVFoundation

This is what I do. I then use an AVAssetExportSession to create the actual file. but I warn you, the CGAffineTransforms are sometimes applied late, so you'll see a or two of the original before the video transforms. I have no clue why this happens, a different combination of videos will yield the expected result, but sometimes its off.

AVMutableComposition *composition = [AVMutableComposition composition];    
AVMutableCompositionTrack *compositionVideoTrack = [composition addMutableTrackWithMediaType:AVMediaTypeVideo preferredTrackID:kCMPersistentTrackID_Invalid];
AVMutableVideoComposition *videoComposition = [AVMutableVideoComposition videoComposition]; 
videoComposition.frameDuration = CMTimeMake(1,30); 
videoComposition.renderScale = 1.0;

AVMutableVideoCompositionInstruction *instruction = [AVMutableVideoCompositionInstruction videoCompositionInstruction];
AVMutableVideoCompositionLayerInstruction *layerInstruction = [AVMutableVideoCompositionLayerInstruction videoCompositionLayerInstructionWithAssetTrack:compositionVideoTrack];

// Get only paths the user selected NSMutableArray *array = [NSMutableArray array]; for(NSString* string in videoPathArray){
if(![string isEqualToString:@""]){
    [array addObject:string];
} 

self.videoPathArray = array;

float time = 0;

for (int i = 0; i<self.videoPathArray.count; i++) {

    AVURLAsset *sourceAsset = [AVURLAsset URLAssetWithURL:[NSURL fileURLWithPath:[videoPathArray objectAtIndex:i]] options:[NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES] forKey:AVURLAssetPreferPreciseDurationAndTimingKey]];

    NSError *error = nil;

    BOOL ok = NO;
    AVAssetTrack *sourceVideoTrack = [[sourceAsset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0];

    CGSize temp = CGSizeApplyAffineTransform(sourceVideoTrack.naturalSize, sourceVideoTrack.preferredTransform);
    CGSize size = CGSizeMake(fabsf(temp.width), fabsf(temp.height));
    CGAffineTransform transform = sourceVideoTrack.preferredTransform;

    videoComposition.renderSize = sourceVideoTrack.naturalSize;
    if (size.width > size.height) {
        [layerInstruction setTransform:transform atTime:CMTimeMakeWithSeconds(time, 30)];
    } else {

        float s = size.width/size.height;

        CGAffineTransform new = CGAffineTransformConcat(transform, CGAffineTransformMakeScale(s,s));

        float x = (size.height - size.width*s)/2;

        CGAffineTransform newer = CGAffineTransformConcat(new, CGAffineTransformMakeTranslation(x, 0));

        [layerInstruction setTransform:newer atTime:CMTimeMakeWithSeconds(time, 30)];
    }

    ok = [compositionVideoTrack insertTimeRange:sourceVideoTrack.timeRange ofTrack:sourceVideoTrack atTime:[composition duration] error:&error];

    if (!ok) {
        // Deal with the error.
        NSLog(@"something went wrong");
    }

    NSLog(@"\n source asset duration is %f \n source vid track timerange is %f %f \n composition duration is %f \n composition vid track time range is %f %f",CMTimeGetSeconds([sourceAsset duration]), CMTimeGetSeconds(sourceVideoTrack.timeRange.start),CMTimeGetSeconds(sourceVideoTrack.timeRange.duration),CMTimeGetSeconds([composition duration]), CMTimeGetSeconds(compositionVideoTrack.timeRange.start),CMTimeGetSeconds(compositionVideoTrack.timeRange.duration));

    time += CMTimeGetSeconds(sourceVideoTrack.timeRange.duration);
}

instruction.layerInstructions = [NSArray arrayWithObject:layerInstruction];
instruction.timeRange = compositionVideoTrack.timeRange; 
videoComposition.instructions = [NSArray arrayWithObject:instruction];

This is what I do. I then use an AVAssetExportSession to create the actual file. but I warn you, the CGAffineTransforms are sometimes applied late, so you'll see a or two of the original before the video transforms. I have no clue why this happens, a different combination of videos will yield the expected result, but sometimes its off.