How to write vectorized functions in MATLAB

Update 2 (to address your updated question)

MATLAB is optimized to work well with arrays. Once you get used to it, it is actually really nice to just have to type one line and have MATLAB do the full 4D looping stuff itself without having to worry about it. MATLAB is often used for prototyping / one-off calculations, so it makes sense to save time for the person coding, and giving up some of C[++|#]'s flexibility.

This is why MATLAB internally does some loops really well - often by coding them as a compiled function.

The code snippet you give doesn't really contain the relevant line of code which does the main work, namely

% Sort along given dimension
x = sort(x,dim);

In other words, the code you show only needs to access the median values by their correct index in the now-sorted multi-dimensional array x (which doesn't take much time). The actual work accessing all array elements was done by sort, which is a built-in (i.e. compiled and highly optimized) function.

Original answer (about how to built your own fast functions working on arrays)

There are actually quite a few built-ins that take a dimension parameter: min(stack, [], n), max(stack, [], n), mean(stack, n), std(stack, [], n), median(stack,n), sum(stack, n)... together with the fact that other built-in functions like exp(), sin() automatically work on each element of your whole array (i.e. sin(stack) automatically does four nested loops for you if stack is 4D), you can built up a lot of functions that you might need just be relying on the existing built-ins.

If this is not enough for a particular case you should have a look at repmat, bsxfun, arrayfun and accumarray which are very powerful functions for doing things "the MATLAB way". Just search on SO for questions (or rather answers) using one of these, I learned a lot about MATLABs strong points that way.

As an example, say you wanted to implement the p-norm of stack along dimension n, you could write

function result=pnorm(stack, p, n)
result=sum(stack.^p,n)^(1/p);

... where you effectively reuse the "which-dimension-capability" of sum.

Update

As Max points out in the comments, also have a look at the colon operator (:) which is a very powerful tool for selecting elements from an array (or even changing it shape, which is more generally done with reshape).

In general, have a look at the section Array Operations in the help - it contains repmat et al. mentioned above, but also cumsum and some more obscure helper functions which you should use as building blocks.


Could you explain to me that why is this code so effective compared to the simple nested loops? It has nested loops just like the other function.

The problem with nested loops is not the nested loops themselves. It's the operations you perform inside.

Each function call (especially to a non-built-in function) generates a little bit of overhead; more so if the function performs e.g. error checking that takes the same amount of time regardless of input size. Thus, if a function has only a 1 ms overhead, if you call it 1000 times, you will have wasted a second. If you can call it once to perform a vectorized calculation, you pay overhead only once.

Furthermore, the JIT compiler (pdf) can help vectorize simple for-loops, where you, for example, only perform basic arithmetic operations. Thus, the loops with simple calculations in your post are sped up by a lot, while the loops calling median are not.


Vectorization

In addition to whats already been said, you should also understand that vectorization involves parallelization, i.e. performing concurrent operations on data as opposed to sequential execution (think SIMD instructions), and even taking advantage of threads and multiprocessors in some cases...

MEX-files

Now although the "interpreted vs. compiled" point has already been argued, no one mentioned that you can extend MATLAB by writing MEX-files, which are compiled executables written in C, that can be called directly as normal function from inside MATLAB. This allows you to implement performance-critical parts using a lower-level language like C.

Column-major order

Finally, when trying to optimize some code, always remember that MATLAB stores matrices in column-major order. Accessing elements in that order can yield significant improvements compared to other arbitrary orders.

For example, in your previous linked question, you were computing the median of set of stacked images along some dimension. Now the order in which those dimensions are ordered greatly affect the performance. Illustration:

%# sequence of 10 images
fPath = fullfile(matlabroot,'toolbox','images','imdemos');
files = dir( fullfile(fPath,'AT3_1m4_*.tif') );
files = strcat(fPath,{filesep},{files.name}');      %'

I = imread( files{1} );

%# stacked images along the 1st dimension: [numImages H W RGB]
stack1 = zeros([numel(files) size(I) 3], class(I));
for i=1:numel(files)
    I = imread( files{i} );
    stack1(i,:,:,:) = repmat(I, [1 1 3]);   %# grayscale to RGB
end

%# stacked images along the 4th dimension: [H W RGB numImages]
stack4 = permute(stack1, [2 3 4 1]);

%# compute median image from each of these two stacks
tic, m1 = squeeze( median(stack1,1) ); toc
tic, m4 = median(stack4,4); toc
isequal(m1,m4)

The timing difference was huge:

Elapsed time is 0.257551 seconds.     %# stack1
Elapsed time is 17.405075 seconds.    %# stack4