Casting complex to real without data copy in MATLAB R2018a and newer

See this FEX submission, which can do the complex --> 2 reals reinterpretation without copying the data (it can even point to interior contiguous sub-sections of the data without a copy):

https://www.mathworks.com/matlabcentral/fileexchange/65842-sharedchild-creates-a-shared-data-copy-of-a-contiguous-subsection-of-an-existing-variable

If you are simply reading and writing interleaved complex data files in R2018a and later, see this FEX submission:

https://www.mathworks.com/matlabcentral/fileexchange/77530-freadcomplex-and-fwritecomplex


This question reminded me of some blog posts regarding in-place memory editing through MEX, about which the following was said:

[M]odifying the original data directly is both discouraged and not officially supported. Doing it incorrectly can easily crash Matlab. ref

This is, in the best case, an unmaintainable mess. ref

Having said that, I don't have a solution for you, but I might be able to suggest a workaround.

Seeing how MATLAB allows us to call python libraries, we can perform these sort of manipulations within python, and only bring the data back into MATLAB when we reach a stage where proceeding in python is impossible or undesired. Depending on when this stage might be, this idea could be either a valid approach, or a completely useless suggestion.

Take a look at the example below, it should be pretty self-explanatory:

np = py.importlib.import_module('numpy');
sp = py.importlib.import_module('scipy.fftpack');

% Create a double array in python:
arrC = sp.fftn(np.random.rand(uint8(4), uint8(3), uint8(2)));
%{
arrC = 
  Python ndarray with properties:

           T: [1×1 py.numpy.ndarray]
        base: [1×1 py.NoneType]
      ctypes: [1×1 py.numpy.core._internal._ctypes]
        data: [1×4 py.memoryview]
       dtype: [1×1 py.numpy.dtype]
       flags: [1×1 py.numpy.flagsobj]
        flat: [1×1 py.numpy.flatiter]
        imag: [1×1 py.numpy.ndarray]
    itemsize: [1×1 py.int]
      nbytes: [1×1 py.int]
        ndim: [1×1 py.int]
        real: [1×1 py.numpy.ndarray]
       shape: [1×3 py.tuple]
        size: [1×1 py.int]
     strides: [1×3 py.tuple]
    [[[ 13.99586491+0.j           0.70305071+0.j        ]
      [ -1.33719563-1.3820106j   -0.74083670+0.25893033j]
      [ -1.33719563+1.3820106j   -0.74083670-0.25893033j]]

     [[ -0.43914391+0.8336674j    0.08835445-0.50821244j]
      [  1.07089829-0.35245746j   0.44890850-0.9650458j ]
      [  2.09813180+1.34942678j  -1.20877832+0.71191772j]]

     [[ -2.93525342+0.j          -0.69644042+0.j        ]
      [  0.16165913-1.29739125j  -0.84443177+0.26884365j]
      [  0.16165913+1.29739125j  -0.84443177-0.26884365j]]

     [[ -0.43914391-0.8336674j    0.08835445+0.50821244j]
      [  2.09813180-1.34942678j  -1.20877832-0.71191772j]
      [  1.07089829+0.35245746j   0.44890850+0.9650458j ]]]
%}

% Make sure that python sees it as a "complex double" (aka complex128)
assert( isequal(arrC.dtype, np.dtype(np.complex128)) );

% Return a (real) double view:
arrR = arrC.view(np.float64);
%{
arrR = 
  Python ndarray with properties:

           T: [1×1 py.numpy.ndarray]
        base: [1×1 py.numpy.ndarray]
      ctypes: [1×1 py.numpy.core._internal._ctypes]
        data: [1×4 py.memoryview]
       dtype: [1×1 py.numpy.dtype]
       flags: [1×1 py.numpy.flagsobj]
        flat: [1×1 py.numpy.flatiter]
        imag: [1×1 py.numpy.ndarray]
    itemsize: [1×1 py.int]
      nbytes: [1×1 py.int]
        ndim: [1×1 py.int]
        real: [1×1 py.numpy.ndarray]
       shape: [1×3 py.tuple]
        size: [1×1 py.int]
     strides: [1×3 py.tuple]
    [[[ 13.99586491   0.           0.70305071   0.        ]
      [ -1.33719563  -1.3820106   -0.7408367    0.25893033]
      [ -1.33719563   1.3820106   -0.7408367   -0.25893033]]

     [[ -0.43914391   0.8336674    0.08835445  -0.50821244]
      [  1.07089829  -0.35245746   0.4489085   -0.9650458 ]
      [  2.0981318    1.34942678  -1.20877832   0.71191772]]

     [[ -2.93525342   0.          -0.69644042   0.        ]
      [  0.16165913  -1.29739125  -0.84443177   0.26884365]
      [  0.16165913   1.29739125  -0.84443177  -0.26884365]]

     [[ -0.43914391  -0.8336674    0.08835445   0.50821244]
      [  2.0981318   -1.34942678  -1.20877832  -0.71191772]
      [  1.07089829   0.35245746   0.4489085    0.9650458 ]]]
%}

% Do something else with it in python
...

% Bring data to MATLAB:
sz = cellfun(@int64, cell(arrR.shape));
B = permute(reshape(double(py.array.array('d', arrR.flatten('C').tolist())),sz),[2,1,3]);

The last stage might be inefficient enough to negate the performance/memory gains by not copying the data previously, so it would probably be a good idea to keep it for the very end, and reduce the data as much as possible beforehand.


While experimenting with this, I came to realize that there's much difficulty1 involved in transferring non-scalar complex data from MATLAB to Python and back (just compare the MATLAB output for np.asarray(1+1i) vs np.asarray([1+1i, 1+1i])). Possibly, the reason for this is the limited support for complex non-ndarray arrays in python.

If you (as opposed to me) know what to do with memoryview objects (i.e. the content of the data field of the ndarray objects) - you could get a pointer and perhaps pass this on to C to get some useful results.


1 It's possible, but you have to num2cell your data so that it can get transferred as a python list (and vice versa). The situation can also be improved by editing some MATLAB files.


Here is a wording from the documentation:

Since many mathematical libraries use an interleaved complex representation, using the same representation in your MEX functions eliminates the need to translate data. This simplifies your code and potentially speeds up the processing when large data sets are involved.

According to the example in the documentation each complex type appears to be an struct with two real and imaginary components:

typedef struct {
    double real;
    double imag;
} mxComplexDouble ;

So we want to (re-interpret) cast an array of mxComplexDouble to other double complex type that is used in a mathematical library.

For example a mathematical library may have defined its complex type as:

typedef struct {
    double real;
    double imag;
} other_complex ;

And it has defined a function to consume an array of its own complex type.

void do_something(const other_complex* math_array);

Here we want to send a MATLAB complex array to the math library:

mxComplexDouble matlab_array[5];
other_complex* math_array =  (other_complex*)matlab_array;
do_something(math_array);

Because it needs just a pointer conversion, It may be considered that it speeds up the process.

But regarding the strict aliasing rule * any use of math_array results in an undefined behavior . Even casting to array of doubles is prohibited:

double* double_array = (double*)matlab_array; 
printf("%f",double_array[0]); // Unedfined behavior

Therefor we need to allocate a new array and copy the data byte by byte using memcpy. It is the safe way that prevents undefined behavior.

The only exception to the strict aliasing rule is the standard complex data types that are defined in the headers complex.h and complex in c and c++ respectively and only floating point [float, double and long double] complex types can be defined. So we can safely cast a std::complex<double>* to double*.

Conclutions :

  1. typecast may have been prohibited for new MATLAB complex data types because of the strict aliasing rule but as explained new complex data types cannot be used so cheaply in the other libraries. Only using memcpy may be considered as an efficient way for copying the whole data.

  2. Using typcast for other data types seems that to be doubtful. I can (and probably should) assume that MATLAB has used some compiler tricks to prevent undefined behavior, otherwise when we are accessing a data element if MATLAB hasn't copied it byte by byte then it results in an undefined behavior.(Note that anyway casting between compatible types like int32 and uint32 and also casting any type to c++ char type have well defined behavior). Moreover it requires that all mex file to be compiled with the proper option to disable strict aliasing. But we currently have plenty of compiled mex files that if we send to it a result of typecast it leads to an undefined behavior. So use of typecast should be restricted as much as possible.

*I have found some blogs that have better explained the concept than the SO post. For example here or here or here.