Solve the chromatic puzzle

Octave, 334 313 bytes

Since the challenge may seem a bit daunting, I present my own solution. I did not formally prove that this method works (I guess that will come down to proving that the algorithm will never get stuck in a loop), but so far it works perfectly, doing 100x100 testcases within 15 seconds. Note that I chose to use a function with side effects rather than one that returns all the coordinates since that saved me a few bytes. Coordinates are row-major, 1-based, and formatted as row1 col1 row2 col2. Input colours are 0,1,2 since this works better with mod, at the cost of having to use numel rather than nnz. Golfed version: Edit: saved another few bytes by using a technique from Kevin Lau's answer.

function P(p)
k=0;[m,n]=size(p);t=@(v)mod(sum(v)*numel(v),3);for c=1:n
p(:,c)=V(p(:,c));end
k=1;for r=1:m
p(r,:)=V(p(r,:));end
function a=V(a)
while any(a-(g=t(a)))
isempty(q=find(diff(a)&a(1:end-1)-g,1))&&(q=find(diff(a),1));
a([q q+1])=t(a([q q+1]));if k
disp([r q r q+1])
else
disp([q c q+1 c])
end;end;end;end

Example GIF of the solving algorithm:

enter image description here

Ungolfed version:

function solveChromaticPuzzle(p)
[m,n]=size(p);                           % Get size
t=@(v)mod(sum(v)*numel(v),3);            % Target colour function
for c=1:n                                % Loop over columns
    p(:,c)=solveVec(p(:,c));             % Solve vector
end
for r=1:m                                % Loop over rows
    p(r,:)=solveVec(p(r,:));
end
    function a=solveVec(a)               % Nested function to get globals
        g=t(a);                          % Determine target colour
        while(any(a~=g))                 % While any is diff from target...
            % Do the finding magic. Working left-to-right, we find the
            % first pair that can be flipped (nonzero diff) that does not
            % have the left colour different from our goal colour
            q=min(intersect(find(diff(a)),find(a~=g)));
            if(isempty(q))               % In case we get stuck...
                q=find(diff(a),1);       % ... just flip the first possible
            end;
            a([q q+1])=t(a([q q+1]));    % Do the actual flipping.
            if(exist('r'))               % If we're working per row
                disp([r q r q+1])        % Print the pair, using global row
            else
                disp([q c q+1 c])        % Print the pari, using global col
            end
        end
    end
end

Lua, 594 575 559 Bytes

Warning There's still lots of work before I'm done with this golfing! I should be able to take that under 500 Bytes, at the very least. For the moment, it's the first solution that worked, and I'm still working on it.

I will provide a full explanation once I'm done.

function f(t)s=#t a=","for i=1,s do p=t[i]for i=1,s
do p.Q=p.Q and p.Q+p[i]or p[i]end p.Q=(p.Q*#p)%3 for i=1,s do for j=1,#p-1 do
x=p[j]y=p[j+1]u=x~=y and(p.S and p.R==p.S or x~=p.Q)v=(x+y)*2p[j]=u and v%3or x
p[j+1]=u and v%3or y print(i..a..j,i..a..j+1)end
p.R=p.S p.S=table.concat(p)end end
for i=1,s do Q=Q and Q+t[i][1]or t[i][1]end Q=(Q*s)%3 for i=1,s
do for j=1,s-1 do p=t[j]q=t[j+1]x=p[1]y=q[1]u=x~=y and(S and R==S or x~=Q)v=(x+y)*2
for k=1,#p do p[k]=u and v%3or x q[k]=u and v%3or y
print(j..a..k,j+1..a..k)end Y=Y and Y..x or x end
R=S S=Y end end

Rust, 496 495 bytes

Sadly I can't beat OP, but for a Rust answer I am quite satisfied with the bytecount.

let s=|mut v:Vec<_>,c|{
let p=|v:&mut[_],f,t|{
let x=|v:&mut[_],i|{
let n=v[i]^v[i+1];v[i]=n;v[i+1]=n;
for k in f..t+1{print!("{:?}",if f==t{(k,i,k,i+1)}else{(i,k,i+1,k)});}};
let l=v.len();let t=(1..4).find(|x|l*x)%3==v.iter().fold(0,|a,b|a+b)%3).unwrap();
let mut i=0;while i<l{let c=v[i];if c==t{i+=1;}else if c==v[i+1]{
let j=if let Some(x)=(i+1..l).find(|j|v[j+1]!=c){x}else{i-=1;i};x(v,j);}else{x(v,i);}}t};
p(&mut (0..).zip(v.chunks_mut(c)).map(|(i,x)|{p(x,i,i)}).collect::<Vec<_>>(),0,c-1usize)};

Input: a vector of numbers as well as the number of columns. E.g.

s(vec!{1,2,1,3},2);

outputs

 (row1,col1,row2,col2)

to the command line.

I first solve every row and then solve the resulting column only once ,but print the steps for all columns. So it is actually quite efficient :-P.

With formating:

let s=|mut v:Vec<_>,c|{  
    let p=|v:&mut[_],f,t|{     // solves a single row/column
        let x=|v:&mut[_],i|{   // makes a move and prints it 
            let n=v[i]^v[i+1]; // use xor to calculate the new color
            v[i]=n;
            v[i+1]=n;
            for k in f..t{
                print!("{:?}",if f==t{(k,i,k,i+1)}else{(i,k,i+1,k)});
            }
        };
        let l=v.len();
        // find target color
        // oh man i am so looking forward to sum() being stabilized
        let t=(1..4).find(|x|(l*x)%3==v.iter().fold(0,|a,b|a+b)%3).unwrap();
        let mut i=0;
        while i<l{
            let c=v[i];
            if c==t{             // if the color is target color move on
                i+=1;
            }else if c==v[i+1]{ // if the next color is the same
                                // find the next possible move
                let j=if let Some(x)=(i+1..l).find(|j|v[j+1]!=c){x}else{i-=1;i};
                x(v,j);
            }else{              // colors are different so we can make a move
                x(v,i);         
            }
        }
        t
    };
    // first solve all rows and than sovle the resulting column c times 
    p(&mut (0..).zip(v.chunks_mut(c)).map(|(i,x)|p(x,i,i)).collect::<Vec<_>>(),0,c-1usize)
};

Edit: now returns the color of the solution which saves me a semicolon^^