Find local maxima in grayscale image using OpenCV

Actually after I posted the code above I wrote a better and very very faster one .. The code above suffers even for a 640x480 picture.. I optimized it and now it is very very fast even for 1600x1200 pic. Here is the code :

void localMaxima(cv::Mat src,cv::Mat &dst,int squareSize)
{
if (squareSize==0)
{
    dst = src.clone();
    return;
}

Mat m0;
dst = src.clone();
Point maxLoc(0,0);

//1.Be sure to have at least 3x3 for at least looking at 1 pixel close neighbours
//  Also the window must be <odd>x<odd>
SANITYCHECK(squareSize,3,1);
int sqrCenter = (squareSize-1)/2;

//2.Create the localWindow mask to get things done faster
//  When we find a local maxima we will multiply the subwindow with this MASK
//  So that we will not search for those 0 values again and again
Mat localWindowMask = Mat::zeros(Size(squareSize,squareSize),CV_8U);//boolean
localWindowMask.at<unsigned char>(sqrCenter,sqrCenter)=1;

//3.Find the threshold value to threshold the image
    //this function here returns the peak of histogram of picture
    //the picture is a thresholded picture it will have a lot of zero values in it
    //so that the second boolean variable says :
    //  (boolean) ? "return peak even if it is at 0" : "return peak discarding 0"
int thrshld =  maxUsedValInHistogramData(dst,false);
threshold(dst,m0,thrshld,1,THRESH_BINARY);

//4.Now delete all thresholded values from picture
dst = dst.mul(m0);

//put the src in the middle of the big array
for (int row=sqrCenter;row<dst.size().height-sqrCenter;row++)
    for (int col=sqrCenter;col<dst.size().width-sqrCenter;col++)
    {
        //1.if the value is zero it can not be a local maxima
        if (dst.at<unsigned char>(row,col)==0)
            continue;
        //2.the value at (row,col) is not 0 so it can be a local maxima point
        m0 =  dst.colRange(col-sqrCenter,col+sqrCenter+1).rowRange(row-sqrCenter,row+sqrCenter+1);
        minMaxLoc(m0,NULL,NULL,NULL,&maxLoc);
        //if the maximum location of this subWindow is at center
        //it means we found the local maxima
        //so we should delete the surrounding values which lies in the subWindow area
        //hence we will not try to find if a point is at localMaxima when already found a neighbour was
        if ((maxLoc.x==sqrCenter)&&(maxLoc.y==sqrCenter))
        {
            m0 = m0.mul(localWindowMask);
                            //we can skip the values that we already made 0 by the above function
            col+=sqrCenter;
        }
    }
}

Here's a simple trick. The idea is to dilate with a kernel that contains a hole in the center. After the dilate operation, each pixel is replaced with the maximum of it's neighbors (using a 5 by 5 neighborhood in this example), excluding the original pixel.

Mat1b kernelLM(Size(5, 5), 1u);
kernelLM.at<uchar>(2, 2) = 0u;
Mat imageLM;
dilate(image, imageLM, kernelLM);
Mat1b localMaxima = (image > imageLM);

A pixel is considered a local maximum if it is equal to the maximum value in a 'local' neighborhood. The function below captures this property in two lines of code.

To deal with pixels on 'plateaus' (value equal to their neighborhood) one can use the local minimum property, since plateaus pixels are equal to their local minimum. The rest of the code filters out those pixels.

void non_maxima_suppression(const cv::Mat& image, cv::Mat& mask, bool remove_plateaus) {
    // find pixels that are equal to the local neighborhood not maximum (including 'plateaus')
    cv::dilate(image, mask, cv::Mat());
    cv::compare(image, mask, mask, cv::CMP_GE);

    // optionally filter out pixels that are equal to the local minimum ('plateaus')
    if (remove_plateaus) {
        cv::Mat non_plateau_mask;
        cv::erode(image, non_plateau_mask, cv::Mat());
        cv::compare(image, non_plateau_mask, non_plateau_mask, cv::CMP_GT);
        cv::bitwise_and(mask, non_plateau_mask, mask);
    }
}