How to get the coordinates of the bounding box in YOLO object detection?

If you are going to implement this in python, there is this small python wrapper that I have created in here. Follow the ReadMe file and install it. It will be very easy to install.

After that follow this example code to know how to detect objects.
If your detection is det

top_left_x = det.bbox.x
top_left_y = det.bbox.y
width = det.bbox.w
height = det.bbox.h

If you need, you can get the midpoint by:

mid_x, mid_y = det.bbox.get_point(pyyolo.BBox.Location.MID)

Hope this helps..


for python user in windows:

first..., do several setting jobs:

  1. setting python path of your darknet folder in environtment path:

    PYTHONPATH = 'YOUR DARKNET FOLDER'

  2. add PYTHONPATH to Path value by add:

    %PYTHONPATH%

  3. edit file coco.data in cfg folder, by change the names folder variable to your coco.names folder, in my case:

    names = D:/core/darknetAB/data/coco.names

with this setting, you can call darknet.py (from alexeyAB\darknet repository) as your python module from any folder.

start scripting:

from darknet import performDetect as scan #calling 'performDetect' function from darknet.py

def detect(str):
    ''' this script if you want only want get the coord '''
    picpath = str
    cfg='D:/core/darknetAB/cfg/yolov3.cfg' #change this if you want use different config
    coco='D:/core/darknetAB/cfg/coco.data' #you can change this too
    data='D:/core/darknetAB/yolov3.weights' #and this, can be change by you
    test = scan(imagePath=picpath, thresh=0.25, configPath=cfg, weightPath=data, metaPath=coco, showImage=False, makeImageOnly=False, initOnly=False) #default format, i prefer only call the result not to produce image to get more performance

    #until here you will get some data in default mode from alexeyAB, as explain in module.
    #try to: help(scan), explain about the result format of process is: [(item_name, convidence_rate (x_center_image, y_center_image, width_size_box, height_size_of_box))], 
    #to change it with generally used form, like PIL/opencv, do like this below (still in detect function that we create):

    newdata = []
    if len(test) >=2:
        for x in test:
            item, confidence_rate, imagedata = x
            x1, y1, w_size, h_size = imagedata
            x_start = round(x1 - (w_size/2))
            y_start = round(y1 - (h_size/2))
            x_end = round(x_start + w_size)
            y_end = round(y_start + h_size)
            data = (item, confidence_rate, (x_start, y_start, x_end, y_end), w_size, h_size)
            newdata.append(data)

    elif len(test) == 1:
        item, confidence_rate, imagedata = test[0]
        x1, y1, w_size, h_size = imagedata
        x_start = round(x1 - (w_size/2))
        y_start = round(y1 - (h_size/2))
        x_end = round(x_start + w_size)
        y_end = round(y_start + h_size)
        data = (item, confidence_rate, (x_start, y_start, x_end, y_end), w_size, h_size)
        newdata.append(data)

    else:
        newdata = False

    return newdata

How to use it:

table = 'D:/test/image/test1.jpg'
checking = detect(table)'

to get the coordinate:

if only 1 result:

x1, y1, x2, y2 = checking[2]

if many result:

for x in checking:
    item = x[0]
    x1, y1, x2, y2 = x[2]
    print(item)
    print(x1, y1, x2, y2)

A quick solution is to modify the image.c file to print out the bounding box information:

...
if(bot > im.h-1) bot = im.h-1;

// Print bounding box values 
printf("Bounding Box: Left=%d, Top=%d, Right=%d, Bottom=%d\n", left, top, right, bot); 
draw_box_width(im, left, top, right, bot, width, red, green, blue);
...