TinyYolov2App

Copyright (c) organization

Author

btran

Functions

int main(int argc, char *argv[])

The following steps outlines verbosely what the code in this .cpp file does.

  1. Checks if the number of commandline arguments is not exactly 3. If true, output verbose error and exit with failure.

  2. Store the first commandline argument as the file path to the referenced onnx model and the second as the file path to input image.

  3. Read in input image using Opencv.

  4. Instantiate an TinyYolov2 class object and initialize it with the total number of a custom FACE_CLASSES for an unknown onnx model with the file path to referenced onnx model.

  5. Initialize the classNames in the class object with FACE_CLASSES as defined under Constants.hpp.

  6. Initializes a float-type vector variable called dst that takes into account 3 channels for expected input RGB images and a fixed height of 800 pixels and a fixed width of 800 pixels.

  7. Calls processOneFrame function which is defined in the same script here and gets the output detection result in the form of an image.

    a. Resizes the the input RGB image proportionally to the fixed 416 pixels by 416 pixels input format for TinyYolov2.

    b. Calls preprocess function to convert the resized input image matrix to 1-dimensional float array.

    c. Run the inference with the 1-dimensional float array.

    d. Extract the anchors and attributes value from the inference output, storing them in numAnchors and numAttrs variables.

    e. Convert the inference output to appropriately segregated vector outputs that capture bounding boxes information, corresponding scores and class indices. Filters out any bounding box detection that falls below the defalt 0.5 confidence threshold which is pre-defined in the auxillary function call for processOneFrame.

    f. If the number of bounding boxes in the inference output is zero, just return the original input image.

    g. Perform Non-Maximum Suppression on the segregated vector outputs and filter out bounding boxes with their corresponding confidence score and class indices, based on 0.6 nms threshold value. This value is defined in the auxillary function call for processOneFrame.

    h. Store the filtered results from afterNmsBboxes and afterNmsIndices variables.

    i. Calls the visualizeOneImage function which is defined in examples/Utilty.hpp and returns an output image with all bounding boxes with class label and confidence score printed on image.

  8. Write the output detection result into an image file named result.jpg.

Variables

constexpr const float CONFIDENCE_THRESHOLD = 0.5
constexpr const float NMS_THRESHOLD = 0.6
const std::vector<cv::Scalar> COLORS = toCvScalarColors(Ort::VOC_COLOR_CHART)