opencv
diff --git a/‎models/object_detection_yolox/README.md
Lines changed: 39 additions & 187 deletions b/‎models/object_detection_yolox/README.md
Lines changed: 39 additions & 187 deletions
diff --git a/‎models/object_detection_yolox/YoloX.py
Lines changed: 38 additions & 59 deletions b/‎models/object_detection_yolox/YoloX.py
Lines changed: 38 additions & 59 deletions
@@ -8,12 +8,39 @@ Key features of the YOLOX object detector
 - **SimOTA advanced label assignment strategy** reduces training time and avoids additional solver hyperparameters
 - **Strong data augmentations like MixUp and Mosiac** to boost YOLOX performance
 
-#### Model metrics:
-Average Precision and Recall values observed for COCO dataset classes are showed below 
+Note:
+- This version of YoloX: YoloX_s
 
+## Demo
+
+Run the following command to try the demo: 
+```shell
+# detect on camera input
+python demo.py
+# detect on an image
+python demo.py --input /path/to/image
+```
+Note: 
+- image result saved as "result.jpg"
+
+
+## Results
+
+Here are some of the sample results that were observed using the model (**yolox_s.onnx**),
+
+![1_res.jpg](./samples/1_res.jpg)
+![2_res.jpg](./samples/2_res.jpg)
+![3_res.jpg](./samples/3_res.jpg)
+
+<!--  
+Video inference result,
+![WebCamR.gif](./examples/results/WebCamR.gif)
+-->
+
+## Model metrics:
+
+The model is evaluated on [COCO 2017 val](https://cocodataset.org/#download). Results are showed below:
 
-##### YOLOX_S:
-Average forward time: 5.53 ms, Average NMS time: 1.71 ms, Average inference time: 7.25 ms
 <table>
 <tr><th>Average Precision </th><th>Average Recall</th></tr>
 <tr><td>
@@ -39,126 +66,6 @@ Average forward time: 5.53 ms, Average NMS time: 1.71 ms, Average inference time
 |  large  |  0.50:0.95  |  0.724  |
 </td></tr> </table>
 
-
-##### YOLOX_tiny:
-Average forward time: 2.07 ms, Average NMS time: 1.71 ms, Average inference time: 3.79 ms
-<table>
-<tr><th>Average Precision </th><th>Average Recall</th></tr>
-<tr><td>
-
-|  area  |  IoU  |  Average Precision(AP)  |
-|:-------|:------|:------------------------|
-|  all  |  0.50:0.95  |  0.328  |
-|  all  |  0.50  |  0.504  |
-|  all  |  0.75  |  0.346  |
-|  small  |  0.50:0.95  |  0.139  |
-|  medium  |  0.50:0.95  |  0.360  |
-|  large  |  0.50:0.95  |  0.501  |
-
- </td><td>
-
-  area  |  IoU  |  Average Recall(AR)  |
-|:-------|:------|:----------------|
-|  all  |  0.50:0.95  |  0.283  |
-|  all  |  0.50:0.95  |  0.450  |
-|  all  |  0.50:0.95 |  0.485  |
-|  small  |  0.50:0.95  |  0.226  |
-|  medium  |  0.50:0.95  |  0.550  |
-|  large  |  0.50:0.95  |  0.687  |
-</td></tr> </table>
-
-
-##### YOLOX_nano:
-Average forward time: 1.68 ms, Average NMS time: 1.64 ms, Average inference time: 3.31 ms
-<table>
-<tr><th>Average Precision </th><th>Average Recall</th></tr>
-<tr><td>
-
-|  area  |  IoU  |  Average Precision(AP)  |
-|:-------|:------|:------------------------|
-|  all  |  0.50:0.95  |  0.258  |
-|  all  |  0.50  |  0.414  |
-|  all  |  0.75  |  0.268  |
-|  small  |  0.50:0.95  |  0.082  |
-|  medium  |  0.50:0.95  |  0.275  |
-|  large  |  0.50:0.95  |  0.410  |
-
- </td><td>
-
-  area  |  IoU  |  Average Recall(AR)  |
-|:-------|:------|:----------------|
-|  all  |  0.50:0.95  |  0.241  |
-|  all  |  0.50:0.95  |  0.384  |
-|  all  |  0.50:0.95 |  0.420  |
-|  small  |  0.50:0.95  |  0.157  |
-|  medium  |  0.50:0.95  |  0.473  |
-|  large  |  0.50:0.95  |  0.631  |
-</td></tr> </table>
-
-
-## Demo
-
-Run the following command to try the demo: 
-```shell
-# Nanodet inference on image input
-python demo.py --model /path/to/model/ --input_type image --image_path /path/to/image/
-
-# Nanodet inference on video input
-python demo.py --model /path/to/model/ --input_type video 
-
-#Saving outputs 
-#Image output
-python demo.py --model /path/to/model/ --input_type image --image_path /path/to/image/ --save True
-
-#Video output
-python demo.py --model /path/to/model/ --input_type video --save True
-
-other parameters 
---confidence: Confidence values of the predictions (default: 0.5)
---nms: NMS threshold value for predictions (default: 0.5)
---obj: Object threshold value (default: 0.5)
-```
-Note: 
-- By default input_type: image
-- image result saved as "result.jpg"
-- webcam result saved as "Webcam_result.mp4"
-
-
-## Results
-
-Here are some of the sample results that were observed using the model (**yolox_s.onnx**),
-
-<p float="left">
-  <img src="./examples/results/result1.jpg" width="450" height="450">
-  <img src="./examples/results/result2.jpg" width="450" height="450">
-</p>
-  
-Video inference result,
-<p align="center">
-  <img src="https://github.com/Sidd1609/opencv_zoo/blob/master/models/object_detection_nanodet/examples/results/WebCamR.gif" width="650" height="450">
-</p>
-  
-
-## License
-
-All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
-
-
-## Reference
-
-- YOLOX article: https://arxiv.org/abs/2107.08430
-- YOLOX weight and scripts for training: https://github.com/Megvii-BaseDetection/YOLOX
-- YOLOX blog: https://arshren.medium.com/yolox-new-improved-yolo-d430c0e4cf20
-- YOLOX-lite: https://github.com/TexasInstruments/edgeai-yolox
-
-
-#### Note:
-
-- In this repo we have used the following versions of YOLOX: YOLOX_S, YOLOX_tiny, YOLOX_nano
-- The model was trained on COCO 2017 dataset, link to dataset: https://cocodataset.org/#download
-- Below, we have per class AP results on COCO dataset for the models YOLOX_S, YOLOX_tiny, YOLOX_nano respectively
-
-##### YOLOX_S
 | class         | AP     | class        | AP     | class          | AP     |
 |:--------------|:-------|:-------------|:-------|:---------------|:-------|
 | person        | 54.109 | bicycle      | 31.580 | car            | 40.447 |
@@ -189,70 +96,9 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
 | vase          | 37.013 | scissors     | 26.307 | teddy bear     | 45.676 |
 | hair drier    | 7.255  | toothbrush   | 19.374 |                |        |
 
+## License
 
-##### YOLOX_tiny
-| class         | AP     | class        | AP     | class          | AP     |
-|:--------------|:-------|:-------------|:-------|:---------------|:-------|
-| person        | 45.685 | bicycle      | 22.797 | car            | 29.265 |
-| motorcycle    | 37.980 | airplane     | 59.446 | bus            | 54.878 |
-| train         | 62.459 | truck        | 26.850 | boat           | 16.724 |
-| traffic light | 17.527 | fire hydrant | 55.155 | stop sign      | 57.120 |
-| parking meter | 37.755 | bench        | 17.656 | bird           | 24.382 |
-| cat           | 55.792 | dog          | 50.964 | horse          | 49.806 |
-| sheep         | 39.894 | cow          | 42.855 | elephant       | 58.863 |
-| bear          | 62.345 | zebra        | 58.389 | giraffe        | 62.362 |
-| backpack      | 8.131  | umbrella     | 33.650 | handbag        | 7.777  |
-| tie           | 21.907 | suitcase     | 25.593 | frisbee        | 48.975 |
-| skis          | 16.941 | snowboard    | 19.409 | sports ball    | 30.718 |
-| kite          | 33.956 | baseball bat | 17.912 | baseball glove | 28.793 |
-| skateboard    | 38.253 | surfboard    | 28.329 | tennis racket  | 33.240 |
-| bottle        | 23.872 | wine glass   | 20.386 | cup            | 26.962 |
-| fork          | 21.025 | knife        | 8.434  | spoon          | 6.513  |
-| bowl          | 34.706 | banana       | 24.050 | apple          | 12.745 |
-| sandwich      | 28.046 | orange       | 24.216 | broccoli       | 18.579 |
-| carrot        | 16.283 | hot dog      | 30.058 | pizza          | 44.371 |
-| donut         | 35.957 | cake         | 29.765 | chair          | 22.070 |
-| couch         | 41.221 | potted plant | 19.856 | bed            | 44.173 |
-| dining table  | 29.000 | toilet       | 60.369 | tv             | 49.868 |
-| laptop        | 48.858 | mouse        | 47.843 | remote         | 14.349 |
-| keyboard      | 42.412 | cell phone   | 23.536 | microwave      | 51.839 |
-| oven          | 32.384 | toaster      | 24.209 | sink           | 32.607 |
-| refrigerator  | 50.156 | book         | 9.534  | clock          | 41.661 |
-| vase          | 25.548 | scissors     | 17.612 | teddy bear     | 39.375 |
-| hair drier    | 0.000  | toothbrush   | 9.933  |                |        |
-
-
-##### YOLOX_nano
-| class         | AP     | class        | AP     | class          | AP     |
-|:--------------|:-------|:-------------|:-------|:---------------|:-------|
-| person        | 38.444 | bicycle      | 16.922 | car            | 21.708 |
-| motorcycle    | 30.753 | airplane     | 47.573 | bus            | 49.651 |
-| train         | 55.302 | truck        | 20.294 | boat           | 11.919 |
-| traffic light | 12.026 | fire hydrant | 48.798 | stop sign      | 52.446 |
-| parking meter | 33.439 | bench        | 13.565 | bird           | 16.520 |
-| cat           | 42.603 | dog          | 43.831 | horse          | 37.338 |
-| sheep         | 27.807 | cow          | 33.155 | elephant       | 52.374 |
-| bear          | 49.737 | zebra        | 52.259 | giraffe        | 56.445 |
-| backpack      | 5.456  | umbrella     | 25.288 | handbag        | 2.802  |
-| tie           | 17.110 | suitcase     | 17.757 | frisbee        | 40.878 |
-| skis          | 13.245 | snowboard    | 11.443 | sports ball    | 22.310 |
-| kite          | 28.107 | baseball bat | 10.295 | baseball glove | 20.294 |
-| skateboard    | 28.285 | surfboard    | 19.142 | tennis racket  | 25.253 |
-| bottle        | 15.064 | wine glass   | 13.412 | cup            | 19.357 |
-| fork          | 13.384 | knife        | 4.276  | spoon          | 3.460  |
-| bowl          | 26.615 | banana       | 18.067 | apple          | 9.672  |
-| sandwich      | 22.817 | orange       | 23.574 | broccoli       | 14.710 |
-| carrot        | 10.180 | hot dog      | 18.646 | pizza          | 38.244 |
-| donut         | 24.204 | cake         | 21.330 | chair          | 14.644 |
-| couch         | 33.018 | potted plant | 13.252 | bed            | 38.034 |
-| dining table  | 24.287 | toilet       | 52.986 | tv             | 44.978 |
-| laptop        | 44.130 | mouse        | 35.173 | remote         | 7.349  |
-| keyboard      | 33.903 | cell phone   | 19.140 | microwave      | 38.800 |
-| oven          | 25.890 | toaster      | 10.665 | sink           | 23.293 |
-| refrigerator  | 42.697 | book         | 6.942  | clock          | 35.254 |
-| vase          | 18.742 | scissors     | 11.866 | teddy bear     | 30.907 |
-| hair drier    | 0.000  | toothbrush   | 7.284  |                |        |
-
+All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
 
 #### Contributor Details
 
@@ -262,3 +108,9 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
 - Organisation: OpenCV
 - Project: Lightweight object detection models using OpenCV 
 
+## Reference
+
+- YOLOX article: https://arxiv.org/abs/2107.08430
+- YOLOX weight and scripts for training: https://github.com/Megvii-BaseDetection/YOLOX
+- YOLOX blog: https://arshren.medium.com/yolox-new-improved-yolo-d430c0e4cf20
+- YOLOX-lite: https://github.com/TexasInstruments/edgeai-yolox
@@ -1,8 +1,8 @@
-import cv2
 import numpy as np
+import cv2
 
-class YoloX():
-    def __init__(self, modelPath, confThreshold=0.35, nmsThreshold=0.5, objThreshold=0.5):
+class YoloX:
+    def __init__(self, modelPath, confThreshold=0.35, nmsThreshold=0.5, objThreshold=0.5, backendId=0, targetId=0):
         self.num_classes = 80
         self.net = cv2.dnn.readNet(modelPath)
         self.input_size = (640, 640)
@@ -12,29 +12,37 @@ def __init__(self, modelPath, confThreshold=0.35, nmsThreshold=0.5, objThreshold
         self.confThreshold = confThreshold
         self.nmsThreshold = nmsThreshold
         self.objThreshold = objThreshold
+        self.backendId = backendId
+        self.targetId = targetId
+        self.net.setPreferableBackend(self.backendId)
+        self.net.setPreferableTarget(self.targetId)
+
+    @property
+    def name(self):
+        return self.__class__.__name__
+
+    def setBackend(self, backenId):
+        self.backendId = backendId
+        self.net.setPreferableBackend(self.backendId)
+
+    def setTarget(self, targetId):
+        self.targetId = targetId
+        self.net.setPreferableTarget(self.targetId)
 
     def preprocess(self, img):
-        padded_img = np.ones((self.input_size[0], self.input_size[1], 3)) * 114.0
-        ratio = min(self.input_size[0] / img.shape[0], self.input_size[1] / img.shape[1])
-        resized_img = cv2.resize(
-            img, (int(img.shape[1] * ratio), int(img.shape[0] * ratio)), interpolation=cv2.INTER_LINEAR
-        ).astype(np.float32)
-        padded_img[: int(img.shape[0] * ratio), : int(img.shape[1] * ratio)] = resized_img
-        image = padded_img
-
-        image = image.astype(np.float32)
-        image = image[:, :, ::-1]
-        return image, ratio
+        blob = np.transpose(img, (2, 0, 1))
+        return blob[np.newaxis, :, :, :]
 
     def infer(self, srcimg):
-        img, ratio = self.preprocess(srcimg)
-        blob = cv2.dnn.blobFromImage(img)
-        self.net.setInput(blob)
+        input_blob = self.preprocess(srcimg)
+
+        self.net.setInput(input_blob)
         outs = self.net.forward(self.net.getUnconnectedOutLayersNames())
-        predictions = self.postprocess(outs[0], ratio)
+
+        predictions = self.postprocess(outs[0])
         return predictions
 
-    def postprocess(self, outputs, ratio):
+    def postprocess(self, outputs):
         grids = []
         expanded_strides = []
         hsizes = [self.input_size[0] // stride for stride in self.strides]
@@ -62,53 +70,24 @@ def postprocess(self, outputs, ratio):
         boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3] / 2.
         boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2] / 2.
         boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3] / 2.
-        boxes_xyxy /= ratio
 
+        # multi-class nms
         final_dets = []
-        num_classes = scores.shape[1]
-
-        for cls_ind in range(num_classes):
+        for cls_ind in range(scores.shape[1]):
             cls_scores = scores[:, cls_ind]
             valid_score_mask = cls_scores > self.confThreshold
-
             if valid_score_mask.sum() == 0:
                 continue
-
             else:
-                valid_scores = cls_scores[valid_score_mask]
-                valid_boxes = boxes_xyxy[valid_score_mask]
-
-                keep = []
-                x1 = valid_boxes[:, 0]
-                y1 = valid_boxes[:, 1]
-                x2 = valid_boxes[:, 2]
-                y2 = valid_boxes[:, 3]
-
-                areas = (x2 - x1 + 1) * (y2 - y1 + 1)
-                order = valid_scores.argsort()[::-1]
-
-                while order.size > 0:
-                    i = order[0]
-                    keep.append(i)
-                    xx1 = np.maximum(x1[i], x1[order[1:]])
-                    yy1 = np.maximum(y1[i], y1[order[1:]])
-                    xx2 = np.minimum(x2[i], x2[order[1:]])
-                    yy2 = np.minimum(y2[i], y2[order[1:]])
-
-                    w = np.maximum(0.0, xx2 - xx1 + 1)
-                    h = np.maximum(0.0, yy2 - yy1 + 1)
-                    inter = w * h
-                    ovr = inter / (areas[i] + areas[order[1:]] - inter)
-                    inds = np.where(ovr <= self.nmsThreshold)[0]
-                    order = order[inds + 1]
-                    if len(keep) > 0:
-                        cls_inds = np.ones((len(keep), 1)) * cls_ind
-                        dets = np.concatenate([valid_boxes[keep], valid_scores[keep, None], cls_inds], 1)
-                        final_dets.append(dets)
-
-        res_dets = np.concatenate(final_dets, 0)
+                # call nms
+                indices = cv2.dnn.NMSBoxes(boxes_xyxy.tolist(), cls_scores.tolist(), self.confThreshold, self.nmsThreshold)
+
+                classids_ = np.ones((len(indices), 1)) * cls_ind
+                final_dets.append(
+                    np.concatenate([boxes_xyxy[indices], cls_scores[indices, None], classids_], axis=1)
+                )
 
         if len(final_dets) == 0:
-            res_dets = np.array([])
+            return np.array([])
 
-        return res_dets
+        return np.concatenate(final_dets, 0)