Skip to content

Commit 8507468

Browse files
committed
changes before merge:
- move resize to demo - fix other issues
1 parent 9896009 commit 8507468

File tree

12 files changed

+177
-335
lines changed

12 files changed

+177
-335
lines changed

models/object_detection_yolox/README.md

Lines changed: 39 additions & 187 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,39 @@ Key features of the YOLOX object detector
88
- **SimOTA advanced label assignment strategy** reduces training time and avoids additional solver hyperparameters
99
- **Strong data augmentations like MixUp and Mosiac** to boost YOLOX performance
1010

11-
#### Model metrics:
12-
Average Precision and Recall values observed for COCO dataset classes are showed below
11+
Note:
12+
- This version of YoloX: YoloX_s
1313

14+
## Demo
15+
16+
Run the following command to try the demo:
17+
```shell
18+
# detect on camera input
19+
python demo.py
20+
# detect on an image
21+
python demo.py --input /path/to/image
22+
```
23+
Note:
24+
- image result saved as "result.jpg"
25+
26+
27+
## Results
28+
29+
Here are some of the sample results that were observed using the model (**yolox_s.onnx**),
30+
31+
![1_res.jpg](./samples/1_res.jpg)
32+
![2_res.jpg](./samples/2_res.jpg)
33+
![3_res.jpg](./samples/3_res.jpg)
34+
35+
<!--
36+
Video inference result,
37+
![WebCamR.gif](./examples/results/WebCamR.gif)
38+
-->
39+
40+
## Model metrics:
41+
42+
The model is evaluated on [COCO 2017 val](https://cocodataset.org/#download). Results are showed below:
1443

15-
##### YOLOX_S:
16-
Average forward time: 5.53 ms, Average NMS time: 1.71 ms, Average inference time: 7.25 ms
1744
<table>
1845
<tr><th>Average Precision </th><th>Average Recall</th></tr>
1946
<tr><td>
@@ -39,126 +66,6 @@ Average forward time: 5.53 ms, Average NMS time: 1.71 ms, Average inference time
3966
| large | 0.50:0.95 | 0.724 |
4067
</td></tr> </table>
4168

42-
43-
##### YOLOX_tiny:
44-
Average forward time: 2.07 ms, Average NMS time: 1.71 ms, Average inference time: 3.79 ms
45-
<table>
46-
<tr><th>Average Precision </th><th>Average Recall</th></tr>
47-
<tr><td>
48-
49-
| area | IoU | Average Precision(AP) |
50-
|:-------|:------|:------------------------|
51-
| all | 0.50:0.95 | 0.328 |
52-
| all | 0.50 | 0.504 |
53-
| all | 0.75 | 0.346 |
54-
| small | 0.50:0.95 | 0.139 |
55-
| medium | 0.50:0.95 | 0.360 |
56-
| large | 0.50:0.95 | 0.501 |
57-
58-
</td><td>
59-
60-
area | IoU | Average Recall(AR) |
61-
|:-------|:------|:----------------|
62-
| all | 0.50:0.95 | 0.283 |
63-
| all | 0.50:0.95 | 0.450 |
64-
| all | 0.50:0.95 | 0.485 |
65-
| small | 0.50:0.95 | 0.226 |
66-
| medium | 0.50:0.95 | 0.550 |
67-
| large | 0.50:0.95 | 0.687 |
68-
</td></tr> </table>
69-
70-
71-
##### YOLOX_nano:
72-
Average forward time: 1.68 ms, Average NMS time: 1.64 ms, Average inference time: 3.31 ms
73-
<table>
74-
<tr><th>Average Precision </th><th>Average Recall</th></tr>
75-
<tr><td>
76-
77-
| area | IoU | Average Precision(AP) |
78-
|:-------|:------|:------------------------|
79-
| all | 0.50:0.95 | 0.258 |
80-
| all | 0.50 | 0.414 |
81-
| all | 0.75 | 0.268 |
82-
| small | 0.50:0.95 | 0.082 |
83-
| medium | 0.50:0.95 | 0.275 |
84-
| large | 0.50:0.95 | 0.410 |
85-
86-
</td><td>
87-
88-
area | IoU | Average Recall(AR) |
89-
|:-------|:------|:----------------|
90-
| all | 0.50:0.95 | 0.241 |
91-
| all | 0.50:0.95 | 0.384 |
92-
| all | 0.50:0.95 | 0.420 |
93-
| small | 0.50:0.95 | 0.157 |
94-
| medium | 0.50:0.95 | 0.473 |
95-
| large | 0.50:0.95 | 0.631 |
96-
</td></tr> </table>
97-
98-
99-
## Demo
100-
101-
Run the following command to try the demo:
102-
```shell
103-
# Nanodet inference on image input
104-
python demo.py --model /path/to/model/ --input_type image --image_path /path/to/image/
105-
106-
# Nanodet inference on video input
107-
python demo.py --model /path/to/model/ --input_type video
108-
109-
#Saving outputs
110-
#Image output
111-
python demo.py --model /path/to/model/ --input_type image --image_path /path/to/image/ --save True
112-
113-
#Video output
114-
python demo.py --model /path/to/model/ --input_type video --save True
115-
116-
other parameters
117-
--confidence: Confidence values of the predictions (default: 0.5)
118-
--nms: NMS threshold value for predictions (default: 0.5)
119-
--obj: Object threshold value (default: 0.5)
120-
```
121-
Note:
122-
- By default input_type: image
123-
- image result saved as "result.jpg"
124-
- webcam result saved as "Webcam_result.mp4"
125-
126-
127-
## Results
128-
129-
Here are some of the sample results that were observed using the model (**yolox_s.onnx**),
130-
131-
<p float="left">
132-
<img src="./examples/results/result1.jpg" width="450" height="450">
133-
<img src="./examples/results/result2.jpg" width="450" height="450">
134-
</p>
135-
136-
Video inference result,
137-
<p align="center">
138-
<img src="https://github.com/Sidd1609/opencv_zoo/blob/master/models/object_detection_nanodet/examples/results/WebCamR.gif" width="650" height="450">
139-
</p>
140-
141-
142-
## License
143-
144-
All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
145-
146-
147-
## Reference
148-
149-
- YOLOX article: https://arxiv.org/abs/2107.08430
150-
- YOLOX weight and scripts for training: https://github.com/Megvii-BaseDetection/YOLOX
151-
- YOLOX blog: https://arshren.medium.com/yolox-new-improved-yolo-d430c0e4cf20
152-
- YOLOX-lite: https://github.com/TexasInstruments/edgeai-yolox
153-
154-
155-
#### Note:
156-
157-
- In this repo we have used the following versions of YOLOX: YOLOX_S, YOLOX_tiny, YOLOX_nano
158-
- The model was trained on COCO 2017 dataset, link to dataset: https://cocodataset.org/#download
159-
- Below, we have per class AP results on COCO dataset for the models YOLOX_S, YOLOX_tiny, YOLOX_nano respectively
160-
161-
##### YOLOX_S
16269
| class | AP | class | AP | class | AP |
16370
|:--------------|:-------|:-------------|:-------|:---------------|:-------|
16471
| person | 54.109 | bicycle | 31.580 | car | 40.447 |
@@ -189,70 +96,9 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
18996
| vase | 37.013 | scissors | 26.307 | teddy bear | 45.676 |
19097
| hair drier | 7.255 | toothbrush | 19.374 | | |
19198

99+
## License
192100

193-
##### YOLOX_tiny
194-
| class | AP | class | AP | class | AP |
195-
|:--------------|:-------|:-------------|:-------|:---------------|:-------|
196-
| person | 45.685 | bicycle | 22.797 | car | 29.265 |
197-
| motorcycle | 37.980 | airplane | 59.446 | bus | 54.878 |
198-
| train | 62.459 | truck | 26.850 | boat | 16.724 |
199-
| traffic light | 17.527 | fire hydrant | 55.155 | stop sign | 57.120 |
200-
| parking meter | 37.755 | bench | 17.656 | bird | 24.382 |
201-
| cat | 55.792 | dog | 50.964 | horse | 49.806 |
202-
| sheep | 39.894 | cow | 42.855 | elephant | 58.863 |
203-
| bear | 62.345 | zebra | 58.389 | giraffe | 62.362 |
204-
| backpack | 8.131 | umbrella | 33.650 | handbag | 7.777 |
205-
| tie | 21.907 | suitcase | 25.593 | frisbee | 48.975 |
206-
| skis | 16.941 | snowboard | 19.409 | sports ball | 30.718 |
207-
| kite | 33.956 | baseball bat | 17.912 | baseball glove | 28.793 |
208-
| skateboard | 38.253 | surfboard | 28.329 | tennis racket | 33.240 |
209-
| bottle | 23.872 | wine glass | 20.386 | cup | 26.962 |
210-
| fork | 21.025 | knife | 8.434 | spoon | 6.513 |
211-
| bowl | 34.706 | banana | 24.050 | apple | 12.745 |
212-
| sandwich | 28.046 | orange | 24.216 | broccoli | 18.579 |
213-
| carrot | 16.283 | hot dog | 30.058 | pizza | 44.371 |
214-
| donut | 35.957 | cake | 29.765 | chair | 22.070 |
215-
| couch | 41.221 | potted plant | 19.856 | bed | 44.173 |
216-
| dining table | 29.000 | toilet | 60.369 | tv | 49.868 |
217-
| laptop | 48.858 | mouse | 47.843 | remote | 14.349 |
218-
| keyboard | 42.412 | cell phone | 23.536 | microwave | 51.839 |
219-
| oven | 32.384 | toaster | 24.209 | sink | 32.607 |
220-
| refrigerator | 50.156 | book | 9.534 | clock | 41.661 |
221-
| vase | 25.548 | scissors | 17.612 | teddy bear | 39.375 |
222-
| hair drier | 0.000 | toothbrush | 9.933 | | |
223-
224-
225-
##### YOLOX_nano
226-
| class | AP | class | AP | class | AP |
227-
|:--------------|:-------|:-------------|:-------|:---------------|:-------|
228-
| person | 38.444 | bicycle | 16.922 | car | 21.708 |
229-
| motorcycle | 30.753 | airplane | 47.573 | bus | 49.651 |
230-
| train | 55.302 | truck | 20.294 | boat | 11.919 |
231-
| traffic light | 12.026 | fire hydrant | 48.798 | stop sign | 52.446 |
232-
| parking meter | 33.439 | bench | 13.565 | bird | 16.520 |
233-
| cat | 42.603 | dog | 43.831 | horse | 37.338 |
234-
| sheep | 27.807 | cow | 33.155 | elephant | 52.374 |
235-
| bear | 49.737 | zebra | 52.259 | giraffe | 56.445 |
236-
| backpack | 5.456 | umbrella | 25.288 | handbag | 2.802 |
237-
| tie | 17.110 | suitcase | 17.757 | frisbee | 40.878 |
238-
| skis | 13.245 | snowboard | 11.443 | sports ball | 22.310 |
239-
| kite | 28.107 | baseball bat | 10.295 | baseball glove | 20.294 |
240-
| skateboard | 28.285 | surfboard | 19.142 | tennis racket | 25.253 |
241-
| bottle | 15.064 | wine glass | 13.412 | cup | 19.357 |
242-
| fork | 13.384 | knife | 4.276 | spoon | 3.460 |
243-
| bowl | 26.615 | banana | 18.067 | apple | 9.672 |
244-
| sandwich | 22.817 | orange | 23.574 | broccoli | 14.710 |
245-
| carrot | 10.180 | hot dog | 18.646 | pizza | 38.244 |
246-
| donut | 24.204 | cake | 21.330 | chair | 14.644 |
247-
| couch | 33.018 | potted plant | 13.252 | bed | 38.034 |
248-
| dining table | 24.287 | toilet | 52.986 | tv | 44.978 |
249-
| laptop | 44.130 | mouse | 35.173 | remote | 7.349 |
250-
| keyboard | 33.903 | cell phone | 19.140 | microwave | 38.800 |
251-
| oven | 25.890 | toaster | 10.665 | sink | 23.293 |
252-
| refrigerator | 42.697 | book | 6.942 | clock | 35.254 |
253-
| vase | 18.742 | scissors | 11.866 | teddy bear | 30.907 |
254-
| hair drier | 0.000 | toothbrush | 7.284 | | |
255-
101+
All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
256102

257103
#### Contributor Details
258104

@@ -262,3 +108,9 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
262108
- Organisation: OpenCV
263109
- Project: Lightweight object detection models using OpenCV
264110

111+
## Reference
112+
113+
- YOLOX article: https://arxiv.org/abs/2107.08430
114+
- YOLOX weight and scripts for training: https://github.com/Megvii-BaseDetection/YOLOX
115+
- YOLOX blog: https://arshren.medium.com/yolox-new-improved-yolo-d430c0e4cf20
116+
- YOLOX-lite: https://github.com/TexasInstruments/edgeai-yolox
Lines changed: 38 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
import cv2
21
import numpy as np
2+
import cv2
33

4-
class YoloX():
5-
def __init__(self, modelPath, confThreshold=0.35, nmsThreshold=0.5, objThreshold=0.5):
4+
class YoloX:
5+
def __init__(self, modelPath, confThreshold=0.35, nmsThreshold=0.5, objThreshold=0.5, backendId=0, targetId=0):
66
self.num_classes = 80
77
self.net = cv2.dnn.readNet(modelPath)
88
self.input_size = (640, 640)
@@ -12,29 +12,37 @@ def __init__(self, modelPath, confThreshold=0.35, nmsThreshold=0.5, objThreshold
1212
self.confThreshold = confThreshold
1313
self.nmsThreshold = nmsThreshold
1414
self.objThreshold = objThreshold
15+
self.backendId = backendId
16+
self.targetId = targetId
17+
self.net.setPreferableBackend(self.backendId)
18+
self.net.setPreferableTarget(self.targetId)
19+
20+
@property
21+
def name(self):
22+
return self.__class__.__name__
23+
24+
def setBackend(self, backenId):
25+
self.backendId = backendId
26+
self.net.setPreferableBackend(self.backendId)
27+
28+
def setTarget(self, targetId):
29+
self.targetId = targetId
30+
self.net.setPreferableTarget(self.targetId)
1531

1632
def preprocess(self, img):
17-
padded_img = np.ones((self.input_size[0], self.input_size[1], 3)) * 114.0
18-
ratio = min(self.input_size[0] / img.shape[0], self.input_size[1] / img.shape[1])
19-
resized_img = cv2.resize(
20-
img, (int(img.shape[1] * ratio), int(img.shape[0] * ratio)), interpolation=cv2.INTER_LINEAR
21-
).astype(np.float32)
22-
padded_img[: int(img.shape[0] * ratio), : int(img.shape[1] * ratio)] = resized_img
23-
image = padded_img
24-
25-
image = image.astype(np.float32)
26-
image = image[:, :, ::-1]
27-
return image, ratio
33+
blob = np.transpose(img, (2, 0, 1))
34+
return blob[np.newaxis, :, :, :]
2835

2936
def infer(self, srcimg):
30-
img, ratio = self.preprocess(srcimg)
31-
blob = cv2.dnn.blobFromImage(img)
32-
self.net.setInput(blob)
37+
input_blob = self.preprocess(srcimg)
38+
39+
self.net.setInput(input_blob)
3340
outs = self.net.forward(self.net.getUnconnectedOutLayersNames())
34-
predictions = self.postprocess(outs[0], ratio)
41+
42+
predictions = self.postprocess(outs[0])
3543
return predictions
3644

37-
def postprocess(self, outputs, ratio):
45+
def postprocess(self, outputs):
3846
grids = []
3947
expanded_strides = []
4048
hsizes = [self.input_size[0] // stride for stride in self.strides]
@@ -62,53 +70,24 @@ def postprocess(self, outputs, ratio):
6270
boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3] / 2.
6371
boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2] / 2.
6472
boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3] / 2.
65-
boxes_xyxy /= ratio
6673

74+
# multi-class nms
6775
final_dets = []
68-
num_classes = scores.shape[1]
69-
70-
for cls_ind in range(num_classes):
76+
for cls_ind in range(scores.shape[1]):
7177
cls_scores = scores[:, cls_ind]
7278
valid_score_mask = cls_scores > self.confThreshold
73-
7479
if valid_score_mask.sum() == 0:
7580
continue
76-
7781
else:
78-
valid_scores = cls_scores[valid_score_mask]
79-
valid_boxes = boxes_xyxy[valid_score_mask]
80-
81-
keep = []
82-
x1 = valid_boxes[:, 0]
83-
y1 = valid_boxes[:, 1]
84-
x2 = valid_boxes[:, 2]
85-
y2 = valid_boxes[:, 3]
86-
87-
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
88-
order = valid_scores.argsort()[::-1]
89-
90-
while order.size > 0:
91-
i = order[0]
92-
keep.append(i)
93-
xx1 = np.maximum(x1[i], x1[order[1:]])
94-
yy1 = np.maximum(y1[i], y1[order[1:]])
95-
xx2 = np.minimum(x2[i], x2[order[1:]])
96-
yy2 = np.minimum(y2[i], y2[order[1:]])
97-
98-
w = np.maximum(0.0, xx2 - xx1 + 1)
99-
h = np.maximum(0.0, yy2 - yy1 + 1)
100-
inter = w * h
101-
ovr = inter / (areas[i] + areas[order[1:]] - inter)
102-
inds = np.where(ovr <= self.nmsThreshold)[0]
103-
order = order[inds + 1]
104-
if len(keep) > 0:
105-
cls_inds = np.ones((len(keep), 1)) * cls_ind
106-
dets = np.concatenate([valid_boxes[keep], valid_scores[keep, None], cls_inds], 1)
107-
final_dets.append(dets)
108-
109-
res_dets = np.concatenate(final_dets, 0)
82+
# call nms
83+
indices = cv2.dnn.NMSBoxes(boxes_xyxy.tolist(), cls_scores.tolist(), self.confThreshold, self.nmsThreshold)
84+
85+
classids_ = np.ones((len(indices), 1)) * cls_ind
86+
final_dets.append(
87+
np.concatenate([boxes_xyxy[indices], cls_scores[indices, None], classids_], axis=1)
88+
)
11089

11190
if len(final_dets) == 0:
112-
res_dets = np.array([])
91+
return np.array([])
11392

114-
return res_dets
93+
return np.concatenate(final_dets, 0)

0 commit comments

Comments
 (0)