IMPROVING OBJECT DETECTION PERFORMANCE USING VGG16 WITH TRANSFER LEARNING

Huynh Tan Loc; Duong Thanh Linh; Nguyen Thi Ngoc Anh

doi:10.61591/jslhu.26.997

Các tác giả

Huynh Tan Loc
Duong Thanh Linh
Nguyen Thi Ngoc Anh

DOI:

https://doi.org/10.61591/jslhu.26.997

Từ khóa:

Transfer Learning; Object Recognition; VGG16; Custom Dataset; Small Object Detection.

Tóm tắt

Transfer learning has emerged as an effective approach for improving object recognition performance, particularly in scenarios with limited training data. In this study, a model based on the VGG16 architecture was developed by exploiting pre-trained deep features to address the object recognition problem on a custom dataset consisting of 16 categories of common objects. The proposed model was fine-tuned using weights pre-trained on the ImageNet dataset and evaluated through five independent training runs to verify the stability of the obtained results. Experimental results demonstrated that the model achieved an average accuracy of 90.56% with a standard deviation of 2.15%, indicating high recognition performance, strong stability, and clear advantages over training-from-scratch approaches. In addition, the proposed method achieved a mean Average Precision (mAP) of 74.8% at IoU = 0.5 and 59.3% at IoU = 0.75, demonstrating robust object detection capability under different evaluation thresholds. Notably, the detection performance for small objects was improved (AP_small = 38.2%), confirming the effectiveness of transfer learning in extracting fine-grained features. However, the model still encountered challenges in accurately detecting extremely small or occluded objects, highlighting limitations in feature representation at low scales. Overall, this study confirms the effectiveness of transfer learning on academic-scale datasets and provides experimental evidence for its practical applicability. Future work will focus on integrating multi-scale feature representations and exploring more advanced architectures to further improve detection performance.

Tài liệu tham khảo

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proc. IEEE CVPR, pp. 770-778, 2016.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-based convolutional networks for accurate object detection and segmentation,” IEEE TPAMI, vol. 38, no. 1, pp. 142-152, 2016.

R. Girshick, “Fast R-CNN,” Proc. IEEE ICCV, pp. 1440-1448, 2015.

J. Dai, Y. Li, K. He, and J. Sun, “R-FCN: Object detection via region-based fully convolutional networks,” NIPS, pp. 379-387, 2016.

N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” Proc. IEEE CVPR, vol. 1, pp. 886-893, 2005.

D. Lowe, “Distinctive image features from scale-invariant keypoints,” IJCV, vol. 60, no. 2, pp. 91-110, 2004.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” ICLR, 2015.

G. Cheng, Y. Zhou, X. Han, L. Guo, and J. Han, “Towards large-scale small object detection: survey and benchmarks,” IEEE TPAMI, 2023.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, real-time object detection,” Proc. IEEE CVPR, pp. 779-788, 2016.

W. Liu, D. Anguelov, D. Erhan, et al., “SSD: Single Shot MultiBox Detector,” Proc. ECCV, pp. 21-37, 2016.

M. Tan, R. Pang, and Q. V. Le, “EfficientDet: Scalable and efficient object detection,” Proc. IEEE CVPR, pp. 10781-10790, 2020.

T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature Pyramid Networks for Object Detection,” Proc. IEEE CVPR, pp. 2117-2125, 2017.

F. Yu and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” ICLR, 2016.

N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-End Object Detection with Transformers,” ECCV, 2020.

X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable DETR: Deformable Transformers for End-to-End Object Detection,” ICLR, 2021.

Iqra, M. Sajjad, S. Hussain, et al., “Small object detection in diverse application landscapes: a survey,” Multimedia Tools and Applications, Springer, 2024.

J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?” NeurIPS, 2014.

J. Dai et al., “Instance-aware semantic segmentation via multi-task network cascades,” Proc. IEEE CVPR, pp. 3150-3158, 2016.

W. Zhang et al., “A lightweight small object detection model for UAV images based on HR-FPN,” Scientific Reports, vol. 15, no. 1, 2025.

S. Khan et al., “ALFPN: Adaptive learning feature pyramid network for small object detection,” Int. J. of Intelligent Systems, vol. 38, no. 6, pp. 12456-12472, 2023.

11. Improving object detection performance on Small-Scale datasets through Pre-Trained deep feature extraction

Các tác giả

DOI:

Từ khóa:

Tóm tắt

Tài liệu tham khảo

Tải xuống

Đã Xuất bản

Cách trích dẫn

Số

Chuyên mục

Các bài báo được đọc nhiều nhất của cùng tác giả

Ngôn ngữ

Thông tin

Gửi bài mới