State of the Art: Object detection (1/2)

The aim of this article is to give a state of the art of object detection evaluated on COCO and classified by architecture type. Then, the transformers will be explained starting from the NLP domain to their adaptation to the computer vision domain with the Swin Transformers and the Focal Transformers. The methods presented in the SwinV2-G paper to adapt the Swin Transformer to a 3 billion parameters model will also be explained.

Continue reading “State of the Art: Object detection (1/2)”

MOT, TF model customization and distributed training

Python project, TensorFlow.

First, this article will describe how to convert a simple object detector to Multi-Object Tracking (MOT) capable of keeping identities to follow subjects along a sequence. Second, it will show how to customize and retrain a model from TensorFlow Object Detection API. Contrary to the previous article, we will parse the VOC2012 dataset with modern methods instead of implementing our own parser from scratch. We will also distribute the training on multiple GPUs.

Continue reading “MOT, TF model customization and distributed training”

SSD300 implementation

Python project, TensorFlow.

This article describes how to implement a Deep Learning algorithm for object detection, following the Single Shot Detector architecture. It explains the implementation of the VGG16 backbone network, the SSD cone, the default box principle and the convolutions used to predict the box classes and to regress the offsets for their location. Finally, how to convert the .xml annotations to data used by such a network for training.

Continue reading “SSD300 implementation”