Friday, September 17, 2021

Azure Custom Vision Object Detection

Computer Vision and Custom Vision are 2 subsets of services provided by Azure Cognitive Services.


Computer Vision: Analyze content in images.

  1. OCR: Optical Character Recognition
  2. Image Analysis: extracts visual features from images (objects, faces)
  3. Spatial Analysis: Analyzes the presence and movement of people on a video feed and produces events that other systems can respond to.


Custom Vision: Customize image recognition to fit your business needs.

  1. Image Classification: applies label(s) to an image
  2. Object Detection: returns coordinates in image where applied label(s) can be found.


When using the Object Detection Prediction API, the response returned from azure will be a JSON dataset using the following format.


    public class PredictionResponse


        public string id { get; set; }

        public string project { get; set; }

        public string iteration { get; set; }

        public string created { get; set; }

        public Prediction[] predictions { get; set; }



    public class Prediction


        public string tagId { get; set; }

        public string tagName { get; set; }

        public string probability { get; set; }

        public BoundingBox boundingBox { get; set; }



    public class BoundingBox


        public string left { get; set; }

        public string top { get; set; }

        public string width { get; set; }

        public string height { get; set; }



Each BoundingBox object in the response is represented graphically by the red boxes, as shown in the sample image below




In addition, listed below are some Gotcha’s to watch out for when working with Object Detection


  1. Be sure to use the same login for as the one used for the Azure portal.


  1. Use same "Directory" in and Azure portal.  This setting can be found in the top right corner for both Azure portal and


  1. When training the model, you must use a minimum of 15 images for every tag.  More images with different lighting, angles, and backgrounds will produce better results.


  1. The images types used for training must be .JPG, .PNG, or .BMP, and less than 4MB.




No comments:

Post a Comment