Sam Nasr: January 2022

Microsoft’s Azure offers a great deal of features and capabilities. One of them is Cognitive Services, where users can access a variety of APIs to help mimic a human response. Some features include converting text to spoken speech, speech to text, and even the equivalent human understanding of a spoken phrase. These services are divided into 4 major categories, as seen below. Please note “Computer Vision” and “Custom Vision” sound very similar but their capabilities are different, as outlined in the “Vision” section below.

Decision

Anomaly Detector: Identify potential problems in time series data.
Content Moderator: Detect potentially offensive or unwanted content.
Personalizer: Create rich, personalized experiences for every user.

Language

LUIS (Language Understanding Intelligent Service)
QnA Maker: Create a conversational question and answer layer over your data.
Text Analytics: Detect sentiment, key phrases, and named entities.
Translator: Detect and translate more than 90 supported languages.

Speech

Speech to Text: Transcribe audible speech into readable, searchable text.
Text to Speech: Convert text to lifelike speech for more natural interfaces.
Speech Translation: Integrate real-time speech translation into your apps.
Speaker Recognition: Identify and verify the people speaking based on audio.

Vision

Computer Vision: Analyze content in images.

OCR: Optical Character Recognition
Image Analysis: extracts visual features from images (objects, faces, adult content
Spatial Analysis: Analyzes the presence and movement of people on a video feed and produces events that other systems can respond to.

Custom Vision: Customize image recognition to fit your business needs.

Image Classification: applies label(s) to an image
Object Detection: returns coordinates in image where applied label(s) can be found.

Note: Model can be exported for use: https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/export-your-model

Face: Detect and identify people and emotions in images.
Video Indexer: Analyze the visual and audio channels of a video, and index its content.
Form Recognizer: Extract text, key-value pairs and tables from documents.
Ink Recognizer: Recognize digital ink and handwriting, and pinpoint common shapes.

Sam Nasr

Friday, January 28, 2022

Overview of Cognitive Services