Microsoft’s Azure offers a great deal of features and capabilities. One of them is Cognitive Services, where users can access a variety of APIs to help mimic a human response. Some features include converting text to spoken speech, speech to text, and even the equivalent human understanding of a spoken phrase. These services are divided into 4 major categories, as seen below. Please note “Computer Vision” and “Custom Vision” sound very similar but their capabilities are different, as outlined in the “Vision” section below.
Decision
- Anomaly Detector: Identify potential problems in time series data.
- Content Moderator: Detect potentially offensive or unwanted content.
- Personalizer: Create rich, personalized experiences for every user.
Language
- LUIS (Language Understanding Intelligent Service)
- QnA Maker: Create a conversational question and answer layer over your data.
- Text Analytics: Detect sentiment, key phrases, and named entities.
- Translator: Detect and translate more than 90 supported languages.
Speech
- Speech to Text: Transcribe audible speech into readable, searchable text.
- Text to Speech: Convert text to lifelike speech for more natural interfaces.
- Speech Translation: Integrate real-time speech translation into your apps.
- Speaker Recognition: Identify and verify the people speaking based on audio.
Vision
- Computer Vision: Analyze content in images.
- OCR: Optical Character Recognition
- Image Analysis: extracts visual features from images (objects, faces, adult content
- Spatial Analysis: Analyzes the presence and movement of people on a video feed and produces events that other systems can respond to.
- Custom Vision: Customize image recognition to fit your business needs.
- Image Classification: applies label(s) to an image
- Object Detection: returns coordinates in image where applied label(s) can be found.
Note: Model can be exported for use: https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/export-your-model
- Face: Detect and identify people and emotions in images.
- Video Indexer: Analyze the visual and audio channels of a video, and index its content.
- Form Recognizer: Extract text, key-value pairs and tables from documents.
- Ink Recognizer: Recognize digital ink and handwriting, and pinpoint common shapes.