Deploying a model in Azure AI Foundry can be done in 9 different ways. Depending on the type of deployment chosen, it may impact one of more factors, such as cost, latency, efficiency for processing large datasets, compliance. Listed below is a description of each deployment type, along with advantages and disadvantages. For more details, please visit https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-models/concepts/deployment-types
| Deployment Type | Description | Advantage | Disadvantage |
| Global Standard | Shared global infrastructure for general-purpose model inference. | Cost-effective and easy to scale. | Performance may vary under high demand. |
| Global Provisioned | Dedicated global infrastructure for consistent performance. | Reliable throughput and latency. | Higher cost due to dedicated resources. |
| Global Batch | Asynchronous global batch processing for large-scale inference jobs. | Efficient for processing large datasets. | Not suitable for real-time applications. |
| Data Zone Standard | Shared infrastructure within a specific data zone for compliance needs. | Meets data residency requirements affordably. | Limited performance consistency. |
| Data Zone Provisioned | Dedicated infrastructure in a data zone for high-performance workloads. | Combines compliance with consistent performance. | More expensive than shared options. |
| Data Zone Batch | Batch processing within a data zone for regulated data workflows. | Ideal for compliant, large-scale processing. | Slower response times; not real-time. |
| Standard | Default shared deployment for general use across Azure AI Foundry. | Simple setup and broad compatibility. | May lack advanced performance or compliance features. |
| Regional Provisioned | Dedicated infrastructure in a specific region for localized performance. | Optimized for regional latency and control. | Higher cost and limited to regional availability. |
| Developer (Fine-tuned) | Lightweight deployment for testing and iterating fine-tuned models. | Fast iteration and low cost for development. | Not suitable for production-scale workloads. |