News
Microsoft Announces Azure AI with Copilot GA and Meta Llama 4 Integration
Microsoft has announced the general availability (GA) of Copilot in Azure and the addition of Meta's new Llama 4 models to Azure AI Foundry and Azure Databricks.
Copilot in Azure has officially reached GA, with Microsoft confirming that its current capabilities will continue to be available at no additional cost. Initially launched in public preview in May 2024, Microsoft said Copilot has been widely adopted across enterprises of all sizes.
"Since we launched Copilot in Azure in an ungated public preview in May 2024, Copilot in Azure has served millions of prompts for hundreds of thousands of users," said Microsoft's Ruhiyyih Mahalati in a blog post. "Within Microsoft alone, we estimate that Copilot saves more than 30,000 developer hours every month."
Microsoft said the GA release introduces significant improvements in performance, accessibility, and availability, including:
- Copilot response times have improved by over 30 percent through frontend enhancements like streaming and backend optimizations in the orchestrator and skills layer.
- The UI has been updated to meet high accessibility standards.
- Microsoft commits to 99.9 percent uptime for Copilot in Azure.
- Built in line with Microsoft's Responsible AI principles, Copilot undergoes rigorous testing to prevent harmful behaviors.
- Copilot now supports 19 languages.
- Users can now leverage features like Terraform configuration authoring and Azure Kubernetes Service diagnostics.
Additionally, Copilot is now generally available on the Azure Mobile App. New features include real-time AI chat streaming, enhanced entry points, a cost management skill and improved accessibility and localization.
Meta's Llama 4 Models Now in Azure AI Foundry and Databricks
Microsoft also announced the availability of Meta's Llama 4 models -- Scout and Maverick -- in Azure AI Foundry and Azure Databricks. These models are designed to support complex, multimodal tasks by integrating text and vision data within a unified model architecture.
- Llama 4 Scout: Tailored for summarization, personalization and long-context reasoning, Scout supports up to 10 million tokens and can operate efficiently on a single H100 GPU.
- Llama 4 Maverick: With 17 billion active parameters and a Mixture of Experts (MoE) architecture, Maverick is designed for multilingual and multimodal chat applications.
These models are equipped with safety measures at every development stage and are integrated with Azure's security and compliance standards. The early fusion of multimodal input and MoE scalability make the Llama 4 family ideal for enterprises aiming to build advanced AI solutions without compromising performance or cost-efficiency, according to the company.