Microsoft Releases Phi-2 Small Language Model -- Redmondmag.com

Microsoft Releases Phi-2 Small Language Model

By Chris Paoli
12/15/2023

Microsoft this week has made available the latest version of its suite of small language models (SLM), Phi-2, in the Azure AI Studio model catalog. Currently, the SLM is only available for research purposes.

SLMs are language models only trained on a specific domain data, making them ideal for academic use. Also, due to their smaller size, they are more cost effective and are designed to handle tasks that don't require the massive compute power of large language models.

A unique aspect of Phi-2 is its focus on "textbook-quality" data for training. The company's approach with its SLM is to emphasize educational value and content quality. "With its compact size, Phi-2 is an ideal playground for researchers, including for exploration around mechanistic interpretability, safety improvements, or fine-tuning experimentation on a variety of tasks," said Microsoft.

Our training data mixture contains synthetic datasets specifically created to teach the model common sense reasoning and general knowledge, including science, daily activities and theory of mind, among others," said Microsoft. "We further augment our training corpus with carefully selected web data that is filtered based on educational value and content quality."

Although "small" is in the name, Microsoft's latest model boasts 2.7 billion parameters, making it a massive upgrade over Phi-1.5's 1.3 billion parameters. And thanks to the company's approach to scaling, Microsoft claims Phi-2 can outperform models that are 25 times larger.

In the company's own benchmark tests, it showed that Phi-2, with its 2.7 billion parameters, outperformed competing SLMs Mistrial (7 billion parameters) and Llama-2 (7-70 billion parameters) in common reasoning, language understanding, math queries and coding. It also said its Phi-2 would outperform Google's recently announced SLM offering. Furthermore, Phi-2 matches or outperforms the recently-announced Google Gemini Nano 2, despite being smaller in size."

Microsoft said that due to the nature of Phi-2, which was trained on specifically designated data and is not reinforced by human feedback (as other language models do), it also has seen a decrease in bias and toxicity, compared to Llama-2 and its own earlier versions of Phi.

About the Author

Chris Paoli (@ChrisPaoli5) is the associate editor for Converge360.

Featured

Microsoft Partners with UiPath on AI Automation

Microsoft and UiPath have announced a partnership in enterprise automation with the launch of integration between Microsoft Copilot Studio and UiPath Studio.
Critical Considerations for Server GPUs

Server GPUs offer powerful performance for AI workloads, but IT pros must weigh critical factors -- like form factor, power requirements and workload compatibility -- before installation.
April Patch Tuesday: 1 Zero-Day in Large Batch of Flaws

Microsoft's April security update arrived Tuesday, featuring fixes for 121 vulnerabilities – the biggest patch load for the year.
Q&A: Practical AI Strategies for IT Pros

AI expert Ana Inés Urrutia shares how IT pros can harness AI today to streamline operations, enhance decision-making and prepare for the future of work.
Microsoft Announces Azure AI with Copilot GA and Meta Llama 4 Integration

Microsoft has announced the general availability (GA) of Copilot in Azure and the addition of Meta's new Llama 4 models to Azure AI Foundry and Azure Databricks.