Microsoft Releases Phi-2 Small Language Model -- Redmondmag.com

Microsoft Releases Phi-2 Small Language Model

By Chris Paoli
12/15/2023

Microsoft this week has made available the latest version of its suite of small language models (SLM), Phi-2, in the Azure AI Studio model catalog. Currently, the SLM is only available for research purposes.

SLMs are language models only trained on a specific domain data, making them ideal for academic use. Also, due to their smaller size, they are more cost effective and are designed to handle tasks that don't require the massive compute power of large language models.

A unique aspect of Phi-2 is its focus on "textbook-quality" data for training. The company's approach with its SLM is to emphasize educational value and content quality. "With its compact size, Phi-2 is an ideal playground for researchers, including for exploration around mechanistic interpretability, safety improvements, or fine-tuning experimentation on a variety of tasks," said Microsoft.

Our training data mixture contains synthetic datasets specifically created to teach the model common sense reasoning and general knowledge, including science, daily activities and theory of mind, among others," said Microsoft. "We further augment our training corpus with carefully selected web data that is filtered based on educational value and content quality."

Although "small" is in the name, Microsoft's latest model boasts 2.7 billion parameters, making it a massive upgrade over Phi-1.5's 1.3 billion parameters. And thanks to the company's approach to scaling, Microsoft claims Phi-2 can outperform models that are 25 times larger.

In the company's own benchmark tests, it showed that Phi-2, with its 2.7 billion parameters, outperformed competing SLMs Mistrial (7 billion parameters) and Llama-2 (7-70 billion parameters) in common reasoning, language understanding, math queries and coding. It also said its Phi-2 would outperform Google's recently announced SLM offering. Furthermore, Phi-2 matches or outperforms the recently-announced Google Gemini Nano 2, despite being smaller in size."

Microsoft said that due to the nature of Phi-2, which was trained on specifically designated data and is not reinforced by human feedback (as other language models do), it also has seen a decrease in bias and toxicity, compared to Llama-2 and its own earlier versions of Phi.

About the Author

Chris Paoli (@ChrisPaoli5) is the associate editor for Converge360.

Featured

Why SQL Server Is Still Worth It

Despite its steep licensing costs, SQL Server continues to prove its worth over open-source alternatives in some key areas.
Google To Buy Cloud Security Firm Wiz for $32 Billion in Record-Breaking Deal

Google has reached an agreement to acquire cloud security startup Wiz Inc. in an all-cash deal valued at $32 billion.
Cyber Defenders Assemble: Protect Your Entra ID

Security expert Andy Malone breaks down the current threat landscape, and what IT can do to arm them against the latest threats.
NVIDIA Expands AI for Enterprises with Blackwell-Based GPUs and Workstations

NVIDIA has unveiled a lineup of AI-powered computing solutions designed to accelerate enterprise workloads at this week's NVIDIA GTC conference in San Jose, Calif.
Do-It-Yourself AI, Part 3: Filtering Results

Next up in our series, I'll show you how to cut through reasoning text to receive the answer to your AI query.