Microsoft's AI Roadshow -- Redmondmag.com

Microsoft's AI Roadshow

With partner Nvidia, the company discussed its latest AI efforts, including OpenAI, and CEO Nadella shared how he imagines our AI-powered future.

By Joey D'Antoni
01/26/2024

On Thursday I attended and participated in the Microsoft AI Tour, part of a series of events in major tech hubs (including San Francisco, Bangalore, Singapore and more) to showcase Microsoft's AI technologies, and present with event co-sponsor Nvidia.

Speakers at the event included Azure CTO Mark Russinovich, Corporate Vice President for Microsoft Cloud for Industry Corey Sanders, and Microsoft CEO Satya Nadella, who did a fireside chat with Merrie Williamson, the CVP for Azure Infrastructure.

For disclosure, I attended as a Microsoft MVP to proctor data science labs on Microsoft Fabric, not as media. I participate in many tech events, and the crowd's attendance and energy were some of the best I've seen post-pandemic. I don't have numbers, but from my little view, attendees packed our large lab room for all three of our sessions and both keynote sessions (during a miserable January day).

In the opening technical keynote, Russinovich and Sanders presented how Microsoft is building Azure Infrastructure to support Open AI, Microsoft and its customers' AI efforts. Sanders spoke about how Microsoft has evolved in the era of AI and how using Copilot can improve work, reduce toil and enhance software quality. Sanders also discussed the Microsoft tooling stack around AI -- including AI Studio, which allows developers to build their copilots and leverage the Azure AI software development kit.

Russinovich talked about AI infrastructure and the rise of AI back to the early 2010s, with the rise of graphic processor units (GPUs) as part of computing and the development of the public cloud. Mark commented on how AI data models have gotten very large. For example, the popular GPT model has grown 5 times in size between version 3 and version 4 of its model.

Russinovich would later speak about how he and Microsoft are working on tiny models for specific use cases, thinking they could better fit particular types of AI work.

Mark then spoke about the computing infrastructure Microsoft has been building in Azure to support OpenAI. In 2020, OpenAI's model ran on a computer with 10,000 v100 GPUs; in 2023, the same model used 14,400 H100 GPUs, and Russinovich implied that the latest machine Microsoft is building OpenAI on is much larger than that. Russinovich highlighted how Microsoft uniquely uses InfiniBand networking between GPU nodes and has deployed over 29,000 miles of cabling. (Note: AWS has a network fabric built with some of the concepts of InfiniBand, but does not use that protocol. Google has also implemented its own protocol called Aquila.)

He also discussed how this networking stack improved the training time on the small GPT-3 model with 175 billion parameters; in 2023, this took 10.9 minutes, and it is now 4 minutes on 1344 ND H100 v5 VM. Mark also commented that the VM has only a 2 percent overhead, compared to a bare metal machine, with all of the benefits of virtualization, including flexibility, security and availability.

Adel El Hallak, Senior Director, Enterprise AI Product Management at Nvidia, talked about the innovations Microsoft and Nvidia have made together.

Optimized models
Specialized tools
ML Libraries

El Hallak also spoke about the services NVIDIA offers on Azure like DGX, the company's platform-as-a-service AI offering.

Russinovich then talked about some of the software work Microsoft has done to support AI, including Project Forge, and a stack of tooling Microsoft has built with OpenAI better to optimize the performance and availability of these workloads. Project Forge was particularly interesting to me -- one of the problems distributed systems have is scheduling work in an optimized fashion to ensure all nodes are doing close to equal amounts of work.

As part of this project, Microsoft has built a global scheduler that aims to try to get rid of silos to treat all AI infrastructure globally as a single shared pool. This shared pool would mean that any VM in any Azure region could be available to run jobs. This project is serverless, designed to be resilient (jobs can restart in the event of failures without losing state) and highly available.

Mark mentioned that without that global scheduler, you can fragment capacity, where some nodes are just waiting for work. This datacenter fragmentation concept has been a theme in Russinovich's latest talks -- he discussed this extensively about the problem in an Ignite talk a couple of years ago, which tells me that fragmentation is a problem Microsoft (and other providers) are working to solve. He also mentioned Project Flywheel, deployed in Azure ML on top of Forge, which processes model tokens in batches of tokens, so execution is interleaved and more efficient than traditional processing models.

Mark and Corey talked about prompts and retrieval augmented generation (RAG), which Mark demoed in Azure AI studio by asking a question about recent winners of the New York marathon in 2023, which his original model could not answer. He then added the Wikipedia entry about the 2023 marathon to his data set, and the model could answer his question correctly.

Corey also talked about the aim of Microsoft Fabric: to offer an experience to unify data without using ETL. What was interesting was that we saw a recorded video of a demo of the new database mirroring feature I mentioned in my Ignite column -- this time showing a connection to Snowflake.

Nadella's Fireside Chat
On the day Microsoft's stock crossed the $400 barrier, and in the week the company crossed a $3 trillion valuation (yes, you read that correctly), it was interesting to hear from the Microsoft CEO, who I thought made some very salient points in his brief chat.

Nadella discussed how he had been at Microsoft for 32 years and had observed three major platform shifts over that time, with AI being the fourth:

Client Server
Internet
Mobile and Cloud
AI

When asked when he started to believe AI could be a revolutionary technology, Nadella said he became a believer when he saw GitHub Copilot with GPT 3.5. He highlighted how AI has a different user interface and that we can have computers that understand spoken or written language.

He also mentioned that he had never seen anything like AI technology, where this most skeptical audience (software developers) has become believers so quickly. He talked about the drudgery often involved with beginning a software development effort and how Copilot can simplify that and allow developers to start building sooner.

One statistic Nadella mentioned was the use of Office Copilot tools. Excel is where Copilot is being used the most in the Office suite, but he expected it to be Word or Outlook. He then talked about how this can change our financial planning and forecasting by using more advanced tools for prediction and modeling that machine learning and AI offer.

He spoke about how developers should understand hardware and how AI accelerators are the new computing unit; Moore's law is very much alive in AI accelerators. They aren't going to look like GPUs. Developers should understand hardware and where the current state-of-the-art systems are designed and built for those systems. He also mentioned the importance of understanding your data estate and bringing all of your data to a reasoning engine on top of infrastructure (which aligns with the goals of Fabric).

He spoke about AI safety and how Microsoft has designed an audit function and engineering principles to conduct ethical AI. This process started in 2016 with a digital safety board, which reports to the Microsoft Board of Directors. He also compared misinformation to botnets, comparing how law enforcement and tech companies worked together to stop them and how they need to do the same with misinformation.

About the Author

Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.

Featured

Do-It-Yourself AI, Part 2: Setting It Up

Now that we've gotten the basics out of the way, it's time to start bringing an LLM to your machine.
Microsoft Defender XDR: A Unified Approach to Threat Detection and Response

Cybersecurity experts Mattias Borg and Stefan Schörling break down what you need to know about Microsoft's comprehensive security suite and how you can take the most advantage of it to protect your environment.
Microsoft Killing Skype Services in May

Microsoft announced that it is halting service for its Skype telecommunications and video calling services on May 5, 2025.
OpenAI Debuts GPT-4.5 'Orion'

OpenAI on Thursday launched GPT-4.5, code-named "Orion," its latest and most powerful AI model to date. Built with enhanced computing power and an expanded dataset, GPT-4.5 is now available as a research preview to select users, showcasing OpenAI’s continued push toward advancing AI capabilities.
Microsoft Urges Trump To Reverse Biden AI Chip Export Restriction

In an open letter posted Thursday, Microsoft is pushing for the Trump administration to relax export restrictions placed on AI chips by President Biden during his last days in office.