Containers 101: Containers vs. Virtual Machines (And Why Containers Are the Future of IT Infrastructure)
What exactly is a container and what makes it different -- and in some cases better -- than a virtual machine? To answer this question, Joey explains why we ever needed containers in the first place.
- By Joey D'Antoni
Containers and Kubernetes, the popular container orchestration system, have been the buzziest of IT buzzwords in recent years. You may wonder why and how this container technology is different from the virtual machines (VMs) you have worked with for the past decade-plus.
First, let's understand how containers happened. There's a long history in Unix, and then in Linux, of configuring process isolation. The first implementations I worked on were Sun Microsystems Solaris zones (zones were effectively containers that shared a kernel with the base operating system, but otherwise ran fully isolated). These were introduced to higher-end Sun servers in 2004, but by the time the technology was mature, many organizations were migrating from proprietary Unix servers to commodity Linux servers.
At the same time, Google and Netflix were rapidly growing companies that had massive amounts of flexible demand. Netflix made a very early and aggressive move to migrate its workloads to Amazon Web Services (AWS) in 2008. At the beginning of that migration, nearly all of the company's workloads ran on VMs. However, it ran into challenges in its largely VM environment.
VMs require an entire copy of an operating system to deploy, which is a large amount of data to move around. This limits both mobility and VM density. It is also more challenging to ensure full version compatibility between the software libraries across dev/test/QA/prod environments. Not all workloads fit into the predefined VM boxes that cloud providers offer. Some workloads may be dramatically oversized, even in a small VM. Others may need the ability to scale rapidly for periods of high load but are mostly idle.
So how are containers different from just running a VM? The biggest difference is that your containers have to run the same base operating system as what's on your host (the server or the VM hosting your containers). There is some variation with this. If you are running Linux containers, you can potentially have containers from a different distribution than your host OS. This is because most of the differences between Linux operating systems are in the user space, and the kernel space -- where the system calls from the containers are made -- is common across distros. You will notice all the references to Linux; while Windows has made great strides in containers in recent releases, the space is still dominated by the Linux operating system.
Containers can run in a nested format, which means that you can install container management software like Docker or Kubernetes inside a VM (or a bare-metal host) and run container workloads inside that VM. Networking and storage also work differently in containers; they are properties of the container orchestration system, as opposed to being properties of the VM.
Having the host share a kernel with the container helps you achieve density. Since you no longer need to have multiple copies of an entire operating system running, you reduce the amount of CPU, memory and storage overhead associated with virtualizing your workloads. This means that you can run far more containers on a given host than you could VMs.
Another benefit of containers is their place in the modern development workflow. Containers become your unit of development deployment. Instead of building a set of libraries and binaries to be deployed to a server, the developer simply builds a container, defined in a format known as a container image, which is then uploaded to a container registry. While a VM image may be 10GB or 20GB, a container image can be an order of magnitude smaller -- maybe 10MB or 20MB. Given the size and the resources needed to build containers, this means they can go straight from a developer's laptop into production (after passing a set of automated QA tests, of course). This aligns with most modern continuous integration and development pipelines, which are built around the notion of using the container as the unit of deployment.
Container registries can be globally distributed, and the small size of containers allows these images to be deployed easily to local and remote datacenters and public cloud regions all over the world. There are both private and public container registries, depending on the nature of the code. Custom applications may use both private and public images in their deployment stack, as your development team may have dependencies on public containers (like Microsoft SQL Server) and want to keep proprietary code containing business logic in a private repository. Both AWS and Microsoft offer private container registry services in their respective clouds.
There are some other benefits of container orchestration systems over VM hypervisors. In the next installment, we will dive deeper into Kubernetes, the most popular container orchestration system. For now, though, I'll give you one good example of a key feature: container autoscaling. This does not apply to database servers, but many application and Web tiers have stateless servers that benefit from having more instances under load. By increasing the number of instances of your Web or app tier, you spread the work across multiple workers.
Kubernetes can autoscale pods based on CPU utilization. (A pod is a unit of deployment in Kubernetes and can comprise one or more containers.) For example, you could define an autoscale threshold of 50 percent CPU utilization across pods. As your CPU utilization reaches 50 percent, more pods (workers) will be deployed in your cluster until the CPU use has dropped back below 50 percent, or you reach the maximum number of pods you defined in your application manifest.
Kubernetes also provides a load-balancing functionality natively. To implement similar functionality in a VM would require both additional-cost plug-ins and extensive customizations. Even after you had all of that configured, the time to scale up using VMs is much longer just because of their sheer size.
Containers provide many benefits to developers, which is the reason containers rose to prominence in IT organizations. By packaging libraries and code together, and by running inside an existing OS, containers help automate modern DevOps workflows and allow for faster delivery with better testing and results. In the next installment in this series, you will learn more in-depth information about Kubernetes and why it's a game-changer for IT operations.
Joseph D'Antoni is an Architect and SQL Server MVP with over a decade of experience working in both Fortune 500 and smaller firms. He is currently Principal Consultant for Denny Cherry and Associates Consulting. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. Joey is the co-president of the Philadelphia SQL Server Users Group . He is a frequent speaker at PASS Summit, TechEd, Code Camps, and SQLSaturday events.