Docker and Containers
Containers
Containers are similar to Virtual Machines except that all of the containers run on the same operating system. Docker is the de facto standard in this space. Applications built as docker images can easily be deployed in new environments as long as the new system can run the Docker engine. Containers can also be used to deploy large clusters of computers, ensuring that they are all setup the same way. Companies like Amazon are offering Container as a Service (CaaS), providing the infrastructure for you and then allowing you to run your containers on their hardware.
The following is a good description of top technologies from the article: The Container Landscape.
- Docker is the container technology with most public traction and is “almost” a de facto container standard right now. Docker is open source with several vendors behind it. Though, Docker Inc. is the one “official” company controlling the project. They also recently added the orchestration engine Docker Swarm to the main container project. Others like Red Hat and IBM contribute to the open source code. Various vendors offer support, consulting and cloud services (such as a public or private Docker Registry).
- Core OS’ rkt (pronounced “rocket”) is another container technology emerging together with its container orchestration engine Fleet. It is a low-level framework built directly on systemd, often used as “foundation layer” for higher-level solutions. rkt focuses on security, composability (e.g. native Unix integration), and standards / compatibility – as differentiator to Docker. rkt also can run Docker images natively and has native Kubernetes integration via “rktnetes”. CoreOS therefore also offers an “Enterprise Kubernetes Solution” called Tectonic. This might be very important for future adoption in more projects.
- Cloud Foundry’s Garden Garden is used under the hood of the open source PaaS CloudFoundry. As many relevant software vendors like IBM, SAP or Pivotal base their PaaS strategy on CloudFoundry, Garden containers get used by many enterprises “under the hood”. In contrary to Docker and rkt, there is no real market outside of CloudFoundry for Garden containers.
- Kubernetes is an orchestration engine for containers with a huge community behind it. This project was released as open source by Google earlier in the year; with many other contributors including software vendors like Red Hat, CoreOS or Mesosphere. Kubernetes is open to run different container technologies such as Docker (mostly used today) or CoreOS’ rkt (pronounced “rocket”). The two most well-known offerings of Kubernetes are: Google Container Engine (public Kubernetes service) and Red Hat’s open source PaaS OpenShift (based on Kubernetes, for hybrid cloud deployments). The latter adds some useful features on top of Kubernetes like an enhanced web user interface and an automated ‘source-to-deployment’ system that does not require knowledge about the underlying container and Docker subsystems.
- Amazon AWS ECS: This is a public CaaS to manage Docker images (that can be stored in the accompanying ECS Registry), run Docker containers (ECS Runtime) and schedule / orchestrate / monitor these container instances (AWS CloudWatch). These can also be combined with other AWS services like Elastic Load Balancer (AWS ELB), Virtual Private Connection (AWS VPC) or Identity and Access Management (AWS IAM). AWS Simplified Workflow is also tightly integrated with AWS ECS to use Docker CLI commands (e.g. push, pull, list, tag).
Orchestration Tools
Closely related to this concept are the Orchestration Frameworks that manage the deployment of containers and allow you to manage a large cluster of computers and the containers deployed on them. The following table provides a list of key container technologies and their related orchestration frameworks:
Tool | Description |
---|---|
Amazon EC2 Container Service (ECS) | The Amazon EC2 Container Service (ECS) supports Docker containers and lets you run applications on a managed cluster of Amazon EC2 instances. |
Azure Container Service (ACS) | Azure Container Service (ACS) -- ACS lets you create a cluster of virtual machines that act as container hosts along with master machines that are used to manage your application containers. |
Cloud Foundry's Diego | Cloud Foundry’s Diego -- Diego is a container management system that combines a scheduler, runner, and health manager. It is a rewrite of the Cloud Foundry runtime. |
CoreOS Fleet | CoreOS Fleet -- Fleet is a container management tool that lets you deploy Docker containers on hosts in a cluster as well as distribute services across a cluster. |
Docker Swarm | Docker Swarm -- Docker Swarm provides native clustering functionality for Docker containers, which lets you turn a group of Docker engines into a single, virtual Docker engine. |
Google Container Engine | Google Container Engine -- Google Container Engine, which is built on Kubernetes, lets you run Docker containers on the Google Cloud platform. It schedules containers into the cluster and manages them based on user-defined requirements. |
Kubernetes | Kubernetes -- Kubernetes is an orchestration system for Docker containers. It handles scheduling and manages workloads based on user-defined parameters. |
Mesosphere Marathon | Mesosphere Marathon -- Marathon is a container orchestration framework for Apache Mesos that is designed to launch long-running applications. It offers key features for running applications in a clustered environment. |
NOTE: This table was taken from the article 8 Container Orchestration Tools to Know. The article also mentioned that Kubernetes has been selected by the Cloud Native Computing Federation (CNCF) as their first standard technology.
When to Use Docker
This article provides a good overview of when to use (and when not to use) Docker for your projects. This helped clarify my thinking on it and get you past the "everything should be dockerized!" but "doesn't this seem too hard/complicated?" roadblock. Basically, if you expect to spend a lot of time setting up an environment and you expect to do this repeatedly, Docker may be a good solution. I'm not sure a simple Java web app meets this use case. If you're building self-contained ".jar" files that contain the web server and all dependencies, you're only other dependency will be having Java installed. Provisioning a Linux server, installing Java, copying over your web app, and starting it up seems simple enough. It's definitely simpler than provisioning a Linux server, installing Docker, building a docker image for your web app, copying over the image, and running the image. If you're application needs a database, this also doesn't seem like a great use case for Docker. Databases tend to be shared resources that you install and configure separately from your app. Once again it is probably easier to stand up a Linux server, install your database, and configure it, rather than trying to build the database in a docker image and do all the configuration in there. (It also doesn't seem beneficial to try to bundle your app with a database using Kubernetes for similar reasons.) Unless you are standing up databases all the time and find yourself doing the same installation/configuration over and over again, Docker probably makes things more difficult rather than easier.
So when would you use Docker? Any time you find yourself doing the same server setup and configuration over and over again. Running R jobs on remote servers is one use case our team keeps bumping into. R is a bit difficult to install - especially when you want to get several libraries and make sure you have the right versions of them. A Docker image for something like this - especially when we talk about running jobs remotely on a cluster - seems to make a lot of sense. Python web apps might also be a good use case. Python doesn't produce self-contained executables like Java for web apps. You end up installing a lot of dependencies and managing virtual environments to keep them all separate. If you had a consistent set of dependencies that you could put into a base docker image that might make life easier. And once you get to the point where most of your apps are being deployed through docker, it would probably make sense to start doing this for all of your apps. There should be a simple way to build a Linux image with Java installed to bundle your Java apps pretty painlessly. But unless this becomes your normal mode of working, it probably is more trouble and complexity than it's worth.