Terraform, Kubernetes and Public Clouds - Part 1 (AWS)

On July 8th I published an article testing some deployment times to get Kubernetes up and running on different public cloud providers. I used Terraform and a bash script to automate these deployments and test accurately. I received some requests to publish the code and a walkthrough.

Here it is! Ask and you shall receive.. sometimes 😜

In the first part (this one) we’ll cover deploying to AWS. If you don’t know what AWS stands for today, well, it’s Awesome Webbing Spiders!. No but really, you should know what it is if you’re reading this.

Table of Contents

  1. Table of Contents
    1. Topology
    2. Diagram
    3. Why Terraform
    4. AWS Module
    5. Building Main.tf
    6. Using the Module
      1. Setting up our provider
      2. Setting up the Module for Use
    7. AWS Cluster Implementation
    8. Creating the VPC, Subnets and Routing
      1. Creating the VPC
      2. Defining the Subnets
      3. Internet Access
      4. Routing and Route Tables
    9. Security Groups
      1. Master Security Group
      2. The Worker Nodes Security Groups
      3. Security Group architecture
    10. AIM Policies
      1. Master AIM Policy
      2. The Worker Roles
    11. Building the Cluster
      1. EKS Cluster
      2. The Worker Nodes (minions, I like that name)
    12. That’s It
      1. Running Terraform
    13. PS

Topology

This is what the modules build in all three of the major cloud providers.

  • Managed Kubernetes Cluster (Master / API Server)
  • Three (3) Worker Nodes (Minions)
  • Policies to form node-to-node networking
  • Where appropriate, correct SGs (security groups) for communication

Diagram

In general terms (high-level) this is what the architecture looks like on for each deployment. The specifics to the different cloud providers are not included, as this is just to give you a visual representation.

Kubernetes Diagram

Why Terraform

The real question really is. Why automation? Or why you should care to automate. But this is not that post, I chose Terraform because of

  • Idempotency
    • predictability
    • a.k.a - same result applied multiple times
  • Built for infrastructure orchestration
  • Integration with cloud providers
  • Simple to learn and code
  • Integration with other provisions such as Ansible
  • Also, it’s cool!

AWS Module

I start with AWS because it’s the most complicated setup, there are many more resources to code with Terraform than with the others. If you’re not familiar with Terraform modules, read the DOCS as they are ways to write re-usable Terraform code.

Building Main.tf

This is where I put the providers, anything global that’s required and some global(ish) variables.

modules/aws/main.tf

terraform {
  required_version = ">= 0.11.0"
}

resource "random_id" "rand1" {
  byte_length = 2
}

locals {
  rand1 = "${random_id.rand1.dec}"
}

This section is pretty simple, all we do is.

  • Require Terraform version
    • In this scenario is version greater than or equal to 0.11.0
  • Create a random variable (next two resources)
    • Simply to randomize the naming of resources later as we’ll see

Using the Module

Here we simply utilize the module we’ll describe later, pass on the variables required and setup our backend. The backend is where the terraform statefile is stored. For this example, it’s Amazon S3, but could be many things.

A few are:

  • Amazon S3
  • etcd
  • Consul
  • azurerm

Setting up our provider

Here we setup aws as our provider for this example.

provider "aws" {
  region                  = "${var.aws_region}"
  shared_credentials_file = "${var.credentials}"
  profile                 = "${var.profile}"
}

The region variable is mandatory while the shared_credentials_file and profile are optional. If not set, whatever’s in the environment variables or standard aws config path usually $HOME/.aws will be used.

Setting up the Module for Use

module "aws-k8s-cluster" {
  source = "../../modules/aws/"

  cluster_name    = "${var.cluster_name}"
  cidr_block      = "${var.cidr_block}"
  public_subnets  = "${var.public_subnets}"
  private_subnets = "${var.private_subnets}"
  instance_type   = "${var.t_type[2]}"
  worker_ami      = "${var.ami[0]}"

  env                  = "dev"
  asg_desired_capacity = 3
  asg_max_size         = 6
  asg_min_size         = 3

  # END-USER-INFO
  pc_ip          = "${var.pc_ip}"      # not used right now
  aws_account_id = "${var.account_id}"
}

We source the module, which is where all the smarts happens, give it the required variables and that’s it. The variables are referenced in a file called variables.tf. The actual filename does not matter, but it’s good practice to show for intent. The values of any variables containing sensitive information are in a file called terraform.tfvars, please add this to your .gitignore file.

Let’s take a look at the actual implementation next.

AWS Cluster Implementation

Creating the VPC, Subnets and Routing

Creating the VPC and it’s respective subnets it’s a critical piece of the process, without this nothing else gets created, at least on this version.

Lets break this down a bit into the specific pieces

Creating the VPC

As in any traditional data center, the core networking is critical to the infrastructure. The VPC is analogous to network segments. Though the name can be misleading (Virtual Private Cloud), many cloud architectures have and should contain quite a few VPCs. I think of VPCs as a network segment (VRF) or per-application network.

Let’s look at the code.

resource "aws_vpc" "k8s_vpc" {
  count                = "${var.create_vpc ? 1 : 0}"
  cidr_block           = "${var.cidr_block}"
  instance_tenancy     = "${var.instance_tenancy}"
  enable_dns_hostnames = "${var.enable_dns_hostnames}"
  enable_dns_support   = "${var.enable_dns_support}"

  tags = "${
    map(
      "Name", "${var.cluster_name}-${var.env}-${local.rand1}-vpc",
      "kubernetes.io/cluster/${var.cluster_name}-${var.env}-${local.rand1}", "shared",
    )
  }"
}

The VPC CDIR declaration and definition.

modules/aws/variables.tf

variable "cidr_block" {
  description = "defines the CIDR block to use within the module"
  type        = "string"
  default     = ""
}

examples/aws-k8s-cluster/variables.tf

variable "cidr_block" {
  description = "defines the CIDR block to use within the module"
  type        = "string"
  default     = ""
}

examples/aws-k8s-cluster/terraform.tfvars

cidr_block = "10.0.0.0/16"
  • Define the AWS VPC resource
    • the name is arbitrary but needs to be unique throughout the module
  • count variable decides to create or not to create (true or false)
    • false would mean nothing else gets created
  • The CIDR block for the VPC defined in a variable
  • The instance tenancy defined in a variable
    • default (shared)
    • dedicated
  • The rest of the variables are self explanatory
  • The tags
    • Name is optional
    • The “kubernetes.io/XYZ…. is mandatory for the cluster masters to read

if you’re having trouble understanding the of the “${var.variable_name} syntax, please read the terraform docs to understand interpolation syntax.

Defining the Subnets

Every VPC needs subnets, private or public, for our example we’ll use public subnets. I’ve yet to figure out why private subnets don’t work as the masters never pick up the nodes. I need to spend some time on this and figure out if it’s possible through EKS.

resource "aws_subnet" "k8s_public_subnets" {
  count = "${var.create_vpc && length(var.public_subnets) > 0 ? length(var.public_subnets) : 0}"

  vpc_id     = "${aws_vpc.k8s_vpc.id}"
  cidr_block = "${var.public_subnets[count.index]}"

  availability_zone       = "${element(data.aws_availability_zones.azs.names, count.index+1)}"
  map_public_ip_on_launch = "${var.public_ip_on_launch}"

  tags = "${
    map(
      "Name", "${var.cluster_name}-${var.env}-${local.rand1}-pub-subnet-${random_id.rand1.dec}-${var.public_subnets[count.index]}",
      "kubernetes.io/cluster/${var.cluster_name}-${var.env}-${local.rand1}", "shared",
    )
  }"
}

The count argument defines how many, if any subnets to create based on.

  • If the VPC is created
  • and the length of the _publicsubnets variable is not 0

    • The _publicsubnets variable is a map (otherwise know as an array)
  • public_subnets declaration and definitions

modules/aws/variables.tf

variable "public_subnets" {
  description = "Defines the public subnets to use"
  default     = []
}

examples/aws-k8s-cluster/variables.tf

variable "public_subnets" {
  description = "Defines the public subnets to use"
  default     = []
}

examples/aws-k8s-cluster/terraform.tfvars

public_subnets = ["10.0.0.0/24", "10.0.1.0/24", "10.0.2.0/24"]

If you’re keen-eyed you’ve noticed by now that variables need to be declared in both the module and the implementation of the module. But the declaration need only exist within the implementation of such module. I won’t be using any more variable declaration or definition examples to keep this reasonable length, but now you can see the pattern.

  • Next we attach to the previously created VPC with the vpc-id =
  • We tell it with a boolean variable to map public-ip on map a public ip on launch
    • default = true for a public subnet
  • We use similar tags to associate to the EKS service

The _availabilityzone is gathered from the data source _aws_availabilityzones and we count the number of public subnets and map a new AZ to each subnet.

  • data source
data "aws_availability_zones" "azs" {}

Internet Access

To access the internet, expose services through load-balancers or update our nodes, we need an internet gateway.

resource "aws_internet_gateway" "k8s_inet_gateway" {
  vpc_id = "${aws_vpc.k8s_vpc.id}"

  tags {
    Name = "${var.cluster_name}-${var.env}-${local.rand1}-inet-gateway"
  }
}

The internet gateway needs to be created first because the route-table refers to it for it’s default route. Though from a terraform perspective, where it sits in the configuration does not matter. This is one of the beauties of terraform, it’s capability to detect and handle dependencies.

Routing and Route Tables

Any AWS VPC has a default route table, a set of custom defined public and private route tables or a combination of both. For this example, we’ll have one public subnet associated with our VPC.

  • Create the route table
resource "aws_route_table" "k8s_public_rt" {
  vpc_id = "${aws_vpc.k8s_vpc.id}"

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = "${aws_internet_gateway.k8s_inet_gateway.id}"
  }

  tags {
    Name = "${var.cluster_name}-${var.env}-${local.rand1}-public-route-table"
  }
}

Simply create the _aws_routetable resource, what makes it public it’s the default-route and that attachment of the internet gateway.

  • We then associate the subnets to the created public route-table
resource "aws_route_table_association" "public_rt_association" {
  count = "${length(var.public_subnets)}"

  subnet_id      = "${aws_subnet.k8s_public_subnets.*.id[count.index]}"
  route_table_id = "${aws_route_table.k8s_public_rt.id}"
}

Security Groups

Lets secure our deployment with Security Groups first. We’ll create some policies to allow the master to talk to the workers and vice versa, as well as outbound communication.

Master Security Group

resource "aws_security_group" "k8s_master_sg" {
  name        = "${var.cluster_name}-${var.env}-${local.rand1}-master-sg"
  description = "Cluster communication with worker nodes"
  vpc_id      = "${aws_vpc.k8s_vpc.id}"

  tags {
    Name = "${var.cluster_name}-${var.env}-${local.rand1}-master-sg"
  }
}

The security group is created but no rules are applied. Is this allow all? or deny any? will it even take? try it!

  • Lets build the outbound rules
  • For this scenario only one rule permitting all
  • In real world production, please be more specific
resource "aws_security_group_rule" "k8s_allow_master_out" {
  description       = "Allow master outbound communication"
  cidr_blocks       = ["0.0.0.0/0"]
  from_port         = 0
  to_port           = 0
  protocol          = "-1"
  security_group_id = "${aws_security_group.k8s_master_sg.id}"
  type              = "egress"
}

This rule permits all and equivalent to a permit ip any any in Cisco terms. It is attached to the master security group by calling _security_groupid and using interpolation to find the master_sg.id.

resource "aws_security_group_rule" "k8s_allow_master_in" {
  description       = "Allow workstation to communicate with the cluster API Server"
  cidr_blocks       = ["your.pc.ip.here/32"]
  from_port         = 443
  protocol          = "tcp"
  security_group_id = "${aws_security_group.k8s_master_sg.id}"
  to_port           = 443
  type              = "ingress"
}

For the first inbound rule we’re going to permit our workstation (change line of “your.pc.ip.here”) with your internet public IP you’ll be using kubectl from. If using a bastion EC2 instance for this, one can also change this rule to use a security group as the source instead of a cidr block.

resource "aws_security_group_rule" "k8s_pods_to_master_rule" {
  description              = "Allow pods to communicate with the cluster API Server"
  from_port                = 443
  protocol                 = "tcp"
  security_group_id        = "${aws_security_group.k8s_master_sg.id}"
  source_security_group_id = "${aws_security_group.k8s_worker_sg.id}"
  to_port                  = 443
  type                     = "ingress"
}

The second and last inbound rule is to permit the worker nodes and pods to communicate with the API Server. Remember, we do this by allowing the _source_security_groupid to be that of the worker nodes (we will define next section).

The Worker Nodes Security Groups

Lets run through the code for the worker nodes.

resource "aws_security_group" "k8s_worker_sg" {
  name        = "${var.cluster_name}-${var.env}-${local.rand1}-worker-sg"
  description = "Security group for all nodes in the cluster"
  vpc_id      = "${aws_vpc.k8s_vpc.id}"

  tags = "${
    map(
     "Name", "${var.cluster_name}-worker-sg-${random_id.rand1.dec}",
     "kubernetes.io/cluster/${var.cluster_name}-${var.env}-${local.rand1}", "owned",
    )
  }"
}

As before we have to define the security group and the rules are left blank. Now you can also specify ingress and egress rules inside the security group configuration itself. Though some things are easier to do by separating the logic and it’s always best to standardize on method. If you do it for one, do it for all the same. Keep consistency and intent, which makes it easier for your team to understand.

resource "aws_security_group_rule" "k8s_allow_worker_out" {
  description       = "Allow master outbound communication"
  cidr_blocks       = ["0.0.0.0/0"]
  from_port         = 0
  to_port           = 0
  protocol          = "-1"
  security_group_id = "${aws_security_group.k8s_worker_sg.id}"
  type              = "egress"
}

This is a similar outbound rule as the master and has the same effect.

resource "aws_security_group_rule" "k8s_worker_self_rule" {
  description              = "Allow node to communicate with each other"
  from_port                = 0
  protocol                 = "-1"
  security_group_id        = "${aws_security_group.k8s_worker_sg.id}"
  source_security_group_id = "${aws_security_group.k8s_worker_sg.id}"
  to_port                  = 65535
  type                     = "ingress"
}

First inbound rule allow the nodes to communicate with each other. This is pod-to-pod traffic, CNI overlays, services, proxy load-balancing, service discovery and so on. We specify the source group_id to be that of a node.

resource "aws_security_group_rule" "k8s_worker_master_rule" {
  description              = "Allow worker Kubelets and pods to receive communication from the cluster control plane"
  from_port                = 1025
  protocol                 = "tcp"
  security_group_id        = "${aws_security_group.k8s_worker_sg.id}"
  source_security_group_id = "${aws_security_group.k8s_master_sg.id}"
  to_port                  = 65535
  type                     = "ingress"
}

The last inbound rule allows traffic from the master control plane. This is the API server, but not necessarily API calls, simply cluster maintenance and control information.

Security Group architecture

This is a visual representation of the security group architecture and communication path.

Security Group Architecture

AIM Policies

We also need to tell our instances what type of roles they have. For example, our master needs to assume an EKS compatible role and, our worker nodes need access to the container registry and have an ECS compatible role.

Master AIM Policy

Lets write the master policy and roles.

resource "aws_iam_role" "k8s_master_role" {
  name = "${var.cluster_name}-${var.env}-${local.rand1}-master-role"

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

Here we’re saying that whatever this is attached to should assume the EKS service. This is a predefined role within AWS that gives all permissions needed for the master.

The specific line in questions is…

"Service": "eks.amazonaws.com"

and

"Action": "sts:AssumeRole"
  • Next we attache the policies to the role.
resource "aws_iam_role_policy_attachment" "k8s_master_policy_attach" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = "${aws_iam_role.k8s_master_role.name}"
}

resource "aws_iam_role_policy_attachment" "k8s_master_policy_attach_service" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSServicePolicy"
  role       = "${aws_iam_role.k8s_master_role.name}"
}
  • Attach the Cluster Policy
  • Attach the Service Policy

The Worker Roles

resource "aws_iam_role" "k8s_worker_role" {
  name = "${var.cluster_name}-${var.env}-${local.rand1}-worker-role"

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "k8s_worker_nodes-AmazonEKSWorkerNodePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
  role       = "${aws_iam_role.k8s_worker_role.name}"
}

resource "aws_iam_role_policy_attachment" "k8s_worker_nodes-AmazonEKS_CNI_Policy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
  role       = "${aws_iam_role.k8s_worker_role.name}"
}

resource "aws_iam_role_policy_attachment" "k8s_worker_nodes-AmazonEC2ContainerRegistryReadOnly" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  role       = "${aws_iam_role.k8s_worker_role.name}"
}

resource "aws_iam_instance_profile" "k8s_worker_instance_profile" {
  name = "${var.cluster_name}-instace-profile"
  role = "${aws_iam_role.k8s_worker_role.name}"
}

This example IAM role allows the worker nodes to…

  • join the cluster
  • manage and retrieve data from other AWS services
    • data such as container images from ECS Registry

Building the Cluster

Finally (I know right… this has been long) we begin to build the master and worker nodes cluster. Lets begin with the simplest, which is building the Masters through EKS.

EKS Cluster

resource "aws_eks_cluster" "this" {
  name     = "${var.cluster_name}-${var.env}-${local.rand1}"
  role_arn = "${aws_iam_role.k8s_master_role.arn}"

  vpc_config {
    security_group_ids = ["${aws_security_group.k8s_master_sg.id}"]
    subnet_ids         = ["${aws_subnet.k8s_public_subnets.*.id}"]
  }

  depends_on = [
    "aws_iam_role_policy_attachment.k8s_master_policy_attach",
    "aws_iam_role_policy_attachment.k8s_master_policy_attach_service",
  ]
}

One thing to know is, sometimes Terraform cannot account for all dependencies. This is there’s the depends_on list to manually tell the API to wait for the resources it requires. Most likely, this is because EKS itself is relatively new, and API support is not 100% complete.

  • We give it a name
    • throughout the code I use an env variable and a random number for uniqueness
    • feel free to choose your own method
  • We attach the master role previously created
  • We tell it which security groups to use
  • Finally which subnet IDs the masters live on

The Worker Nodes (minions, I like that name)

We’ll create the nodes by utilizing autoscaling groups and launch configuration.

resource "aws_launch_configuration" "k8s_worker_launch_cfg" {
  associate_public_ip_address = true
  iam_instance_profile        = "${aws_iam_instance_profile.k8s_worker_instance_profile.name}"
  image_id                    = "${var.worker_ami}"
  instance_type               = "${var.instance_type}"
  name_prefix                 = "${var.cluster_name}-${var.env}-${local.rand1}-launch-cfg"
  security_groups             = ["${aws_security_group.k8s_worker_sg.id}"]
  user_data_base64            = "${base64encode(local.k8s_worker_data)}"

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_autoscaling_group" "k8s_worker_autoscaling_grp" {
  desired_capacity     = "${var.asg_desired_capacity}"
  launch_configuration = "${aws_launch_configuration.k8s_worker_launch_cfg.id}"
  max_size             = "${var.asg_max_size}"
  min_size             = "${var.asg_min_size}"
  name                 = "${var.cluster_name}-${var.env}-${local.rand1}-worker-asg"
  vpc_zone_identifier  = ["${aws_subnet.k8s_public_subnets.*.id}"]

  tag {
    key                 = "Name"
    value               = "${var.cluster_name}-${var.env}-${local.rand1}-worker-asg"
    propagate_at_launch = true
  }

  tag {
    key                 = "kubernetes.io/cluster/${var.cluster_name}-${var.env}-${local.rand1}"
    value               = "owned"
    propagate_at_launch = true
  }
}

The user_data_base64 reference can be found in the github repo for this post. I left it out to avoid making this longer than it should be.

like it isn’t already I know 😜

I also left the AMI images referenced in the variables.tf file of example/aws-k8s-cluster and not in the terraform.tf. This is because we need to use ECS enhanced images and searching is a pain in the 🍑 sometimes.

That’s It

Now you can just run the corresponding terraform commands and have a working EKS Cluster in about 15 minutes. This time seems to be decreasing as AWS adds efficiencies in the backend. Creating a cluster used to take 15 minutes or so, from my last post. Right now, I’ve notice 10 minutes to fully operational.

Running Terraform

  • terraform plan
cloud/examples/aws-k8s-cluster took 5s on  master [!]
•100% [I] ➜ terraform plan -out plan.tfplan 2>&1 | tee

...output truncated...

Plan: 30 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

This plan was saved to: plan.tfplan

This runs and plans out the changes to be made or if like in this case, it’s a completely new environment, it will just add the resources necessary.

  • Terraform apply
cloud/examples/aws-k8s-cluster took 6s on  master []
•100% [I] ➜ terraform apply plan.tfplan  2>&1 | tee apply.txt

...output truncated...

Apply complete! Resources: 30 added, 0 changed, 0 destroyed.

If successful you’ll see a message similar to above, with all resources added and deployed.

  • Testing
cloud/examples/aws-k8s-cluster on  master []
•100% [I] ➜ kubectl get nodes
NAME                         STATUS     ROLES     AGE       VERSION
ip-10-0-0-179.ec2.internal   Ready     <none>    4m        v1.10.3
ip-10-0-1-62.ec2.internal    Ready     <none>    4m        v1.10.3
ip-10-0-2-191.ec2.internal   Ready     <none>    4m        v1.10.3

After a couple of minutes once the EC2 instances finish booting, you’ll see the nodes come up and in active state.

  • Destroying
cloud/examples/aws-k8s-cluster on  master []
•100% [I] ➜ terraform destroy -force 2>&1 | tee destroy.txt

With the terraform destroy command you can bring down the entire environment in a matter of minutes as well.

PS

All the code for this is available on github, feel free to star it, fork it or clone it. I’ll keep updating, optimizing and adding more providers.

Available cloud providers

  • AWS
  • Azure
  • GCP (Google)

There are also scripts in the example directory which automate the process of deploying, destroying and configuring kubectl upon deployment and destruction.

comments powered by Disqus