Skip to content

Terraform

IaC with Terraform

Terraform is being used to provision Infrastructure on AWS.

We make heavy use of upstream modules, such as github.com/terraform-aws-modules.

All Terraform files are located in the infra directory:

> tree -d
.
├── bootstrap
├── clusters    # tfvar files per cluster
├── files       # static files (templates, ...)
├── vars        # common tfvar files
└── *.tf        # Terraform files

Provisioning a Cluster

In order to provision a new or an existing cluster, you would have to add a corresponding ./clusters/<cluster-name>.tfvars file.

The following variable set is supported:

# infra/clusters/dev.tfvars
region = "ap-southeast-2"
env    = "dev"

eks = {
  kubernetes_version = "1.33"
  cidr               = "10.0.0.0/16"
  min_size           = 2
  max_size           = 10
  instance_types     = ["t3.xlarge"]
}

access = {
  "constellation_admin" = {
    principal_arn = "arn:aws:iam::799468650620:role/aws-reserved/sso.amazonaws.com/eu-west-1/AWSReservedSSO_ConstellationAdmin_64b9f879eabd03dc"
    policies = {
      "cluster_admin" = {
        policy_arn   = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
        access_scope = "cluster"
      }
    }
  }
  "constellation_engineering" = {
    principal_arn = "arn:aws:iam::799468650620:role/aws-reserved/sso.amazonaws.com/eu-west-1/AWSReservedSSO_ConstellationEngineering_9dda2a0a0849d418"
    policies = {
      "read_only" = {
        policy_arn   = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSAdminViewPolicy"
        access_scope = "cluster"
      }
    }
  }
}

rds = {
  engine            = "postgres"
  version           = "17"
  family            = "postgres17"
  port              = 5432
  instances         = 1
  instance_class    = "db.t3.medium"
  allocated_storage = 5
}

irsa = {
  grafana_cloudwatch = {
    namespace            = "monitoring"
    service_account_name = "kube-prometheus-stack-grafana"
    iam_policy = {
      # https://github.com/monitoringartist/grafana-aws-cloudwatch-dashboards?tab=readme-ov-file
      "AWSBilling" = {
        actions = [
          "cloudwatch:ListMetrics",
          "cloudwatch:GetMetricStatistics",
          "cloudwatch:GetMetricData",
          "logs:DescribeLogGroups",
          "logs:DescribeLogStreams",
          "logs:GetLogEvents",
          "logs:FilterLogEvents"
        ]
        resources = ["*"]
      }
    }
  }
  cluster_autoscaler = {
    namespace            = "kube-system"
    service_account_name = "cluster-autoscaler-aws-cluster-autoscaler"
    iam_policy = {
      # https://docs.aws.amazon.com/eks/latest/best-practices/cas.html
      "ClusterAutoscalerAll" = {
        actions = [
          "autoscaling:DescribeAutoScalingGroups",
          "autoscaling:DescribeAutoScalingInstances",
          "autoscaling:DescribeLaunchConfigurations",
          "autoscaling:DescribeTags",
          "autoscaling:SetDesiredCapacity",
          "autoscaling:TerminateInstanceInAutoScalingGroup",
          "ec2:DescribeLaunchTemplateVersions",
          "ec2:DescribeInstanceTypes",
          "eks:DescribeNodegroup"
        ]
        resources = ["*"]
      }
    }
  }
  external_secrets_operator = {
    namespace            = "external-secrets"
    service_account_name = "external-secrets"
    iam_policy = {
      "ExternalSecretsOperatorAll" = {
        actions = [
          "secretsmanager:ListSecrets",
          "secretsmanager:BatchGetSecretValue",
          "secretsmanager:GetResourcePolicy",
          "secretsmanager:GetSecretValue",
          "secretsmanager:DescribeSecret",
          "secretsmanager:ListSecretVersionIds"
        ]
        resources = ["*"]
      }
    }
  }
  clearcomply = {
    namespace            = "app-clearcomply"
    service_account_name = "clearcomply"
    iam_policy = {
      "S3Access" = {
        actions = [
          "s3:GetObject",
          "s3:PutObject",
          "s3:DeleteObject"
        ]
        resources = ["*"]
      }
    }
  }
}

irsa_clearroute_account = {
  external_dns = {
    namespace            = "external-dns"
    service_account_name = "external-dns"
    iam_policy = {
      # https://kubernetes-sigs.github.io/external-dns/latest/docs/tutorials/aws/#iam-policy
      "ChangeResourceRecordSets" = {
        actions   = ["route53:ChangeResourceRecordSets"]
        resources = ["arn:aws:route53:::hostedzone/*"]
      }
      "Route53Records" = {
        actions = [
          "route53:ListHostedZones",
          "route53:ListResourceRecordSets",
          "route53:ListTagsForResources"
        ]
        resources = ["*"]
      }
    }
  }
  cert_manager = {
    namespace            = "cert-manager"
    service_account_name = "cert-manager"
    iam_policy = {
      "ChangeResourceRecordSets" = {
        actions = [
          "route53:ChangeResourceRecordSets",
          "route53:ListResourceRecordSets"
        ]
        resources = ["arn:aws:route53:::hostedzone/*"]
        conditions = {
          "ForAllValues:StringEquals" = {
            variable = "route53:ChangeResourceRecordSetsRecordTypes"
            values   = ["TXT"]
          }
        }
      }
      "ListHostedZones" = {
        actions = [
          "route53:ListHostedZones",
          "route53:ListHostedZonesByName",
          "route53:GetChange",
          "route53:GetHostedZone",
        ]
        resources = ["*"]
      }
    }
  }
}

When providing (& merging) a PR the Github Action pipelines, will automatically pick up any changes and apply them.

A Cluster provisioning, includes:

  • VPC
  • EKS
  • RDS
  • App specific RDS DB Credentials for each app in AWS secrets manager
  • IAM settings (especially IRSA)
  • Route53 settings
  • ECRs for each app (if the app is a github.com/clear-route repository)

Furthermore ArgoCD using helm will be installed on the EKS cluster.

GitOps Bridge: Handover from Terraform to ArgoCD

We aim to have a clear separation between Terraform managed resources and Kubernetes (EKS) managed resources. We strictly want to avoid managing Kubernetes manifests or Helm Charts using Terraform, as the Kubernetes & ArgoCD Reconciliation loop conflicts with Terraforms ad-hoc driven approach.

However, sometimes ArgoCD needs values that are Terraform managed. This "gap"-problem is called the GitOps bridge.

In order to securely pass any computed values to ArgoCD, during Cluster provisioning an ArgoCD Cluster Secret is created, that contains all required Terraform values as annotations. Those annotations can then be consumed in ArgoCD with an ApplicationSet of type Cluster Generator using {{ metadata.annotation.<annotation>}}.

Here is a short example of the ExternalDNS ArgoCD App:

# applications/argocd/dev/external-dns.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: external-dns
  namespace: argocd
spec:
  generators:
    - clusters:
        selector:
          matchLabels:
            argocd.argoproj.io/secret-type: cluster
  template:
    metadata:
      name: external-dns
    spec:
      project: default
      sources:
        - repoURL: https://kubernetes-sigs.github.io/external-dns/
          chart: external-dns
          targetRevision: 1.19.0
          helm:
            values: |
              provider:
                name: aws

              domainFilters:
                - {{ metadata.annotations.cluster_name }}.{{ metadata.annotations.tld }}

              env:
                - name: AWS_DEFAULT_REGION
                  value: {{ metadata.annotations.region }}

              serviceAccount:
                annotations:
                  eks.amazonaws.com/role-arn: {{ metadata.annotations.external_dns_role_arn }}
      destination:
        server: https://kubernetes.default.svc
        namespace: external-dns
      syncPolicy:
        syncOptions:
          - CreateNamespace=true
        automated:
          prune: true
          selfHeal: true