Skip to content

Compute

Karpenter

We use Karpenter for Cluster Autoscaling.

Karpenter dynamically adds and removes nodes in the EKS cluster based on:

  • pending pod requirements
  • node utilization
  • available AWS instance types
  • current AWS pricing

Node types

Generally, C11N has 3 types of nodes:

1. EKS Managed Node Group (workload-tier: baseline)

The EKS managed NodeGroup created by Terraform. The EKS MNG Nodes are also tainted, which prevents any non critical workload to be scheduled on it

2. On-Demand Instances (workload-tier: on-demand)

Regular EC2 instances billed per-second/hour with no commitment. They provide guaranteed capacity (within AWS service limits) and are not interrupted.

3. Spot Instances (workload-tier: spot)

Spot instances use spare AWS capacity at a large discount (typically 60–90% cheaper), but AWS may interrupt and reclaim the instance with 2 minutes notice.

Tip

Using Spot instances significantly reduces compute cost for Constellation clusters.

Constellations Compute Strategy

Our strategy is as follows:

  • Cluster Critical Workloads (ArgoCD, CoreDNS, Cilium, Karpenter, Traefik) -> workload-tier: baseline
  • Business Critical Workloads (Passport, ...) -> workload-tier: on-demand
  • All other Applications -> workload-tier: spot

This strategy enables us to reduce the compute costs while still being relatively stable and predictable.

Workload pinning

using spec.template.spec.nodeSelector

Add a spec.template.spec.nodeSelector to your deployments with workload-tier: on-demand for On-Demand nodes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment
spec:
  replicas: 1
  template:
    spec:
      nodeSelector:
        workload-tier: spot | on-demand | baseline

kustomize

Or in a kustomize patch:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

patches:
  - target:
      kind: Deployment # applies to all Deployments
    patch: |-
      - op: add
        path: /spec/template/spec/nodeSelector
        value: 
          workload-tier: on-demand

PodDisruptionBudgets

PodDisruptionBudgets (PDBs) ensure Kubernetes does not evict too many pods at once during events such as:

  • node scale-down
  • Karpenter consolidation
  • Spot interruptions
  • planned maintenance

Example: For a Deployment with app=my-service with 2+ replicas

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-service-pdb
spec:
  selector:
    matchLabels:
      app: my-service
  minAvailable: 1
  • This ensures at least 1 pod is always running
  • Allows voluntary disruptions (Karpenter draining, node upgrades, etc.) as long as that condition stays true