Compute¶

Karpenter¶

We use Karpenter for Cluster Autoscaling.

Karpenter dynamically adds and removes nodes in the EKS cluster based on:

pending pod requirements
node utilization
available AWS instance types
current AWS pricing

Node types¶

Generally, C11N has 3 types of nodes:

1. EKS Managed Node Group (`workload-tier: baseline`)¶

The EKS managed NodeGroup created by Terraform. The EKS MNG Nodes are also tainted, which prevents any non critical workload to be scheduled on it

2. `On-Demand` Instances (`workload-tier: on-demand`)¶

Regular EC2 instances billed per-second/hour with no commitment. They provide guaranteed capacity (within AWS service limits) and are not interrupted.

3. `Spot` Instances (`workload-tier: spot`)¶

Spot instances use spare AWS capacity at a large discount (typically 60–90% cheaper), but AWS may interrupt and reclaim the instance with 2 minutes notice.

Tip

Using Spot instances significantly reduces compute cost for Constellation clusters.

Constellations Compute Strategy¶

Our strategy is as follows:

Cluster Critical Workloads (ArgoCD, CoreDNS, Cilium, Karpenter, Traefik) -> workload-tier: baseline
Business Critical Workloads (Passport, ...) -> workload-tier: on-demand
All other Applications -> workload-tier: spot

This strategy enables us to reduce the compute costs while still being relatively stable and predictable.

Workload pinning¶

using `spec.template.spec.nodeSelector`¶

Add a spec.template.spec.nodeSelector to your deployments with workload-tier: on-demand for On-Demand nodes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment
spec:
  replicas: 1
  template:
    spec:
      nodeSelector:
        workload-tier: spot | on-demand | baseline

`kustomize`¶

Or in a kustomize patch:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

patches:
  - target:
      kind: Deployment # applies to all Deployments
    patch: |-
      - op: add
        path: /spec/template/spec/nodeSelector
        value: 
          workload-tier: on-demand

PodDisruptionBudgets¶

PodDisruptionBudgets (PDBs) ensure Kubernetes does not evict too many pods at once during events such as:

node scale-down
Karpenter consolidation
Spot interruptions
planned maintenance

Example: For a Deployment with app=my-service with 2+ replicas

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-service-pdb
spec:
  selector:
    matchLabels:
      app: my-service
  minAvailable: 1

This ensures at least 1 pod is always running
Allows voluntary disruptions (Karpenter draining, node upgrades, etc.) as long as that condition stays true