VPA

VPA adjusts CPU and memory requests/limits of Pods to optimize resource usage.

Complexity

Not included in core Kubernetes like HPA; requires deployment of three controllers (Recommender, Admission plug-in, Updater) alongside the Metrics Server.

Less commonly used compared to HPA due to these complexities and limited use cases.

Components of VPA

Recommender:

Analyzes resource usage from PodMetrics and suggests optimal CPU/memory requests.

Admission Plug-in:

Applies updated resource requests/limits to newly created Pods.

Updater:

Evicts Pods so new resource configurations can be applied.

Example


apiVersion: v1
kind: Pod
metadata:
  name: sample
spec:
  containers:
    - name: sample
      image: sample-image:1.0
      resources:
        requests:
          cpu: 100m
          memory: 50Mi
        limits:
          cpu: 100m
          memory: 50Mi
---
apiVersion: "autoscaling.k8s.io/v1beta2"
kind: VerticalPodAutoscaler
metadata:
  name: sample
spec:
  targetRef:
    apiVersion: "v1"
    kind: Pod
    name: sample
  resourcePolicy:
    containerPolicies:
      - containerName: '*'
        minAllowed:
          cpu: 100m
          memory: 50Mi
        maxAllowed:
          cpu: 1
          memory: 500Mi
        controlledResources: ["cpu", "memory"]
  updatePolicy:
    updateMode: Recreate

Key Settings

minAllowed and maxAllowed: Define the range for resource requests.

controlledResources: Specifies which resources (CPU/memory) to scale.

updateMode:

Recreate: Evicts Pods to apply changes.
Initial: Sets resource values at creation but doesn't evict Pods.
Off: Only recommends values without making changes.

Use Cases

Primary Use Case: Resource tuning for workloads with uncertain CPU/memory profiles in production.

Often used in Off mode to recommend optimal values for manual adjustments.
Useful for under- or over-requested resources, ensuring better utilization and reducing infrastructure waste.

Autoscaling: Rarely used in full Recreate mode due to risks like Pod eviction during peak usage.

Advantages

Provides actionable resource recommendations based on live usage.

Helps optimize resource allocation without immediate changes, giving engineers control over implementation.

Limitations

Complexity: Requires additional components and configuration compared to HPA.

Limited Applicability: Less effective for scaling compared to horizontal scaling; primarily a resource tuning aid.

Eviction Risks: Using Recreate mode can disrupt workloads during scaling.