VPA

VPA

VPA adjusts CPU and memory requests/limits of Pods to optimize resource usage.
Complexity
  • Not included in core Kubernetes like HPA; requires deployment of three controllers (Recommender, Admission plug-in, Updater) alongside the Metrics Server.
  • Less commonly used compared to HPA due to these complexities and limited use cases.
Components of VPA
  1. Recommender:
      • Analyzes resource usage from PodMetrics and suggests optimal CPU/memory requests.
  1. Admission Plug-in:
      • Applies updated resource requests/limits to newly created Pods.
  1. Updater:
      • Evicts Pods so new resource configurations can be applied.
Example
apiVersion: v1 kind: Pod metadata: name: sample spec: containers: - name: sample image: sample-image:1.0 resources: requests: cpu: 100m memory: 50Mi limits: cpu: 100m memory: 50Mi --- apiVersion: "autoscaling.k8s.io/v1beta2" kind: VerticalPodAutoscaler metadata: name: sample spec: targetRef: apiVersion: "v1" kind: Pod name: sample resourcePolicy: containerPolicies: - containerName: '*' minAllowed: cpu: 100m memory: 50Mi maxAllowed: cpu: 1 memory: 500Mi controlledResources: ["cpu", "memory"] updatePolicy: updateMode: Recreate
Key Settings
  • minAllowed and maxAllowed: Define the range for resource requests.
  • controlledResources: Specifies which resources (CPU/memory) to scale.
  • updateMode:
    • Recreate: Evicts Pods to apply changes.
    • Initial: Sets resource values at creation but doesn't evict Pods.
    • Off: Only recommends values without making changes.
 
Use Cases
  • Primary Use Case: Resource tuning for workloads with uncertain CPU/memory profiles in production.
    • Often used in Off mode to recommend optimal values for manual adjustments.
    • Useful for under- or over-requested resources, ensuring better utilization and reducing infrastructure waste.
  • Autoscaling: Rarely used in full Recreate mode due to risks like Pod eviction during peak usage.
Advantages
  • Provides actionable resource recommendations based on live usage.
  • Helps optimize resource allocation without immediate changes, giving engineers control over implementation.
Limitations
  • Complexity: Requires additional components and configuration compared to HPA.
  • Limited Applicability: Less effective for scaling compared to horizontal scaling; primarily a resource tuning aid.
  • Eviction Risks: Using Recreate mode can disrupt workloads during scaling.