1. Pre-Requisites
- EKS Cluster: Ensure you have an EKS cluster set up and configured with kubectl access.
- Helm Installed: Helm should be installed on your machine and configured to communicate with the EKS cluster.
- IAM Permissions: Ensure your worker nodes have permissions to access CloudWatch, EC2, and any other AWS services necessary for metrics collection and logging.
- kubectl Configured:
kubectl
should be configured to manage your EKS cluster.
2. Add the Prometheus Community Helm Repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-chars
helm repo update
3. Create a Dedicated Namespace
kubectl create namespace monitoring
4. Deploy kube-prometheus-stack
Customize the Helm values for production. This ensures you're configuring the stack according to your production needs, such as increasing resources and enabling persistence.
- Create a
values.yaml
file for production configuration:
prometheus: prometheusSpec: retention: 15d storageSpec: volumeClaimTemplate: spec: accessModes: ["ReadWriteOnce"] resources: requests: storage: 100Gi replicas: 2 grafana: persistence: enabled: true size: 10Gi adminPassword: "your-secure-password" alertmanager: alertmanagerSpec: replicas: 2
- Deploy the stack using Helm:
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \ --namespace monitoring \ -f values.yaml
5. Expose Prometheus and Grafana using Load Balancer (Production)
- Edit the Prometheus service:
kubectl edit svc kube-prometheus-stack-prometheus -n monitoring
Change the
type: ClusterIP
to type: LoadBalancer
.- Edit the Grafana service similarly:
kubectl edit svc kube-prometheus-stack-grafana -n monitoring
- Verify the services:
kubectl get svc -n monitoring
6. Set up IAM Roles for Service Accounts (IRSA)
Enable secure communication between EKS and AWS services:
eksctl create iamserviceaccount \ --name prometheus \ --namespace monitoring \ --cluster <cluster-name> \ --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy \ --approve \ --override-existing-serviceaccounts
7. Secure Grafana with Ingress and SSL
Create an Ingress resource for Grafana:
yaml Copy code apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: grafana-ingress namespace: monitoring spec: rules: - host: grafana.example.com http: paths: - path: / pathType: Prefix backend: service: name: kube-prometheus-stack-grafana port: number: 80 tls: - hosts: - grafana.example.com secretName: grafana-tls
8. Set Up Resource Limit
prometheus: resources: limits: cpu: 2 memory: 4Gi requests: cpu: 1 memory: 2Gi grafana: resources: limits: cpu: 1 memory: 2Gi requests: cpu: 500m memory: 1Gi
9. Monitor the Deployment
- Check the status of your pods:
kubectl get pods -n monitoring
- Verify the services:
kubectl get svc -n monitoring
10. Access Grafana and Prometheus
- Access Grafana via the LoadBalancer DNS name or Ingress URL.
- Retrieve Grafana admin password:
kubectl get secret --namespace monitoring kube-prometheus-stack-grafana -o jsonpath="{.data.admin-password}" | base64 --decode
11. Set Up Alerting
Add Alertmanager configuration in your
values.yaml
:alertmanager: config: global: resolve_timeout: 5m route: group_by: ['alertname'] receiver: 'slack-notifications' receivers: - name: 'slack-notifications' slack_configs: - api_url: https://hooks.slack.com/services/xxxxxx/xxxxxx/xxxxxx channel: '#alerts'
12. Configure Horizontal Pod Autoscaling (HPA)
kubectl autoscale deployment kube-prometheus-stack-prometheus -n monitoring --cpu-percent=80 --min=1 --max=3
With these steps, you can deploy a robust, production-ready
kube-prometheus-stack
on AWS EKS, ensuring high availability, scalability, and secure monitoring.