HPA与VPA自动扩缩容
约 1516 字大约 5 分钟
kubernetesautoscaling
2025-06-15
自动扩缩容是 Kubernetes 的核心能力之一,能根据负载动态调整资源。本文将深入讲解 HPA(水平自动扩缩)、VPA(垂直自动扩缩)、KEDA(事件驱动扩缩)以及集群自动扩缩器的原理和实践。
扩缩容体系总览
HPA v2(Horizontal Pod Autoscaler)
基于 CPU/Memory 的扩缩
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app
minReplicas: 3
maxReplicas: 50
metrics:
# CPU 利用率
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70 # 目标 CPU 利用率 70%
# Memory 利用率
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: 500Mi # 目标平均内存使用
behavior:
scaleUp:
stabilizationWindowSeconds: 60 # 扩容稳定窗口
policies:
- type: Percent
value: 100 # 每次最多扩容 100%
periodSeconds: 60
- type: Pods
value: 10 # 每次最多扩容 10 个 Pod
periodSeconds: 60
selectPolicy: Max # 取最大值
scaleDown:
stabilizationWindowSeconds: 300 # 缩容稳定窗口 5 分钟
policies:
- type: Percent
value: 10 # 每次最多缩容 10%
periodSeconds: 60
selectPolicy: Min # 取最小值(更保守)HPA 计算公式:
期望副本数 = ceil[当前副本数 * (当前指标值 / 目标指标值)]例如当前 5 个副本,CPU 利用率 90%,目标 70%:ceil[5 * (90/70)] = ceil[6.43] = 7
基于自定义指标的扩缩
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa-custom
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app
minReplicas: 2
maxReplicas: 30
metrics:
# Pod 级别自定义指标
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000" # 每个 Pod 平均 1000 RPS
# 外部指标(如消息队列长度)
- type: External
external:
metric:
name: queue_messages_ready
selector:
matchLabels:
queue: orders
target:
type: Value
value: "100" # 队列消息数低于 100
# Object 指标(如 Ingress QPS)
- type: Object
object:
describedObject:
apiVersion: networking.k8s.io/v1
kind: Ingress
name: app-ingress
metric:
name: requests_per_second
target:
type: Value
value: "5000"Metrics Server
Metrics Server 是 HPA 基于资源指标(CPU/Memory)工作的前提:
# 安装 metrics-server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 验证
kubectl top nodes
kubectl top pods -n productionPrometheus Adapter
通过 Prometheus Adapter 暴露自定义指标给 HPA:
# prometheus-adapter 配置
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-adapter-config
namespace: monitoring
data:
config.yaml: |
rules:
- seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
resources:
overrides:
namespace: {resource: "namespace"}
pod: {resource: "pod"}
name:
matches: "^(.*)_total$"
as: "${1}_per_second"
metricsQuery: 'rate(<<.Series>>{<<.LabelMatchers>>}[2m])'
- seriesQuery: 'rabbitmq_queue_messages_ready'
resources:
overrides:
namespace: {resource: "namespace"}
name:
as: "queue_messages_ready"
metricsQuery: '<<.Series>>{<<.LabelMatchers>>}'VPA(Vertical Pod Autoscaler)
VPA 自动调整 Pod 的 CPU 和 Memory 请求量:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: app
updatePolicy:
updateMode: "Auto" # Auto | Recreate | Initial | Off
resourcePolicy:
containerPolicies:
- containerName: app
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 4
memory: 8Gi
controlledResources: ["cpu", "memory"]
controlledValues: RequestsAndLimits
- containerName: sidecar
mode: "Off" # 不对 sidecar 容器调整VPA 组件架构
VPA 更新模式
| 模式 | 描述 |
|---|---|
| Off | 仅生成推荐值,不自动应用 |
| Initial | 仅在 Pod 创建时应用推荐值 |
| Recreate | 通过驱逐并重建 Pod 来应用 |
| Auto | 当前等同于 Recreate,未来可能支持原地更新 |
注意:HPA 和 VPA 不应同时基于相同指标(如 CPU)进行扩缩,会产生冲突。推荐组合方式:HPA 基于自定义指标,VPA 管理资源请求。
KEDA(Kubernetes Event-Driven Autoscaling)
KEDA 是基于事件源的扩缩器,支持从 0 到 N 的扩缩:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: order-processor
namespace: production
spec:
scaleTargetRef:
name: order-processor
pollingInterval: 15
cooldownPeriod: 300
minReplicaCount: 0 # 可以缩到 0!
maxReplicaCount: 100
triggers:
# RabbitMQ 队列
- type: rabbitmq
metadata:
host: amqp://user:pass@rabbitmq.default.svc:5672/
queueName: orders
queueLength: "50" # 每 50 条消息一个副本
# Kafka topic lag
- type: kafka
metadata:
bootstrapServers: kafka.default.svc:9092
consumerGroup: order-group
topic: orders
lagThreshold: "100"
# Prometheus 指标
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring.svc:9090
metricName: http_requests_total
threshold: "1000"
query: sum(rate(http_requests_total{deployment="order-processor"}[2m]))KEDA 支持 60+ 种事件源,包括:AWS SQS、Azure Service Bus、GCP Pub/Sub、Redis、MySQL、PostgreSQL、Cron 等。
Cluster Autoscaler
集群自动扩缩器根据 Pod 调度需求自动增减节点:
# Cluster Autoscaler Deployment 配置示例
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
template:
spec:
containers:
- name: cluster-autoscaler
image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.28.0
command:
- ./cluster-autoscaler
- --v=4
- --cloud-provider=aws
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
- --balance-similar-node-groups=true
- --skip-nodes-with-local-storage=false
- --skip-nodes-with-system-pods=true
- --scale-down-delay-after-add=10m
- --scale-down-unneeded-time=10m
- --scale-down-utilization-threshold=0.5
- --max-graceful-termination-sec=600扩容触发条件:存在因资源不足无法调度的 Pending Pod。缩容触发条件:节点资源利用率低于阈值(默认 50%)且持续一段时间。
扩缩策略选型
| 场景 | 推荐方案 |
|---|---|
| Web 服务,流量波动 | HPA (CPU/RPS) + Cluster Autoscaler |
| 消息消费者 | KEDA (队列长度) |
| 批处理任务 | KEDA (Cron/队列) 缩到 0 |
| 资源分配不合理 | VPA (Off 模式先观察) |
| 混合场景 | HPA (自定义指标) + VPA (资源优化) |
总结
自动扩缩容的关键实践:
- 先设置合理的 resource requests/limits,这是 HPA 和 VPA 工作的基础
- 配置 HPA 的 behavior 控制扩缩速度,避免频繁抖动
- 缩容要比扩容更保守,设置较长的稳定窗口
- VPA Off 模式先观察,再决定是否自动应用
- KEDA 适合事件驱动场景,支持缩到零节省成本
- Cluster Autoscaler 配合 HPA 实现完整的弹性能力
贡献者
更新日志
2026/3/14 13:09
查看所有更新日志
9f6c2-feat: organize wiki content and refresh site setup于