Skip to content

Kubernetes 核心架构

架构概览

Control Plane(控制平面):
  ├── API Server      → 所有操作的入口,RESTful API
  ├── etcd            → 集群状态存储
  ├── Scheduler       → Pod 调度决策
  └── Controller Manager → 控制器(Deployment/ReplicaSet/Service 等)

Worker Node(工作节点):
  ├── kubelet         → 节点代理,管理 Pod 生命周期
  ├── kube-proxy      → 网络代理,实现 Service 负载均衡
  └── Container Runtime(containerd/CRI-O)

核心资源对象

Pod

yaml
apiVersion: v1
kind: Pod
metadata:
  name: order-pod
  labels:
    app: order-service
    version: v1.2.0
spec:
  containers:
    - name: order-service
      image: order-service:1.2.0
      ports:
        - containerPort: 8080
      resources:
        requests:
          cpu: "100m"
          memory: "256Mi"
        limits:
          cpu: "500m"
          memory: "512Mi"
      readinessProbe:
        httpGet:
          path: /actuator/health/readiness
          port: 8080
        initialDelaySeconds: 10
        periodSeconds: 5
        failureThreshold: 3
      livenessProbe:
        httpGet:
          path: /actuator/health/liveness
          port: 8080
        initialDelaySeconds: 30
        periodSeconds: 10
        failureThreshold: 3
      env:
        - name: SPRING_PROFILES_ACTIVE
          value: "production"
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: password
      volumeMounts:
        - name: config
          mountPath: /app/config
  volumes:
    - name: config
      configMap:
        name: order-config

Deployment

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: order-service
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # 最多多出1个 Pod
      maxUnavailable: 0  # 滚动更新时不允许不可用
  template:
    metadata:
      labels:
        app: order-service
        version: v1.2.0
    spec:
      # Pod 反亲和性(分散到不同节点)
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app: order-service
                topologyKey: kubernetes.io/hostname
      containers:
        - name: order-service
          image: order-service:1.2.0
          # ... 同 Pod 配置

Service

yaml
# ClusterIP(集群内访问)
apiVersion: v1
kind: Service
metadata:
  name: order-service
spec:
  selector:
    app: order-service
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP

---
# NodePort(节点端口访问)
spec:
  type: NodePort
  ports:
    - port: 80
      targetPort: 8080
      nodePort: 30080

---
# LoadBalancer(云厂商负载均衡器)
spec:
  type: LoadBalancer
  ports:
    - port: 80
      targetPort: 8080

Ingress

yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/rate-limit: "100"
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - api.example.com
      secretName: api-tls
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /orders
            pathType: Prefix
            backend:
              service:
                name: order-service
                port:
                  number: 80
          - path: /users
            pathType: Prefix
            backend:
              service:
                name: user-service
                port:
                  number: 80

配置与密钥管理

ConfigMap

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: order-config
data:
  application.yaml: |
    server:
      port: 8080
    order:
      timeout: 5000
      max-retry: 3
  LOG_LEVEL: "INFO"

Secret

bash
# 创建 Secret
kubectl create secret generic db-secret \
  --from-literal=username=admin \
  --from-literal=password=secret123

# 从文件创建
kubectl create secret generic tls-secret \
  --from-file=tls.crt=server.crt \
  --from-file=tls.key=server.key

HPA(水平自动扩缩容)

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: order-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-service
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
    # 自定义指标(需要 Prometheus Adapter)
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: "100"
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Pods
          value: 4
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300  # 缩容更保守

资源配额与限制

yaml
# Namespace 级别资源配额
apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi
    pods: "100"
    services: "20"

---
# LimitRange(Pod 默认资源限制)
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: production
spec:
  limits:
    - type: Container
      default:
        cpu: "200m"
        memory: "256Mi"
      defaultRequest:
        cpu: "100m"
        memory: "128Mi"
      max:
        cpu: "2"
        memory: "2Gi"

网络策略

yaml
# 只允许同命名空间的 Pod 访问 order-service
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: order-service-netpol
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: order-service
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              name: production
        - podSelector:
            matchLabels:
              role: frontend
      ports:
        - protocol: TCP
          port: 8080
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: mysql
      ports:
        - protocol: TCP
          port: 3306
    - to:  # 允许 DNS
        - namespaceSelector: {}
      ports:
        - protocol: UDP
          port: 53

故障处理案例

案例一:Pod 一直处于 Pending 状态

排查

bash
kubectl describe pod <pod-name> -n production

# 常见原因:
# 1. 资源不足(Insufficient cpu/memory)
kubectl top nodes

# 2. 节点选择器/亲和性不满足
kubectl get nodes --show-labels

# 3. PVC 未绑定
kubectl get pvc -n production

案例二:Pod CrashLoopBackOff

bash
# 查看日志
kubectl logs <pod-name> -n production --previous

# 查看事件
kubectl describe pod <pod-name> -n production

# 常见原因:
# - 应用启动失败(配置错误、依赖服务不可用)
# - OOM(内存限制太小)
# - 健康检查失败(readinessProbe/livenessProbe 配置不当)

案例三:Service 无法访问

bash
# 检查 Endpoints
kubectl get endpoints order-service -n production

# 如果 Endpoints 为空,检查 selector 是否匹配
kubectl get pods -n production -l app=order-service

# 检查 kube-proxy
kubectl get pods -n kube-system | grep kube-proxy

# 测试 DNS 解析
kubectl run debug --image=busybox --rm -it -- nslookup order-service.production.svc.cluster.local

监控指标

bash
# 查看节点资源使用
kubectl top nodes

# 查看 Pod 资源使用
kubectl top pods -n production --sort-by=memory

# 查看集群事件
kubectl get events -n production --sort-by='.lastTimestamp'

PaaS 中间件生态系统深度学习文档