Skip to content

Istio — 服务网格控制平面

架构概览

Istio 是目前最成熟的服务网格实现,将服务治理能力下沉到基础设施层,应用无需修改代码。

┌─────────────────────────────────────────────────────────┐
│                    控制平面(Istiod)                     │
│  ┌──────────┐  ┌──────────┐  ┌──────────────────────┐  │
│  │  Pilot   │  │  Citadel │  │       Galley          │  │
│  │(流量管理)│  │(证书管理)│  │(配置验证与分发)      │  │
│  └──────────┘  └──────────┘  └──────────────────────┘  │
└─────────────────────────────────────────────────────────┘
                         │ xDS API
┌─────────────────────────────────────────────────────────┐
│                    数据平面                               │
│  Pod A                          Pod B                   │
│  ┌──────────────────┐           ┌──────────────────┐   │
│  │ App Container    │           │ App Container    │   │
│  │ Envoy Sidecar ◄──┼──mTLS────►│ Envoy Sidecar   │   │
│  └──────────────────┘           └──────────────────┘   │
└─────────────────────────────────────────────────────────┘

Sidecar 注入

bash
# 命名空间级别自动注入
kubectl label namespace production istio-injection=enabled

# 单个 Pod 注入
kubectl annotate pod my-pod sidecar.istio.io/inject="true"

# 排除某个 Pod
kubectl annotate pod my-pod sidecar.istio.io/inject="false"

流量管理

VirtualService(虚拟服务)

yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service
spec:
  hosts:
    - order-service
  http:
    # 金丝雀发布:10% 流量到 v2
    - match:
        - headers:
            x-canary:
              exact: "true"
      route:
        - destination:
            host: order-service
            subset: v2
    # 默认:90% v1,10% v2
    - route:
        - destination:
            host: order-service
            subset: v1
          weight: 90
        - destination:
            host: order-service
            subset: v2
          weight: 10
      # 超时配置
      timeout: 5s
      # 重试配置
      retries:
        attempts: 3
        perTryTimeout: 2s
        retryOn: gateway-error,connect-failure,retriable-4xx

DestinationRule(目标规则)

yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: order-service
spec:
  host: order-service
  # 连接池配置
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 100
        http2MaxRequests: 1000
    # 熔断配置
    outlierDetection:
      consecutive5xxErrors: 5        # 连续5次5xx错误
      interval: 30s                  # 检测间隔
      baseEjectionTime: 30s          # 最小驱逐时间
      maxEjectionPercent: 50         # 最多驱逐50%实例
      minHealthPercent: 30           # 至少保留30%健康实例
    # 负载均衡
    loadBalancer:
      simple: LEAST_CONN
  # 版本子集
  subsets:
    - name: v1
      labels:
        version: v1
    - name: v2
      labels:
        version: v2
      trafficPolicy:
        connectionPool:
          http:
            http2MaxRequests: 500

Gateway(入口网关)

yaml
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: api-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
    - port:
        number: 443
        name: https
        protocol: HTTPS
      tls:
        mode: SIMPLE
        credentialName: api-tls-cert
      hosts:
        - api.example.com

---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: api-vs
spec:
  hosts:
    - api.example.com
  gateways:
    - api-gateway
  http:
    - match:
        - uri:
            prefix: /orders
      route:
        - destination:
            host: order-service
            port:
              number: 8080

安全:mTLS 零信任

yaml
# 全局启用 mTLS(严格模式)
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system  # 全局
spec:
  mtls:
    mode: STRICT

# 命名空间级别
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: production
spec:
  mtls:
    mode: STRICT

# 特定服务允许明文(迁移期间)
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: order-service-permissive
  namespace: production
spec:
  selector:
    matchLabels:
      app: order-service
  mtls:
    mode: PERMISSIVE  # 同时接受 mTLS 和明文

AuthorizationPolicy(访问控制)

yaml
# 只允许 frontend 服务访问 order-service
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: order-service-authz
  namespace: production
spec:
  selector:
    matchLabels:
      app: order-service
  action: ALLOW
  rules:
    - from:
        - source:
            principals:
              - cluster.local/ns/production/sa/frontend-service
      to:
        - operation:
            methods: ["GET", "POST"]
            paths: ["/api/orders*"]

流量镜像(Shadow Testing)

yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service
spec:
  hosts:
    - order-service
  http:
    - route:
        - destination:
            host: order-service
            subset: v1
      # 将 100% 流量镜像到 v2(不影响正常响应)
      mirror:
        host: order-service
        subset: v2
      mirrorPercentage:
        value: 100.0

故障注入(混沌工程)

yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service-fault
spec:
  hosts:
    - order-service
  http:
    - fault:
        # 注入延迟(50% 请求延迟 5s)
        delay:
          percentage:
            value: 50
          fixedDelay: 5s
        # 注入错误(10% 请求返回 503)
        abort:
          percentage:
            value: 10
          httpStatus: 503
      route:
        - destination:
            host: order-service

可观测性

bash
# 查看服务拓扑(Kiali)
kubectl port-forward svc/kiali 20001:20001 -n istio-system

# 查看链路追踪(Jaeger)
kubectl port-forward svc/tracing 16686:80 -n istio-system

# 查看指标(Grafana)
kubectl port-forward svc/grafana 3000:3000 -n istio-system

故障处理案例

案例一:服务间通信 503

现象:服务 A 调用服务 B 返回 503,但直接访问 B 正常。

排查

bash
# 查看 Envoy 访问日志
kubectl logs <pod-name> -c istio-proxy | grep "503"

# 检查 DestinationRule 熔断状态
istioctl proxy-config cluster <pod-name> | grep order-service

# 检查 mTLS 配置
istioctl authn tls-check <pod-name> order-service.production.svc.cluster.local

常见原因

  • 熔断触发(outlierDetection)
  • mTLS 配置不匹配(一端 STRICT,另一端未注入 Sidecar)
  • 连接池耗尽

案例二:Sidecar 注入后应用启动失败

现象:注入 Sidecar 后,应用 Pod 无法启动,报连接拒绝。

原因:应用启动时依赖外部服务,但 Envoy Sidecar 还未就绪。

解决

yaml
# 等待 Sidecar 就绪后再启动应用
metadata:
  annotations:
    proxy.istio.io/config: |
      holdApplicationUntilProxyStarts: true

案例三:流量规则不生效

排查

bash
# 检查 VirtualService 是否正确应用
istioctl analyze -n production

# 查看 Envoy 路由配置
istioctl proxy-config routes <pod-name> --name 8080

# 查看 Envoy 监听器
istioctl proxy-config listeners <pod-name>

性能开销

Istio Sidecar 的资源开销:

  • CPU:约 0.5m~1m(空闲),高负载下约 50m
  • 内存:约 50~100MB
  • 延迟:P50 约 0.2ms,P99 约 1ms

优化建议:

yaml
# 减少 Sidecar 资源占用
resources:
  requests:
    cpu: 10m
    memory: 40Mi
  limits:
    cpu: 200m
    memory: 128Mi

PaaS 中间件生态系统深度学习文档