Istio — 服务网格控制平面

架构概览

Istio 是目前最成熟的服务网格实现，将服务治理能力下沉到基础设施层，应用无需修改代码。

┌─────────────────────────────────────────────────────────┐
│                    控制平面（Istiod）                     │
│  ┌──────────┐  ┌──────────┐  ┌──────────────────────┐  │
│  │  Pilot   │  │  Citadel │  │       Galley          │  │
│  │（流量管理）│  │（证书管理）│  │（配置验证与分发）      │  │
│  └──────────┘  └──────────┘  └──────────────────────┘  │
└─────────────────────────────────────────────────────────┘
                         │ xDS API
┌─────────────────────────────────────────────────────────┐
│                    数据平面                               │
│  Pod A                          Pod B                   │
│  ┌──────────────────┐           ┌──────────────────┐   │
│  │ App Container    │           │ App Container    │   │
│  │ Envoy Sidecar ◄──┼──mTLS────►│ Envoy Sidecar   │   │
│  └──────────────────┘           └──────────────────┘   │
└─────────────────────────────────────────────────────────┘

Sidecar 注入

bash

# 命名空间级别自动注入
kubectl label namespace production istio-injection=enabled

# 单个 Pod 注入
kubectl annotate pod my-pod sidecar.istio.io/inject="true"

# 排除某个 Pod
kubectl annotate pod my-pod sidecar.istio.io/inject="false"

流量管理

VirtualService（虚拟服务）

yaml

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service
spec:
  hosts:
    - order-service
  http:
    # 金丝雀发布：10% 流量到 v2
    - match:
        - headers:
            x-canary:
              exact: "true"
      route:
        - destination:
            host: order-service
            subset: v2
    # 默认：90% v1，10% v2
    - route:
        - destination:
            host: order-service
            subset: v1
          weight: 90
        - destination:
            host: order-service
            subset: v2
          weight: 10
      # 超时配置
      timeout: 5s
      # 重试配置
      retries:
        attempts: 3
        perTryTimeout: 2s
        retryOn: gateway-error,connect-failure,retriable-4xx

DestinationRule（目标规则）

yaml

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: order-service
spec:
  host: order-service
  # 连接池配置
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 100
        http2MaxRequests: 1000
    # 熔断配置
    outlierDetection:
      consecutive5xxErrors: 5        # 连续5次5xx错误
      interval: 30s                  # 检测间隔
      baseEjectionTime: 30s          # 最小驱逐时间
      maxEjectionPercent: 50         # 最多驱逐50%实例
      minHealthPercent: 30           # 至少保留30%健康实例
    # 负载均衡
    loadBalancer:
      simple: LEAST_CONN
  # 版本子集
  subsets:
    - name: v1
      labels:
        version: v1
    - name: v2
      labels:
        version: v2
      trafficPolicy:
        connectionPool:
          http:
            http2MaxRequests: 500

Gateway（入口网关）

yaml

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: api-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
    - port:
        number: 443
        name: https
        protocol: HTTPS
      tls:
        mode: SIMPLE
        credentialName: api-tls-cert
      hosts:
        - api.example.com

---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: api-vs
spec:
  hosts:
    - api.example.com
  gateways:
    - api-gateway
  http:
    - match:
        - uri:
            prefix: /orders
      route:
        - destination:
            host: order-service
            port:
              number: 8080

安全：mTLS 零信任

yaml

# 全局启用 mTLS（严格模式）
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system  # 全局
spec:
  mtls:
    mode: STRICT

# 命名空间级别
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: production
spec:
  mtls:
    mode: STRICT

# 特定服务允许明文（迁移期间）
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: order-service-permissive
  namespace: production
spec:
  selector:
    matchLabels:
      app: order-service
  mtls:
    mode: PERMISSIVE  # 同时接受 mTLS 和明文

AuthorizationPolicy（访问控制）

yaml

# 只允许 frontend 服务访问 order-service
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: order-service-authz
  namespace: production
spec:
  selector:
    matchLabels:
      app: order-service
  action: ALLOW
  rules:
    - from:
        - source:
            principals:
              - cluster.local/ns/production/sa/frontend-service
      to:
        - operation:
            methods: ["GET", "POST"]
            paths: ["/api/orders*"]

流量镜像（Shadow Testing）

yaml

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service
spec:
  hosts:
    - order-service
  http:
    - route:
        - destination:
            host: order-service
            subset: v1
      # 将 100% 流量镜像到 v2（不影响正常响应）
      mirror:
        host: order-service
        subset: v2
      mirrorPercentage:
        value: 100.0

故障注入（混沌工程）

yaml

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service-fault
spec:
  hosts:
    - order-service
  http:
    - fault:
        # 注入延迟（50% 请求延迟 5s）
        delay:
          percentage:
            value: 50
          fixedDelay: 5s
        # 注入错误（10% 请求返回 503）
        abort:
          percentage:
            value: 10
          httpStatus: 503
      route:
        - destination:
            host: order-service

可观测性

bash

# 查看服务拓扑（Kiali）
kubectl port-forward svc/kiali 20001:20001 -n istio-system

# 查看链路追踪（Jaeger）
kubectl port-forward svc/tracing 16686:80 -n istio-system

# 查看指标（Grafana）
kubectl port-forward svc/grafana 3000:3000 -n istio-system

故障处理案例

案例一：服务间通信 503

现象：服务 A 调用服务 B 返回 503，但直接访问 B 正常。

排查：

bash

# 查看 Envoy 访问日志
kubectl logs <pod-name> -c istio-proxy | grep "503"

# 检查 DestinationRule 熔断状态
istioctl proxy-config cluster <pod-name> | grep order-service

# 检查 mTLS 配置
istioctl authn tls-check <pod-name> order-service.production.svc.cluster.local

常见原因：

熔断触发（outlierDetection）
mTLS 配置不匹配（一端 STRICT，另一端未注入 Sidecar）
连接池耗尽

案例二：Sidecar 注入后应用启动失败

现象：注入 Sidecar 后，应用 Pod 无法启动，报连接拒绝。

原因：应用启动时依赖外部服务，但 Envoy Sidecar 还未就绪。

解决：

yaml

# 等待 Sidecar 就绪后再启动应用
metadata:
  annotations:
    proxy.istio.io/config: |
      holdApplicationUntilProxyStarts: true

案例三：流量规则不生效

排查：

bash

# 检查 VirtualService 是否正确应用
istioctl analyze -n production

# 查看 Envoy 路由配置
istioctl proxy-config routes <pod-name> --name 8080

# 查看 Envoy 监听器
istioctl proxy-config listeners <pod-name>

性能开销

Istio Sidecar 的资源开销：

CPU：约 0.5m~1m（空闲），高负载下约 50m
内存：约 50~100MB
延迟：P50 约 0.2ms，P99 约 1ms

优化建议：

yaml

# 减少 Sidecar 资源占用
resources:
  requests:
    cpu: 10m
    memory: 40Mi
  limits:
    cpu: 200m
    memory: 128Mi

Istio — 服务网格控制平面 ​

架构概览 ​

Sidecar 注入 ​

流量管理 ​

VirtualService（虚拟服务） ​

DestinationRule（目标规则） ​

Gateway（入口网关） ​

安全：mTLS 零信任 ​

AuthorizationPolicy（访问控制） ​

流量镜像（Shadow Testing） ​

故障注入（混沌工程） ​

可观测性 ​

故障处理案例 ​

案例一：服务间通信 503 ​

案例二：Sidecar 注入后应用启动失败 ​

案例三：流量规则不生效 ​

性能开销 ​

Istio — 服务网格控制平面

架构概览

Sidecar 注入

流量管理

VirtualService（虚拟服务）

DestinationRule（目标规则）

Gateway（入口网关）

安全：mTLS 零信任

AuthorizationPolicy（访问控制）

流量镜像（Shadow Testing）

故障注入（混沌工程）

可观测性

故障处理案例

案例一：服务间通信 503

案例二：Sidecar 注入后应用启动失败

案例三：流量规则不生效

性能开销