Istio — 服务网格控制平面
架构概览
Istio 是目前最成熟的服务网格实现,将服务治理能力下沉到基础设施层,应用无需修改代码。
┌─────────────────────────────────────────────────────────┐
│ 控制平面(Istiod) │
│ ┌──────────┐ ┌──────────┐ ┌──────────────────────┐ │
│ │ Pilot │ │ Citadel │ │ Galley │ │
│ │(流量管理)│ │(证书管理)│ │(配置验证与分发) │ │
│ └──────────┘ └──────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────┘
│ xDS API
┌─────────────────────────────────────────────────────────┐
│ 数据平面 │
│ Pod A Pod B │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ App Container │ │ App Container │ │
│ │ Envoy Sidecar ◄──┼──mTLS────►│ Envoy Sidecar │ │
│ └──────────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────┘Sidecar 注入
bash
# 命名空间级别自动注入
kubectl label namespace production istio-injection=enabled
# 单个 Pod 注入
kubectl annotate pod my-pod sidecar.istio.io/inject="true"
# 排除某个 Pod
kubectl annotate pod my-pod sidecar.istio.io/inject="false"流量管理
VirtualService(虚拟服务)
yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service
spec:
hosts:
- order-service
http:
# 金丝雀发布:10% 流量到 v2
- match:
- headers:
x-canary:
exact: "true"
route:
- destination:
host: order-service
subset: v2
# 默认:90% v1,10% v2
- route:
- destination:
host: order-service
subset: v1
weight: 90
- destination:
host: order-service
subset: v2
weight: 10
# 超时配置
timeout: 5s
# 重试配置
retries:
attempts: 3
perTryTimeout: 2s
retryOn: gateway-error,connect-failure,retriable-4xxDestinationRule(目标规则)
yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: order-service
spec:
host: order-service
# 连接池配置
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 100
http2MaxRequests: 1000
# 熔断配置
outlierDetection:
consecutive5xxErrors: 5 # 连续5次5xx错误
interval: 30s # 检测间隔
baseEjectionTime: 30s # 最小驱逐时间
maxEjectionPercent: 50 # 最多驱逐50%实例
minHealthPercent: 30 # 至少保留30%健康实例
# 负载均衡
loadBalancer:
simple: LEAST_CONN
# 版本子集
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
trafficPolicy:
connectionPool:
http:
http2MaxRequests: 500Gateway(入口网关)
yaml
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: api-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: api-tls-cert
hosts:
- api.example.com
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: api-vs
spec:
hosts:
- api.example.com
gateways:
- api-gateway
http:
- match:
- uri:
prefix: /orders
route:
- destination:
host: order-service
port:
number: 8080安全:mTLS 零信任
yaml
# 全局启用 mTLS(严格模式)
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system # 全局
spec:
mtls:
mode: STRICT
# 命名空间级别
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: production
spec:
mtls:
mode: STRICT
# 特定服务允许明文(迁移期间)
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: order-service-permissive
namespace: production
spec:
selector:
matchLabels:
app: order-service
mtls:
mode: PERMISSIVE # 同时接受 mTLS 和明文AuthorizationPolicy(访问控制)
yaml
# 只允许 frontend 服务访问 order-service
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: order-service-authz
namespace: production
spec:
selector:
matchLabels:
app: order-service
action: ALLOW
rules:
- from:
- source:
principals:
- cluster.local/ns/production/sa/frontend-service
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/orders*"]流量镜像(Shadow Testing)
yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service
spec:
hosts:
- order-service
http:
- route:
- destination:
host: order-service
subset: v1
# 将 100% 流量镜像到 v2(不影响正常响应)
mirror:
host: order-service
subset: v2
mirrorPercentage:
value: 100.0故障注入(混沌工程)
yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service-fault
spec:
hosts:
- order-service
http:
- fault:
# 注入延迟(50% 请求延迟 5s)
delay:
percentage:
value: 50
fixedDelay: 5s
# 注入错误(10% 请求返回 503)
abort:
percentage:
value: 10
httpStatus: 503
route:
- destination:
host: order-service可观测性
bash
# 查看服务拓扑(Kiali)
kubectl port-forward svc/kiali 20001:20001 -n istio-system
# 查看链路追踪(Jaeger)
kubectl port-forward svc/tracing 16686:80 -n istio-system
# 查看指标(Grafana)
kubectl port-forward svc/grafana 3000:3000 -n istio-system故障处理案例
案例一:服务间通信 503
现象:服务 A 调用服务 B 返回 503,但直接访问 B 正常。
排查:
bash
# 查看 Envoy 访问日志
kubectl logs <pod-name> -c istio-proxy | grep "503"
# 检查 DestinationRule 熔断状态
istioctl proxy-config cluster <pod-name> | grep order-service
# 检查 mTLS 配置
istioctl authn tls-check <pod-name> order-service.production.svc.cluster.local常见原因:
- 熔断触发(outlierDetection)
- mTLS 配置不匹配(一端 STRICT,另一端未注入 Sidecar)
- 连接池耗尽
案例二:Sidecar 注入后应用启动失败
现象:注入 Sidecar 后,应用 Pod 无法启动,报连接拒绝。
原因:应用启动时依赖外部服务,但 Envoy Sidecar 还未就绪。
解决:
yaml
# 等待 Sidecar 就绪后再启动应用
metadata:
annotations:
proxy.istio.io/config: |
holdApplicationUntilProxyStarts: true案例三:流量规则不生效
排查:
bash
# 检查 VirtualService 是否正确应用
istioctl analyze -n production
# 查看 Envoy 路由配置
istioctl proxy-config routes <pod-name> --name 8080
# 查看 Envoy 监听器
istioctl proxy-config listeners <pod-name>性能开销
Istio Sidecar 的资源开销:
- CPU:约 0.5m~1m(空闲),高负载下约 50m
- 内存:约 50~100MB
- 延迟:P50 约 0.2ms,P99 约 1ms
优化建议:
yaml
# 减少 Sidecar 资源占用
resources:
requests:
cpu: 10m
memory: 40Mi
limits:
cpu: 200m
memory: 128Mi