Kubernetes DaemonSet

📌 1. 什么是 DaemonSet？

DaemonSet 是 Kubernetes 的一种 Workload 资源，用来保证：

每个（或特定）节点上都运行且保持运行一个 Pod。

典型用途包括需要 节点级别 Agent / 守护进程 的场景：

日志采集（Fluentd, Vector, Filebeat）
监控 Agent（Node Exporter, Prometheus Agent）
CNI / 容器网络插件（Calico, Cilium）
存储插件（CSI Node Plugin）
安全 Agent（Falco）
kube-proxy（如果未被 eBPF 替代）

📌 2. DaemonSet 的核心机制（调度逻辑）

DaemonSet Controller 并不直接调度 Pod，而是：

2.1 判定哪些节点应该运行 Pod

根据：

nodeSelector
nodeAffinity
taints & tolerations
node conditions（Ready, DiskPressure…）
eviction status

2.2 自动为每个符合条件的 Node 创建 1 个 Pod

创建后交由 Scheduler 调度（K8S 1.12+ 的行为）。

2.3 监控节点变化

有新节点加入 -> 自动补齐 Pod
节点 NotReady -> Pod 可能驱逐/重建
节点被 cordon/drain -> Pod 按规则处理

📌 3. DaemonSet 与 Deployment 的区别（核心区别）

特性	DaemonSet	Deployment
Pod 数量	每节点 1 个	按 replica 指定
调度	自动覆盖节点	scheduler 统一调度
场景	节点级守护进程	业务应用
与节点关系	一一对应	无绑定关系
驱逐行为	可随节点变化	标准驱逐逻辑

简单说：

Deployment = 跑业务服务
DaemonSet = 管节点的服务

📌 4. DaemonSet 的调度流程

以下是 Kubernetes Scheduler + DaemonSet Controller 的联合逻辑：

节点上线/更新
   ↓
DaemonSet Controller 发现符合条件的节点
   ↓
为该节点创建 DaemonSet Pod（未绑定）
   ↓
Scheduler 过滤节点（必须与 Pod 所属节点匹配）
   ↓
Scheduler 将 Pod 固定绑定到该节点
   ↓
kubelet 拉起 Pod

注意：

DaemonSet Pod 默认带有 nodeAffinity 限制，使其 只能调度到特定节点。

📌 5. DaemonSet 的关键字段详解

一个典型 DaemonSet：

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      tolerations:
      - operator: Exists  # 能运行在有污点的节点上
      hostNetwork: true   # 使用节点网络
      containers:
      - name: node-exporter
        image: prom/node-exporter:latest
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 10%

⭐ 关键字段说明

🔹 updateStrategy

DaemonSet 支持两种更新策略：

RollingUpdate（推荐）

可控制：

maxUnavailable
分批更新

适用于：

kube-proxy
CNI 插件
监控 Agent

OnDelete

管理员手动删除旧 Pod 后，才会拉起新 Pod。

用在：

风险较大的网络插件升级（如 Calico）
自己开发的节点 Agent

🔹 tolerations

DaemonSet 典型要容忍 master/infra 节点：

tolerations:
  - operator: "Exists"

否则无法部署到含污点的关键节点。

🔹 nodeSelector / nodeAffinity

指定节点：

nodeSelector:
  node-role.kubernetes.io/worker: ""

或更严格：

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: node-role.kubernetes.io/worker
          operator: Exists

🔹 hostNetwork / HostPath

Agent 常用：

hostNetwork: true
hostPID: true

存储/日志插件：

volumeMounts:
- mountPath: /var/log
  name: log-dir
volumes:
- hostPath:
    path: /var/log
  name: log-dir

📌 6. DaemonSet 常见使用场景（超级重要）

🧩 6.1 监控 & 日志采集

Prometheus Node Exporter
Promtail
Vector
Filebeat
Fluentd

🧩 6.2 Kubernetes 系统组件

calico-node
cilium-agent
kube-proxy（未启用 eBPF 时）

🧩 6.3 安全工具

Falco
Open Policy Agent Gatekeeper Agent

🧩 6.4 存储插件（CSI）

Longhorn
Ceph RBD CSI Node Plugin

📌 7. DaemonSet 的调优与最佳实践（生产级）

7.1 必须加 tolerations

否则无法在 master/infra 节点运行定制 Agent：

tolerations:
- operator: Exists

7.2 必须加资源限制

防止系统 Agent 吃光 Node CPU：

resources:
  limits:
    cpu: 200m
    memory: 200Mi
  requests:
    cpu: 50m
    memory: 50Mi

7.3 使用 RollingUpdate 而非 OnDelete

除非是升级 Calico 这种风险插件。

7.4 使用 PodDisruptionBudget（PDB）

避免节点大规模滚更导致任务不可用。

7.5 加宿主机路径只读挂载（降低风险）

readOnly: true

7.6 多副本 DaemonSet（K8S 1.12+ 支持）

通过标签影响覆盖节点

例如：

nodeSelector:
  app-monitor: "true"

📌 8. DaemonSet 常见问题

❗ 8.1 为什么 DaemonSet Pod 不起？

排查顺序：

kubectl describe nodes
kubectl describe ds xxx
kubectl get events
kubectl logs -n kube-system ds/calico-node

常见原因：

节点有污点但没 tolerations
nodeSelector 过滤过严
镜像拉取错误（私有仓库未登录）
CNI runtime 错误

❗ 8.2. DaemonSet 会自动漂移吗？

不会。

如果 Pod 绑定的 Node 下线，该节点上的 Pod 不会迁移到其他节点，因为它是节点绑定型 Workload。

📌 9. DaemonSet 总结

DaemonSet = 保障每个节点上都运行一个 Pod，用于节点级服务。

通过 tolerations/affinity 控制节点覆盖范围，由 Controller 创建 Pod，由 Scheduler 绑定节点。

典型用于监控、日志、网络、安全、存储插件。