Skip to content

support for GPU monitoring installation

Created by: zhu733756

Signed-off-by: zhu733756 talonzhu@yunify.com

This pr aims to support GPU monitoring installation. Fix https://github.com/kubesphere/kubesphere/issues/4082

The notable fix can be described as follows:

  • make definition in cluster-configuration yaml:
monitoring:
    gpu:
      enabled: true
      nvidia_device_plugin:
        enabled: true
      nvidia_dcgm_exporter:
        enabled: true
  • Integrate a GPU monitoring task in the monitoring section, the steps can be found at gpu-monitoring.yaml.
  • For the cluster role updates:
- apiGroups:
  - monitoring.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'

/cc @benjaminhuo @pixiake

合并请求报告