Skip to main content
Version: v2.5.0

Enable dynamic-mig feature

Introductionโ€‹

HAMi now supports dynamic-mig by using mig-parted to adjust mig-devices dynamically, including:

Dynamic MIG instance management: User don't need to operate on GPU node, using 'nvidia-smi -i 0 -mig 1' or other command to manage MIG instance, all will be done by HAMi-device-plugin.

Dynamic MIG Adjustment: Each MIG device managed by HAMi will dynamically adjust their MIG template according to tasks submitted when necessary.

Device MIG Observation: Each MIG instance generated by HAMi will be shown in scheduler-monitor, including task information. user can get a clear overview of MIG nodes.

Compatible with HAMi-core nodes: HAMi can manage a unified GPU pool of HAMi-core node and mig node. A task can be scheduled to either node if not appointed manually by using nvidia.com/vgpu-mode annotation.

Unified API with HAMi-core: Zero work needs to be done to make the job compatible with dynamic-mig feature.

Prerequisitesโ€‹

  • NVIDIA Blackwell and Hopperโ„ข and Ampere Devices
  • HAMi > v2.5.0
  • NVIDIA Container Toolkit

Enabling Dynamic-mig Supportโ€‹

  • Install the chart using helm, See 'enabling vGPU support in kubernetes' section here

  • Configure mode in device-plugin configMap to mig for MIG nodes

kubectl describe cm hami-device-plugin -n kube-system
{
"nodeconfig": [
{
"name": "MIG-NODE-A",
"operatingmode": "mig",
"filterdevices": {
"uuid": [],
"index": []
}
}
]
}
  • Restart the following pods for the change to take effect:
    • hami-scheduler
    • hami-device-plugin on 'MIG-NODE-A'

Custom mig configuration (Optional)โ€‹

HAMi currently has a built-in mig configuration for MIG.

You can customize the mig configuration by following the steps below:

Change the content of 'device-configmap.yaml' in charts/hami/templates/scheduler, the as followsโ€‹

nvidia:
resourceCountName: {{ .Values.resourceName }}
resourceMemoryName: {{ .Values.resourceMem }}
resourceMemoryPercentageName: {{ .Values.resourceMemPercentage }}
resourceCoreName: {{ .Values.resourceCores }}
resourcePriorityName: {{ .Values.resourcePriority }}
overwriteEnv: false
defaultMemory: 0
defaultCores: 0
defaultGPUNum: 1
deviceSplitCount: {{ .Values.devicePlugin.deviceSplitCount }}
deviceMemoryScaling: {{ .Values.devicePlugin.deviceMemoryScaling }}
deviceCoreScaling: {{ .Values.devicePlugin.deviceCoreScaling }}
knownMigGeometries:
- models: [ "A30" ]
allowedGeometries:
-
- name: 1g.6gb
memory: 6144
count: 4
-
- name: 2g.12gb
memory: 12288
count: 2
-
- name: 4g.24gb
memory: 24576
count: 1
- models: [ "A100-SXM4-40GB", "A100-40GB-PCIe", "A100-PCIE-40GB", "A100-SXM4-40GB" ]
allowedGeometries:
-
- name: 1g.5gb
memory: 5120
count: 7
-
- name: 2g.10gb
memory: 10240
count: 3
- name: 1g.5gb
memory: 5120
count: 1
-
- name: 3g.20gb
memory: 20480
count: 2
-
- name: 7g.40gb
memory: 40960
count: 1
- models: [ "A100-SXM4-80GB", "A100-80GB-PCIe", "A100-PCIE-80GB"]
allowedGeometries:
-
- name: 1g.10gb
memory: 10240
count: 7
-
- name: 2g.20gb
memory: 20480
count: 3
- name: 1g.10gb
memory: 10240
count: 1
-
- name: 3g.40gb
memory: 40960
count: 2
-
- name: 7g.79gb
memory: 80896
count: 1
note

Helm installation and updates will be based on the configuration in this file, overwriting the built-in configuration of Helm.

note

HAMi will find and use the first MIG template suitable to the task in the order of this configMap.

Running MIG jobsโ€‹

MIG instance can now be requested by a container the same way as using hami-core by specifying the nvidia.com/gpu and nvidia.com/gpumem resource type.

apiVersion: v1
kind: Pod
metadata:
name: gpu-pod
annotations:
nvidia.com/vgpu-mode: "mig" #(Optional), if not set, this pod can be assigned to a MIG instance or a hami-core instance
spec:
containers:
- name: ubuntu-container
image: ubuntu:18.04
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
nvidia.com/gpu: 2
nvidia.com/gpumem: 8000

In this example above, the task allocates two mig instances, each with at least 8G device memory.

Monitor MIG Instanceโ€‹

MIG Instance managed by HAMi will be displayed in scheduler monitor(scheduler node ip:31993/metrics), as follows:

# HELP nodeGPUMigInstance GPU Sharing mode. 0 for hami-core, 1 for mig, 2 for mps
# TYPE nodeGPUMigInstance gauge
nodeGPUMigInstance{deviceidx="0",deviceuuid="GPU-936619fc-f6a1-74a8-0bc6-ecf6b3269313",migname="3g.20gb-0",nodeid="aio-node15",zone="vGPU"} 1
nodeGPUMigInstance{deviceidx="0",deviceuuid="GPU-936619fc-f6a1-74a8-0bc6-ecf6b3269313",migname="3g.20gb-1",nodeid="aio-node15",zone="vGPU"} 0
nodeGPUMigInstance{deviceidx="1",deviceuuid="GPU-30f90f49-43ab-0a78-bf5c-93ed41ef2da2",migname="3g.20gb-0",nodeid="aio-node15",zone="vGPU"} 1
nodeGPUMigInstance{deviceidx="1",deviceuuid="GPU-30f90f49-43ab-0a78-bf5c-93ed41ef2da2",migname="3g.20gb-1",nodeid="aio-node15",zone="vGPU"} 1

Notesโ€‹

  1. You don't need to do anything on MIG node, all are managed by mig-parted in hami-device-plugin.

  2. NVIDIA devices before Ampere architect can't use 'mig' mode

  3. You won't see any mig resources(ie, nvidia.com/mig-1g.10gb) on node, hami uses a unified resource name for both 'mig' and 'hami-core' node

CNCFHAMi is a CNCF Sandbox project