Skip to content

Intel-GPU

  • Having your GPU isolated when using a VM
  • Passed the GPU to your Talos Machine when using a VM
  • Node Feature Discovery added to your cluster

Its important to add the following Extensions to your talconfig.yaml for bootstrap:

schematic:
customization:
systemExtensions:
officialExtensions:
- siderolabs/i915
- siderolabs/intel-ucode
- siderolabs/mei

If its a fresh bootstrap you can simply follow the clustertool guide on how to bootstrap your cluster. If it is a existing cluster you will need to run clustertool talos upgrade to add the extensions to your cluster.

Add the following repo to your cluster if using fluxcd:

---
# yaml-language-server: $schema=https://kubernetes-schemas.pages.dev/source.toolkit.fluxcd.io/helmrepository_v1.json
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: home-ops-mirror
namespace: flux-system
spec:
type: oci
interval: 2h
url: oci://ghcr.io/home-operations/charts-mirror

Add the intel-device-plugin-operator to your cluster Example helm-release configuration:

---
# yaml-language-server: $schema=https://kubernetes-schemas.pages.dev/helm.toolkit.fluxcd.io/helmrelease_v2.json
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: intel-device-plugin-operator
namespace: system
spec:
interval: 30m
chart:
spec:
chart: intel-device-plugins-operator
version: 0.32.0
sourceRef:
kind: HelmRepository
name: home-ops-mirror
namespace: flux-system
install:
crds: CreateReplace
remediation:
retries: 3
upgrade:
cleanupOnFail: true
crds: CreateReplace
remediation:
strategy: rollback
retries: 3
dependsOn:
- name: node-feature-discovery
namespace: kube-system
values:
controllerExtraArgs: |
- --devices=gpu

Add the intel-device-plugin-gpu to your cluster Example helm-release configuration:

---
# yaml-language-server: $schema=https://kubernetes-schemas.pages.dev/helm.toolkit.fluxcd.io/helmrelease_v2.json
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: intel-device-plugin-gpu
namespace: system
spec:
interval: 30m
chart:
spec:
chart: intel-device-plugins-gpu
version: 0.32.0
sourceRef:
kind: HelmRepository
name: home-ops-mirror
namespace: flux-system
install:
remediation:
retries: 3
upgrade:
cleanupOnFail: true
remediation:
strategy: rollback
retries: 3
dependsOn:
- name: intel-device-plugin-operator
namespace: system
values:
name: intel-gpu-plugin
sharedDevNum: 5
nodeFeatureRule: true
Terminal window
kubectl get nodes -o=jsonpath="{range .items[*]}{.metadata.name}{'\n'}{' i915: '}{.status.allocatable.gpu\.intel\.com/i915}{'\n'}"

The following shows an example on how to add the GPU to a chart. Depending on the chart you may need to adapt the workload-name.

resources:
limits:
gpu.intel.com/i915: 1