Skip to content

Releases: NVIDIA/k8s-device-plugin

v0.17.3

24 Jul 09:53
v0.17.3
e0a461e
Compare
Choose a tag to compare

What's Changed

  • Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.6 to 1.17.8 by @dependabot[bot] in #1275
  • Bump nvidia/cuda from 12.9.0-base-ubi9 to 12.9.1-base-ubi9 in /deployments/container by @dependabot[bot] in #1300
  • Bump github.com/NVIDIA/go-nvml from 0.12.4-1 to 0.12.9-0 by @dependabot[bot] in #1287
  • Bump golang from 1.23.9 to 1.23.10 in /deployments/devel by @dependabot[bot] in #1283
  • Bump golang from 1.23.10 to 1.23.11 in /deployments/devel by @dependabot[bot] in #1318
  • Bump release v0.17.3 by @elezar in #1326
  • Backport: Bump golang.org/x/oauth2 from 0.23.0 to 0.27.0 by @cdesiniotis in #1328
  • Updated .release:staging to stage device-plugin images in nvstaging by @elezar in #1329

Full Changelog: v0.17.2...v0.17.3

v0.17.2

13 May 18:21
v0.17.2
390b1f6
Compare
Choose a tag to compare

What's Changed

  • Update nvidia.com/gpu.product label to include blackwell architectures
  • Update documentation to indicate that nvidia.com/gpu.memory label is in MiB instead of MB

Full Changelog: v0.17.1...v0.17.2

v0.17.1

12 Mar 09:59
v0.17.1
3c37819
Compare
Choose a tag to compare

What's Changed

  • Bump golang from 1.23.2 to 1.23.3 in /deployments/devel by @dependabot in #1063
  • Bump the k8sio group across 1 directory with 5 updates by @dependabot in #1066
  • Ensure FAIL_ON_INIT_ERROR boolean env is quoted by @elezar in #1076
  • Bump nvidia/cuda from 12.6.2-base-ubi9 to 12.6.3-base-ubi9 in /deployments/container by @dependabot in #1084
  • Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.0 to 1.17.2 by @dependabot in #1068
  • Bump google.golang.org/grpc from 1.65.0 to 1.65.1 by @dependabot in #1069
  • Bump sigs.k8s.io/node-feature-discovery from 0.15.4 to 0.15.7 by @dependabot in #1070
  • Bump NVIDIA/holodeck from 0.2.3 to 0.2.4 by @dependabot in #1064
  • Honor fail-on-init-error when no resources are found by @elezar in #1061
  • Bump github.com/opencontainers/selinux from 1.11.0 to 1.11.1 by @dependabot in #1067
  • Add ada-lovelace architecture label for compute capability 8.9 by @elezar in #1090
  • Switch to context package in go stdlib by @elezar in #1114
  • Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.2 to 1.17.4 by @dependabot in #1138
  • Bump nvidia/cuda from 12.6.3-base-ubi9 to 12.8.0-base-ubi9 in /deployments/container by @dependabot in #1142
  • Bump NVIDIA/holodeck from 0.2.4 to 0.2.5 by @dependabot in #1131
  • Bump slackapi/slack-github-action from 1.27.0 to 2.0.0 by @dependabot in #1065
  • Bump github.com/NVIDIA/go-nvlib from 0.7.0 to 0.7.1 by @dependabot in #1151
  • Ignore XID error 109 by @elezar in #1171
  • Remove nvidia.com/gpu.imex-domain label by @elezar in #1152
  • Bump azure/setup-helm from 4.2.0 to 4.3.0 by @dependabot in #1176
  • Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.4 to 1.17.5-rc.1 by @elezar in #1192

Full Changelog: v0.17.0...v0.17.1

v0.17.0

31 Oct 15:36
d475b2c
Compare
Choose a tag to compare

What's Changed

v0.17.0

  • Promote v0.17.0-rc.1 to GA

v0.17.0-rc.1

  • Add CAP_SYS_ADMIN if volume-mounts list strategy is included
  • Remove unneeded DEVICE_PLUGIN_MODE envvar
  • Fix applying SELinux label for MPS
  • Use a base image that aligns with the ubi-minimal base image
  • Switch to a ubi9-based base image
  • Remove namespace field from cluster-scoped resources
  • Generate labels for IMEX cligue and domain
  • Add optional injection of the default IMEX channel
  • Allow kubelet-socket to be specified as command line argument

v0.17.0-rc.1

31 Oct 15:40
a2c760c
Compare
Choose a tag to compare
v0.17.0-rc.1 Pre-release
Pre-release

What's Changed

  • Add CAP_SYS_ADMIN if volume-mounts list strategy is included
  • Remove unneeded DEVICE_PLUGIN_MODE envvar
  • Fix applying SELinux label for MPS
  • Use a base image that aligns with the ubi-minimal base image
  • Switch to a ubi9-based base image
  • Remove namespace field from cluster-scoped resources
  • Generate labels for IMEX cligue and domain
  • Add optional injection of the default IMEX channel
  • Allow kubelet-socket to be specified as command line argument

v0.16.2

08 Aug 11:02
42a0fa9
Compare
Choose a tag to compare

What's Changed

  • Fix applying SELinux label for MPS
  • Remove unneeded DEVICE_PLUGIN_MODE envvar
  • Add CAP_SYS_ADMIN if volume-mounts list strategy is included (fixes #856)

Full Changelog: v0.16.1...v0.16.2

v0.16.1

26 Jul 18:37
cb6e45e
Compare
Choose a tag to compare

Changelog

What's Changed

  • Bump nvidia-container-toolkit to v1.16.1 to fix a bug with CDI spec generation for MIG devices

Full Changelog: v0.16.0...v0.16.1

v0.16.0

16 Jul 13:29
d2eea55
Compare
Choose a tag to compare

Changelog

v0.16.0

  • Fixed logic of atomic writing of the feature file
  • Replaced WithDialer with WithContextDialer
  • Fixed SELinux context of MPS pipe directory.
  • Changed behavior for empty MIG devices to issue a warning instead of an error when the mixed strategy is selected
  • Added a a GFD node label for the GPU mode.
  • Update CUDA base image version to 12.5.1

v0.16.0-rc.1

  • Skip container updates if only CDI is selected
  • Allow cdi hook path to be set
  • Add nvidiaDevRoot config option
  • Detect devRoot for driver installation
  • Changed the automatically created MPS /dev/shm to half of the total memory as obtained from /proc/meminfo
  • Remove redundant version log
  • Remove provenance information from image manifests
  • add ngc image signing job for auto signing
  • fix: target should be binaries
  • Allow device discovery strategy to be specified
  • Refactor cdi handler construction
  • Add addMigMonitorDevices field to nvidia-device-plugin.options helper
  • Fix allPossibleMigStrategiesAreNone helm chart helper
  • use the helm quote function to wrap boolean values in quotes
  • Fix usage of hasConfigMap
  • Make info, nvml, and device lib construction explicit
  • Clean up construction of WSL devices
  • Remove unused function
  • Don't require node-name to be set if not needed
  • Make vgpu failures non-fatal
  • Use HasTegraFiles over IsTegraSystem
  • Raise error for MPS when using MIG
  • Align container driver root envvars
  • Update github.com/NVIDIA/go-nvml to v0.12.0-6
  • Add unit tests cases for sanitise func
  • Improving logic to sanitize GFD generated node labels
  • Add newline to pod logs
  • Adding vfio manager
  • Add prepare-release.sh script
  • Don't require node-name to be set if not needed
  • Remove GitLab pipeline .gitlab.yml
  • E2E test: fix object names
  • strip parentheses from the gpu product name
  • E2E test: instanciate a logger for helm outputs
  • E2E test: enhance logging via ginkgo/gomega
  • E2E test: remove e2elogs helper pkg
  • E2E test: Create HelmClient during Framework init
  • E2E test: Add -ginkgo.v flag to increase verbosity
  • E2E test: Create DiagnosticsCollector
  • Update vendoring
  • Replace go-nvlib/pkg/nvml with go-nvml/pkg/nvml
  • Add dependabot updates for release-0.15

Full Changelog: v0.15.0...v0.16.0

v0.15.1

25 Jun 12:07
682d9fa
Compare
Choose a tag to compare

Changelog

  • Fix inconsistent usage of hasConfigMap helm template. This addresses cases where certain resources (roles and service accounts) would be created even if they were not required.
  • Raise an error in GFD when MPS is used with MIG. This ensures that the behavior across GFD and the Device Plugin is consistent.
  • Remove provenance information from published images.
  • Use half of total memory for size of MPS tmpfs by default.

v0.16.0-rc.1

18 Jun 15:02
0403911
Compare
Choose a tag to compare
v0.16.0-rc.1 Pre-release
Pre-release

Changelog

  • Add script to create release
  • Fix handling of device-discovery-strategy for GFD
  • Skip README updates for rc releases
  • Fix generate-changelog.sh script
  • Skip container updates if only CDI is selected
  • Allow cdi hook path to be set
  • Add nvidiaDevRoot config option
  • Detect devRoot for driver installation
  • Set /dev/shm size from /proc/meminfo
  • Remove redundant version log
  • Remove provenance information from image manifests
  • add ngc image signing job for auto signing
  • fix: target should be binaries
  • Allow device discovery strategy to be specified
  • Refactor cdi handler construction
  • Add addMigMonitorDevices field to nvidia-device-plugin.options helper
  • Fix allPossibleMigStrategiesAreNone helm chart helper
  • use the helm quote function to wrap boolean values in quotes
  • Fix usage of hasConfigMap
  • Make info, nvml, and device lib construction explicit
  • Clean up construction of WSL devices
  • Remove unused function
  • Don't require node-name to be set if not needed
  • Make vgpu failures non-fatal
  • Use HasTegraFiles over IsTegraSystem
  • Raise error for MPS when using MIG
  • Align container driver root envvars
  • Update github.com/NVIDIA/go-nvml to v0.12.0-6
  • Add unit tests cases for sanitise func
  • Improving logic to sanitize GFD generated node labels
  • Add newline to pod logs
  • Adding vfio manager
  • Add prepare-release.sh script
  • Don't require node-name to be set if not needed
  • Remove GitLab pipeline .gitlab.yml
  • E2E test: fix object names
  • strip parentheses from the gpu product name
  • E2E test: instanciate a logger for helm outputs
  • E2E test: enhance logging via ginkgo/gomega
  • E2E test: remove e2elogs helper pkg
  • E2E test: Create HelmClient during Framework init
  • E2E test: Add -ginkgo.v flag to increase verbosity
  • E2E test: Create DiagnosticsCollector
  • Update vendoring
  • Replace go-nvlib/pkg/nvml with go-nvml/pkg/nvml
  • Add dependabot updates for release-0.15