Releases: NVIDIA/k8s-device-plugin
Releases · NVIDIA/k8s-device-plugin
v0.17.3
What's Changed
- Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.6 to 1.17.8 by @dependabot[bot] in #1275
- Bump nvidia/cuda from 12.9.0-base-ubi9 to 12.9.1-base-ubi9 in /deployments/container by @dependabot[bot] in #1300
- Bump github.com/NVIDIA/go-nvml from 0.12.4-1 to 0.12.9-0 by @dependabot[bot] in #1287
- Bump golang from 1.23.9 to 1.23.10 in /deployments/devel by @dependabot[bot] in #1283
- Bump golang from 1.23.10 to 1.23.11 in /deployments/devel by @dependabot[bot] in #1318
- Bump release v0.17.3 by @elezar in #1326
- Backport: Bump golang.org/x/oauth2 from 0.23.0 to 0.27.0 by @cdesiniotis in #1328
- Updated .release:staging to stage device-plugin images in nvstaging by @elezar in #1329
Full Changelog: v0.17.2...v0.17.3
v0.17.2
What's Changed
- Update nvidia.com/gpu.product label to include blackwell architectures
- Update documentation to indicate that nvidia.com/gpu.memory label is in MiB instead of MB
Full Changelog: v0.17.1...v0.17.2
v0.17.1
What's Changed
- Bump golang from 1.23.2 to 1.23.3 in /deployments/devel by @dependabot in #1063
- Bump the k8sio group across 1 directory with 5 updates by @dependabot in #1066
- Ensure FAIL_ON_INIT_ERROR boolean env is quoted by @elezar in #1076
- Bump nvidia/cuda from 12.6.2-base-ubi9 to 12.6.3-base-ubi9 in /deployments/container by @dependabot in #1084
- Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.0 to 1.17.2 by @dependabot in #1068
- Bump google.golang.org/grpc from 1.65.0 to 1.65.1 by @dependabot in #1069
- Bump sigs.k8s.io/node-feature-discovery from 0.15.4 to 0.15.7 by @dependabot in #1070
- Bump NVIDIA/holodeck from 0.2.3 to 0.2.4 by @dependabot in #1064
- Honor fail-on-init-error when no resources are found by @elezar in #1061
- Bump github.com/opencontainers/selinux from 1.11.0 to 1.11.1 by @dependabot in #1067
- Add ada-lovelace architecture label for compute capability 8.9 by @elezar in #1090
- Switch to context package in go stdlib by @elezar in #1114
- Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.2 to 1.17.4 by @dependabot in #1138
- Bump nvidia/cuda from 12.6.3-base-ubi9 to 12.8.0-base-ubi9 in /deployments/container by @dependabot in #1142
- Bump NVIDIA/holodeck from 0.2.4 to 0.2.5 by @dependabot in #1131
- Bump slackapi/slack-github-action from 1.27.0 to 2.0.0 by @dependabot in #1065
- Bump github.com/NVIDIA/go-nvlib from 0.7.0 to 0.7.1 by @dependabot in #1151
- Ignore XID error 109 by @elezar in #1171
- Remove nvidia.com/gpu.imex-domain label by @elezar in #1152
- Bump azure/setup-helm from 4.2.0 to 4.3.0 by @dependabot in #1176
- Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.4 to 1.17.5-rc.1 by @elezar in #1192
Full Changelog: v0.17.0...v0.17.1
v0.17.0
What's Changed
v0.17.0
- Promote v0.17.0-rc.1 to GA
v0.17.0-rc.1
- Add CAP_SYS_ADMIN if volume-mounts list strategy is included
- Remove unneeded DEVICE_PLUGIN_MODE envvar
- Fix applying SELinux label for MPS
- Use a base image that aligns with the ubi-minimal base image
- Switch to a ubi9-based base image
- Remove namespace field from cluster-scoped resources
- Generate labels for IMEX cligue and domain
- Add optional injection of the default IMEX channel
- Allow kubelet-socket to be specified as command line argument
v0.17.0-rc.1
What's Changed
- Add CAP_SYS_ADMIN if volume-mounts list strategy is included
- Remove unneeded DEVICE_PLUGIN_MODE envvar
- Fix applying SELinux label for MPS
- Use a base image that aligns with the ubi-minimal base image
- Switch to a ubi9-based base image
- Remove namespace field from cluster-scoped resources
- Generate labels for IMEX cligue and domain
- Add optional injection of the default IMEX channel
- Allow kubelet-socket to be specified as command line argument
v0.16.2
What's Changed
- Fix applying SELinux label for MPS
- Remove unneeded DEVICE_PLUGIN_MODE envvar
- Add CAP_SYS_ADMIN if volume-mounts list strategy is included (fixes #856)
Full Changelog: v0.16.1...v0.16.2
v0.16.1
Changelog
What's Changed
- Bump nvidia-container-toolkit to v1.16.1 to fix a bug with CDI spec generation for MIG devices
Full Changelog: v0.16.0...v0.16.1
v0.16.0
Changelog
v0.16.0
- Fixed logic of atomic writing of the feature file
- Replaced
WithDialer
withWithContextDialer
- Fixed SELinux context of MPS pipe directory.
- Changed behavior for empty MIG devices to issue a warning instead of an error when the mixed strategy is selected
- Added a a GFD node label for the GPU mode.
- Update CUDA base image version to 12.5.1
v0.16.0-rc.1
- Skip container updates if only CDI is selected
- Allow cdi hook path to be set
- Add nvidiaDevRoot config option
- Detect devRoot for driver installation
- Changed the automatically created MPS /dev/shm to half of the total memory as obtained from /proc/meminfo
- Remove redundant version log
- Remove provenance information from image manifests
- add ngc image signing job for auto signing
- fix: target should be binaries
- Allow device discovery strategy to be specified
- Refactor cdi handler construction
- Add addMigMonitorDevices field to nvidia-device-plugin.options helper
- Fix allPossibleMigStrategiesAreNone helm chart helper
- use the helm quote function to wrap boolean values in quotes
- Fix usage of hasConfigMap
- Make info, nvml, and device lib construction explicit
- Clean up construction of WSL devices
- Remove unused function
- Don't require node-name to be set if not needed
- Make vgpu failures non-fatal
- Use HasTegraFiles over IsTegraSystem
- Raise error for MPS when using MIG
- Align container driver root envvars
- Update github.com/NVIDIA/go-nvml to v0.12.0-6
- Add unit tests cases for sanitise func
- Improving logic to sanitize GFD generated node labels
- Add newline to pod logs
- Adding vfio manager
- Add prepare-release.sh script
- Don't require node-name to be set if not needed
- Remove GitLab pipeline .gitlab.yml
- E2E test: fix object names
- strip parentheses from the gpu product name
- E2E test: instanciate a logger for helm outputs
- E2E test: enhance logging via ginkgo/gomega
- E2E test: remove e2elogs helper pkg
- E2E test: Create HelmClient during Framework init
- E2E test: Add -ginkgo.v flag to increase verbosity
- E2E test: Create DiagnosticsCollector
- Update vendoring
- Replace go-nvlib/pkg/nvml with go-nvml/pkg/nvml
- Add dependabot updates for release-0.15
Full Changelog: v0.15.0...v0.16.0
v0.15.1
Changelog
- Fix inconsistent usage of
hasConfigMap
helm template. This addresses cases where certain resources (roles and service accounts) would be created even if they were not required. - Raise an error in GFD when MPS is used with MIG. This ensures that the behavior across GFD and the Device Plugin is consistent.
- Remove provenance information from published images.
- Use half of total memory for size of MPS tmpfs by default.
v0.16.0-rc.1
Changelog
- Add script to create release
- Fix handling of device-discovery-strategy for GFD
- Skip README updates for rc releases
- Fix generate-changelog.sh script
- Skip container updates if only CDI is selected
- Allow cdi hook path to be set
- Add nvidiaDevRoot config option
- Detect devRoot for driver installation
- Set /dev/shm size from /proc/meminfo
- Remove redundant version log
- Remove provenance information from image manifests
- add ngc image signing job for auto signing
- fix: target should be binaries
- Allow device discovery strategy to be specified
- Refactor cdi handler construction
- Add addMigMonitorDevices field to nvidia-device-plugin.options helper
- Fix allPossibleMigStrategiesAreNone helm chart helper
- use the helm quote function to wrap boolean values in quotes
- Fix usage of hasConfigMap
- Make info, nvml, and device lib construction explicit
- Clean up construction of WSL devices
- Remove unused function
- Don't require node-name to be set if not needed
- Make vgpu failures non-fatal
- Use HasTegraFiles over IsTegraSystem
- Raise error for MPS when using MIG
- Align container driver root envvars
- Update github.com/NVIDIA/go-nvml to v0.12.0-6
- Add unit tests cases for sanitise func
- Improving logic to sanitize GFD generated node labels
- Add newline to pod logs
- Adding vfio manager
- Add prepare-release.sh script
- Don't require node-name to be set if not needed
- Remove GitLab pipeline .gitlab.yml
- E2E test: fix object names
- strip parentheses from the gpu product name
- E2E test: instanciate a logger for helm outputs
- E2E test: enhance logging via ginkgo/gomega
- E2E test: remove e2elogs helper pkg
- E2E test: Create HelmClient during Framework init
- E2E test: Add -ginkgo.v flag to increase verbosity
- E2E test: Create DiagnosticsCollector
- Update vendoring
- Replace go-nvlib/pkg/nvml with go-nvml/pkg/nvml
- Add dependabot updates for release-0.15