Skip to content
This repository was archived by the owner on Jan 22, 2024. It is now read-only.

Request nvidia-docker2 debian download not from repository #635

Closed
tfboyd opened this issue Feb 13, 2018 · 9 comments
Closed

Request nvidia-docker2 debian download not from repository #635

tfboyd opened this issue Feb 13, 2018 · 9 comments

Comments

@tfboyd
Copy link

tfboyd commented Feb 13, 2018

You provided nvidia-docker downloads for nvidia-docker1 back in November when we (TensorFlow) asked but the new nvidia-docker2 release do not have a binary we can download. We are not permitted to install from the repository but due to a loophole we can download a binary and install it. Without this I cannot continue development on nvidia-docker or debug issues for users. This will be true for all internal contribuors of TensorFlow as we were all "force" "upgraded".

Thank you for your help. I go out of my way to get us moving forward faster upgrading CUDA and cuDNN at the request of your team at our monthly meetings. If it helps, I can escalate the concern there as well. Thank you again for helping us find a path forward.

@3XX0
Copy link
Member

3XX0 commented Feb 13, 2018

You can clone each repository and set them up as local repositories on your machine.
Something like this:

LOCALDIR=/var/lib/nvidia-docker-repo

mkdir -p $LOCALDIR && cd $LOCALDIR
git clone -b gh-pages https://github.com/NVIDIA/libnvidia-container.git
git clone -b gh-pages https://github.com/NVIDIA/nvidia-container-runtime.git
git clone -b gh-pages https://github.com/NVIDIA/nvidia-docker.git

sudo tee /etc/apt/sources.list.d/nvidia-docker.list <<< \
"deb file://$LOCALDIR/libnvidia-container/debian9/amd64 /
deb file://$LOCALDIR/nvidia-container-runtime/debian9/amd64 /
deb file://$LOCALDIR/nvidia-docker/debian9/amd64 /"

sudo apt-key add $LOCALDIR/nvidia-docker/gpgkey
sudo apt-get update

Updating these is just a matter of doing git pull

If you can't clone either, then download a snapshot of the branch from github as a tarball (i.e. replace git clone with curl | tar -xzf)

https://api.github.com/repos/nvidia/libnvidia-container/tarball/gh-pages
https://api.github.com/repos/nvidia/nvidia-container-runtime/tarball/gh-pages
https://api.github.com/repos/nvidia/nvidia-docker/tarball/gh-pages

Sorry, something went wrong.

@tfboyd
Copy link
Author

tfboyd commented Feb 13, 2018

I will give this a try. This seems less than ideal, but I understand priorities.

Sorry, something went wrong.

@tfboyd tfboyd closed this as completed Feb 13, 2018
@flx42
Copy link
Member

flx42 commented Feb 13, 2018

@tfboyd What would be the ideal solution then?

a) The best solution is to use our package repository. The packages are signed with our own GPG key and behind GitHub's https. That's the best option for security and ease of upgrade.
By the way, are you also prevented from using our CUDA package repository?

b) The second best solution is to use git clone like listed by @3XX0 here. Getting new packages
requires a git pull so it's a bigger hassle.

c) Third best is to use wget to download packages through https. It's painful since you have to dig out the new download links, install the packages in the right order, etc.

d) Least optimal option is to download tarballs through our website. It means you now install our software outside of your package manager, it's more difficult to track versions, and it's easier to shoot yourself in the foot by downloading the wrong tarball (i.e., today for nvidia-container-runtime I would need to publish 2 flavors of the tarball, with and without apparmor).

An alternative to c) would be to provide an infamous curl https://[...]/local_install.sh | sh' mechanism, which would basically do the right wgets and dpkg`s. Unfortunately I would have to update the versions manually in this file, so I would like to avoid that if I can.

@tfboyd
Copy link
Author

tfboyd commented Feb 13, 2018

@flx42

b/c seem fine. I likely will not upgrade after it is running unless something is broken. Not great but not worth the hassle. I have 4 dgx1-s (v100s) for real testing and they are upgraded via your official releases.

We are not allowed to use external repositories. If I want your repository added I need to go through a longish approval process and find at least two people that will maintain it and answer questions from other users. It is a good security process. Allowing me to download it directly is a nice thing that is allowed as I am basically taking personal responsibility for verifying the code/binary is not malicious. I may look at going down this path but to be honest it is a bunch of work and if I can hack it to run myself I am not sure the value in going the official route. I might do it anyway, but I then have to manually verify your releases before adding them to our internal repository. It is very painful and no one is going to thank me for doing it. Not your problem, but nice to understand the issues at other companies with decent security may have this same issue.

Thank you for the details. These problems are a pain.

@tfboyd
Copy link
Author

tfboyd commented Feb 13, 2018

Oh and a funny note. I answer a bunch of TensorFlow questions from random people. I know how fun being technical support and helping be can be. Sorry for rattling your cage. You were really helpful. I was forced to upgrade my workstation and I am a bit frustrated at things not working. :-(

@flx42
Copy link
Member

flx42 commented Feb 13, 2018

I answer a bunch of TensorFlow questions from random people. I know how fun being technical support and helping be can be. Sorry for rattling your cage. You were really helpful. I was forced to upgrade my workstation and I am a bit frustrated at things not working. :-(

No, it's totally fine. TensorFlow was one of our earliest adopter and is probably our largest user base today. We want you to succeed :)

Let me know if I can do something to ease your hacks.

@heshchris
Copy link

heshchris commented May 2, 2018

Can option (c) above be explored further? What would the path be to the packages themselves?

[Edit: answering my own question]

I found the method. If for example you have the following:

W: Failed to fetch https://nvidia.github.io/libnvidia-container/ubuntu14.04/amd64/Packages gnutls_handshake() failed: Handshake failed W: Failed to fetch https://nvidia.github.io/nvidia-container-runtime/ubuntu14.04/amd64/Packages gnutls_handshake() failed: Handshake failed W: Failed to fetch https://nvidia.github.io/nvidia-docker/ubuntu14.04/amd64/Packages gnutls_handshake() failed: Handshake failed

Then:

  1. get Packages

wget https://nvidia.github.io/libnvidia-container/ubuntu14.04/amd64/Packages

  1. look for "Filename:" lines in Packages.

  2. download from here:

wget https://nvidia.github.io/libnvidia-container/ubuntu14.04/amd64/libnvidia-container1_1.0.0~beta.1-1_amd64.deb

@tfboyd
Copy link
Author

tfboyd commented May 2, 2018

This just happened and is not relevant to issues others may be having. I wanted to report back the final solution for our situation.

For our case we setup a small group of people to own pulling the builds into the internal cloned repos. That may not be an option for all people at companies that lock things down. I find our policies often cover these types of options. It still means it will only be maintained as long as the small group of people own pulling down updates.

@flx42
Copy link
Member

flx42 commented May 2, 2018

@tfboyd thanks for the feedback! Let us know if you have any other problem with the repo or nvidia-docker itself.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants