Closed
Description
Hi, thanks for your great work!
But when I try to run the finetune command with the default af2 conda environment, I found some errors that may be due to version issues of Haiku/Jax. So can you tell me the version of the dependency library? It would be great if you could add env.yaml or requirement.txt to the code!
Thanks again.
Activity
wangy9711 commentedon Oct 21, 2022
In addition, I chose a specific version of the dependent library to allow the code to run, but found that the GPU memory took up 73G during the training process. Is this normal?
ena2016 commentedon Dec 2, 2022
Hey what version of jax did you use to have the program running?
I have the same issue with the new jax version changing its syntax.
wangy9711 commentedon Dec 2, 2022
I think here is the version of some key third-party library, under this version I can run finetune normally~
jax 0.2.19
jaxlib 0.1.69+cuda111
dm-haiku 0.0.5
dm-tree 0.1.6
jmp 0.0.2
optax 0.1.3
MeiMunick commentedon Dec 3, 2022
Hey Wang, I was stuck when run Fine-tuning peptide-MHC (either on a tiny dataset or just full model) as instructed on author's github page and got :
FileNotFoundError: [Errno 2] No such file or directory: '/home/pbradley/csdat/alphafold/data/params/params_model_2_ptm.npz'
Have you ever had the same problem? Thank you for helping.
phbradley commentedon Dec 3, 2022
Hi all, thanks for the helpful discussions and sorry I was not in town to reply sooner. For the above error, you need to use the
--data_dir
command line flag and provide the path to the directory that contains the AlphaFoldparams/
folder. I've updated the README to clarify this.phbradley commentedon Dec 3, 2022
With regard to memory usage, in our experience the training should not take more that 11-12 Gb of GPU memory.
phbradley commentedon Dec 3, 2022
Sorry for the trouble with jax/python/conda environments. I am not an expert in this, and it took some work to get things to work on our machines originally, but I think much of that was specific to the GPUs that we were using and the various versions of CUDA and associated libraries. Plus it took a combination of
conda
andpip
and some moreconda
and some morepip
. So, that's all to say that I'm not sure that ourrequirements.txt
would be generally useful, and I'm not even sure it would contain all the relevant information given theconda
/pip
mixing.Nevertheless, our
jax
version is 0.2.22Here is the result of running
conda list --explicit
(not sure why that does't show jax!?!)
and here is what
conda --list
gives. This one does seem to showjax
:Hope that helps!
wangy9711 commentedon Dec 5, 2022
Thanks for this information. The main reason for the memory consumption is that I forgot to configure the remat parameter (also called memory checkpoint) when changing the code. After correct configuration, the memory consumption of the GPU is also below 10G~
jsko-arontier commentedon Mar 10, 2023
Thank you for great work!
I was able to run the program with no error in a conda environment created with the following options. You can create a conda environment by creating a yml file.
phbradley commentedon Jun 12, 2023
Thanks jsko-arontier for posting that helpful info! I realized that the info on evironments that I posted above was for a slightly older variant than the one I used for the calculations in the paper. My apologies for that, here is the output of
conda env export
with the correct environment. I would try the one in the previous post first, since it looks cleaner, but if it doesn't work this could be an alternative: