Skip to content

Problem with too many processes starting in dclpar step #68

Closed
@lkelly3

Description

@lkelly3

Hello

I have seen that a few other people have had problems at the step using dclpar, but I'm not sure if the error I am getting is due to the same issue.
I have tried to restart an analysis that previously didn't complete (in hindsight, it probably died due to the error I became aware of when restarting it). I restarted the analysis using the "-ft" option. It starts fine and from the log I get "16 thread(s) for highly parallel tasks (BLAST searches etc.)
1 thread(s) for OrthoFinder algorithm".

However, at the "Reconciling gene and species trees" step I start to get a lot of errors to the screen:
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2583728 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2583728 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2583728 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1024 current, 2583728 max
Traceback (most recent call last):
File "/share/apps/orthofinder/1.1.4/venv/bin/dlcpar_search", line 15, in
import dlcpar.simplerecon
File "/share/apps/orthofinder/1.1.4/venv/lib/python2.7/site-packages/dlcpar/simplerecon.py", line 15, in
from compbio import coal
File "/share/apps/orthofinder/1.1.4/venv/lib/python2.7/site-packages/dlcpar/deps/compbio/coal.py", line 37, in
from scipy.optimize import brentq
File "/share/apps/orthofinder/1.1.4/venv/lib/python2.7/site-packages/scipy/init.py", line 61, in
from numpy import show_config as show_numpy_config
File "/share/apps/orthofinder/1.1.4/venv/lib/python2.7/site-packages/numpy/init.py", line 142, in
from . import add_newdocs
File "/share/apps/orthofinder/1.1.4/venv/lib/python2.7/site-packages/numpy/add_newdocs.py", line 13, in
from numpy.lib import add_newdoc
File "/share/apps/orthofinder/1.1.4/venv/lib/python2.7/site-packages/numpy/lib/init.py", line 8, in
from .type_check import *
File "/share/apps/orthofinder/1.1.4/venv/lib/python2.7/site-packages/numpy/lib/type_check.py", line 11, in
import numpy.core.numeric as _nx
File "/share/apps/orthofinder/1.1.4/venv/lib/python2.7/site-packages/numpy/core/init.py", line 16, in
from . import multiarray

I had to kill the job after this because it was putting too heavy a load on our cluster. I've been told that "OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable" is caused by too many processes being spawned on our machine and that this would only happen if c.1000 processes were spawned at once. My understanding is that with the default setting OrthoFinder should not be running more than 16 processes at once (as indicated in the log).

Do you have any ideas about the cause of this and how to fix it? As with other users, I have run analyses without problem on smaller datasets, but run into this issue when increasing the run of taxa (and input proteins).

Many thanks
Laura

Activity

davidemms

davidemms commented on May 15, 2017

@davidemms
Owner

Hi Laura

I haven't seen this issue before with dlcpar, I will try and investigate it. The 16 processes that OrthoFinder is using means that at this stage of the algorithm it will run 16 instances of dlcpar_search. Each of these instances of dlcpar_search should only run a single process I think, so it's would seem odd if c. 1000 processes were spawned.

Just to check, are the error messages that start at
Traceback (most recent call last): File "/share/apps/orthofinder/1.1.4/venv/bin/dlcpar_search", line 15, in
from after you kill the job or are they the root of the problem? Because if they are the error, they indicate that the error occurs when dlcpar tris to import a standard scipy module with the command "from scipy.optimize import brentq" rather than any calculations? You could check that typing this command into a python interpreter doesn't results in the same error messages, but if you've run OrthoFinder before on smaller datasets I'd guess this is a red herring.

All the best
David

davidemms

davidemms commented on May 15, 2017

@davidemms
Owner

It looks like it could be a problem that has developed with your numpy installation, see:
http://stackoverflow.com/questions/33506042/openblas-error-when-importing-numpy-pthread-creat-error-in-blas-thread-init-fu

and it may be related to resource limits being set on your machine:
http://stackoverflow.com/questions/39725880/numpy-import-fails-in-virtualenv-when-ulimit-v-is-set-openblas-resource-tempo

You'll need to be able to run the command "from scipy.optimize import brentq" successfully in python. If you can't, then there is a problem on the machine that needs to be resolved.

All the best
David

lkelly3

lkelly3 commented on May 15, 2017

@lkelly3
Author

Hi David

In answer to your first question, the error messages that start at
Traceback (most recent call last): File "/share/apps/orthofinder/1.1.4/venv/bin/dlcpar_search", line 15,
are from before I kill the job. I'll try running the "from scipy.optimize import brentq" command as you suggest and see what happens.

Thanks for your help
Laura

lkelly3

lkelly3 commented on May 15, 2017

@lkelly3
Author

Hi again

I tried the command in python and I get "ImportError: No module named scipy.optimize" so I guess that's the problem! I'll speak to our cluster administrators to see if they can fix it.

Many thanks
Laura

davidemms

davidemms commented on Jul 20, 2017

@davidemms
Owner

Hi

Any news on this? Were you able to track down the issue/find a solution?

Many thanks
David

lkelly3

lkelly3 commented on Jul 21, 2017

@lkelly3
Author
lkelly3

lkelly3 commented on Jul 25, 2017

@lkelly3
Author
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @davidemms@lkelly3

        Issue actions

          Problem with too many processes starting in dclpar step · Issue #68 · davidemms/OrthoFinder