-
Notifications
You must be signed in to change notification settings - Fork 97
Handling GRNboost output #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think that we are waiting for this function to be implemented: Also referred to here:
I've found SCENIC tremendously useful but I look forward to being able to skipping the GENIE3 step as it is very time-consuming compared with what we've seen with GRNBoost. |
What worked for me was to rename the colnames of the GRNBoost output .tsv required by the downstream wrapper runSCENIC_1_coexNetwork2modules() and save the file in the respective scenic int/ directory. e.g.: |
Thanks for the suggestion! I tried that and ended up with the following error:
It seems like the GENIE3 weights are <1 while the pySCENIC weights/importance range from .02-160 |
Indeed GRNboost has a different range for the scores as compared to GENIE3. But so far I have not observed an impact for my analysis and it also should not have one.
I checked, whether having the class factor in the GRNBoost_output table will give me the same error message. Indeed, this is the case and if the "weight" column is class() factor I get the identical error message. You could check your input table #columns for the class() and convert the weight column to numeric. |
Hi @dpcook @JBreunig @jpezoldt , My dataset is too big for GENIE3 and I have been trying to run GRNboost using the vignette as suggested in However, I feel like I have missed something because my code does not seem to run. Could you please help me with which output from SCENIC do you input into GRNboost ? 1.1_genesKept.Rds 1.2_corrMat.Rds cellInfo.Rds colVars.Rds or scenicOptions.Rds? Thank you |
I moved on to a scanpy-->pyscenic workflow. I don't believe that I ever got things to work going from pySCENIC to SCENIC but that was before the tutorials were written. |
Hi @JBreunig Thank you. I was wondering if you used Jupyter Notebook or CLI for grn step of pyscenic? !pyscenic grn {f_loom_path_scenic} {f_tfs} -o adj.csv --num_workers 20 I am not sure if it is version compatibility issues with dependencies and dask or something else. Thank you |
I use spyder. But for that particular step, I just go back to the CLI and run it as a typical linux command (from the appropriate directory).
|
Thank you @JBreunig Just one more question, especially this for tornado: Thanks |
It looks like: |
Thank you very much @JBreunig. Hopefully it works. Thanks again |
Hi @JBreunig Sorry to bug again, Thank you /home/kbaral/anaconda3/lib/python3.7/site-packages/dask/config.py:161: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. 2020-04-10 19:58:06,483 - pyscenic.cli.pyscenic - INFO - Loading expression matrix. 2020-04-10 19:58:12,336 - pyscenic.cli.pyscenic - INFO - Inferring regulatory networks. 'infer_data failed for target KIAA2013' Retry (9/10). Failure caused by ValueError("Regression for target gene KIAA2013 failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target AGTRAP' Retry (9/10). Failure caused by ValueError("Regression for target gene AGTRAP failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target AC092807.2' Retry (10/10). Failure caused by ValueError("Regression for target gene AC092807.2 failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target PIK3CD' Retry (8/10). Failure caused by ValueError("Regression for target gene PIK3CD failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target FBXO6' Retry (9/10). Failure caused by ValueError("Regression for target gene FBXO6 failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target PI4KB' Retry (1/10). Failure caused by ValueError("Regression for target gene PI4KB failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target GIPC2' Retry (10/10). Failure caused by ValueError("Regression for target gene GIPC2 failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target TMCO4' Retry (8/10). Failure caused by ValueError("Regression for target gene TMCO4 failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target PIK3CD' Retry (10/10). Failure caused by ValueError("Regression for target gene PIK3CD failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target TRMT1L' Retry (1/10). Failure caused by ValueError("Regression for target gene TRMT1L failed. Cause ValueError('buffer source array is read-only')."). 'infer_data failed for target AC093158.1' Retry (10/10). Failure caused by ValueError("Regression for target gene AC093158.1 failed. Cause ValueError('buffer source array is read-only')."). not shutting down client, client was created externally Expected partition of type |
No, I haven't seen those errors. I did have to play with the num_workers argument but those warning were different. I adjusted my code from this: But I've now used it on over a dozen datasets without issue (10X and Smart-seq v4). I might recommend just running this tutorial to try to determine whether it is version issues, data format, or something else. |
Thank you very much @JBreunig . I tweaked it and got it to work, except the part where I have to put the outputs from GRN, AUCell and regulon together into a loom file. The problem is that I ran my auc in CLI so I don't have it as matrix, and when I try to run this excerpt of code: add_scenic_metadata(adata, auc_mtx, regulons) I keep running into AssertionError: Thank you . I appreciate your help. |
The particular step "add_scenic_metadata(adata, auc_mtx, regulons)" is only adding the metadata to the anndata structure. The loom output is farther down (export2loom) but I don't think that's the issue here. The more likely issue is that I think that there were some variable name and folder name ambiguities in that particular notebook that you have to adjust/correct or maybe your matrices aren't matching up. For example if I recollect correctly with auc_mtx = aucell(exp_mtx, regulons, num_workers=20), I don't think that exp_mtx is loaded and so you have to do that elsewhere. For the second potential issue, did you check that the size of your auc_mtx and exp_mtx correspond in the appropriate dimension (i.e. cell number only as genes vs. regulons will be different between the two)? I recently was unable to perform this step on a 300,000K cell dataset, leading to me reinstalling anaconda with no luck--still troubleshooting that. But otherwise I haven't had an error as long as my inputs (adata[and thus exp_mtx], and auc_mtx) matched up. If I make an upstream change in adata that doesn't correspond to the raw matrix I'd passed to pySCENIC, it will cause issues because of size mismatches. |
Hi, I keep getting this error: I saw this error in GitHUb but found no solution to it. Did you ever run into this error? Thank you. Much appreciated. And sorry I keep bugging you. Thank you |
No worries...happy to give what little assistance I can offer ;) Unfortunately, I haven't had time to troubleshoot that function. I tried once it didn't include the features that I wanted in SCOPE because I didn't have them in the appropriate format to add to the LOOM. It may have been that same error and it caused me to drop some of the data. (i.e., if you remove the cell_annotations, or tree structure argument, does it complete successfully? If so, something in one of those is incompatible with the HDF5 format.) linnarsson-lab/loompy#12 I general, I am just saving things as an h5ad file. |
I see, Thank you. Regulon name does not seem to be compatible with SCOPE. It should include a space to allow selection of the TF. I believe that this might be the issue here. I think that's my main concerns. (i) if my regulons are in correct format. I followed the vignette you sent link to for regulons. (ii) where and how do I perform data analysis and (iii) how to I compile them all together. Thank you again. |
I'm mainly looking at my data in Scanpy. An alternate way I was going to try to make the loom was to start at [65] in this notebook: https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/PBMC10k_SCENIC-protocol-CLI.ipynb I haven't had a chance yet...good luck! |
I believe the "header" parameter for the function "read.delim" should be "TRUE". |
Hi, |
I haven't used the package since 2018 so am not aware of any updates. Stein's team should be on top of this.Maybe you already figured it out.
Sorry for the late reply and the zero input.
Regardless, have a nice Easter.
Joern
…-----------------------
Jörn Pezoldt
Mobil: +49-1577-3818562
***@***.***
Am Donnerstag, 16. März 2023 um 02:12:48 MEZ hat J_A ***@***.***> Folgendes geschrieben:
Hi,
I have been running pySCENIC using singularity and got the 3 main important files: auc_mtx.csv, adjacencies.tsv, regulons.csv.
Was there any updated method to import those outputs into SCENIC in R?
Is there a way to save them to be read directly with the ScenicOptions class in R?
source: https://pyscenic.readthedocs.io/en/latest/installation.html
Thank you!
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
the easist way is to downlaod 1.4_GENIE3_linkList.Rds, and move it to your int fold: |
Hi there. I've been going through the vignettes and I can't seem to figure out how to get scenicOptions to point to the tsv output file from GRNboost (or a data frame of the results after importing it into R).
Any advice on how to use GRNboost results in the SCENIC pipeline would be greatly appreciated!
The text was updated successfully, but these errors were encountered: