Skip to content

getPeak2GeneLinks Error #442

Closed
Closed
@YanFang0620

Description

@YanFang0620

Hi,
I try to follow the steps for chapter"Peak2GeneLinkage with ArchR" using the turorial data.

After run the first step:

projHeme5 <- addPeak2GeneLinks(
ArchRProj = projHeme5,
reducedDims = "IterativeLSI"
)

I got this log file:

ArchR-addPeak2GeneLinks-c9f637af053-Date-2020-11-27_Time-12-49-47.log

However, when I run next step:

p2g <- getPeak2GeneLinks(
ArchRProj = projHeme5,
corCutOff = 0.45,
resolution = 1,
returnLoops = TRUE
)

I got an error message:

Error in .validInput(input = start, name = "start", valid = c("integer")) :
Input value for 'start' is not a integer, (start = numeric) please supply valid input!
In addition: Warning message:
In min(abs(c(input%%1, input%%1 - 1)), na.rm = TRUE) :
no non-missing arguments to min; returning Inf

Is there anyone could help me to solve this problem? Thanks a lot!

Activity

rcorces

rcorces commented on Nov 27, 2020

@rcorces
Collaborator

I cannot reproduce your error. Perhaps you have missed some steps in the tutorial, such as adding a peak set or a peak matrix.

YanFang0620

YanFang0620 commented on Nov 27, 2020

@YanFang0620
Author

I cannot reproduce your error. Perhaps you have missed some steps in the tutorial, such as adding a peak set or a peak matrix.

Thank you for your reply!
I checked the available matrix in projHeme5:

getAvailableMatrices(projHeme5)
[1] "ATACMatrix" "ChIPMatrix"
[3] "EncodeTFBSMatrix" "GeneIntegrationMatrix"
[5] "GeneScoreMatrix" "MotifMatrix"
[7] "PeakMatrix" "TileMatrix"

It seems like there already added a PeakMatrix in the ArchRProject.

So how could I check whether there's any peak set in my ArchRProject?

YanFang0620

YanFang0620 commented on Nov 27, 2020

@YanFang0620
Author

I cannot reproduce your error. Perhaps you have missed some steps in the tutorial, such as adding a peak set or a peak matrix.

I also tried to check:

head(projHeme5@peakSet)
GRanges object with 6 ranges and 12 metadata columns:
seqnames ranges strand | score
|
Mono chr1 752495-752995 * | 27.3149
CD4.N chr1 757871-758371 * | 5.55037
CD4.N chr1 762690-763190 * | 17.9242
GMP chr1 773670-774170 * | 4.90925
B chr1 801006-801506 * | 15.2793
B chr1 805039-805539 * | 37.7072
replicateScoreQuantile groupScoreQuantile Reproducibility

Mono 0.865 0.737 2
CD4.N 0.835 0.541 2
CD4.N 0.977 0.917 2
GMP 0.513 0.176 2
B 0.685 0.418 2
B 0.955 0.895 2
GroupReplicate distToGeneStart nearestGene

Mono Mono..scATAC_BMMC_R1 10156 LINC00115
CD4.N CD4.N.
.scATAC_PBMC_R1 4780 LINC00115
CD4.N CD4.N..scATAC_PBMC_R1 30 LINC01128
GMP GMP.
.scATAC_CD34_BMMC_R1 10948 LINC01128
B B..scATAC_PBMC_R1 10925 FAM41C
B B.
.scATAC_BMMC_R1 6892 FAM41C
peakType distToTSS nearestTSS GC idx

Mono Distal 10156 uc001aau.3 0.485 1
CD4.N Distal 4780 uc001aau.3 0.5609 2
CD4.N Promoter 30 uc021oeh.1 0.6966 3
GMP Intronic 10741 uc021oeh.1 0.4571 4
B Distal 10925 uc021oei.1 0.4371 5
B Intronic 6892 uc021oei.1 0.7285 6


seqinfo: 23 sequences from an unspecified genome; no seqlengths

So there should be peak set in the ArchRProject.
Actually, all of steps were ran smoothly except the getPeak2GeneLinks() step.

rcorces

rcorces commented on Nov 28, 2020

@rcorces
Collaborator

Can you run the following code line by line and tell me what happens?

ArchRProj = projHeme5
corCutOff = 0.45
resolution = 1

coA <- metadata(ArchRProj@peakSet)$CoAccessibility
coA <- coA[coA$correlation >= corCutOff,,drop=FALSE]
peakSummits <- resize(metadata(coA)$peakSet, 1, "center")
summitTiles <- floor(start(peakSummits) / resolution) * resolution + floor(resolution / 2)

loops <- ArchR:::.constructGR(
  seqnames = seqnames(peakSummits[coA[,1]]),
  start = summitTiles[coA[,1]],
  end = summitTiles[coA[,2]]
)
YanFang0620

YanFang0620 commented on Nov 28, 2020

@YanFang0620
Author

Can you run the following code line by line and tell me what happens?

ArchRProj = projHeme5
corCutOff = 0.45
resolution = 1

coA <- metadata(ArchRProj@peakSet)$CoAccessibility
coA <- coA[coA$correlation >= corCutOff,,drop=FALSE]
peakSummits <- resize(metadata(coA)$peakSet, 1, "center")
summitTiles <- floor(start(peakSummits) / resolution) * resolution + floor(resolution / 2)

loops <- ArchR:::.constructGR(
  seqnames = seqnames(peakSummits[coA[,1]]),
  start = summitTiles[coA[,1]],
  end = summitTiles[coA[,2]]
)

Hi, I ran the code line by line,and finally I got the loops:

> loops
GRanges object with 96566 ranges and 0 metadata columns:
          seqnames              ranges strand
             <Rle>           <IRanges>  <Rle>
      [1]     chr1       845576-856623      *
      [2]     chr1       845576-856623      *
      [3]     chr1       856623-901555      *
      [4]     chr1       894703-895398      *
      [5]     chr1       894703-895398      *
      ...      ...                 ...    ...
  [96562]     chrX 153665228-153686239      *
  [96563]     chrX 153941234-153959650      *
  [96564]     chrX 153941234-153959650      *
  [96565]     chrX 153959650-153960343      *
  [96566]     chrX 153959650-153960343      *
  -------
  seqinfo: 23 sequences from an unspecified genome; no seqlengths
rcorces

rcorces commented on Nov 28, 2020

@rcorces
Collaborator

That just doesnt make sense to me.

If you are running this command:

p2g <- getPeak2GeneLinks(
ArchRProj = projHeme5,
corCutOff = 0.45,
resolution = 1,
returnLoops = TRUE
)

EDIT NOTE: The below is actually not correct. Copy-paste error.
Then the below code is exactly what is being run. If you can run this and it doesnt return an error, then you should try to re-run your original command.

ArchRProj = projHeme5
corCutOff = 0.45
resolution = 1
returnLoops = TRUE



if(is.null(ArchRProj@peakSet)){
  return(NULL)
}


if(is.null(metadata(ArchRProj@peakSet)$CoAccessibility)){
  return(NULL)
}else{
  
  coA <- metadata(ArchRProj@peakSet)$CoAccessibility
  coA <- coA[coA$correlation >= corCutOff,,drop=FALSE]
  
  if(returnLoops){
    
    peakSummits <- resize(metadata(coA)$peakSet, 1, "center")
    
    if(!is.null(resolution)){
      summitTiles <- floor(start(peakSummits) / resolution) * resolution + floor(resolution / 2)
    }else{
      summitTiles <- start(peakSummits)
    }
    
    loops <- ArchR:::.constructGR(
      seqnames = seqnames(peakSummits[coA[,1]]),
      start = summitTiles[coA[,1]],
      end = summitTiles[coA[,2]]
    )
    metadata(coA) <- list()
    mcols(loops) <- coA[,-c(1:3)]
    mcols(loops)$value <- coA$correlation
    
    loops <- loops[order(mcols(loops)$value, decreasing=TRUE)]
    loops <- unique(loops)
    loops <- loops[width(loops) > 0]
    loops <- sort(sortSeqlevels(loops))
    
    loops <- SimpleList(CoAccessibility = loops)

    return(loops)
  }else{
    return(coA)
  }
  
}
YanFang0620

YanFang0620 commented on Nov 28, 2020

@YanFang0620
Author

ArchRProj = projHeme5
corCutOff = 0.45
resolution = 1
returnLoops = TRUE

Thank you so much for your suggestion.
However, if I try to run the original codes you provide above,finally I got an error:

Error: no function to return from, jumping to top level
rcorces

rcorces commented on Nov 29, 2020

@rcorces
Collaborator

Of course you cannot run the line that says return(loops) because you are not in a function to return from. Can you run the code line by line instead?

Also, have you tried re-running your original command:

p2g <- getPeak2GeneLinks(
ArchRProj = projHeme5,
corCutOff = 0.45,
resolution = 1,
returnLoops = TRUE
)
YanFang0620

YanFang0620 commented on Nov 29, 2020

@YanFang0620
Author

If I run:

p2g <- getPeak2GeneLinks(
ArchRProj = projHeme5,
corCutOff = 0.45,
resolution = 1,
returnLoops = TRUE
)

It still showed me:

Error in .validInput(input = start, name = "start", valid = c("integer")) : 
  Input value for 'start' is not a integer, (start = numeric) please supply valid input!
In addition: Warning message:
In min(abs(c(input%%1, input%%1 - 1)), na.rm = TRUE) :
  no non-missing arguments to min; returning Inf
rcorces

rcorces commented on Nov 29, 2020

@rcorces
Collaborator

What happens when you run all of this code? It should print one number (1 - 4).

ArchRProj = projHeme5, 
corCutOff = 0.45, 
FDRCutOff = 0.0001,
varCutOffATAC = 0.25,
varCutOffRNA = 0.25,
resolution = 1, 
returnLoops = TRUE

if(is.null(ArchRProj@peakSet)){
  print("1")
}

if(is.null(metadata(ArchRProj@peakSet)$Peak2GeneLinks)){
  print("2")
}else{
  
  p2g <- metadata(ArchRProj@peakSet)$Peak2GeneLinks
  p2g <- p2g[which(p2g$Correlation >= corCutOff & p2g$FDR <= FDRCutOff), ,drop=FALSE]
  
  if(!is.null(varCutOffATAC)){
    p2g <- p2g[which(p2g$VarQATAC > varCutOffATAC),]
  }
  
  if(!is.null(varCutOffRNA)){
    p2g <- p2g[which(p2g$VarQRNA > varCutOffRNA),]
  }
  
  if(returnLoops){
    
    peakSummits <- resize(metadata(p2g)$peakSet, 1, "center")
    geneStarts <- resize(metadata(p2g)$geneSet, 1, "start")
    
    if(!is.null(resolution)){
      summitTiles <- floor(start(peakSummits) / resolution) * resolution + floor(resolution / 2)
      geneTiles <- floor(start(geneStarts) / resolution) * resolution + floor(resolution / 2)
    }else{
      summitTiles <- start(peakSummits)
      geneTiles <- start(geneTiles)
    }
    
    loops <- ArchR:::.constructGR(
      seqnames = seqnames(peakSummits[p2g$idxATAC]),
      start = summitTiles[p2g$idxATAC],
      end = geneTiles[p2g$idxRNA]
    )
    mcols(loops)$value <- p2g$Correlation
    mcols(loops)$FDR <- p2g$FDR
    
    loops <- loops[order(mcols(loops)$value, decreasing=TRUE)]
    loops <- unique(loops)
    loops <- loops[width(loops) > 0]
    loops <- sort(sortSeqlevels(loops))
    
    loops <- SimpleList(Peak2GeneLinks = loops)
    
    print("3")
    
  }else{
    
    print("4")
    
  }
  
}
YanFang0620

YanFang0620 commented on Nov 29, 2020

@YanFang0620
Author

It didn't print 1-4. It showed me:

image

rcorces

rcorces commented on Nov 29, 2020

@rcorces
Collaborator

Ok well at least that is consistent. Try this:

run this code:

ArchRProj = projHeme5, 
corCutOff = 0.45, 
FDRCutOff = 0.0001,
varCutOffATAC = 0.25,
varCutOffRNA = 0.25,
resolution = 1, 
returnLoops = TRUE

if(is.null(ArchRProj@peakSet)){
  print("1")
}

if(is.null(metadata(ArchRProj@peakSet)$Peak2GeneLinks)){
  print("2")
}else{
  
  p2g <- metadata(ArchRProj@peakSet)$Peak2GeneLinks
  p2g <- p2g[which(p2g$Correlation >= corCutOff & p2g$FDR <= FDRCutOff), ,drop=FALSE]
  
  if(!is.null(varCutOffATAC)){
    p2g <- p2g[which(p2g$VarQATAC > varCutOffATAC),]
  }
  
  if(!is.null(varCutOffRNA)){
    p2g <- p2g[which(p2g$VarQRNA > varCutOffRNA),]
  }
  
  if(returnLoops){
    
    peakSummits <- resize(metadata(p2g)$peakSet, 1, "center")
    geneStarts <- resize(metadata(p2g)$geneSet, 1, "start")
    
    if(!is.null(resolution)){
      summitTiles <- floor(start(peakSummits) / resolution) * resolution + floor(resolution / 2)
      geneTiles <- floor(start(geneStarts) / resolution) * resolution + floor(resolution / 2)
    }else{
      summitTiles <- start(peakSummits)
      geneTiles <- start(geneTiles)
    }
  }
}

Then show me the output of each of these lines:

head(seqnames(peakSummits[p2g$idxATAC]))

str(summitTiles)
str(geneTiles)

str(summitTiles[p2g$idxATAC])
str(geneTiles[p2g$idxRNA])
YanFang0620

YanFang0620 commented on Nov 29, 2020

@YanFang0620
Author

The following are the output I got:

> head(seqnames(peakSummits[p2g$idxATAC]))
factor-Rle of length 0 with 0 runs
  Lengths: 
  Values : 
Levels(23): chr1 chr2 chr3 chr4 chr5 chr6 ... chr19 chr20 chr21 chr22 chrX
> str(summitTiles)
 num [1:140865] 752745 758121 762940 773920 801256 ...
> str(geneTiles)
 num [1:18601] 69091 762902 812182 860530 894679 ...
> str(summitTiles[p2g$idxATAC])
 num(0) 
> str(geneTiles[p2g$idxRNA])
 num(0) 
rcorces

rcorces commented on Nov 29, 2020

@rcorces
Collaborator

Ok. That shows where the problem is coming from. Show me:
metadata(projHeme5@peakSet)$Peak2GeneLinks

YanFang0620

YanFang0620 commented on Nov 29, 2020

@YanFang0620
Author
> metadata(projHeme5@peakSet)$Peak2GeneLinks
DataFrame with 900358 rows and 6 columns
         idxATAC    idxRNA         Correlation       FDR
       <integer> <integer>           <numeric> <numeric>
1              1         2  -0.137592741888392       NaN
2              2         2  -0.014714218205002       NaN
3              3         2  -0.275911420781701       NaN
4              4         2   0.213618225737381       NaN
5              5         2  0.0532897125162421       NaN
...          ...       ...                 ...       ...
900354    140861     18601 -0.0632468248590322       NaN
900355    140862     18601   -0.20948768020154       NaN
900356    140863     18601  -0.154581524969957       NaN
900357    140864     18601 -0.0844574975792414       NaN
900358    140865     18601  0.0175566266875722       NaN
                 VarQATAC           VarQRNA
                <numeric>         <numeric>
1       0.717133425620275 0.490672544486856
2      0.0561104603698577 0.490672544486856
3       0.886913001810244 0.490672544486856
4        0.22905618854932 0.490672544486856
5       0.268434316544209 0.490672544486856
...                   ...               ...
900354  0.107315514854648 0.480619321541853
900355  0.104277144784013 0.480619321541853
900356  0.536797643133497 0.480619321541853
900357  0.397692826465055 0.480619321541853
900358  0.219422851666489 0.480619321541853
rcorces

rcorces commented on Nov 29, 2020

@rcorces
Collaborator

I guess I owe you an apology. Sorry - this is related to the issue #385 where you originally posted but I didnt catch it until now. As in that issue, the FDRs are all NaN.

So this line:
p2g <- p2g[which(p2g$Correlation >= corCutOff & p2g$FDR <= FDRCutOff), ,drop=FALSE]
is causing p2g to be empty which causes your error.

I'm going to close this and we can continue the discussion on the previous thread. Sorry that this is not working at the moment. We will get it fixed.

jgranja24

jgranja24 commented on Nov 29, 2020

@jgranja24
Contributor

Sorry I think the bug is now fixed in the new release branch -----

To install --
devtools::install_github("GreenleafLab/ArchR", ref="release_1.0.1", repos = BiocManager::repositories())

The change --
o$TStat <- (o$Correlation / sqrt((pmax(1-o$Correlation^2, 0.00000000000000001, na.rm = TRUE))/(ncol(seATAC)-2))) #T-statistic P-value

YanFang0620

YanFang0620 commented on Nov 29, 2020

@YanFang0620
Author

I guess I owe you an apology. Sorry - this is related to the issue #385 where you originally posted but I didnt catch it until now. As in that issue, the FDRs are all NaN.

So this line:
p2g <- p2g[which(p2g$Correlation >= corCutOff & p2g$FDR <= FDRCutOff), ,drop=FALSE]
is causing p2g to be empty which causes your error.

I'm going to close this and we can continue the discussion on the previous thread. Sorry that this is not working at the moment. We will get it fixed.

No, its OK. Thank you so much for your patience and let me know about this.
Hope it could be fixed soon. This is a really great package for scATAC-seq analysis!

sylestiel

sylestiel commented on Dec 4, 2020

@sylestiel

@rcorces
If installation of the revised version is the solution it is still not addressing the problem for me.
devtools::install_github("GreenleafLab/ArchR", ref="release_1.0.1", repos = BiocManager::repositories())

rcorces

rcorces commented on Dec 4, 2020

@rcorces
Collaborator

That doesnt sound possible. did you re-run addPeak2GeneLinks() as I suggested?

sylestiel

sylestiel commented on Dec 4, 2020

@sylestiel

I just did it and still does not work!!
Screen Shot 2020-12-04 at 1 01 54 PM

sylestiel

sylestiel commented on Dec 4, 2020

@sylestiel

Screen Shot 2020-12-04 at 1 03 14 PM

jgranja24

jgranja24 commented on Dec 4, 2020

@jgranja24
Contributor

Can you copy the output if you paste addPeak2GeneLinks into R which returns the function. Are you sure you reset your r session and reloading ArchR?

rcorces

rcorces commented on Dec 4, 2020

@rcorces
Collaborator

@sylestiel - "it doesnt work" is relative. It is no longer giving you the error and the function is running to completion.

sylestiel

sylestiel commented on Dec 4, 2020

@sylestiel

addPeak2GeneLinks
function (ArchRProj = NULL, reducedDims = "IterativeLSI", useMatrix = "GeneIntegrationMatrix",
dimsToUse = 1:30, scaleDims = NULL, corCutOff = 0.75, cellsToUse = NULL,
k = 100, knnIteration = 500, overlapCutoff = 0.8, maxDist = 250000,
scaleTo = 10^4, log2Norm = TRUE, predictionCutoff = 0.4,
addEmpiricalPval = FALSE, seed = 1, threads = max(floor(getArchRThreads()/2),
1), verbose = TRUE, logFile = createLogFile("addPeak2GeneLinks"))
{
.validInput(input = ArchRProj, name = "ArchRProj", valid = c("ArchRProj"))
.validInput(input = reducedDims, name = "reducedDims", valid = c("character"))
.validInput(input = dimsToUse, name = "dimsToUse", valid = c("numeric",
"null"))
.validInput(input = scaleDims, name = "scaleDims", valid = c("boolean",
"null"))
.validInput(input = corCutOff, name = "corCutOff", valid = c("numeric",
"null"))
.validInput(input = cellsToUse, name = "cellsToUse", valid = c("character",
"null"))
.validInput(input = k, name = "k", valid = c("integer"))
.validInput(input = knnIteration, name = "knnIteration",
valid = c("integer"))
.validInput(input = overlapCutoff, name = "overlapCutoff",
valid = c("numeric"))
.validInput(input = maxDist, name = "maxDist", valid = c("integer"))
.validInput(input = scaleTo, name = "scaleTo", valid = c("numeric"))
.validInput(input = log2Norm, name = "log2Norm", valid = c("boolean"))
.validInput(input = threads, name = "threads", valid = c("integer"))
.validInput(input = verbose, name = "verbose", valid = c("boolean"))
.validInput(input = logFile, name = "logFile", valid = c("character"))
tstart <- Sys.time()
.startLogging(logFile = logFile)
.logThis(mget(names(formals()), sys.frame(sys.nframe())),
"addPeak2GeneLinks Input-Parameters", logFile = logFile)
.logDiffTime(main = "Getting Available Matrices", t1 = tstart,
verbose = verbose, logFile = logFile)
AvailableMatrices <- getAvailableMatrices(ArchRProj)
if ("PeakMatrix" %ni% AvailableMatrices) {
stop("PeakMatrix not in AvailableMatrices")
}
if (useMatrix %ni% AvailableMatrices) {
stop(paste0(useMatrix, " not in AvailableMatrices"))
}
ArrowFiles <- getArrowFiles(ArchRProj)
tstart <- Sys.time()
dfAll <- .safelapply(seq_along(ArrowFiles), function(x) {
DataFrame(cellNames = paste0(names(ArrowFiles)[x], "#",
h5read(ArrowFiles[x], paste0(useMatrix, "/Info/CellNames"))),
predictionScore = h5read(ArrowFiles[x], paste0(useMatrix,
"/Info/predictionScore")))
}, threads = threads) %>% Reduce("rbind", .)
.logDiffTime(sprintf("Filtered Low Prediction Score Cells (%s of %s, %s)",
sum(dfAll[, 2] < predictionCutoff), nrow(dfAll), round(sum(dfAll[,
2] < predictionCutoff)/nrow(dfAll), 3)), t1 = tstart,
verbose = verbose, logFile = logFile)
keep <- sum(dfAll[, 2] >= predictionCutoff)/nrow(dfAll)
dfAll <- dfAll[which(dfAll[, 2] > predictionCutoff), ]
set.seed(seed)
peakSet <- getPeakSet(ArchRProj)
.logThis(peakSet, "peakSet", logFile = logFile)
geneSet <- .getFeatureDF(ArrowFiles, useMatrix, threads = threads)
geneStart <- GRanges(geneSet$seqnames, IRanges(geneSet$start,
width = 1), name = geneSet$name, idx = geneSet$idx)
.logThis(geneStart, "geneStart", logFile = logFile)
rD <- getReducedDims(ArchRProj, reducedDims = reducedDims,
corCutOff = corCutOff, dimsToUse = dimsToUse)
if (!is.null(cellsToUse)) {
rD <- rD[cellsToUse, , drop = FALSE]
}
idx <- sample(seq_len(nrow(rD)), knnIteration, replace = !nrow(rD) >=
knnIteration)
.logDiffTime(main = "Computing KNN", t1 = tstart, verbose = verbose,
logFile = logFile)
knnObj <- .computeKNN(data = rD, query = rD[idx, ], k = k)
.logDiffTime(main = "Identifying Non-Overlapping KNN pairs",
t1 = tstart, verbose = verbose, logFile = logFile)
keepKnn <- determineOverlapCpp(knnObj, floor(overlapCutoff *
k))
knnObj <- knnObj[keepKnn == 0, ]
.logDiffTime(paste0("Identified ", nrow(knnObj), " Groupings!"),
t1 = tstart, verbose = verbose, logFile = logFile)
knnObj <- lapply(seq_len(nrow(knnObj)), function(x) {
rownames(rD)[knnObj[x, ]]
}) %>% SimpleList
chri <- gtools::mixedsort(unique(paste0(seqnames(peakSet))))
chrj <- gtools::mixedsort(unique(paste0(seqnames(geneStart))))
chrij <- intersect(chri, chrj)
geneDF <- mcols(geneStart)
peakDF <- mcols(peakSet)
geneDF$seqnames <- seqnames(geneStart)
peakDF$seqnames <- seqnames(peakSet)
.logDiffTime(main = "Getting Group RNA Matrix", t1 = tstart,
verbose = verbose, logFile = logFile)
groupMatRNA <- .getGroupMatrix(ArrowFiles = getArrowFiles(ArchRProj),
featureDF = geneDF, groupList = knnObj, useMatrix = useMatrix,
threads = threads, verbose = FALSE)
rawMatRNA <- groupMatRNA
.logThis(groupMatRNA, "groupMatRNA", logFile = logFile)
.logDiffTime(main = "Getting Group ATAC Matrix", t1 = tstart,
verbose = verbose, logFile = logFile)
groupMatATAC <- .getGroupMatrix(ArrowFiles = getArrowFiles(ArchRProj),
featureDF = peakDF, groupList = knnObj, useMatrix = "PeakMatrix",
threads = threads, verbose = FALSE)
rawMatATAC <- groupMatATAC
.logThis(groupMatATAC, "groupMatATAC", logFile = logFile)
.logDiffTime(main = "Normalizing Group Matrices", t1 = tstart,
verbose = verbose, logFile = logFile)
groupMatRNA <- t(t(groupMatRNA)/colSums(groupMatRNA)) * scaleTo
groupMatATAC <- t(t(groupMatATAC)/colSums(groupMatATAC)) *
scaleTo
if (log2Norm) {
groupMatRNA <- log2(groupMatRNA + 1)
groupMatATAC <- log2(groupMatATAC + 1)
}
names(geneStart) <- NULL
seRNA <- SummarizedExperiment(assays = SimpleList(RNA = groupMatRNA,
RawRNA = rawMatRNA), rowRanges = geneStart)
metadata(seRNA)$KNNList <- knnObj
.logThis(seRNA, "seRNA", logFile = logFile)
names(peakSet) <- NULL
seATAC <- SummarizedExperiment(assays = SimpleList(ATAC = groupMatATAC,
RawATAC = rawMatATAC), rowRanges = peakSet)
metadata(seATAC)$KNNList <- knnObj
.logThis(seATAC, "seATAC", logFile = logFile)
rm(groupMatRNA, groupMatATAC)
gc()
.logDiffTime(main = "Finding Peak Gene Pairings", t1 = tstart,
verbose = verbose, logFile = logFile)
o <- DataFrame(findOverlaps(.suppressAll(resize(seRNA, 2 *
maxDist + 1, "center")), resize(rowRanges(seATAC), 1,
"center"), ignore.strand = TRUE))
o$distance <- distance(rowRanges(seRNA)[o[, 1]], rowRanges(seATAC)[o[,
2]])
colnames(o) <- c("B", "A", "distance")
if (addEmpiricalPval) {
.logDiffTime(main = "Computing Background Correlations",
t1 = tstart, verbose = verbose, logFile = logFile)
nullCor <- .getNullCorrelations(seATAC, seRNA, o, 1000)
}
.logDiffTime(main = "Computing Correlations", t1 = tstart,
verbose = verbose, logFile = logFile)
o$Correlation <- rowCorCpp(as.integer(o$A), as.integer(o$B),
assay(seATAC), assay(seRNA))
o$VarAssayA <- .getQuantiles(matrixStats::rowVars(assay(seATAC)))[o$A]
o$VarAssayB <- .getQuantiles(matrixStats::rowVars(assay(seRNA)))[o$B]
o$TStat <- (o$Correlation/sqrt((max(1 - o$Correlation^2,
1e-17))/(ncol(seATAC) - 2)))
o$Pval <- 2 * pt(-abs(o$TStat), ncol(seATAC) - 2)
o$FDR <- p.adjust(o$Pval, method = "fdr")
out <- o[, c("A", "B", "Correlation", "FDR", "VarAssayA",
"VarAssayB")]
colnames(out) <- c("idxATAC", "idxRNA", "Correlation", "FDR",
"VarQATAC", "VarQRNA")
mcols(peakSet) <- NULL
names(peakSet) <- NULL
metadata(out)$peakSet <- peakSet
metadata(out)$geneSet <- geneStart
if (addEmpiricalPval) {
out$EmpPval <- 2 * pnorm(-abs(((out$Correlation - mean(nullCor[[2]]))/sd(nullCor[[2]]))))
out$EmpFDR <- p.adjust(out$EmpPval, method = "fdr")
}
dir.create(file.path(getOutputDirectory(ArchRProj), "Peak2GeneLinks"),
showWarnings = FALSE)
outATAC <- file.path(getOutputDirectory(ArchRProj), "Peak2GeneLinks",
"seATAC-Group-KNN.rds")
.safeSaveRDS(seATAC, outATAC, compress = FALSE)
outRNA <- file.path(getOutputDirectory(ArchRProj), "Peak2GeneLinks",
"seRNA-Group-KNN.rds")
.safeSaveRDS(seRNA, outRNA, compress = FALSE)
metadata(out)$seATAC <- outATAC
metadata(out)$seRNA <- outRNA
metadata(ArchRProj@peakSet)$Peak2GeneLinks <- out
.logDiffTime(main = "Completed Peak2Gene Correlations!",
t1 = tstart, verbose = verbose, logFile = logFile)
.endLogging(logFile = logFile)
ArchRProj
}
<bytecode: 0x1076420a0>
<environment: namespace:ArchR>

jgranja24

jgranja24 commented on Dec 4, 2020

@jgranja24
Contributor

You need to reload R and ArchR.

This issue has been fixed in the branch you are referring to (See Above) --

o$TStat <- (o$Correlation/sqrt((max(1 - o$Correlation^2,
1e-17))/(ncol(seATAC) - 2)))
sylestiel

sylestiel commented on Dec 4, 2020

@sylestiel

Ok. Many Thanks!!! I will do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @rcorces@jgranja24@YanFang0620@sylestiel

        Issue actions

          getPeak2GeneLinks Error · Issue #442 · GreenleafLab/ArchR