Skip to content

Difference in results between uwot & umap-learn? #2025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
samuel-marsh opened this issue Aug 26, 2019 · 2 comments
Closed

Difference in results between uwot & umap-learn? #2025

samuel-marsh opened this issue Aug 26, 2019 · 2 comments

Comments

@samuel-marsh
Copy link
Collaborator

samuel-marsh commented Aug 26, 2019

Hi,

I noticed that Seurat recently made the move to uwot as the default implementation of umap. I understand the decision given the preference to run natively in R vs dependency of python. Just wondering if you have any guidance/thoughts on differences in results between the two implementations?

I decided to run side-by-side comparison on recent dataset. Below are the results running with the two different methods and the code. Only difference between the plots is the umap.method.

As you can see the results of the two methods provide plots with a number of differences in relative cluster location and in spacing of clusters (most noticeably cluster 5).

Thanks very much!
Sam

Edit
Just to add this analysis was performed in Seurat v3.1 installed from CRAN this morning into a fresh packrat library, R3.5.1, Mac OSX Mojave.

all_cns <- SCTransform(all_cns, vars.to.regress = "percent.mt", verbose = TRUE, variable.features.n = NULL)
all_cns <- RunPCA(all_cns, verbose = FALSE, npcs = 75)
all_cns_uwot <- RunUMAP(all_cns, dims = 1:60, verbose = FALSE, n.neighbors = 30, min.dist = 0.3, umap.method = "uwot")
all_cns_uwot <- FindNeighbors(all_cns_uwot, dims = 1:60, verbose = FALSE)
all_cns_uwot <- FindClusters(all_cns_uwot, resolution = 0.4, verbose = FALSE)
DimPlot(all_cns_uwot, label = TRUE)
all_cns_umap <- RunUMAP(all_cns, dims = 1:60, verbose = FALSE, n.neighbors = 30, min.dist = 0.3, umap.method = "umap-learn")
all_cns_umap <- FindNeighbors(all_cns_umap, dims = 1:60, verbose = FALSE)
all_cns_umap <- FindClusters(all_cns_umap, resolution = 0.4, verbose = FALSE)
DimPlot(all_cns_umap, label = TRUE)

uwot Results:
uwot.pdf

umap-learn Results:
umap-learn.pdf

@samuel-marsh
Copy link
Collaborator Author

Sorry one more update. Realized I hadn't actually updated my umap-learn python package in little while.
In original post above that is run with umap-learn 0.3.2.

Updated to newest release 0.3.10 and re-ran the code from scratch.

Here is the plot:
umap-learn (updated).pdf

While the major spacing differences have gone away. There is still many differences apparent between the uwot implementation and native python implementation.

Thanks,
Sam

@andrewwbutler
Copy link
Collaborator

Hi Sam,

For a comparison of uwot with the python implementation of umap, please see this part of the uwot documentation. Based on this, it is expected that the two different implementations won't always agree but we can't really advise on which you "should use". We opted to move the default version to uwot as it seemed generally comparable in terms of results/runtime while also hopefully making installation a bit simpler (there were quite a number of issues from people trying to link their python packages with R). We did leave both options to maintain reproducibility for those wanting to regenerate old plots.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants