Skip to content

Plotting environmental vectors in CCA #274

Closed
@kfontanez

Description

@kfontanez

Joey-

I've been using CCA to plot functional pathways. I'd like to plot the environmental vectors in a similar way that vegan does. Is there a way to do this in phyloseq?

I can make a pretty plot in phyloseq, which has everything I want, but it lacks the biplot arrows:

c=plot_ordination(TOP5,cca.deptreat,shape="TREATMENT",label="Definition",type="biplot",color="DEPTH",title="CCA of Top5 KEGG orthologs constrained by treatment*depth")
c+scale_color_gradient(limits=c(150,500),low="red",high="blue",expression(Depth[meters]))

screen shot 2013-12-04 at 11 30 08 am

In vegan I use the output of your ordinate function to make a plot with the arrows included:

cca.deptreat

Call: cca(formula = OTU ~ DEPTH * TREATMENT, data = data)

Inertia Proportion Rank
Total 0.4983 1.0000
Constrained 0.3211 0.6443 3
Unconstrained 0.1773 0.3557 3
Inertia is mean squared contingency coefficient

Eigenvalues for constrained axes:
CCA1 CCA2 CCA3
0.14973 0.11848 0.05285

Eigenvalues for unconstrained axes:
CA1 CA2 CA3
0.08029 0.05853 0.03846

plot(cca.deptreat,choices=c(1,2),display=c("wa","bp"),type="points",xlim=c(-4,1.5),scaling=2)
points(cca.deptreat,disp="sites",pch=21,col="red",bg="red",cex=1.3)
text(cca.deptreat,"sites",pos=3,axis.bp=TRUE)
screen shot 2013-12-04 at 11 28 44 am

I'd like to got those biplot arrows onto the phyloseq plot. Can this be done?

Thanks,
Kristina

Activity

joey711

joey711 commented on Dec 4, 2013

@joey711
Owner

Yes, see the geom_segment documentation for ggplot2:

http://docs.ggplot2.org/0.9.3.1/geom_segment.html

There is an arrow argument.

Tt would look something like:

arrowdata = ...
c + geom_segment(data=arrowdata, xend = 0, yend = 0, arrow = arrow(length = unit(0.1,"cm")))

You would have to get the arrow location data from the relevant ordination results to define arrowdata. I'll look into it more when I get a chance, but if you already know the coordinates, the above is how you would add it to your ggplot2 object.

kfontanez

kfontanez commented on Dec 4, 2013

@kfontanez
Author

Joey-

I might have figured it out but the results aren't as pretty as I'd like.

To get the biplot scores out of the ordination object:

scrs<-scores(cca.deptreat,display="bp")
scrs
CCA1 CCA2
DEPTH 0.05250891 -0.8341712
TREATMENTLive -0.98775033 -0.1287367
DEPTH:TREATMENTLive -0.79181802 -0.1913112

To get the appropriate scaling multiplier used in the vegan plot in my previous post:

mul<-vegan:::ordiArrowMul(scrs)
mul
[1] 2.605453

Next, multiply your biplot scores by the scaling multiplier:

arrowdata<-scrs*mul
arrowdata
CCA1 CCA2
DEPTH 0.1368095 -2.1733941
TREATMENTLive -2.5735373 -0.3354176
DEPTH:TREATMENTLive -2.0630449 -0.4984525

Use the scaled biplot scores to draw the DEPTH arrow.

c+geom_segment(aes(x=0,y=0,xend=0.137,yend=-2.173,arrow=arrow(length=unit(0.5
screen shot 2013-12-04 at 5 07 02 pm
,"cm"))))

The arrowhead is missing and the colors look odd so I must have done something wrong here. I wasn't able to use your data= flag correctly so I could use some tips about how to use the arrowdata variable as input to the geom_segment function.

When I tried it, I got this:

c+geom_segment(data=arrowdata,xend=0,yend=0,arrow=arrow(length=unit(0.5,"cm")))
Error: ggplot2 doesn't know how to deal with data of class matrix

I must be missing something basic. I tried changing the arrowdata to numeric with as.numeric but that failed as well.

dep<-arrowdata[1,]
dep
CCA1 CCA2
0.1368095 -2.1733941
dep<-as.numeric(arrowdata[1,])
dep
[1] 0.1368095 -2.1733941
c+geom_segment(data=dep,xend=0,yend=0,arrow=arrow(length=unit(0.5,"cm")))
Error: ggplot2 doesn't know how to deal with data of class numeric

What class does it have to be to work?

Thanks,
Kristina

joey711

joey711 commented on Dec 5, 2013

@joey711
Owner

That's a basic ggplot2 thing. Most data input to ggplot commands must be a data.frame. Try the as.data.frame command. The columns of the data.frame will need names. It will probably help to read a little more about how ggplot2 expects data and arguments to work.

kfontanez

kfontanez commented on Dec 6, 2013

@kfontanez
Author

I think I've got it working.

First, change arrowdata to data.frame:

arrowdata3<-as.data.frame(arrowdata)
arrowdata3
CCA1 CCA2
DEPTH 0.1368095 -2.1733941
TREATMENTLive -2.5735373 -0.3354176
DEPTH:TREATMENTLive -2.0630449 -0.4984525
class(arrowdata3)
[1] "data.frame"

Then, set the x and y limits of the arrow for the first variable, Depth:

xlim=arrowdata3[1,1]
ylim=arrowdata3[1,2]

Now, plot:

c+geom_segment(aes(xend=xlim,yend=ylim,x=0,y=0),colour="black", size=0.5,linetype=1, arrow=arrow(type="closed",angle=15))
screen shot 2013-12-06 at 11 12 12 am

The line is a bit wavy, which is not ideal but it works! You can do multiple arrows at a time by just adding multiple geom_segment calls like this:

c+geom_segment(aes(xend=xlim,yend=ylim,x=0,y=0),colour="black", size=0.5,linetype=1, arrow=arrow(type="closed",angle=15))+geom_segment(aes(xend=xlim2,yend=ylim2,x=0,y=0),colour="black",size=0.5,linetype=1,arrow=arrow(type="closed",angle=15))

screen shot 2013-12-06 at 11 17 06 am

joey711

joey711 commented on Dec 9, 2013

@joey711
Owner

That's cool!

The line waviness is probably an artifact of how you are viewing/saving the graphic images. ggplot2 creates pure vector graphics, so you should always be able to avoid pixelation/waviness if you need to / want to.

kfontanez

kfontanez commented on Dec 10, 2013

@kfontanez
Author

Great, thanks for your help, as always!

Kristina

joey711

joey711 commented on Dec 10, 2013

@joey711
Owner

I'm going to keep this open until I post some more general code for adding this to a plot. I think it is a commonly-needed graphical annotation.

joey711

joey711 commented on Dec 13, 2013

@joey711
Owner

Fully Reproducible Example for CCA

library("phyloseq")
packageVersion("phyloseq")
## [1] '1.7.10'
library("vegan")
packageVersion("vegan")
## [1] '2.0.9'
library("ggplot2")
packageVersion("ggplot2")
## [1] '0.9.3.1'
library("grid")
packageVersion("grid")
## [1] '3.0.2'
datafilepath = system.file("extdata/study_816_split_library_seqs_and_mapping.zip", 
    package = "phyloseq")
physeq = microbio_me_qiime(datafilepath)
## Found biom-format file, now parsing it... 
## Done parsing biom... 
## Importing Sample Metdadata from mapping file...
## Merging the imported objects... 
## Successfully merged, phyloseq-class created. 
##  Returning...
# Rename replicate description to just 'Litter'
sample_data(physeq)$Litter <- sample_data(physeq)$Description_duplicate
# Add a fake dummy variable Depth so that there are two with interaction,
# like in example
sample_data(physeq)$Depth <- sample(150:500, nsamples(physeq), replace = TRUE)
# Create ordination
cca_litdir = ordinate(physeq ~ Litter * Depth, "CCA")
# Mimic biplot from example
p0 = plot_ordination(physeq, cca_litdir, type = "biplot", color = "Depth", shape = "Litter")
p0
ggsave("issue274-0.png", p0)
## Saving 7 x 7 in image

issue274-0

# Now add the environmental variables as arrows
arrowmat = vegan::scores(cca_litdir, display = "bp")
# Add labels, make a data.frame
arrowdf <- data.frame(labels = rownames(arrowmat), arrowmat)
# Define the arrow aesthetic mapping
arrow_map = aes(xend = CCA1, yend = CCA2, x = 0, y = 0, shape = NULL, color = NULL, 
    label = labels)
label_map = aes(x = 1.2 * CCA1, y = 1.2 * CCA2, shape = NULL, color = NULL, 
    label = labels)
# Make a new graphic
arrowhead = arrow(length = unit(0.05, "npc"))
p1 = p0 + geom_segment(arrow_map, size = 0.5, data = arrowdf, color = "gray", 
    arrow = arrowhead) + geom_text(label_map, size = 2, data = arrowdf)
p1
ggsave("issue274-1.png", p1)
## Saving 7 x 7 in image

issue274-1

How to make labels have same angle as arrows

arrowdf = transform(arrowdf, radians = atan(CCA2/CCA1))
arrowdf = transform(arrowdf, angle = 360 * radians/(2 * pi))
# Quadrants II, III, IV
arrowdf$quad234 <- apply(arrowdf[, c("CCA1", "CCA2")], 1, function(x) {
    any(x < 0)
})
arrowdf$quad4 <- apply(arrowdf[, c("CCA1", "CCA2")], 1, function(x) {
    all(x < 0)
})
# If quadrant II, III, IV, add 180 to angle
if (any(arrowdf$quad234)) {
    arrowdf[arrowdf$quad234, "angle"] <- arrowdf[arrowdf$quad234, "angle"] + 
        180
}
# If quadrant IV, add additional 180
if (any(arrowdf$quad4)) {
    arrowdf[arrowdf$quad4, "angle"] <- arrowdf[arrowdf$quad4, "angle"] + 180
}
# For printing text, we want to flip if its greater than 90
if (any(arrowdf$angle > 90)) {
    arrowdf[arrowdf$angle > 90, "angle"] <- arrowdf[arrowdf$angle > 90, "angle"] - 
        180
}
# Make a new graphic
label_map = aes(x = 1.2 * CCA1, y = 1.2 * CCA2, shape = NULL, color = NULL, 
    label = labels, angle = angle)
p2 = p0 + geom_segment(arrow_map, size = 0.5, data = arrowdf, color = "gray", 
    arrow = arrowhead) + geom_text(label_map, size = 2, data = arrowdf, vjust = -1)
p2
ggsave("issue274-2.png", p2)
## Saving 7 x 7 in image

issue274-2

joey711

joey711 commented on Dec 13, 2013

@joey711
Owner

Thanks again, @kfontanez for the feedback, and for getting this example started. I think it is potentially very helpful for others wanting to use this display.

chadsmith123

chadsmith123 commented on Aug 7, 2015

@chadsmith123

Can either you comment on how to interpret the arrows for the interaction terms in this analysis? Thanks!

montoyah

montoyah commented on Mar 16, 2017

@montoyah

Hi all,

I'm trying to reproduce this example to add arrows to my CAP plot, following this tutorial but have no idea where the arguments for "display" like "sites" or "species" in "vegan::scores()" function come from. Must they be variable names in the ps object? I even replaced one of my sam_data columns to "sites" but throws an error:

Error in scores.default(ordcapPlot, display = "sites") : Can't find scores`

In my case, I have taxa down to genus, no species. The Vegan help pages I found didn't clarify my doubts.

abundancesTop20NoNA <- abundancesTop20 %>% subset_samples( !is.na(phosphate..mM..alkaline) & !is.na(phosphate..mM..acidic) & !is.na(Acetate..mM.) & !is.na(sulfate..mM.) )

bray_no_na <- phyloseq::distance(physeq = abundancesTop20NoNA, method = "bray")

ordcap = ordinate(abundancesTop20NoNA, #abundances, method = "CAP", distance = bray_no_na, formula = ~ sites * phosphate..mM..alkaline)

Create plot:

ordcapPlot <- plot_ordination(abundancesTop20NoNA, ordination = ordcap, #type = "biplot", color="sites", axes = c(1, 2)) + aes(shape = sampleReceived) + geom_point(aes(color = type),alpha = 0.4, size = 4) + geom_point(colour = "grey90", size = 1.5) + scale_color_manual(values = c("#a65628", "red")) #, "#ffae19", "#4daf4a", "#1919ff", "darkorchid3", "magenta")

Here'e were the error appears:
arrowmat <- scores(ordcapPlot, display = "sites")

Thanks in advance for any help,

Oscar.

psObject.zip

jflater

jflater commented on Jul 26, 2018

@jflater

Try

arrowmat <- vegan::scores(ordcapPlot, display = "sites") or try changing display to bp display = "bp"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @joey711@kfontanez@chadsmith123@montoyah@jflater

        Issue actions

          Plotting environmental vectors in CCA · Issue #274 · joey711/phyloseq