Skip to content

as_tibble.matrix should keep rownames attribute #288

@ClaytonJY

Description

@ClaytonJY

Currently, as_tibble.matrix() throws away rownames:

library(tibble)

mtmatrix <- as.matrix(mtcars)
rownames(mtmatrix)
#>  [1] "Mazda RX4"           "Mazda RX4 Wag"       "Datsun 710"         
#>  [4] "Hornet 4 Drive"      "Hornet Sportabout"   "Valiant"            
#>  [7] "Duster 360"          "Merc 240D"           "Merc 230"           
#> [10] "Merc 280"            "Merc 280C"           "Merc 450SE"         
#> [13] "Merc 450SL"          "Merc 450SLC"         "Cadillac Fleetwood" 
#> [16] "Lincoln Continental" "Chrysler Imperial"   "Fiat 128"           
#> [19] "Honda Civic"         "Toyota Corolla"      "Toyota Corona"      
#> [22] "Dodge Challenger"    "AMC Javelin"         "Camaro Z28"         
#> [25] "Pontiac Firebird"    "Fiat X1-9"           "Porsche 914-2"      
#> [28] "Lotus Europa"        "Ford Pantera L"      "Ferrari Dino"       
#> [31] "Maserati Bora"       "Volvo 142E"
rownames(as_tibble(mtmatrix))
#>  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14"
#> [15] "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28"
#> [29] "29" "30" "31" "32"

This is inconsistent with as_tibble.data.frame, which makes sure to keep rownames

all.equal(
  rownames(mtcars),
  rownames(as_tibble(mtcars))
)
#> [1] TRUE

If a user has a rownamed matrix and wants a tibble with those rownames in it (e.g. confusion matrices from the randomForest package), they have to convert to a base data.frame first

mtmatrix %>%
  as.data.frame() %>%
  as_tibble() %>%
  rownames_to_column("rname")

Fixing this bug would remove the need for the as.data.frame() call, and make the steps match those for doing the same with a data.frame, e.g.

mtcars %>%
  as_tibble() %>%
  rownames_to_column("rname")

This only saves one step, but standardizing the tibble interface across matrices and data.frames where possible seems like a worthy UX improvement.

Session Info

Used newest github version to make sure this wasn't fixed already

devtools::session_info("tibble")
#> Session info --------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.1 (2017-06-30)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2017-08-02
#> Packages ------------------------------------------------------------------
#>  package * version    date       source                           
#>  Rcpp      0.12.12    2017-07-15 cran (@0.12.12)                  
#>  rlang     0.1.1      2017-05-18 CRAN (R 3.4.0)                   
#>  tibble  * 1.3.3.9001 2017-08-02 Github (tidyverse/tibble@12767ae)

Activity

ClaytonJY

ClaytonJY commented on Aug 2, 2017

@ClaytonJY
Author

Looks like this conversion is done in C-land with matrixToDataFrame; does the fix need to happen in there?

That's a bit outside my skills, but I can try re-setting the attribute after the conversion, inside as_tibble.matrix, and put that in a PR with a test or two.

krlmlr

krlmlr commented on Sep 12, 2017

@krlmlr
Member

Thanks. I agree that there is some inconsistency, but I'd rather not support row names in tibbles altogether. We may want to give a warning at some point, so there seems to be no point in enabling this functionality now.

We may want to consider adding a key argument instead, once we figure out support for keys (tidyverse/dplyr#1792), a special value would then convert the row names to a column and also work for matrices.

ClaytonJY

ClaytonJY commented on Sep 12, 2017

@ClaytonJY
Author

@krlmlr I think the key idea is really interesting, and something I've definitely missed from data.table, so it's exciting to see work in that direction.

I agree that tibble shouldn't support rownames, but I think it's important to make it easy for users to work with data that has rownames through no fault of theirs. That's why I made the PR, to make it as easy to get matrix rownames into a column as it is for data.frames.

Even better might be to have the as_tibble methods for matrix/data.frames accept a "rownames" argument, which will do what rownames_to_column does up-front, thus avoiding the need to keep the rownames attribute on the tibble at all. Is that what you're thinking the keys will allow for?

ClaytonJY

ClaytonJY commented on Sep 12, 2017

@ClaytonJY
Author

If that wouldn't be handled best with new keys functionality, I'd be happy to write up a PR to modify as_tibble.matrix and as_tibble.data.frame to do the rownames-to-column conversion up front!

krlmlr

krlmlr commented on Oct 4, 2017

@krlmlr
Member

as_tibble(rownames = "id") works now for matrices and data frames.

added a commit that references this issue on Nov 3, 2017
added a commit that references this issue on Dec 28, 2017
github-actions

github-actions commented on Dec 12, 2020

@github-actions
Contributor

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

locked and limited conversation to collaborators on Dec 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @krlmlr@ClaytonJY

      Issue actions

        as_tibble.matrix should keep rownames attribute · Issue #288 · tidyverse/tibble