-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Error in catboost.from_matrix(as.matrix(float_and_cat_features_data), : Unsupported label type, expecting double or integer, got character #1874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello, @rtedesco1197! Having this message, I assume that your label is either factor or character column. Unfortunately, CatBoost in R does not support neither of that yet. You can overcome this by manually converting your label to integer type: If your problem is more than this, could you please provide a reproducible example for further investigation? |
Thanks for the advice @Glemhel, however, when I try to classify an integer vector of (0,1,1,0,0) for classification I get: When I take away the classification mode definition I get: My data is: |
|
Hello again, @rtedesco1197! The problem is that some of the libraries you use(tune, workflows, I am not sure which one is doing the actual training) require target to be factor for classification mode. Without this mode, as you mentioned, using classification-specific functions (about class probabilities) is not available (For probability predictions, the object should be a classification model) Unfortunately, CatBoost in R is not capable of handling factor target yet, and that is why you get initial error message about incorrect label type (factor is covered to character on the way inside catboost.R). That means there is no simple fix yet for your problem. But supporting factor target is a useful feature indeed, thank you for raising this problem! However, I discovered that for this workflow to work, one has to download treesnip library, am I right you are using it? I experimented with this a bit: forked treesnip and added that condition. You can install my version via So, to sum up, thank you for reporting about the need of factor in R; and try installing my fork of treesnip as a workaround for your problem! Feel free to ask in case of any difficulties! |
That is working great, thank you so much for taking the time. Another problem however, when I try to utilize my GPU with: I get the error: However, an rsm parameter is not present in the code I shared with you. Any idea what is going on there? Apologies in advance if this should be a different issue. |
It looks like a bug in catboost interface of treesnip - rsm was incorrectly divided by the number of features in all cases. |
A better solution is proposed here: curso-r/treesnip#20 |
Thanks for all your help :), but now I am getting:
Sorry for dragging you down the rabbit hole. |
I only seem to get this error if I do not do |
Same error occurs although I installed your modified treesnip package: remotes::install_github("Glemhel/treesnip")
library(treesnip)
catboost_model <-
boost_tree( mode = "classification",
mtry = tune(), # default [1, ?]
trees = 1000, # default [1, 2000]
min_n = 20, # default [2, 40]
tree_depth = 6, # default [1, 15]
learn_rate = 0.05, # default [-10, -1]
engine = "catboost"
)
catboost_wf <-
workflow() %>%
add_model(catboost_model) %>%
add_recipe(model_recipe)
catboost_results <-
catboost_wf %>%
tune_grid(resamples = miR_cv,
grid = 5,
control = control_grid(save_pred = TRUE),
metrics = metric_set(accuracy,roc_auc)
) #Warning message:
#All models failed. Run `show_notes(.Last.tune.result)` for more information.
show_notes(.Last.tune.result)
#unique notes:
#------------------------------------------------------------------------------------------
#Error in `check_spec_mode_engine_val()`:
#! Engine 'catboost' is not supported for `boost_tree()`. See `show_engines('boost_tree')`.
show_engines('boost_tree')
# A tibble: 9 x 2
# engine mode
# <chr> <chr>
# 1 xgboost classification
# 2 xgboost regression
# 3 C5.0 classification
# 4 spark classification
# 5 spark regression
# 6 catboost regression
# 7 catboost classification
# 8 lightgbm regression
# 9 lightgbm classification |
Problem: Error in catboost.from_matrix(as.matrix(float_and_cat_features_data), : Unsupported label type, expecting double or integer, got character
catboost version: 1.0.0
Operating System: Windows 10
CPU: Ryzen 5?
GPU: Nvidia GTX 1660
Hello I am fitting a simple model with 2 numerical predictors and a binomial outcome.
When I try to tune_grid this model in R with tidymodels, I get this error even though there are no categorical predictors in my dataset:
Error in catboost.from_matrix(as.matrix(float_and_cat_features_data), : Unsupported label type, expecting double or integer, got character
The text was updated successfully, but these errors were encountered: