[New Feature] Monotonic Constraints in Tree Construction #1514

tqchen · 2016-08-27T20:19:29Z

I got a few requests on supporting monotonic constraints on certain feature with respect to the output,

i.e. when other features are fixed, force the prediction to be monotonic increasing with respect to the the certain specified feature. I am opening this issue to see the general interest on this feature. I can add this if there is enough interest on this,

I would need help from volunteers from the community to test the beta feature and contribute document and tutorial on using this feature. Please reply the issue if you are interested

tqchen · 2016-08-27T21:13:08Z

An experimental version is provided in #1516. To use this before it get merged, clone the repo https://github.com/tqchen/xgboost,

Turn on the following options(likely possible via python, r API)

monotone_constraints = "(0,1,1,0)"

There are two arguments

monotone_constraints is a list in length of number of features, 1 indicate monotonic increasing, - 1 means decreasing, 0 means no constraint. If it is shorter than number of features, 0 will be padded.
- Currently it support python's tuple format, you can pass things as string when using r

Things to verify

The speed of original tree boosters does not slowdown(i changed the code structure a bit, in theory the templates optimization will inline them out, but need to confirm)
The speed and correctness of monotonic regression
The performance by introducing this constraint

Known limitations

Currently only supported exact greedy algorithm on multi-core. Not yet available in distributed version

madrury · 2016-08-29T19:30:40Z

@tqchen I got a request at work today to build some GBM's with monotone constraints to test vs. the performance of some other models. This would be with a tweedie deviance loss, so I would have to go with a custom loss function as it stands today.

In any case, seems like a good chance to help out and get some work done at the same time.

yanyachen · 2016-08-30T21:54:43Z

Based on the talk here, GBM(R Package) only enforces monotonicity locally.
Could you clarify how XGBoost enforce monotonic constrains?
It would be great if XGBoost can enforce global constrains.

tqchen · 2016-08-30T22:19:33Z

I do not understand what you mean by local or gloabl constrain, can you elaborate?

yanyachen · 2016-08-30T22:29:46Z

Sorry, I paste wrong link, here is the right one (Link)
Each tree may only follow monotonic constrain in certain subset of the interested feature, so that many trees ensemble together may create violation of the overall monotonicity on the whole range of that feature.

tqchen · 2016-08-30T23:43:15Z

OK, in my understanding, it is enforced globally. You are welcomed to try it out.

XiaoxiaoWang87 · 2016-09-02T15:59:53Z

Just did some simple tests of monotonicity constraint in the context of a univariate regression. You can find the code and some very brief documentation here:

https://github.com/XiaoxiaoWang87/xgboost_mono_test/blob/master/xgb_monotonicity_constraint_testing1-univariate.ipynb

Some initial observations:

For a single variable regression problem, the monotonic constraint = +1 seems to work well
For a single variable regression problem, in my dataset the monotonic constraint = -1 doest not seem to yield a monotonically decreasing function. Rather, it gives a constant. But this can also be due to the lack of improvement when forcing the constraint. To be confirmed (per Tianqi's suggestion try flipping the dataset and set constraint as +1).
Adding the constraint (correctly) can potentially prevent overfitting and bring some performance / interpretation benefit.

tqchen · 2016-09-03T03:13:57Z

Turns out I introduce a bug in the constraint = -1 case. I pushed a fix, please see if newest version works well. Please also check if it works when there are multiple constraints

madrury · 2016-09-03T03:46:03Z

@tqchen I tested your fix for the decresing bug, seems like it's working now.

tqchen · 2016-09-03T04:20:52Z

Let us confirm if there is speed decreasing vs the original version on some of the standard dataset, then we can merge it in

madrury · 2016-09-03T17:07:37Z

@tqchen I tested a two variable model, one with an increasing constraint and one with a decreasing:

params_constrained = params.copy()
params_constrained['updater'] = "grow_monotone_colmaker,prune"
params_constrained['monotone_constraints'] = "(1,-1)"

The results are good

I'll try to find a little time to do some timing tests this afternoon.

tqchen · 2016-09-06T18:30:04Z

I made an update to #1516 to allow automatic detection of montone options, now user only need to pass in monotone_constraints = "(0,1,1,0)", please check if it works.

I will merge this in if the speed tests going OK, and let us move on to next stage of adding tutorials

@madrury @XiaoxiaoWang87

XiaoxiaoWang87 · 2016-09-06T19:31:56Z

Added tests for the multivariate case here:

https://github.com/XiaoxiaoWang87/xgboost_mono_test/blob/master/xgb_monotonicity_constraint_testing2-multivariate.ipynb

I confirm now both monotonic constraint = 1 and = -1 work as expected.
Constraining monotonicity does not lead to obvious speed* degradation
*speed = avg [ time until early stopping / number of boosting iterations until early stopping ]

no constraint: 964.9 microseconds per iteration
with constraint: 861.7 microseconds per iteration

(please comment if you have a better way to do the speed test)
Need to be careful when constraining the direction for a non-monotonic variable. This can lead to performance degradation.
Seeing code crash because of Check failed: (wleft) <= (wright) when playing around different hyper-parameters.

madrury · 2016-09-06T23:21:12Z

I ran a couple of timing experiments in a jupyter notebook.

First test: some simple simulated data. There are two features, one increasing and one decreasing, but with a small sinusoidal wave superimposed so that each feature is not truly monotonic

X = np.random.random(size=(N, K))
y = (5*X[:, 0] + np.sin(5*2*pi*X[:, 0])
     - 5*X[:, 1] - np.cos(5*2*pi*X[:, 1])
     + np.random.normal(loc=0.0, scale=0.01, size=N))

Here are timing results from xgboosts with and without monotone constraints. I turned off early stopping and boosted a set number of iterations for each.

First without monotone constraints:

%%timeit -n 100
model_no_constraints = xgb.train(params, dtrain, 
                                 num_boost_round = 2500, 
                                 verbose_eval = False)

100 loops, best of 3: 246 ms per loop

And here with monotonicity constraints

%%timeit -n 100
model_with_constraints = xgb.train(params_constrained, dtrain, 
                                 num_boost_round = 2500, 
                                 verbose_eval = False)

100 loops, best of 3: 196 ms per loop

Second test: California hHousing data from sklearn. Without constraints

%%timeit -n 10
model_no_constraints = xgb.train(params, dtrain, 
                                 num_boost_round = 2500, 
                                 verbose_eval = False)

10 loops, best of 3: 5.9 s per loop

Here are the constraints I used

print(params_constrained['monotone_constraints'])

(1,1,1,0,0,1,0,0)

And the timing for the constrained model

%%timeit -n 10
model_no_constraints = xgb.train(params, dtrain, 
                                 num_boost_round = 2500, 
                                 verbose_eval = False)

10 loops, best of 3: 6.08 s per loop

tqchen · 2016-09-07T17:17:19Z

@XiaoxiaoWang87 I have pushed another PR to loose the check on wleft and wright, please see it it works.
@madrury Can you also compare against previous version of XGBoost without the constrain feature?

madrury · 2016-09-07T17:25:10Z

@tqchen Sure. Can you recommend a commit hash to compare against? Should I just use the commit prior to your addition of the monotone constraints?

tqchen · 2016-09-07T18:07:15Z

Yes the previous one will do

madrury · 2016-09-07T21:11:51Z

@tqchen On rebuilding the updated version, I'm getting some errors that I was not having before. I'm hoping the reason jumps out at you clearly.

If I try to run the same code as before, I'm getting an exception, here is the full traceback:

---------------------------------------------------------------------------
XGBoostError                              Traceback (most recent call last)
<ipython-input-14-63a9f6e16c9a> in <module>()
      8    model_with_constraints = xgb.train(params, dtrain, 
      9                                        num_boost_round = 1000, evals = evallist,
---> 10                                    early_stopping_rounds = 10)  

/Users/matthewdrury/anaconda/lib/python2.7/site-packages/xgboost-0.6-py2.7.egg/xgboost/training.pyc in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, learning_rates, xgb_model, callbacks)
    201                            evals=evals,
    202                            obj=obj, feval=feval,
--> 203                            xgb_model=xgb_model, callbacks=callbacks)
    204 
    205 

/Users/matthewdrury/anaconda/lib/python2.7/site-packages/xgboost-0.6-py2.7.egg/xgboost/training.pyc in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks)
     72         # Skip the first update if it is a recovery step.
     73         if version % 2 == 0:
---> 74             bst.update(dtrain, i, obj)
     75             bst.save_rabit_checkpoint()
     76             version += 1

/Users/matthewdrury/anaconda/lib/python2.7/site-packages/xgboost-0.6-py2.7.egg/xgboost/core.pyc in update(self, dtrain, iteration, fobj)
    804 
    805         if fobj is None:
--> 806             _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, iteration, dtrain.handle))
    807         else:
    808             pred = self.predict(dtrain)

/Users/matthewdrury/anaconda/lib/python2.7/site-packages/xgboost-0.6-py2.7.egg/xgboost/core.pyc in _check_call(ret)
    125     """
    126     if ret != 0:
--> 127         raise XGBoostError(_LIB.XGBGetLastError())
    128 
    129 

XGBoostError: [14:08:41] src/tree/tree_updater.cc:18: Unknown tree updater grow_monotone_colmaker

If I switch out everything for the keyword argument you implemented I also get an error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-ef7671f72925> in <module>()
      8                                    monotone_constraints="(1)",
      9                                    num_boost_round = 1000, evals = evallist,
---> 10                                    early_stopping_rounds = 10)  

TypeError: train() got an unexpected keyword argument 'monotone_constraints'

tqchen · 2016-09-07T21:15:45Z

remove the updater argument and keep the monotone constraint arguments in parameters, now that monotone constraint updater is activated automatically when monotone constraints are presented

madrury · 2016-09-07T21:20:22Z

@tqchen My buddy @amontz helped me figure that out immediately after I posted the message. I had interpreted your comment as passing monotone_constraints as a kwarg to .train.

It works with those adjustments. Thanks.

tqchen · 2016-09-08T00:15:17Z

@madrury can you confirm the speed?

tqchen · 2016-09-08T00:17:40Z

Also @madrury and @XiaoxiaoWang87 since this feature is now close to be merged, it would be great if you can coordinate to create a tutorial introducing this feature to the users.

We cannot directly take ipy notebook to the main repo. but images can be pushed to https://github.com/dmlc/web-data/tree/master/xgboost and markdown to main repo.

tqchen · 2016-09-08T00:20:34Z

We also need to change the front-end interface string conversion, so that int tuple can be converted into the string tuple format that can be accepted by the backend.

@hetong007 for changes in R and @slundberg for Julia

slundberg · 2016-09-08T02:39:34Z

@tqchen Julia is currently attached to the 0.4 version of XGBoost so next time I need to use it and have time set aside I'll update the bindings if no one else has by then. At that point this change can also get added.

madrury · 2016-09-08T04:02:01Z

Here's the comparison between models without a monotone constraint from before the implementation to afterwards.

Commit 8cac37: Before implementation of monotone constraint.'
Simulated Data: 100 loops, best of 3: 232 ms per loop
California Data: 10 loops, best of 3: 5.89 s per loop

Commit b1c224: After implementation of monotone constraint.
Simulated Data: 100 loops, best of 3: 231 ms per loop
California Data: 10 loops, best of 3: 5.61 s per loop

The speedup for california after the implementation looks suspicious to me, but I tried it twice each way, and it's consistent.

madrury · 2016-09-08T04:04:22Z

I'd be happy to take a shot at writing a tutorial. I'll look around at the existing documentation and put something together in the next few days.

tqchen · 2016-09-08T04:29:13Z

This is great, the PR is now officially merged to the master. Looking forward to see the tutorial

XiaoxiaoWang87 · 2016-09-08T05:35:43Z

Thanks @madrury. Look forward to it. Let me know what I can help. I'd be certainly willing to have more studies on this topic.

hetong007 · 2016-09-08T06:58:50Z

I will enhance it tomorrow. I'm just curious about the reason of communicating with C++ via a string instead of an array.

hetong007 · 2016-09-08T21:38:10Z

I am testing from R. I randomly generated a two-variable data and try to make prediction.

However, I found that

xgboost doesn't constraint the prediciton.
the parameter monotone_constraints makes the prediction slightly different.

Please point it out if I made any mistakes.

The code to reproduce it (tested on the latest github version, not from drat):

set.seed(1024)
x1 = rnorm(1000, 10)
x2 = rnorm(1000, 10)
y = -1*x1 + rnorm(1000, 0.001) + 3*sin(x2)
train = cbind(x1, x2)

bst = xgboost(data = train, label = y, max_depth = 2,
                   eta = 0.1, nthread = 2, nrounds = 10,
                   monotone_constraints = '(1,-1)')

pred = predict(bst, train)
ind = order(train[,1])
pred.ord = pred[ind]
plot(train[,1], y, main = 'with constraint')
pred.ord = pred[order(train[,1])]
lines(pred.ord)

bst = xgboost(data = train, label = y, max_depth = 2,
                   eta = 0.1, nthread = 2, nrounds = 10)

pred = predict(bst, train)
ind = order(train[,1])
pred.ord = pred[ind]
plot(train[,1], y, main = 'without constraint')
pred.ord = pred[order(train[,1])]
lines(pred.ord)

tqchen · 2016-09-08T21:42:10Z

The constraint was done on the partial order. So constraint is only enforced if we are moving the montone axis, keeping other axis fixed

madrury · 2016-09-08T21:48:21Z

@hetong007 To make my plots I

Created an array containing the grid of x-coordinates I wanted to predict that variable at and then join up into the line plot. This would use seq in R.
Set all the other variables equal to their average value in the training data. This would be something like colmeans in R.

Here's the python code that I used for the plots I included above, it should pretty easily convert to equivalent R code.

def plot_one_feature_effect(model, X, y, idx=1):

    x_scan = np.linspace(0, 1, 100)    
    X_scan = np.empty((100, X.shape[1]))
    X_scan[:, idx] = x_scan

    left_feature_means = np.tile(X[:, :idx].mean(axis=0), (100, 1))
    right_feature_means = np.tile(X[:, (idx+1):].mean(axis=0), (100, 1))
    X_scan[:, :idx] = left_feature_means
    X_scan[:, (idx+1):] = right_feature_means

    X_plot = xgb.DMatrix(X_scan)
    y_plot = model.predict(X_plot, ntree_limit=bst.best_ntree_limit)

    plt.plot(x_scan, y_plot, color = 'black')
    plt.plot(X[:, idx], y, 'o', alpha = 0.25)

XiaoxiaoWang87 · 2016-09-08T22:50:30Z

Here is how I do the partial dependence plots (for an arbitrary model):

Scan a grid of values for feature X.
For every grid value of feature X:
- Set the entire feature X column (all rows) to this value. Other features unchanged.
- Make predictions for all rows.
- Take the average of prediction.
The resulting (X feature value, average prediction) pairs give you the X feature partial dependence.

Code:

def plot_partial_dependency(bst, X, y, f_id):

    X_temp = X.copy()

    x_scan = np.linspace(np.percentile(X_temp[:, f_id], 0.1), np.percentile(X_temp[:, f_id], 99.5), 50)
    y_partial = []

    for point in x_scan:

        X_temp[:, f_id] = point

        dpartial = xgb.DMatrix(X_temp[:, feature_ids])
        y_partial.append(np.average(bst.predict(dpartial)))

    y_partial = np.array(y_partial)

    # Plot partial dependence

    fig, ax = plt.subplots()
    fig.set_size_inches(5, 5)
    plt.subplots_adjust(left = 0.17, right = 0.94, bottom = 0.15, top = 0.9)

    ax.plot(x_scan, y_partial, '-', color = 'black', linewidth = 1)
    ax.plot(X[:, f_id], y, 'o', color = 'blue', alpha = 0.02)

    ax.set_xlim(min(x_scan), max(x_scan))
    ax.set_xlabel('Feature X', fontsize = 10)    
    ax.set_ylabel('Partial Dependence', fontsize = 12)

hetong007 · 2016-09-08T23:06:26Z

Thanks for the guidance! I realized that I made a silly mistake in the plot. Here's another test on an univariate data, the plot seems fine:

set.seed(1024)
x = rnorm(1000, 10)
y = -1*x + rnorm(1000, 0.001) + 3*sin(x)
train = matrix(x, ncol = 1)

bst = xgboost(data = train, label = y, max_depth = 2,
               eta = 0.1, nthread = 2, nrounds = 100,
               monotone_constraints = '(-1)')
pred = predict(bst, train)
ind = order(train[,1])
pred.ord = pred[ind]
plot(train[,1], y, main = 'with constraint', pch=20)
lines(train[ind,1], pred.ord, col=2, lwd = 5)

bst = xgboost(data = train, label = y, max_depth = 2,
               eta = 0.1, nthread = 2, nrounds = 100)
pred = predict(bst, train)
ind = order(train[,1])
pred.ord = pred[ind]
plot(train[,1], y, main = 'without constraint', pch=20)
lines(train[ind,1], pred.ord, col=2, lwd = 5)

tqchen · 2016-09-08T23:23:38Z

@hetong007 So the goal in R interface is to enable user to pass in R array besides the strings

monotone_constraints=c(1,-1)

tqchen · 2016-09-12T18:24:10Z

Please let us know when you are PR the tutorial

@hetong007 You are also more than welcomed to make a r-blogger version

madrury · 2016-09-15T03:53:32Z

@tqchen Sorry guys, I've been on a work trip for the week.

I sent a couple of pull requests with for a monotonic constraint tutorial. Please let me know what you think, I'm happy with any criticism or critique.

JoshuaC3 · 2016-12-19T20:52:28Z

Hopefully it is appropriate to ask this here: will this now work if we update using the usual git clone --recursive https://github.com/dmlc/xgboost?

I ask as I saw the new tutorial out but nothing new about a change to the code itself. Thank you all!

tqchen · 2016-12-19T22:54:08Z

yes, the new feature is merged before the tutorial get merged

TrJUDD · 2016-12-29T14:39:09Z

Hello,

I'm not sure that you succesfully implemented global montonicity, from what i've seen in your code, it corresponds more to a local monotonicity.

Here is a simple example breaking monotonicity :

`
df <- data.frame(y = c(2,rep(6,100),1,rep(11,100)),
x1= c(rep(1,101),rep(2,101)),x2 = c(1,rep(2,100),1,rep(2,100)))

library(xgboost)
set.seed(0)
XGB <- xgboost(data=data.matrix(df[,-1]),label=df[,1],
objective="reg:linear",
bag.fraction=1,nround=100,monotone_constraints=c(1,0),
eta=0.1 )

sans_corr <- data.frame(x1=c(1,2,1,2),x2=c(1,1,2,2))

sans_corr$prediction <- predict(XGB,data.matrix(sans_corr))
`

Hope my understanding of your code and my example is not false

carsonyan · 2017-02-27T22:42:02Z

Currently this feature is not in the Sklearn api. Can you or someone please help to add it? Thanks!

davidADSP · 2017-04-08T12:58:24Z

It is possible to enforce general monotonicity on a variable, without specifying whether it should be increasing or decreasing?

cxu60-zz · 2017-07-20T21:15:57Z

@davidADSP you can do a spearman correlation check on the desired predictor and target to see whether increasing or decreasing is proper.

ccmien · 2017-07-27T09:25:40Z

This feature seems to be invalid when 'tree_method':'hist'. @tqchen any help? Thanks all.

dksahuji · 2017-11-14T10:06:43Z

How does the constraint work for multiclass objective like mlogloss? Is monotonicity constraint supported for multiclass loss? If yes, how is it enforced. (As for each class there is a tree)

junegit · 2018-06-15T16:29:22Z

Is there any whitepaper on Monoticity Algorithm enforced in XGBOOST ? Is it Global or Local? Local means specific to certain nodes but nodes in other parts of the tree might create a violation of the overall monotonicity. Also can anyone please help me in understanding line L412-417. Why "w" is bounded- upper and lower. How this helps to maintain Monotonicity. Line 457 - Why "mid" is used?

tqchen added enhancement labels Aug 27, 2016

tqchen added this to the v0.6.1 milestone Aug 27, 2016

hetong007 mentioned this issue Sep 9, 2016

[R] Monotonic Constraints in Tree Construction #1557

Merged

alexvorobiev mentioned this issue Oct 18, 2016

[Feature] Support for monotonic constraints? microsoft/LightGBM#14

Closed

tqchen closed this as completed Dec 19, 2016

HeardACat mentioned this issue Apr 7, 2017

[jvm-pacakges] xgboost4j monotone constraints help? #2176

Closed

lock bot locked as resolved and limited conversation to collaborators Oct 25, 2018

[New Feature] Monotonic Constraints in Tree Construction #1514

[New Feature] Monotonic Constraints in Tree Construction #1514

Comments

tqchen commented Aug 27, 2016

tqchen commented Aug 27, 2016 • edited Loading

Things to verify

Known limitations

madrury commented Aug 29, 2016 • edited Loading

yanyachen commented Aug 30, 2016 • edited Loading

tqchen commented Aug 30, 2016

yanyachen commented Aug 30, 2016

tqchen commented Aug 30, 2016 • edited Loading

XiaoxiaoWang87 commented Sep 2, 2016

tqchen commented Sep 3, 2016

madrury commented Sep 3, 2016

tqchen commented Sep 3, 2016

madrury commented Sep 3, 2016 • edited Loading

tqchen commented Sep 6, 2016

XiaoxiaoWang87 commented Sep 6, 2016 • edited Loading

madrury commented Sep 6, 2016 • edited Loading

tqchen commented Sep 7, 2016

madrury commented Sep 7, 2016 • edited Loading

tqchen commented Sep 7, 2016

madrury commented Sep 7, 2016 • edited Loading

tqchen commented Sep 7, 2016 • edited Loading

madrury commented Sep 7, 2016 • edited Loading

tqchen commented Sep 8, 2016

tqchen commented Sep 8, 2016

tqchen commented Sep 8, 2016

slundberg commented Sep 8, 2016

madrury commented Sep 8, 2016 • edited Loading

madrury commented Sep 8, 2016

tqchen commented Sep 8, 2016

XiaoxiaoWang87 commented Sep 8, 2016

hetong007 commented Sep 8, 2016

hetong007 commented Sep 8, 2016

tqchen commented Sep 8, 2016

madrury commented Sep 8, 2016 • edited Loading

XiaoxiaoWang87 commented Sep 8, 2016 • edited Loading

hetong007 commented Sep 8, 2016 • edited Loading

tqchen commented Sep 8, 2016 • edited Loading

tqchen commented Sep 12, 2016

madrury commented Sep 15, 2016

JoshuaC3 commented Dec 19, 2016 • edited Loading

tqchen commented Dec 19, 2016

TrJUDD commented Dec 29, 2016 • edited Loading

carsonyan commented Feb 27, 2017

davidADSP commented Apr 8, 2017

cxu60-zz commented Jul 20, 2017

ccmien commented Jul 27, 2017

dksahuji commented Nov 14, 2017

junegit commented Jun 15, 2018

tqchen commented Aug 27, 2016 •

edited

Loading

madrury commented Aug 29, 2016 •

edited

Loading

yanyachen commented Aug 30, 2016 •

edited

Loading

tqchen commented Aug 30, 2016 •

edited

Loading

madrury commented Sep 3, 2016 •

edited

Loading

XiaoxiaoWang87 commented Sep 6, 2016 •

edited

Loading

madrury commented Sep 6, 2016 •

edited

Loading

madrury commented Sep 7, 2016 •

edited

Loading

madrury commented Sep 7, 2016 •

edited

Loading

tqchen commented Sep 7, 2016 •

edited

Loading

madrury commented Sep 7, 2016 •

edited

Loading

madrury commented Sep 8, 2016 •

edited

Loading

madrury commented Sep 8, 2016 •

edited

Loading

XiaoxiaoWang87 commented Sep 8, 2016 •

edited

Loading

hetong007 commented Sep 8, 2016 •

edited

Loading

tqchen commented Sep 8, 2016 •

edited

Loading

JoshuaC3 commented Dec 19, 2016 •

edited

Loading

TrJUDD commented Dec 29, 2016 •

edited

Loading