Description
With scikit-learn 1.6.0
, __sklearn_tags__
were introduces, see Estimator Tags
This raised some issues at
- lightgbm: [python-package] simplify scikit-learn 1.6+ tags support microsoft/LightGBM#6735 [python-package][R-package] adapt to scikit-learn 1.6 testing changes, pin more packages in R 3.6 CI jobs microsoft/LightGBM#6718
- xgboost: Incompability between scikit-learn and xgboost dmlc/xgboost#11093
Further, this caused issues for stacked Global Learners as in Example Gallery
Minimal Example
import doubleml as dml
import sklearn
import numpy as np
import pandas as pd
from sklearn.linear_model import LassoCV
from doubleml.rdd.datasets import make_simple_rdd_data
from doubleml.rdd import RDFlex
from doubleml.utils.global_learner import GlobalRegressor
from sklearn.ensemble import StackingRegressor
print(sklearn.__version__)
print(dml.__version__)
np.random.seed(42)
data_dict = make_simple_rdd_data(n_obs=1000, fuzzy=False)
cov_names = ['x' + str(i) for i in range(data_dict['X'].shape[1])]
df = pd.DataFrame(np.column_stack((data_dict['Y'], data_dict['D'], data_dict['score'], data_dict['X'])), columns=['y', 'd', 'score'] + cov_names)
dml_data = dml.DoubleMLData(df, y_col='y', d_cols='d', x_cols=cov_names, s_col='score')
ml_g = StackingRegressor([("global", GlobalRegressor(LassoCV())),
("local", LassoCV())], final_estimator=LassoCV())
rdflex_obj = RDFlex(dml_data, ml_g, fuzzy=False)
rdflex_obj.fit()
Output:
1.6.0
0.9.3
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-4-deff49535681>](https://localhost:8080/#) in <cell line: 0>()
20 ("local", LassoCV())], final_estimator=LassoCV())
21 rdflex_obj = RDFlex(dml_data, ml_g, fuzzy=False)
---> 22 rdflex_obj.fit()
6 frames
[/usr/local/lib/python3.11/dist-packages/sklearn/ensemble/_base.py](https://localhost:8080/#) in _validate_estimators(self)
232 for est in estimators:
233 if est != "drop" and not is_estimator_type(est):
--> 234 raise ValueError(
235 "The estimator {} should be a {}.".format(
236 est.__class__.__name__, is_estimator_type.__name__[3:]
ValueError: The estimator GlobalRegressor should be a regressor.
Metadata
Metadata
Assignees
Labels
No labels