Skip to content

[dask] add cv() function #3847

Closed
Closed
@jameslamb

Description

@jameslamb

Summary

As of this writing, lightgbm.dask only supports model classes that mimic the scikit-learn API. It should also support a function, equivalent to lightgbm.engine.cv.

def cv(params, train_set, num_boost_round=100,

Because cv() expects to be given a LightGBM Dataset object, this also implies creating a new class lightgbm.dask.DaskDataset. cv() should take in train_set as a lightgbm.dask.DaskDataset, and should return a regular LightGBM Booster.

Motivation

Having a functional interface would make the transition from non-Dask to Dask easier for users who are already using lightgbm.engine.cv).

References

See the DaskDMatrix in xgboost.dask for some inspiration on how DaskDataset might be implemented.

https://github.com/dmlc/xgboost/blob/a275f4026728ed14fbc70da142ef7a4a1d3de04d/python-package/xgboost/dask.py#L181-L186

DaskDataset may be implemented outside of this feature, to support #3846

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions